@daixuan1996
2015-01-03T16:25:09.000000Z
字数 11139
阅读 1634
概率统计
Continuous random variables and probability density functions(概率密度函数)
Continuous random variables(连续随机变量)
- A random variable X is said to be continuous if its set of possible values is an entire interval of numbers-that is, if for some A
Probability distributions for continuous variables
- The total area of all rectangles is therefore 1.
- Let X be a continuous rv. Then a probability distribution or probability density function - pdf(概率密度函数) of X is f(x) such that for any two numbers a and b with
a≤b
P(a≤X≤b)=∫baf(x)dx - That is, the probability that X takes on a value in the interval [a,b] is the area under the graph of the density function, as illustrated in the figure below:
- The graph of f(x) is often referred to as the density curve(密度曲线).
- For f(x) to be a legitimate pdf, it must satisfy the following two conditions:
f(x)≥0 for all x ∫∞−∞f(x)dx= area under the entire graph of f(x)=1 - A continuous rv X is said to have a uniform distribution(均匀分布) on the interval [A,B] if the pdf of X is
f(x;A,B)=⎧⎩⎨1B−A0A≤x≤Botherwise - If X is a continuous rv, then for any number c, P(X=c)=0. Furthermore, for any two numbers a and b with a < b,
P(a≤b)=P(a<X<b)
Cumulative Distribution Functions and Expected Values
The cumulative distribution function F(x) for a continuous rv X is defined for every number x by
F(x)=P(X≤x)=∫x−∞f(y)dy
- For each x, F(x) is the area under the density curve to the left of x.
Using F(x) to Compute Probabilities
- Let X be a continuous rv with cdf F(x). Then for any number a,
P(X>a)=1−F(a) and for any two numbers a and b with a < b,P(a≤X≤b)=F(b)−F(a)
Obtaining f(x) from F(x)
- If X is a continuous rv with pdf f(x) and cdf F(x), then at every x at which the derivative(导数) F'(x) exists ,
F′(x)=f(x) .
Percentiles(百分位) of a Continuous Distribution
- Let p be a number between 0 and 1 . The (100p)th percentile of the distribution of a continuous rv X , denoted by
η(p) , is defined by
p=F(η(p))=∫η(p)−∞f(y)dy
- The median of a continuous distribution , denoted by
μ~ , is the 50th percentile, soμ~ satisfies0.5=F(μ~) . That is, half the area under the density curve is to the left ofμ~ and half is to the right ofμ~ .
Expected Values for Continuous Random Variables
- The expected or mean value of a continuous rv X with pdf f(x) is
μx=E(X)=∫∞−∞xf(x)dx - If X is a continuous rv with pdf f(x) and h(X) is any function of X, then
E[h(X)]=μh(x)=∫∞−∞h(x)f(x)dx
The Variance of a Continuous Random Variable
- The variance of a continuous random variable X with pdf f(x) and mean value μ is
σ2x=V(X)=∫+∞−∞(x−μ)2f(x)dx=E[(X−μ)2] - As in the discrete case, we can calculate the variance following the formula
V(X)=E(X2)−[E(X)]2
The Normal Distribution(正态分布) ★☆
Many numerical populations have distributions that can be fit very closely by an appropriate normal curve.
A continuous rv X is said to have a normal distribution(正态分布) with parameters
- The statement that X is normally distributed with parameters μ and σ2 is often abbreviated
X N(μ,σ2) .- The cdf of normal distribution is
F(x)=∫x−∞12π−−√σe−(t−μ)22σ2dt - To compute probability of
X∈(a,b)
P(a≤X≤b)=∫aa12π−−√σe−(t−μ)22σ2dx
The standard normal distribution(标准正态分布)
- A random variable that has a normal distribution with a mean of zero and standard deviation of one is said to have a standard normal probability distribution (
μ=0,σ=1 ).- The density function of standard normal distribution is
f(z;0,1)=12π−−√e−z2/2 ,−∞<z<∞ - The corresponding distribution function is
Φ(z)=∫−∞zf(t)dt=∫−∞z12π−−√e−t2/2dt - The Properties of standard normal distribution
Φ(−z)=1−Φ(z) - The density function φ(z) achieved maximum
12π√ at z=0Φ(0)=0.5 - Its mean
μ =0, its varianceσ2 =1P(|X|≤z)=2Φ(z)−1
P(|X|≥z)=2[1−Φ(z)]
Percentiles of the Standard Normal Distribution
P(X≤99thpercentile)=0.99
zα will denote the values on the measurement axis for whichα of the area under the z curve lies to the right ofzα P(X≥zα)=α
Nonstandard normal distributions
- If X has the normal distribution with mean
μ and standard deviationσ , thenZ=X−μσ has a standard normal distribution.- So the probabilities:
P(a≤X≤b)=P(a−μσ≤X≤b−μσ)Φ(b−μσ)−Φ(a−μσ)
Percentiles of an arbitrary normal distribution
- (100p)th percentile for normal(
μ,σ) =μ +[(100p)th for standard normal]·σ
The Normal Distribution and Discrete Populations
- The correction for discreteness of the underlying distribution in the previous example is often called a continuity correction(连续校正).
- P(X ≥ 125) -> P(X ≥ 124.5) P(X ≤ 125) -> P(X ≤ 125.5)
The Normal Approximation to the Binomial Distribution
- Let X be a binominal rv based on n trials with success probability p . Then if the binomial probability histogram is not too skewed, X has approximately a normal distribution with
μ=np andσ=npq−−−√ . In particular, for x = a possible value of X,
P(X≤x)=B(x;n,p)≈Φ(x+.5−npnpq−−−√)
The 0.5 is continuity correction.- In practice, the approximation is adequate provided that both np ≥ 10 and nq ≥ 10.
The Gamma Distribution and Its Relatives
For
- The most important properties of the gamma function are the following: :
- For any
α>1, Γ(α)=(α−1)Γ(α−1) - For any positive integer n,
Γ(n)=(n−1)! Γ(12)=π√ - If we let
f(x;α)={xα−1e−xΓ(α)0x≥0otherwise , then the function satisfies the two properties of a pdf.
The Family of Gamma Distributions
- A continuous random variable X is said to have a gamma distribution if the pdf of X is
f(x;αβ)=⎧⎩⎨1βαΓ(α)xα−1e−1β0x≥0otherwise
where the parametersα andβ satisfyα>0 ,β>0 .- The standard gamma distribution has
β = 1.
E(X)=μ=αβ V(X)=σ2=αβ2 - When X is a standard gamma rv, the cdf of X
F(X;α)=∫x0yα−1e−yΓ(α)dy x>0
is called the incomplete gamma function(不完全伽玛函数).P(X≤x)=F(x;α,β)=F(xβ;α)
The Exponential Distribution(指数分布)
- X is said to have an exponential distribution if the pdf of X is
f(x;λ)={λe−λx0x≥0otherwise where λ>0
In fact, exponential distribution is a special gamma distribution!
μ=αβ=1λ σ2=αβ2=1λ2 - cdf of X is
$F(x;λ)={1−e−λx0x≥0otherwise
Application of the Exponential Distribution
- Suppose that the number of events occurring in any time interval of length t has a Poisson distribution with parameter
αt and that numbers of occurrences in nonoverlapping intervals are independent of one another. Then the distribution of elapsed(消逝) time between the occurrence of two successive events is exponential with parameterλ=α .- Another important application of the exponential distribution is to model the distribution of component lifetime. A partial reason for the popularity of such applications is the “memoryless” property of the exponential distribution.
P(X≥t+t0|X≥t0)=P(X≥t+t0)⋂(X≥t0)P(X≥t0)=P(X≥t+t0)P(X≥t0)=1−F(t+t0;λ)1−F(t0;λ)=e−λt - Thus, the distribution of additional lifetime is exactly the same as the original distribution of lifetime, so at each point in time the component shows no effect of wear. In other words, the distribution of remaining lifetime is independent of current age.
The Chi-Squared Distribution(卡方分布)
- Let ν be a positive integer. Then a random variable X is said to have a chi-squared distribution with parameter
ν if the pdf of X is the gamma density withα=ν/2 andβ=2 . The pdf of a chi-squared rv is thusf(x;v)=⎧⎩⎨12v/2Γ(v/2)xv/2−1e−x/20x≥0x<0 - The parameter
ν is called the number of degrees of freedom - df (自由度数) of X . The symbolχ2 is often used in place of “chi-squared”.
Other Continuous Distribution
The Weibull Distribution(威布尔分布)
- A random variable X is said to have a Weibull distribution with parameters α and β (α > 0, β > 0) if the cdf of X is
f(x;α,β)={αβxα−1e−(x/β)α0x≥0x<0 μ=β Γ(1+1α) σ2=β2{Γ(1+2α)−[Γ(1+1α)]2} - cdf:
f(x;α,β)={1−e−(x/β)α0x≥0x<0
The Lognormal Distribution(对数正态分布)
- A nonnegative rv X is said to have a lognormal distribution if the rv Y = ln(X) has a normal distribution. The resulting pdf of a lognormal rv when ln(X) is normally distributed with parameters
μ andσ is
f(x;μ,σ)=⎧⎩⎨12π−−√σxe−(ln(x)−μ)2/(xσ2)0x≥0x<0 E(X)=eμ+σ2/2 V(X)=e2μ+σ2⋅(eσ2−1) - cdf:
F(x;μ,σ)=P(X≤x)=P[ln(X)≤ln(x)]=P(Z≤ln(x)−μσ=Φ(ln(x)−μσ)
The Beta Distribution(贝塔分布)
- A random variable X is said to have a beta distribution with parameters
α ,β , A, and B if the pdf of X is
- The case A=0, B=1 gives the standard beta distribution.
μ=A+(B−A)⋅αα+β σ2=(B−A)2αβ(α+β)2(α+β+1)
Probability Plot(概率图)
Sample Percentiles
- Order the n sample observations from the smallest to the largest. Then the
i th smallest observation in the list is taken to be the[100(i−.5)/n] th sample percentile.
Probability Plot
Normal Probability Plot(正态概率图)
- A plot of the n pairs ([100(i-.5)/n]th z percentile, ith smallest observation) on a two-dimensional coordinate system is called a normal probability plot.
Copyright © 2014 by Xuan Dai. All rights reserved.