@daixuan1996
2015-01-01T19:48:59.000000Z
字数 8499
阅读 1141
概率统计
https://www.zybuluo.com/daixuan1996/note/57380
- The concept of a random variable allows us to pass from the experimental outcomes themselves to a numerical function of the outcomes.
- There are two fundamentally different types of random variables--discrete random variables and continuous random variables.
Random Variables
random variable (rv):
- For a given sample space S of some experiment, a random variable is any rule that associates a number with each outcome in S.
- A random variable is a function whose domain(值域) is the sample space of the experiment, and range(定义域) is the real numbers.
X:S→R - The notation
X(s)=x means that x is the value associated with the outcome s by the rv X.- Any random variable whose only possible values are 0 and 1 is called a Bernoulli(伯努利) random variable.
Two types of random variables
- A discrete random variable is an rv whose possible values either constitute a finite set or else can be listed in an infinite sequence in which there is a first element, a second element, and so on.
- A random variable is continuous if its set of possible values consists of an entire interval on the number line.
Probability Distributions for Discrete Random Variables
- The rule for describing the probability measure associated with all values of a random variable is called a probability distribution.
The probability distribution(概率分布) or probability mass function - pmf (概率质量函数) of a discrete rv is defined for every number x by
- Any pmf must be required to satisfy the following conditions:
p(x) >= 0 and ∑all possible xp(x) = 1 - A probability distribution can also be represented graphically as a histogram. The histogram represents the relative frequency we would expect to see if we repeated an experiment infinitely many times.
A parameter of a probability distribution
- Suppose p(x) depends on a quantity that can be assigned any one of a number of possible values, with each different value determining a different probability distribution. Such a quantity is called a parameter(参数) of the distribution.
- The collection of all probability distributions for different values of the parameter is called a family of probability distributions(概率分布族).
e.g: The family of Bernoulli distributions:
p(x;α)⎧⎩⎨1−αα0if x=0if x=1otherwise
The cumulative distribution function - cdf (累积分布函数) F(x) of a discrete rv X with pmf p(x) is defined for every number x by
- The cdf satisfies the following properties:
- The cdf is non-decreasing.
- The cdf satisfies :
limx→−∞F(x)=0 limx→+∞F(x)=1 - It is comtinuous in x.
For any two numbers a and b with a≤ b,P(a≤n)=F(b)−F(a−) where “a-” represents the largest possible X value that is strictly less than a.
a, b都是整数时,P(a≤n)=F(b)−F(a−1) P(X=a)=F(a)−F(a−1)
Expected Values(期望) of Discrete Random Variables
Let X be a discrete rv with set of possible values D and pmf p(x).The expected values(期望值) or mean value(均值) of X, denoted by E(X) or
- When the sum does not exist, we say the expectation of X does not exist.
- The population mean(总体均值) is the mean value of the population.
- The probability distribution of X has "a heavy tail(重尾)" if its E(X) is not finite.
The Expected Value of a Function
- Let X be a discrete rv with set of possible values D and pmf p(x). Then the expected values or mean value of any function h(X), denoted by
E[h(X)] orμh(X) , is computed byE[h(X)]=∑x∈Dh(x)p(x)
Rules of Expected Value
E(Y1+Y2)=E(Y1)+E(Y2) E(bY)=bE(Y) E(C)=C(forconstant C) E(Y1Y2)=E(Y1)E(Y2) when Y1and Y2 are independent
The Variance of X
- Let X have pmf p(x) and the expected value
μ . Then the variance(方差) of X, denoted by V(X) or justσ2 , isV(X)=∑x∈D(x−μ)2⋅p(x)=E[(X−μ)2] - The standard deviation(标准差) of X is
σx=σ2x−−√ .- popuplation variance/standard deviation(总体方差/标准差)
The Binomial Probability Distribution(二项分布)
Properties of a Binomial experiment
- The experiment consists of a sequence of n smaller trials, where n is fixed in advance of the experiment.
- Two outcomes are possible on each trial.
- The probability of a success or a failure, denoted by p and 1-p, does not change from trial to trial.
- The trials are independent.
Suppose each trial of an experiment can result in S or F, but the sampling is without replacement from a population of size N. If the sample size (number of trials) n is at most 5% of the population size, the experiment can be analyzed as though it were exactly a binomial experiment.
The Binomial Random Variable and Distribution
- Possible values for X in an n-trial experiment are
x=0,1,2,…,n . We will often write X~Bin(n,p) to indicate that X is a binomial rv based on n trials with success probability p.- We denote the pmf by
b(x;n,p)
b(x;n,p)={Cxnpx(1−p)n−x0x=0,1,2,...notherwise - For X~Bin(n,p), the cdf will be denote by
P(X≤x)=B(x;n,p)=∑y=0xb(y;n,p) x=0,1,2,...,n
The Mean and Variance of X
E(X)=np V(X)=np(1−p)=npq σx=npq−−−√
Hypergeometric and Negative Binomial Distributions(超几何分布与负二项分布)
Properties of Hypergeometric Distribution(超几何分布)
- The population or set to be sampled consists of N individuals, objects, or elements.(a finite population)
- Each individual can be characterized as a success (S) or a failure (F), and there are M successes in the population.
- A sample of n individuals is selected without replacement in such a way that each subset of size n is equally likely to be chosen.
The Hypergeometric Random Variable and Distribution
- The random variable of interest is X = the number of S’s in the sample.
- If X is the number of S’s in a completely random sample of size n drawn from a population consisting of M S’s and (N-M) F’s, then the probability distribution of X, called the hypergeometric distribution(超几何分布), is given by
P(X=x)=h(x;n,M,N)=CxMCn−xN−MCnN
The Mean and Variance of X
E(X)=n⋅MN V(X)=N−nN−1⋅n⋅MN⋅(1−MN) N−nN−1 is often called the finite population correction factor(有限总体校正因子).
Properties of Negative Binomial Distribution(负二项分布)
- The experiment consists of a sequence of independent trials.
- Each trial can result in either a success(S) or a failure(F).
- The probability of success is constant from trial to trial, so
P(Sontriali)=p for i=1,2,3,… - The experiment continues (trials are performed) until a total of r successes have been observed, where r is a specified positive integer.
The Negative Binomial Random Variable and Distribution
- The random variable of interest is X = the number of failures that precede the rth success.
X is called a negative binomial random variable because, in contrast to the binomial rv, the number of successes is fixed and the number of trials is random.- The pmf of the negative binomial rv X with parameters r = number of S’s and p = P(S) is
P(X=x)=nb(x;r,p)=Cr−1x+r−1pr(1−p)x x=0,1,2,... - In the special case r = 1, the pmf is
nb(x;1,p)=(1−p)xp x=0,1,2,...
Then the distribution is called the geometric distribution(几何分布).
The Mean and Variance of X
E(X)=r(1−p)p V(X)=r(1−p)p2
The Poisson Probability Distribution(泊松分布)
Definition:
- A random variable X is said to have a Poisson distribution if the pmf of X is
p(x;λ)=e−λλxx! x=0,1,2,3... with λ>0
The value of λ is frequently a rate per unit time or per unit area. This show thateλ=1+λ+λ22!+...=∑x=0∞λxx! ∑x=0∞p(x;λ)=1
Properties of a Possion variable
- There may have infinite number of trials
- Each trial results in either S or F
- Trials are independent
- The probability that an event occurs in a short interval is proportional(成比例的) to the length of the interval
- The probability of two or more events occurring in a very short interval is negligible(可以忽略的)
The Possition Distribution as a Limit
- Suppose that in the binomial pmf b(x;n,p), we let n →
∞ and p → 0 in such a way that np approaches a value λ > 0. Then b(x;n,p) → p(x;λ).- As a rule of thumb, this approximation can safely be applied if n ≥ 100, p ≤ 0.01, and np ≤ 20.
The Mean and Variance of X
E(X) = V(X)=λ
The Poisson Process
λ=αt⇒Pk(t)=e−αt⋅(αt)kk! - The number of events occurring during a fixed time interval of length t has a Possion distribution with parameter
α t. Any process that has this distribution is called a Poisson process.
Copyright © 2014 by Xuan Dai. All rights reserved.