[关闭]
@daixuan1996 2015-01-01T19:48:59.000000Z 字数 8499 阅读 1141

概率统计笔记 CH3

Discrete Random Variables and Probability Distributions

离散随机变量与概率分布

概率统计
https://www.zybuluo.com/daixuan1996/note/57380

  • The concept of a random variable allows us to pass from the experimental outcomes themselves to a numerical function of the outcomes.
  • There are two fundamentally different types of random variables--discrete random variables and continuous random variables.

  1. Random Variables

    • random variable (rv):

      • For a given sample space S of some experiment, a random variable is any rule that associates a number with each outcome in S.
      • A random variable is a function whose domain(值域) is the sample space of the experiment, and range(定义域) is the real numbers. X:SR
      • The notation X(s)=x means that x is the value associated with the outcome s by the rv X.
      • Any random variable whose only possible values are 0 and 1 is called a Bernoulli(伯努利) random variable.
    • Two types of random variables

      1. A discrete random variable is an rv whose possible values either constitute a finite set or else can be listed in an infinite sequence in which there is a first element, a second element, and so on.
      2. A random variable is continuous if its set of possible values consists of an entire interval on the number line.
  2. Probability Distributions for Discrete Random Variables

    • The rule for describing the probability measure associated with all values of a random variable is called a probability distribution.
    • The probability distribution(概率分布) or probability mass function - pmf (概率质量函数) of a discrete rv is defined for every number x by p(x)=P(X=x)

      • Any pmf must be required to satisfy the following conditions:
        p(x) >= 0 and all possible xp(x) = 1
      • A probability distribution can also be represented graphically as a histogram. The histogram represents the relative frequency we would expect to see if we repeated an experiment infinitely many times.
    • A parameter of a probability distribution

      • Suppose p(x) depends on a quantity that can be assigned any one of a number of possible values, with each different value determining a different probability distribution. Such a quantity is called a parameter(参数) of the distribution.
      • The collection of all probability distributions for different values of the parameter is called a family of probability distributions(概率分布族).
        e.g: The family of Bernoulli distributions:
        p(x;α)1αα0if x=0if x=1otherwise
    • The cumulative distribution function - cdf (累积分布函数) F(x) of a discrete rv X with pmf p(x) is defined for every number x by

      F(X) = P(X  x)=y:yxp(y)

      • The cdf satisfies the following properties:
        1. The cdf is non-decreasing.
        2. The cdf satisfies : limxF(x)=0 limx+F(x)=1
        3. It is comtinuous in x.
          For any two numbers a and b with a b, P(an)=F(b)F(a) where “a-” represents the largest possible X value that is strictly less than a.
          a, b都是整数时,P(an)=F(b)F(a1)
      • P(X=a)=F(a)F(a1)
        example for cdf
  3. Expected Values(期望) of Discrete Random Variables

    • Let X be a discrete rv with set of possible values D and pmf p(x).The expected values(期望值) or mean value(均值) of X, denoted by E(X) or μx , is

      E(X)=μx=xDxp(x)

      • When the sum does not exist, we say the expectation of X does not exist.
      • The population mean(总体均值) is the mean value of the population.
      • The probability distribution of X has "a heavy tail(重尾)" if its E(X) is not finite.
    • The Expected Value of a Function

      • Let X be a discrete rv with set of possible values D and pmf p(x). Then the expected values or mean value of any function h(X), denoted by E[h(X)] or μh(X), is computed by
        E[h(X)]=xDh(x)p(x)
    • Rules of Expected Value

      1. E(Y1+Y2)=E(Y1)+E(Y2)
      2. E(bY)=bE(Y)
      3. E(C)=C(forconstant C)
      4. E(Y1Y2)=E(Y1)E(Y2) when Y1and Y2 are independent
    • The Variance of X

      • Let X have pmf p(x) and the expected value μ. Then the variance(方差) of X, denoted by V(X) or just σ2, is
        V(X)=xD(xμ)2p(x)=E[(Xμ)2]
      • The standard deviation(标准差) of X is σx=σ2x.
      • popuplation variance/standard deviation(总体方差/标准差)
    • Properties of Variance
      1. V(X)=E(X2)E(X)2
      2. V(C)=0,forallconstantC
      3. V(aX+b)=a2V(X)
      4. V(X+Y)=V(X)+V(Y) if X and Y are indenpendent
  4. The Binomial Probability Distribution(二项分布)

    • Properties of a Binomial experiment

      1. The experiment consists of a sequence of n smaller trials, where n is fixed in advance of the experiment.
      2. Two outcomes are possible on each trial.
      3. The probability of a success or a failure, denoted by p and 1-p, does not change from trial to trial.
      4. The trials are independent.
    • Suppose each trial of an experiment can result in S or F, but the sampling is without replacement from a population of size N. If the sample size (number of trials) n is at most 5% of the population size, the experiment can be analyzed as though it were exactly a binomial experiment.

    • The Binomial Random Variable and Distribution

      • Possible values for X in an n-trial experiment are x=0,1,2,,n. We will often write X~Bin(n,p) to indicate that X is a binomial rv based on n trials with success probability p.
      • We denote the pmf by b(x;n,p)
        b(x;n,p)={Cxnpx(1p)nx0x=0,1,2,...notherwise
      • For X~Bin(n,p), the cdf will be denote by
        P(Xx)=B(x;n,p)=y=0xb(y;n,p)  x=0,1,2,...,n
    • The Mean and Variance of X

      • E(X)=np
      • V(X)=np(1p)=npq
      • σx=npq
  5. Hypergeometric and Negative Binomial Distributions(超几何分布与负二项分布)

    • Properties of Hypergeometric Distribution(超几何分布)

      1. The population or set to be sampled consists of N individuals, objects, or elements.(a finite population)
      2. Each individual can be characterized as a success (S) or a failure (F), and there are M successes in the population.
      3. A sample of n individuals is selected without replacement in such a way that each subset of size n is equally likely to be chosen.
    • The Hypergeometric Random Variable and Distribution

      • The random variable of interest is X = the number of S’s in the sample.
      • If X is the number of S’s in a completely random sample of size n drawn from a population consisting of M S’s and (N-M) F’s, then the probability distribution of X, called the hypergeometric distribution(超几何分布), is given by
        P(X=x)=h(x;n,M,N)=CxMCnxNMCnN
    • The Mean and Variance of X

      • E(X)=nMN
      • V(X)=NnN1nMN(1MN)
      • NnN1 is often called the finite population correction factor(有限总体校正因子).
    • Properties of Negative Binomial Distribution(负二项分布)

      1. The experiment consists of a sequence of independent trials.
      2. Each trial can result in either a success(S) or a failure(F).
      3. The probability of success is constant from trial to trial, so P(Sontriali)=p for i=1,2,3,
      4. The experiment continues (trials are performed) until a total of r successes have been observed, where r is a specified positive integer.
    • The Negative Binomial Random Variable and Distribution

      • The random variable of interest is X = the number of failures that precede the rth success.
        X is called a negative binomial random variable because, in contrast to the binomial rv, the number of successes is fixed and the number of trials is random.
      • The pmf of the negative binomial rv X with parameters r = number of S’s and p = P(S) is
        P(X=x)=nb(x;r,p)=Cr1x+r1pr(1p)x  x=0,1,2,...
      • In the special case r = 1, the pmf is
        nb(x;1,p)=(1p)xp  x=0,1,2,...

        Then the distribution is called the geometric distribution(几何分布).
    • The Mean and Variance of X

      • E(X)=r(1p)p
      • V(X)=r(1p)p2
  6. The Poisson Probability Distribution(泊松分布)

    • Definition:

      • A random variable X is said to have a Poisson distribution if the pmf of X is
        p(x;λ)=eλλxx!  x=0,1,2,3...   with λ>0

        The value of λ is frequently a rate per unit time or per unit area.
      • eλ=1+λ+λ22!+...=x=0λxx!
        This show that
        x=0p(x;λ)=1
    • Properties of a Possion variable

      1. There may have infinite number of trials
      2. Each trial results in either S or F
      3. Trials are independent
      4. The probability that an event occurs in a short interval is proportional(成比例的) to the length of the interval
      5. The probability of two or more events occurring in a very short interval is negligible(可以忽略的)
    • The Possition Distribution as a Limit

      • Suppose that in the binomial pmf b(x;n,p), we let n → and p → 0 in such a way that np approaches a value λ > 0. Then b(x;n,p) → p(x;λ).
      • As a rule of thumb, this approximation can safely be applied if n ≥ 100, p ≤ 0.01, and np ≤ 20.
    • The Mean and Variance of X

      • E(X) = V(X)=λ
    • The Poisson Process

      • λ=αtPk(t)=eαt(αt)kk!
      • The number of events occurring during a fixed time interval of length t has a Possion distribution with parameter αt. Any process that has this distribution is called a Poisson process.

Copyright © 2014 by Xuan Dai. All rights reserved.

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注