@daixuan1996 2015-01-07T15:45:16.000000Z 字数 6726 阅读 1275

概率统计笔记 CH6

Point Estimation

点估计

概率统计

Given a parameter of interest, such as a population mean μ or population proportion p, the objective of point estimation is to use a sample to compute a number that represents in some sense a good guess for the true value of the parameter. The resulting number is called a point estimate.

Some General Concepts of Point Estimation
- Statistical inference is almost always directed toward drawing some type of conclusion about one or more parameters (population characteristics). To do so requires that an investigator obtain sample data from each of the populations under study. Conclusions can then be based on the computed values of various sample quantities.
  When discussing general concepts and methods of inference, it is convenient to have a generic symbol for the parameter of interest. We will use the Greek letter θ for this purpose. The objective of point estimation is to select a single number, based on sample data, that represents a sensible value for θ.
  - A point estimate(点估计值) of a parameter $\theta$ is a single number that can be regarded as a sensible value for $\theta$ .
  - A point estimate is obtained by selecting a suitable statistic and computing its value from the given sample data. The selected statistic is called the point estimator(点估计量) of $\theta$ .
  - In the best of all possible worlds, we could find an estimator $\hat{\theta}$ for which $\hat{\theta}=\theta$ always. However, $\hat{\theta}$ is a function of the sample $X_i's$ , so it is a random variable. For some samples, $\hat{\theta}$ will yield a value larger than $\theta$ , whereas for other samples $\hat{\theta}$ will underestimate $\theta$ .
  - If we write $\hat{\theta} = \theta +\ error\ of\ estimation$
    Then an accurate estimator would be one resulting in small estimation errors, so that estimated values will be near the true value.
  - An estimator that has the properties of unbiasedness(无偏性) and minimum variance will often be accurate in this sense.
- Unbiased Estimators(无偏估计量)
  - A point estimator $\hat{\theta}$ is said to be an unbiased estimator of $\theta$ if $E(\hat{\theta}) = \theta$ for every possible value of $\theta$ .
    If $\hat{\theta}$ is not unbiased, the difference $E(\hat{\theta})-\theta$ is called the bias(偏差) of $\hat{\theta}$ .
  - Thus, $\hat{\theta}$ is unbiased if its probability distribution is always "centered" at the true value of the parameter. Note that "centered" here means that the expected value, not the median, of the distribution of $\hat{\theta}$ is equal to $\theta$ .
  - When X is a binomial rv with parameters n and p, the sample proportion $\hat{p}=X/n$ is an unbiased estimator of p.
  - When choosing among several different estimators of $\theta$ , select one that is unbiased.
  - Let $X_1, X_2, …, X_n$ be a random sample from a distribution with mean $\mu$ and variance $\sigma^2$ . Then the estimator
    $σ^2 = S 2 = \sum ( X i - X ¯ ¯ ¯ ) 2 n - 1$ $\hat{\sigma}^2 = S^2 = \frac{\sum(X_i-\overline{X})^2}{n-1}$
    is an unbiased estimator of $\sigma^2$ .
- Estimators with Minimum Variance
  - Among all estimators of θ that are unbiased, choose the one that has minimum variance. The resulting $\hat{\theta}_1$ is called the minimum variance unbiased estimator - MVUE of $\theta$ .
  - Let $X_1, X_2, …, X_n$ be a random sample from a normal distribution with parameters $\mu$ and $\sigma$ . Then the estimator $\hat{\mu} = \overline{X}$ is the MVUE of $\mu$ .
- Some Complications
  - For distributions other than normal, it is possible that the estimator $\overline{X}$ is not the best choice in estimating a population mean $\mu$ .
  - The very important moral here is that the best estimator for μ depends crucially on which distribution is being sampled. In particular.
    1. If the random sample comes from a normal distribution, then $\overline{X}$ is the best of the four estimators, since it is the MVUE.
    2. If the random sample comes from a Cauchy distribution(柯西分布), then $\overline{X}$ and $\overline{X_e}$ (the average of the two extreme observations) are terrible estimators for $\mu$ , whereas $\hat{X}$ is quite good; $\overline{X}$ is bad because it is very sensitive to outlying observations, and the heavy tails of the Cauchy distribution make a few such observation likely to appear in any sample.
    3. If the underlying distribution is uniform, the best estimator is $\overline{X}$ ; this estimator is greatly influenced by outlying observations, but the lack of tails makes such observations impossible
    4. The trimmed mean is best in none of these three situations, but works reasonably well in all three. That is, $\overline{X}_{tr(10)}$ does not suffer too much in any of the three situations.
- Reporting a Point Estimate: The Standard Error(标准误差)
  - Besides reporting the value of a point estimate, some indication of its precision should be given. The usual measure of precision is the standard error of the estimator used.
  - The standard error of an estimator is its standard deviation $\sigma_{\hat{\theta}} = \sqrt{V(\hat{\theta})}$ .
    If the standard error itself involves unknown parameters whose values can be estimated, substitution of these estimates into $\sigma_{\hat{\theta}}$ yields the estimated standard error (estimated standard deviation) of the estimator. The estimated standard error can be denoted either by $\sigma_{\hat{\theta}}$ or by $s_{\hat{\theta}}$ .
Methods of Point Estimation
- The Method of Moments(矩估计)
  - Let $X_1, X_2, …, X_n$ be a random sample from a pmf or pdf f(x).(独立同分布) For k = 1, 2, 3, …, the kth population moment(k阶总体矩), or kth moment of the distribution f(x), is $E(X^k)$ . The kth sample moment(k阶样本矩) is $(1/n)\sum_{i=1}^n X_i^k$ .
    (用样本矩代替总体矩,有多少参数就用多少阶)
  - Let $X_1, X_2, …, X_n$ be a random sample from a distribution with pmf or pdf $f(x;\theta_1,…,\theta_m)$ , where $θ_1,…,θ_m$ are parameters whose values are unknown. Then the moment estimators(矩估计量) $\hat{\theta}_1,…,\hat{\theta}_m$ are obtained by equating the first m sample moments (expressions of some statistics) to the corresponding first m population moments (expressions of the unknown parameters) and solving for $θ_1,…,θ_m$ (the solutions are the moment estimators).
- Maximum Likelihood Estimation(最大似然估计)
  - Let $X_1, X_2, …, X_n$ have joint pmf or pdf $f(x_1,x_2,...,x_n;\theta_1,...,\theta_m)$ where the parameters $\theta_1,...,\theta_m$ have unknown values. When $x_1, …, x_n$ are the observed sample values and f is regarded as a function of $\theta_1,...,\theta_m$ , it is called the likelihood function(似然函数).
    The maximum likelihood estimates - mle’s(最大似然估计值) $\hat{\theta}_1,...,\hat{\theta}_m$ are those values of the $θ_i's$ that maximize the likelihood function, so that
    $f (x 1, x 2, . . ., x n; θ^1, . . ., θ^m) \geq f (x 1, x 2, . . ., x n; θ 1, . . ., θ m)$ $f(x_1,x_2,...,x_n; \hat{\theta}_1,...,\hat{\theta}_m) \ge f(x_1,x_2,...,x_n;\theta_1,...,\theta_m)$
    When the $X_i's$ are substituted in place of the $x_i's$ , the maximum likelihood estimators(最大似然估计量) result.
- Estimating Functions of Parameters
  - Let $\hat{\theta}_1,...,\hat{\theta}_m$ be the mle’s of the parameters $\theta_1,...,\theta_m$ . Then the mle of any function $h(\theta_1,...,\theta_m)$ of these parameters is the function $h(\hat{\theta}_1,...,\hat{\theta}_m)$ of the mle’s.
- Large Sample Behavior of the MLE
  - Under very general conditions on the joint distribution of the sample, when the sample size n is large, the maximum likelihood estimator of any parameter θ is approximately unbiased $[E(\hat{\theta}) \approx \theta]$ and has variance that is nearly as small as can be achieved by any estimator. Stated another way, the mle $\hat{\theta}$ is approximately the MVUE of $\theta$ .
- Some Complications

概率统计笔记 CH6

Point Estimation

点估计

内容目录