天天看點

雙變量回歸模型:估計問題

普通最小二乘估計

m i n ∑ ( Y i − Y ^ i ) 2 = ∑ ( Y i − β 0 − β 1 X i ) 2 min \sum (Y_i-\hat Y_i)^2=\sum (Y_i-\beta_0-\beta_1 X_i)^2 min∑(Yi​−Y^i​)2=∑(Yi​−β0​−β1​Xi​)2

對 β 0 , β 1 \beta_0,\beta_1 β0​,β1​求導,令為0:

2 ∑ ( Y i − β 0 − β 1 X i ) ( − 1 ) = 0 2 ∑ ( Y i − β 0 − β 1 X i ) ( − X i ) = 0 2\sum(Y_i-\beta_0-\beta_1X_i)(-1)=0 \\2\sum(Y_i-\beta_0-\beta_1X_i) (-X_i)=0 2∑(Yi​−β0​−β1​Xi​)(−1)=02∑(Yi​−β0​−β1​Xi​)(−Xi​)=0

∑ Y i = n β 0 + β 1 ∑ X i ∑ X i Y i = β 0 ∑ X i + β 1 ∑ X i 2 \sum Y_i=n\beta_0+\beta_1\sum X_i \\\sum X_iY_i=\beta_0\sum X_i+\beta_1\sum X_i^2 ∑Yi​=nβ0​+β1​∑Xi​∑Xi​Yi​=β0​∑Xi​+β1​∑Xi2​

上式乘以 ∑ X i \sum X_i ∑Xi​,下式乘以 n n n得

∑ X i ∑ Y i = n β 0 ∑ X i + β 1 ( ∑ X i ) 2 n ∑ X i Y i = n β 0 ∑ X i + n β 1 ∑ X i 2 \sum X_i\sum Y_i=n\beta_0\sum X_i+\beta_1(\sum X_i)^2 \\n\sum X_iY_i=n\beta_0\sum X_i+n\beta_1\sum X_i^2 ∑Xi​∑Yi​=nβ0​∑Xi​+β1​(∑Xi​)2n∑Xi​Yi​=nβ0​∑Xi​+nβ1​∑Xi2​

兩式相減得

∑ X i ∑ Y i − n ∑ X i Y i = β 1 [ ( ∑ X i ) 2 − n ∑ X i 2 ] \sum X_i\sum Y_i-n\sum X_iY_i=\beta_1[(\sum X_i)^2-n\sum X_i^2] ∑Xi​∑Yi​−n∑Xi​Yi​=β1​[(∑Xi​)2−n∑Xi2​]

解得

β ^ 1 = n ∑ X i Y i − ∑ X i ∑ Y i n ∑ X i 2 − ( ∑ X i ) 2 = ∑ ( X i − X ‾ ) ( Y i − Y ‾ ) ∑ ( X i − X ‾ ) 2 = ∑ x i y i ∑ x i 2 \hat \beta_1=\frac{n\sum X_i Y_i-\sum X_i\sum Y_i}{n\sum X_i^2-(\sum X_i)^2}\\\quad\\=\frac{\sum (X_i-\overline X)(Y_i-\overline Y)}{\sum (X_i-\overline X)^2}=\frac{\sum x_iy_i}{\sum x_i^2} β^​1​=n∑Xi2​−(∑Xi​)2n∑Xi​Yi​−∑Xi​∑Yi​​=∑(Xi​−X)2∑(Xi​−X)(Yi​−Y)​=∑xi2​∑xi​yi​​

β ^ 0 = Y ‾ − β ^ 1 X ‾ \hat \beta_0=\overline Y-\hat \beta_1 \overline X β^​0​=Y−β^​1​X

估計的方差

  • β ^ 1 \hat\beta_1 β^​1​的方差

v a r ( β ^ 1 ) = E ( β ^ 1 − E ( β ^ 1 ) ) 2 = E ( ∑ x i ∑ x i 2 Y i − β 1 ) 2 = E [ ∑ x i ∑ x i 2 ( β 0 + β 1 X i + μ i ) − β 1 ] 2 = E ( ∑ x i ∑ x i 2 μ i ) 2 var(\hat\beta_1)=E(\hat\beta_1-E(\hat \beta_1))^2\\=E(\sum \frac{x_i}{\sum x_i^2}Y_i-\beta_1)^2\\=E[\sum \frac{x_i}{\sum x_i^2}(\beta_0+\beta_1 X_i+\mu_i)-\beta_1]^2 \\=E(\sum \frac{x_i}{\sum x_i^2}\mu_i)^2 var(β^​1​)=E(β^​1​−E(β^​1​))2=E(∑∑xi2​xi​​Yi​−β1​)2=E[∑∑xi2​xi​​(β0​+β1​Xi​+μi​)−β1​]2=E(∑∑xi2​xi​​μi​)2

由于 E ( μ i 2 ) = σ , E ( μ i μ j ) = 0 E(\mu_i^2)=\sigma,E(\mu_i\mu_j)=0 E(μi2​)=σ,E(μi​μj​)=0,則上式為

v a r ( β ^ 1 ) = σ 2 ∑ ( x i ∑ x i 2 ) 2 = σ 2 ∑ x i 2 var(\hat\beta_1)=\sigma^2\sum (\frac{x_i}{\sum x_i^2})^2=\frac{\sigma^2}{\sum x_i^2} var(β^​1​)=σ2∑(∑xi2​xi​​)2=∑xi2​σ2​

  • β 0 \beta_0 β0​的方差

    v a r ( β ^ 0 ) = E [ Y ‾ − X ‾ ∑ k i ( β 0 + β 1 X i + μ i ) − β 0 ] = E [ Y ‾ − β 1 X ‾ − X ‾ ∑ k i μ i − β 0 ] = E [ X ‾ ∑ x i ∑ x i 2 μ i ] 2 = σ 2 ∑ ( X ‾ x i ∑ x i ) 2 = σ 2 X i 2 n ∑ x i 2 var(\hat\beta_0)=E[\overline Y-\overline X \sum k_i (\beta_0+\beta_1X_i+\mu_i)-\beta_0] \\=E[\overline Y-\beta_1\overline X-\overline X\sum k_i\mu_i-\beta_0]\\=E[\overline X\sum \frac{x_i}{\sum x_i^2} \mu_i]^2= \sigma^2\sum(\overline X\frac{x_i}{\sum x_i})^2 \\ =\sigma^2\frac{X_i^2}{n\sum x_i^2} var(β^​0​)=E[Y−X∑ki​(β0​+β1​Xi​+μi​)−β0​]=E[Y−β1​X−X∑ki​μi​−β0​]=E[X∑∑xi2​xi​​μi​]2=σ2∑(X∑xi​xi​​)2=σ2n∑xi2​Xi2​​

  • σ 2 \sigma^2 σ2的估計

    Y i = β 0 + β 1 X i + μ i Y ‾ = β 0 + β 1 X ‾ + u ‾ Y_i=\beta_0+\beta_1X_i+\mu_i\\ \overline Y=\beta_0+\beta_1\overline X+\overline u Yi​=β0​+β1​Xi​+μi​Y=β0​+β1​X+u

    兩式相減得

    y i = β 1 x i + ( μ i − μ ‾ ) y_i=\beta_1x_i+(\mu_i-\overline \mu) yi​=β1​xi​+(μi​−μ​)

    又知道

    μ ^ = y i − β ^ 1 x i \hat\mu=y_i-\hat\beta_1 x_i μ^​=yi​−β^​1​xi​

    μ ^ = β 1 x i + ( μ i − μ ‾ ) − β ^ 1 x i = ( β 1 − β ^ 1 ) x i + ( μ i − μ ‾ ) \hat \mu=\beta_1x_i+(\mu _i-\overline \mu)-\hat \beta_1 x_i\\= (\beta_1-\hat\beta_1)x_i+(\mu_i-\overline \mu) μ^​=β1​xi​+(μi​−μ​)−β^​1​xi​=(β1​−β^​1​)xi​+(μi​−μ​)

    那麼

    E ∑ μ ^ 2 = E [ ∑ ( β 1 − β ^ 1 ) 2 x i 2 + ∑ ( μ i − μ ‾ ) 2 − 2 ∑ ( β 1 − β ^ 1 ) x i ( μ i − μ ‾ ) ] = v a r ( β ^ 1 ) ∑ x i 2 + ( n − 1 ) σ 2 − 2 E [ ∑ k i μ i x i ( μ i − μ ‾ ) ] = σ 2 + ( n − 1 ) σ 2 − 2 σ 2 = ( n − 2 ) σ 2 E\sum \hat \mu^2=E[\sum(\beta_1-\hat\beta_1)^2x_i^2+\sum(\mu_i-\overline \mu)^2-2\sum (\beta_1-\hat\beta_1)x_i(\mu_i-\overline \mu)]\\ =var(\hat\beta_1)\sum x_i^2+(n-1)\sigma^2-2E[\sum k_i\mu_ix_i(\mu_i-\overline \mu)]\\=\sigma^2+(n-1)\sigma^2-2\sigma^2=(n-2)\sigma^2 E∑μ^​2=E[∑(β1​−β^​1​)2xi2​+∑(μi​−μ​)2−2∑(β1​−β^​1​)xi​(μi​−μ​)]=var(β^​1​)∑xi2​+(n−1)σ2−2E[∑ki​μi​xi​(μi​−μ​)]=σ2+(n−1)σ2−2σ2=(n−2)σ2

    那麼令

    σ ^ 2 = ∑ μ ^ 2 n − 2 \hat\sigma^2=\frac{\sum\hat\mu^2}{n-2} σ^2=n−2∑μ^​2​

    則 E ( σ ^ 2 ) = ( n − 2 ) σ 2 / ( n − 2 ) = σ 2 E(\hat\sigma^2)=(n-2)\sigma^2/(n-2)=\sigma^2 E(σ^2)=(n−2)σ2/(n−2)=σ2

    說明是 σ 2 \sigma^2 σ2的無偏估計。

    高斯-馬爾可夫定理

    在給定經典假設下,OLS估計量是最優線性無偏估計量。
    • 無偏

      β ^ 1 = ∑ x i y i ∑ x i 2 = ∑ x i ∑ x i 2 Y i = ∑ k i Y i = ∑ k i ( β 0 + β 1 X i + μ i ) \hat\beta_1=\frac{\sum x_iy_i}{\sum x_i^2}=\sum \frac{x_i}{\sum x_i^2}Y_i\\=\sum k_iY_i=\sum k_i(\beta_0+\beta_1 X_i+\mu_i) β^​1​=∑xi2​∑xi​yi​​=∑∑xi2​xi​​Yi​=∑ki​Yi​=∑ki​(β0​+β1​Xi​+μi​)

      其中 k i = x i / ∑ x i 2 k_i=x_i/\sum x_i^2 ki​=xi​/∑xi2​,有如下性質:

      ∑ k i = 0 \sum k_i=0 ∑ki​=0;

      ∑ k i X i = ∑ k i x i = 1 \sum k_iX_i=\sum k_ix_i=1 ∑ki​Xi​=∑ki​xi​=1;

      E ( β ^ 1 ) = E ∑ k i β 0 + E ∑ k i X i β 1 + E ∑ k i μ i = β 1 E(\hat \beta_1)=E\sum k_i\beta_0+E\sum k_iX_i\beta_1+E\sum k_i\mu_i=\beta_1 E(β^​1​)=E∑ki​β0​+E∑ki​Xi​β1​+E∑ki​μi​=β1​

      則 β ^ 1 \hat\beta_1 β^​1​是 β 1 \beta_1 β1​的無偏估計量。

      β ^ 0 = Y ‾ − ∑ k i Y i X ‾ = β 0 + β 1 X ‾ − X ‾ ∑ k i ( β 0 + β 1 X i + μ i ) = β 0 + β 1 X ‾ − X ‾ β 1 − X ‾ ∑ k i μ i = β 0 − X ‾ ∑ k i μ i \hat\beta_0=\overline Y-\sum k_i Y_i \overline X=\beta_0+\beta_1 \overline X-\overline X\sum k_i(\beta_0+\beta_1X_i+\mu_i)\\=\beta_0+\beta_1 \overline X-\overline X\beta_1-\overline X\sum k_i\mu_i\\=\beta_0-\overline X\sum k_i\mu_i β^​0​=Y−∑ki​Yi​X=β0​+β1​X−X∑ki​(β0​+β1​Xi​+μi​)=β0​+β1​X−Xβ1​−X∑ki​μi​=β0​−X∑ki​μi​

      取期望

      E ( β ^ 0 ) = β 0 + E ( X ‾ ∑ k i μ i ) = β 0 E(\hat\beta_0)=\beta_0+E(\overline X\sum k_i\mu_i)=\beta_0 E(β^​0​)=β0​+E(X∑ki​μi​)=β0​

      β ^ 0 \hat\beta_0 β^​0​也是 β 0 \beta_0 β0​的無偏估計。

    • 有效

      設另一估計量

      β 1 ∗ = ∑ w i Y i \beta_1^*=\sum w_iY_i β1∗​=∑wi​Yi​

      期望為 E ( β 1 ∗ ) = β 0 ∑ w i + β 1 ∑ w i X i E(\beta_1^*)=\beta_0\sum w_i+\beta_1\sum w_i X_i E(β1∗​)=β0​∑wi​+β1​∑wi​Xi​

      若想無偏,必須有

      ∑ w i = 0 \sum w_i=0 ∑wi​=0;

      ∑ w i X i = ∑ w i x i = 1 \sum w_iX_i=\sum w_i x_i=1 ∑wi​Xi​=∑wi​xi​=1;

      方差為

      v a r ( β 1 ∗ ) = ∑ w i 2 v a r ( Y i ) = σ 2 ∑ [ ( w i − k i ) + k i ] 2 = σ 2 ∑ [ ( w i − k i ) 2 + k i 2 − 2 ( w i k i − k i 2 ) ] = σ 2 ( w i − k i ) 2 + σ 2 1 ∑ x i 2 var(\beta_1^*)=\sum w_i^2 var(Y_i)\\=\sigma^2\sum [(w_i-k_i)+k_i]^2=\sigma^2\sum[(w_i-k_i)^2+k_i^2-2(w_ik_i-k_i^2)] \\=\sigma^2(w_i-k_i)^2+\sigma^2\frac{1}{\sum x_i^2} var(β1∗​)=∑wi2​var(Yi​)=σ2∑[(wi​−ki​)+ki​]2=σ2∑[(wi​−ki​)2+ki2​−2(wi​ki​−ki2​)]=σ2(wi​−ki​)2+σ2∑xi2​1​

      隻有當 w i = k i w_i=k_i wi​=ki​時,方差才最小。