Website Worth

Total Pageviews

Friday

Checking for autocorrelation /serial correlation


What is autocorrelation?
1) So you have an equation to predict values of a dependent variable Y….
2) The predicted values (Y’i)  are only ‘close’ the the real values (Yi)….
3) The difference between predicted and real values are called by different names: residual (the left over), or error term (the part that equation cannot predict)
4) There are two important assumptions related to the residual/error term: – they should not correlate with each other, and their variance should be constant.
5) If the first assumption is violated, or the residuals (between different waves of data, or different groups) correlate with each other, we have autocorrelation. This is particularly so in time series data analysis (called ‘serial correlation’)
6) If the second assumption is violated, we have the phenomenon called heteroskedasticity
How to check for them?
How to know if the first assumption is violated?
We can set the data as panel data, then use Wooldridge test as following in Stata
xtserial DEP (list of predictor)  ------> if p value smaller than 0.05, assumption is violated. If I have d1 as DEP and age is the only predictor,
xtserial d1 age
Wooldridge test for autocorrelation in panel data
H0: no first-order autocorrelation
    F(  1,       1) =      0.059
           Prob>| F| =      0.8486
-->This indicates there is no autocorrelation
Another way is to do this: set the data as times series, then
reg DEP (list of predictor)
dwstat
How to know if heterokesdasticity exist?
reg DEP var list
estat hett
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
         Ho: Constant variance
         Variables: fitted values of d3
         chi2(1)      =     0.73
         Prob >|chi2|  =   0.3942
-->This indicates no heterokesdascity
What do to?
What to do if you find autocorrelation? Remove its effect by:
prais d1 age, corc
Iteration 0:  rho = 0.0000
Iteration 1:  rho = 0.0795
Iteration 2:  rho = 0.0808
Iteration 3:  rho = 0.0808
Iteration 4:  rho = 0.0808
Cochrane-Orcutt AR(1) regression -- iterated estimates
      Source |       SS       df       MS              Number of obs =     127
-------------+------------------------------           F(  1,   125) =    4.85
       Model |  14.3667947     1  14.3667947           Prob >| F|      =  0.0295
    Residual |  370.582352   125  2.96465882           R-squared     =  0.0373
-------------+------------------------------           Adj R-squared =  0.0296
       Total |  384.949147   126  3.05515196           Root MSE      =  1.7218
------------------------------------------------------------------------------
          d1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0306289   .0139136     2.20   0.030     .0030922    .0581656
       _cons |   3.001655   .5507311     5.45   0.000     1.911689     4.09162
-------------+----------------------------------------------------------------
         rho |   .0808052
------------------------------------------------------------------------------
Durbin-Watson statistic (original)    1.838574
Durbin-Watson statistic (transformed) 2.000578
What do do if you find heteroskedasticity?
Heteroskedasticity can distort  estimations. In stata, we remove its effect by using robust regression
reg Y X, robust
 reg d1 age, robust
Linear regression                                      Number of obs =     128
                                                       F(  1,   126) =    5.55
                                                       Prob>| F|      =  0.0201
                                                       R-squared     =  0.0409
                                                       Root MSE      =   1.722
------------------------------------------------------------------------------
             |               Robust
          d1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0322989   .0137162     2.35   0.020      .005155    .0594428
       _cons |   2.945031   .5509205     5.35   0.000     1.854776    4.035286
------------------------------------------------------------------------------