Website Worth

Total Pageviews

Thursday

Repeated measures

Situation:

A group of students were asked two times about their opinions regarding study. In the first time, they were asked two topics:

1) how do you like studying individually?
2) how do you like studying in group of 3?

The experimenter did some intervention on the group, then asked the above questions again.  He wanted to know if the intervention was effective.

What test should he use?














Tuesday

Framework candidate 1


Consumption of homemade alcohol in Muong Khen is closely linked with the image of a competent drinker. A competent drinker in Muong Khen is one how can drink many cups, but remain relatively sober and does no harms to others. He is praised for his openness in conversations, his willingness to make connections with others, and his appreciation of relationships with others. He is, in short, good at both drinking and making social relations. In the time when social capital is important to cope with scarcities of resources and uncertainties of life, the competent drinker has become a model of behaviours that many people would like to imitate. This image acts a norm on which drinkers base their judgments of others’ and their own drinking practices. Drinkers may try to neutralise or justify their drinking patterns if they find them deviated from the norm.  

Sunday

Gologit2 for ordinal outcome


If you have an ordinal outcome /variable ('at risk' drinking, 'alcohol dependent' in a alcohol scale)

If you want to argue that the probabilities of level 1, 2, 3, …n of the outcome are cumulative, (for example, people with increased chance in 'at risk' category have increased chance in 'dependence' category)

If you are trying to adjust for problems caused by nature of a survey (sample does not represent the population due to non-response, finite population correction),

If you want test the proportional odds assumption for your ordinal variable,


 

  1. Install gologit2 into stata
  2. Key in


    gologit2 depvar invar list, autofit svy subpop(group var)
If the test for proportional odds assumption is insignificant, your model is ok.

If it is not, use separated standard logistic equations, or, you can try multinomial logistic regression

Saturday

Understanding interaction in logistic regression

If we have
y = x + z + xz

How to interpret interaction term xy:

xy = when x is held constant, one unit change in  z yields a change of ( ....) y / or log odds of y (in cases of logistic regression)









Wednesday

Checking regression parralel in ologit regression

Context: If your dependent variable is ordered, like 'low', 'middle', 'high' levels of income, you may want to use ordinal logistic regression. An important assumption of this technique is that the set of coefficients estimated for each level of income needs to be similar with other two sets. We need to check for that. If this assumption is not violated, we report the coefficients by ordinal logistic regression. If it is violated, we should use multinominal logistic regression. This type of regression does not require stability of the sets of coefficients.

How to do in STATA?

It is the syntax in stata,

ologit y x 1 x2...................... xn
brant, detail

Significant results mean this assumption is violated. Large sample size is sensitive to this test. Recode the violating variable and re-run.

If brant still does not work, download omodel, and run this: (from here)



omodel y x1 x2 x2


Significant results mean that assumption is violated! In other words, coefficients do vary significantly between categories of the response variabl.

Sample size is also a problem. More samples are needed in ologit model than OLS model.


If both tips do not work, go for gologit2, see here



Understanding logit regression output

Look at this to see how to interpret odds

Tests needed to do before regression

  • Detecting Unusual and Influential Data
    • predict -- used to create predicted values, residuals, and measures of influence.
    • rvpplot --- graphs a residual-versus-predictor plot.
    • rvfplot -- graphs residual-versus-fitted plot.
    • lvr2plot -- graphs a leverage-versus-squared-residual plot.
    • dfbeta -- calculates DFBETAs for all the independent variables in the linear model.
    • avplot -- graphs an added-variable plot, a.k.a. partial regression plot.
  • Tests for Normality of Residuals
    • kdensity -- produces kernel density plot with normal distribution overlayed.
    • pnorm -- graphs a standardized normal probability (P-P) plot.
    • qnorm --- plots the quantiles of varname against the quantiles of a normal distribution.
    • iqr -- resistant normality check and outlier identification.
    • swilk -- performs the Shapiro-Wilk W test for normality.
  • Tests for Heteroscedasticity
    • rvfplot -- graphs residual-versus-fitted plot.
    • hettest -- performs Cook and Weisberg test for heteroscedasticity.
    • whitetst -- computes the White general test for Heteroscedasticity.
  • Tests for Multicollinearity
    • vif -- calculates the variance inflation factor for the independent variables in the linear model.
    • collin -- calculates the variance inflation factor and other multicollinearity diagnostics
  • Tests for Non-Linearity
    • acprplot -- graphs an augmented component-plus-residual plot.
    • cprplot --- graphs component-plus-residual plot, a.k.a. residual plot.
  • Tests for Model Specification
    • linktest -- performs a link test for model specification.
    • ovtest -- performs regression specification error test (RESET) for omitted variables.

Check for multicollinearity

Also called collinearity (when two predictors in the model has near perfect linear combination of one another)

Multicollinearity: more than two!

Method 1:

In stata, after regression, type vif for variance inflation factor

vif

VIF values greater than 10 need further investigations.

1/VIF (called tolerance value) values smaller than 0.1: need further investigations. In other words, the variable could be seen as a linear combination of other independent variables.

Example of case where multicollinearity exists:

vif = 43

1/vif=0.02

Method 2: Use collin for predictors

Still, look for VIF and tolerance. Also look for ‘condition number’, which is at the bottom. Condition number should not be larger than 9, otherwise, collinearity exists.

How to correct for heteroscedasticity in stata

Heteroscedasticity = the situation in which variance of residuals is not homogenous. It changes across groups, waves of data.

The assumption that variance of residuals is homogenous is often made in regression analysis. However, this assumption needs to be checked. If you are not sure about it, you can use option  'robust' or 'hc3' after 'reg' in Stata.

How to do?
1) small sample: use robust option in reg (stata)
reg DEPVAR ListofPredictor, robust
2) large sample: use hc3 option in reg (stata)
reg DEPVAR listofPredictor, hc3
Read here: http://www.sociology.ohio-state.edu/ptv/faq/heteroscedasticity.htm

Tuesday

Using predict in Stata after regression

1) predicted values of y (y is the dependent variable) no option needed

predict name

2) residuals

predict name, resid

3) standardized residuals

predict name, rstandard

4) studentized or jackknifed residuals

predict name, rstudent

5) leverage

predict name, lev or hat

6) standard error of the residual

predict name, stdr

7) Cook's D

predict name, cooksd

8) standard error of predicted individual y

predict name, stdf

9) standard error of predicted mean y

predict name, stdp

Measurement–Techniques of Neutralization

Model

Learn basic regression with stata here

http://www.ats.ucla.edu/stat/stata/webbooks/reg/chapter1/statareg1.htm

Friday

Addiction model

Addiction Model

Why no results in AMOS after boottraping?

You can't see AMOS output for some parts after running boottrap because they only shows up when you click in 'Estimates', scalar.

'Estimates': show results under Joint Multivariate normality assumption, while boottrap (a smaller window below the main window) should the same estimates but using different parameters calculated by running 250-2000 random samples taken from your sample.

Boottrap creates new 'critical values' for non-normal data, then uses these values to judge the null hypothesis. Under normal condition (normality), a critical value for z statistic is (-1.96: +1.96). But under non-normal condition, this technique creates creates a new 'z' and compare the obtained statistic (whatever it is) with this new one.

Wednesday

Draw a random sample with drawnorm

Use drawnorm in Stata 11 to draw a random sample!

corr2data: Make a new var or dataset with correlations/covariance

When correlations/covariances, means or other summary statistics are the only things available. We have to make them a new variable or dataset before we do further analysis. Use corr2data in Stata 11.

I guess when one wants to recalculate published statistics, this is a way to do. Will check it again!

How to detect multivariate outliers using Mahalanobis distance?

Easy! Open data by Stata 11. Then

 

biplot x1 x2 x3 x4 x5, maha

A nice graph appears, with observations being scaled from +3 to –3 on both x and y axes. Observations that are above +3 and less than –3 are outliers!

Next step: Delete them!

Rudyanto: Handling Non-Normal Data in SEM

This is very good post on handling non-normal data. I helped me with my own CFA model!

Rudyanto: Handling Non-Normal Data in SEM

Saturday

Many tools to for sample size determination

This is, http://statpages.org/, provides many tool for sample size! Note that at the middle of the page is a tool for logistic regression!

In many cases, information from previous studies (or a pilot study) are very important to determine sample size.

Friday

Self training multi- modelling & Package to deal with Missing Data :)

Go to this site and register a FREE course on multi-modelling ! All materials are downloadable. See guides to get MLwin for members of UK universities. Go through modules 1-3 to get basics of statistics. Look at the full list of books for multi-modelling.

Note that multi modelling can be done with many packages, including Stata!

http://www.bristol.ac.uk/cmm/learning/course.html#pay
http://www.bristol.ac.uk/cmm/software/mlwin/ordering/ac-uk.html

In order to deal with missing data (2 -level missing data), go to this website, the package is freely available:

http://www.bristol.ac.uk/cmm/software/realcom/imputation.html