Website Worth

Total Pageviews

Monday

Checking for multivariate normality

 

When we run a multiple regression, we assume that the predictors in our model are normally distributed. Not each of them! But the combination of them is normally distributed. That is called ‘multivariate normality’.

We, again, need to check for that. In the Z table, we note that data points that lie below –3 standard deviations or above +3 standard deviations would mean ‘outliers’ because they are too far from the mean (3 sd). Conventionally, we can use these points (-3: +3) to say which observations are ‘out of the range’ when we consider multivariate normality.

zdemo

 

What we need to do now is to combine all predictors and see which points are outliers. This task is extremely difficult, unless we use a computer program called biplot.

 

Graph MV

We now see that observations 8, 265, 266 are potentially outliers, though they are not out of the range. We may delete them if our sample size is large enough, or keep them since these outliers are not severe.