Course Outline

list Introduction to Statistics: A Modeling Approach

Assessing Model Fit with Sum of Squares

Finally, let’s examine the fit of our regression model by running the supernova() function on our model. And at the same time, let’s compare the table we get from the regression model (Height.model) with the one we produced before for the Height2Group.model.



Remember, the total sum of squares is the sum of squared deviations (or more generally, residuals) from the empty model. Total sum of squares is all about the outcome variable, and isn’t affected by the explanatory variable or variables. And when we compare statistical models, as we are doing here, we always are modeling the same outcome variable.

Partitioning Sums of Squares

If you want to follow along with the video, you can use this link ( and then paste in these data.

   Height2 Group Thumb
0 56
0 60
1 61
0 63
1 64
1 68


Height Thumb
62 56
66 60
67 61
63 63
68 64
71 68


For any model with an explanatory variable (what we have been calling “complex models”), the SS Total can be partitioned into the SS Error and the SS Model. The SS Model is the amount by which the error is reduced under the complex model (e.g., the Height model) compared with the empty model.

As we developed previously for the group models, SS Model is easily calculated by subtracting SS Error from SS Total. This is the same, regardless of whether you are fitting a group model or a regression model. Error from the model is defined in the former case as residuals from the group means, and in the latter, residuals from the regression line.

It also is possible to calculate the SS Model in the regression model directly, in much the same way we did for the group model. Recall that for the group model, SS Model was the sum of the squared deviations of each person’s predicted score (their group mean) from the Grand Mean. In the regression model, SS Model is calculated in exactly the same way, except that each person’s predicted score is defined as a point on the regression line. The Grand Mean is the same in both cases.