Course Outline

segmentGetting Started (Don't Skip This Part)

segmentIntroduction to Statistics: A Modeling Approach

segmentPART I: EXPLORING VARIATION

segmentChapter 1  Welcome to Statistics: A Modeling Approach

segmentChapter 2  Understanding Data

segmentChapter 3  Examining Distributions

segmentChapter 4  Explaining Variation

segmentPART II: MODELING VARIATION

segmentChapter 5  A Simple Model

segmentChapter 6  Quantifying Error

segmentChapter 7  Adding an Explanatory Variable to the Model

segmentChapter 8  Models with a Quantitative Explanatory Variable

segmentPART III: EVALUATING MODELS

segmentChapter 9  Distributions of Estimates

segmentChapter 10  Confidence Intervals and Their Uses

segmentChapter 11  Model Comparison with the F Ratio

segmentChapter 12  What You Have Learned

12.1 What You Have Learned About Modeling Variation

segmentResources
list Introduction to Statistics: A Modeling Approach
What You have Learned about Modeling Variation
At first, this was all we could do. You could spot a relationship in a graph, but you weren’t able to quantify it. But now you can do so much more! You can actually specify and fit models to the data, and figure out how strong the relationship is!
Below we’ve redrawn the graphs to include the bestfitting regression line. (You totally could have done this yourself; feel free to go back and try it if you want.)
L_Ch11_What_4
Use the DataCamp window below to find the bestfitting estimates of these two models.
#load packages
require(ggformula)
require(mosaic)
require(supernova)
require(Lock5Data)
require(Lock5withR)
require(okcupiddata)
require(fivethirtyeight)
# find and print the best fitting estimates for the unemployment model
# find and print the best fitting estimates for the income model
# find and print the best fitting estimates for the unemployment model
lm(avg_hatecrimes_per_100k_fbi ~ share_unemp_seas, data = hate_crimes)
# find and print the best fitting estimates for the income model
lm(avg_hatecrimes_per_100k_fbi ~ median_house_inc, data = hate_crimes)
test_function("lm", index = 1)
test_function("lm", index = 2)
test_error()
success_msg("Keep up the great work!")
L_Ch11_What_5
Just from our visualizations, we got the impression that the median household income would explain more of the variation in hate crimes than would unemployment. But median household income has a very small slope: .00006, compared to 11.99 for unemployment.
L_Ch11_What_6
You know by now that to get these statistics you will need to examine the ANOVA tables for the two models. How did you ever get along without the supernova()
function before? Use the DataCamp window below (where we have fit two models for you: unemp.model and income.model) to get the supernova()
tables for the two models.
#load packages
require(ggformula)
require(mosaic)
require(supernova)
require(Lock5Data)
require(Lock5withR)
require(okcupiddata)
library(fivethirtyeight)
# this code fits the models
unemp.model < lm(avg_hatecrimes_per_100k_fbi ~ share_unemp_seas, data = hate_crimes)
income.model < lm(avg_hatecrimes_per_100k_fbi ~ median_house_inc, data = hate_crimes)
# print the supernova table for unemp.model
# print the supernova table for income.model
# this code fits the models
unemp.model < lm(avg_hatecrimes_per_100k_fbi ~ share_unemp_seas, data = hate_crimes)
income.model < lm(avg_hatecrimes_per_100k_fbi ~ median_house_inc, data = hate_crimes)
# print the supernova table for unemp.model
supernova(unemp.model)
# print the supernova table for income.model
supernova(income.model)
test_object("unemp.model")
test_object("income.model")
test_output_contains("supernova(unemp.model)")
test_output_contains("supernova(income.model)")
test_error()
success_msg("Great thinking!")
L_Ch11_What_7
From the two models we fit here, we would say that the income model explains more variation in hate crimes than does the unemployment model. States with higher household incomes seem to report more hate crimes to the FBI. This relationship isn’t perfectly predictive, but it does explain 10% of the total error around the empty model.