Course Outline

segmentGetting Started (Don't Skip This Part)

segmentStatistics and Data Science: A Modeling Approach

segmentPART I: EXPLORING VARIATION

segmentChapter 1  Welcome to Statistics: A Modeling Approach

segmentChapter 2  Understanding Data

segmentChapter 3  Examining Distributions

segmentChapter 4  Explaining Variation

segmentPART II: MODELING VARIATION

segmentChapter 5  A Simple Model

segmentChapter 6  Quantifying Error

segmentChapter 7  Adding an Explanatory Variable to the Model

segmentChapter 8  Models with a Quantitative Explanatory Variable

segmentPART III: EVALUATING MODELS

segmentChapter 9  Distributions of Estimates

segmentChapter 10  Confidence Intervals and Their Uses

segmentChapter 11  Model Comparison with the F Ratio

segmentChapter 12  What You Have Learned

segmentFinishing Up (Don't Skip This Part!)

segmentResources
list Introduction to Statistics: A Modeling Approach
1.3 Introduction to R Functions
So far you know how to print some words and do some basic arithmetic in R. One of the great things about R is that there are a lot of built in commands that you can use. These are called functions. Functions are written by the open source community. You have already seen two functions in action, print()
and sum()
.
Functions have two basic parts. The first part is the name of the function (e.g., sum). The second part is the input to the function, which goes inside the parentheses. We call these inputs arguments. Here we’ve put in some instructions (as comments) into the code window. Write your code as a new line under each comment.
# Use the sum() function to add the numbers 5, 10, 15
# Use the print() function to print the word "hello"
# Use the sum() function to add the numbers 5, 10, 15
sum(5, 10, 15)
# Use the print() function to print the word "hello"
print("hello")
ex() %>% {
check_function(., 'sum') %>% {
check_arg(., "...") %>% check_equal()
check_result(.) %>% check_equal()
}
check_function(., 'print') %>% check_result() %>% check_equal()
}
Notice that the actual R code are the lines you wrote in the script.R window, such as sum(5,10,15)
or print("hello")
. The output or result of the code (e.g., 30) appears in the R Console.
Quirks in DataCamp (Some Patience Required)
DataCamp can be a little quirky. It’s a web application, and so it relies on a server in the background to get everything up and ready to run.
One thing that may be helpful is to click in the script window before you click Run, and then watch for it to say “Workspace Ready” next to the Submit button before you click Run.
If it seems really broken, you can try reloading your browser page. Sometimes just waiting a few minutes will solve the problem.
R is Picky; Sorry About That!
One thing to be aware of is that R is very, very picky. For example, if you type sum(1,100)
it will tell you the answer, 101. But if you type Sum(1,100)
, capitalizing the “s,” it will act like it has no idea what you are talking about!
To take another example: in the print()
function, if we left off the quotation marks, typing print(hello)
instead of print("hello")
, R would return an error message. Let us show you what we mean.
# Run the code below by pressing Run
# Now debug the code  fix the mistake and press Run
Sum(1,2)
# Try running the code below by pressing Run
# Now try debugging the code  fix the mistake press Run again
sum(1,2)
ex() %>% check_function('sum') %>% check_result() %>% check_equal()
If a human treated you this way it would be infuriating! A human would figure out what you meant. But R, a computer program, is not able to do that. It assumes you mean exactly what you type.
Here’s another example. Watch what happens if you forget to put in the close parenthesis in an R function.
# Try running this code that has left off the parenthesis at the end
# Now fix the code (by adding the closing parenthesis) and Run again
sum(50, 100
sum(50,100)
ex() %>% check_function('sum') %>% check_result() %>% check_equal()
ex() %>% check_error()
If you forget a parenthesis, R will give you an error. Sometimes R will drive you crazy, sending you off looking for tiny little mistakes that are holding it up. Argh!
You’ll learn a lot of functions as you progress through this course. It may be helpful to keep track of them in a notebook. We’ve also provided an R cheat sheet that you can mark up as you go through the course. (You can link to it here, and you can also see it in Resources section of the course.)
You Can’t Possibly Memorize All of R’s Functions
Even though you will learn a lot, there are literally thousands of functions in R, more than anyone could remember. And, there are often many functions that do similar things. Even advanced users of R can’t remember it all.
What we do—and you can do it too—is just search on the internet for functions we can’t remember. Not only will you find some new functions, but you’ll also find endless discussions about which ones are better than others. Oh, what fun!
R Functions and Packages
You might be wondering, “Where do all these functions come from?” Many R functions are written by people in the R community—in other words, other people who use R. People share functions and example data sets with each other by releasing R packages which can be downloaded and installed, much like you install apps on your computer or phone.
R packages—thousands of them—are available in an online repository called CRAN. We use several R packages in this course, some of them have been written specifically to help students learn and use R more easily. Mosaic is an example of a package written by educators. They thought about different functions that would be helpful to students and put them all together into a package.
For this course, you really don’t need to worry about all this. We will preinstall in the code windows all the packages we expect you to use, so you don’t need to install them. But it’s important for you to understand where packages come from, because if you decide to install RStudio on your own computer, you may find some of the functions you were taught to use in the course don’t work! The reason is simply that the packages haven’t been installed.
Speaking of the mosaic package, here’s a fun little function written by the educators behind the Mosaic package. Knowing that statistics instructors often ask their students to consider probabilities from flipping coins, they wrote a function called rflip()
that makes it easy to simulate a coin flip in R.
require(mosaic)
# Try running rflip() to see what it does.
rflip()
# Try running rflip() to see what it does.
rflip()
ex() %>% check_function("rflip")
If you are only going to flip one coin one time you could just as easily use a real coin. But if you want to flip a coin many times and save all the results, it makes sense to let the computer do it for you. You can input any number of coin flips into rflip()
. So rflip(3)
would give you the results of three simulated coin flips.
require(mosaic)
# Modify this code to simulate 10 coin flips.
rflip()
# Modify this code to simulate 10 coin flips.
rflip(n = 10)
ex() %>% check_function('rflip')%>% check_arg('n') %>% check_equal()
You may want to run rflip(10)
a few times to see that every time R flips 10 coins, it does not come up with the same number of heads just like real flips of coins would not give rise to the same number of heads. Later on in this course, we’ll tackle this question: Why is the probability of heads always .5 when the actual proportion of heads in a sample of coin flips is not always .5?
Trial and Error, and the Culture of Programming
Earlier we talked about the culture of math. Many students expect the teacher to teach them the right steps to follow for solving problems, and assume that their job is to remember the steps. We made the point that this isn’t a very useful way of thinking about math. It’s also not going to help you learn programming.
The best way to learn programming is to try things and see what happens. Write some code, run it, and think about why it didn’t work! (Sorry to be negative, but often things don’t work the first time.) There are so many ways to make tiny mistakes in programming (e.g., writing an uppercase letter when you need a lowercase letter). We often have to find these bugs by trial and error.
Trial and error can be frustrating if we are not used to learning this way, and it may seem inefficient. But trial and error is a great way to learn because we learn from wrong answers as well as right ones. In this course we might sometimes ask you to run code that is wrong just to see what happens!
By embracing the process of trial and error you will be learning about a whole new way of thinking and about the culture of programming. It will not always go in a straight line, getting better and better, but will be more like experimenting and exploring, making discoveries as you go. The benefit of exploring is that you will get a more thorough sense of R and statistics!