Course Outline

list High School / Advanced Statistics and Data Science I (ABC)

  • High School / Advanced Statistics and Data Science I (ABC)
  • High School / Statistics and Data Science I (AB)
  • High School / Statistics and Data Science II (XCD)
  • College / Statistics and Data Science (ABC)
  • College / Advanced Statistics and Data Science (ABCD)
  • College / Accelerated Statistics and Data Science (XCDCOLLEGE)
  • Skew the Script: Jupyter

1.4 Introduction to R Functions

So far you know how to print some words and do some basic arithmetic in R. One of the great things about R is that there are a lot of built in commands that you can use. These are called functions. Functions are written by the open source community. You have already seen two functions in action, print() and sum().

Functions have two basic parts. The first part is the name of the function (e.g., sum). The second part is the input to the function, which goes inside the parentheses. We call these inputs arguments. Here we’ve put in some instructions (as comments) into the code window. Write your code as a new line under each comment. See if your code works by clicking <Run>. If it works, click <Submit>.

# Use the sum() function to add the numbers 5, 10, 15 # Use the print() function to print the word "hello" # Use the sum() function to add the numbers 5, 10, 15 sum(5, 10, 15) # Use the print() function to print the word "hello" print("hello") ex() %>% { check_function(., 'sum') %>% { check_arg(., "...") %>% check_equal() check_result(.) %>% check_equal() } check_function(., 'print') %>% check_result() %>% check_equal() }
CK Code: ch1-4

Notice that the actual R code are the lines you wrote in the code window, such as sum(5,10,15) or print("hello"). The output or result of the code (e.g., 30) appears in a new area underneath the buttons after you click <Run>.

R is Picky; Sorry About That!

One thing to be aware of is that R is very, very picky. For example, if you type sum(1,100) it will tell you the answer, 101. But if you type Sum(1,100), capitalizing the “s,” it will act like it has no idea what you are talking about!

To take another example: in the print() function, if we left off the quotation marks, typing print(hello) instead of print("hello"), R would return an error message. Let us show you what we mean.

# Run the code below by pressing Run # Now debug the code - fix the mistake and press Run Sum(1,2) # Try running the code below by pressing Run # Now try debugging the code - fix the mistake press Run again sum(1,2) ex() %>% check_function('sum') %>% check_result() %>% check_equal()
CK Code: ch1-5

If a human treated you this way it would be infuriating! A human would figure out what you meant. But R, a computer program, is not able to do that. It assumes you mean exactly what you type.

Here’s another example. Watch what happens if you forget to put in the close parenthesis in an R function.

# Try running this code that has left off the parenthesis at the end # Now fix the code (by adding the closing parenthesis) and Run again sum(50, 100 sum(50,100) ex() %>% check_function('sum') %>% check_result() %>% check_equal() ex() %>% check_error()
CK Code: ch1-6

If you forget a parenthesis, R will give you an error. Sometimes R will drive you crazy, sending you off looking for tiny little mistakes that are holding it up. Argh!

R Functions and Packages

You might be wondering, “Where do all these functions come from?” Many R functions are written by people in the R community—in other words, other people who use R. People share functions and example data sets with each other by releasing R packages which can be downloaded and installed, much like you install apps on your computer or phone.

R packages—thousands of them—are available in an online repository called CRAN. We use several R packages in this course, some of them have been written specifically to help students learn and use R more easily. Mosaic is an example of a package written by educators. They thought about different functions that would be helpful to students and put them all together into a package.

For this course, you really don’t need to worry about all this. We will pre-install in the code windows all the packages we expect you to use, so you don’t need to install them. But it’s important for you to understand where packages come from, because if you decide to install RStudio on your own computer, you may find some of the functions you were taught to use in the course don’t work! The reason is simply that the packages haven’t been installed.

Speaking of the mosaic package, here’s a fun little function written by the educators behind the Mosaic package. Knowing that statistics instructors often ask their students to consider probabilities from flipping coins, they wrote a function called rflip() that makes it easy to simulate a coin flip in R.

require(coursekata) # Try running rflip() to see what it does. rflip() # Try running rflip() to see what it does. rflip() ex() %>% check_function("rflip")
CK Code: ch1-7

If you are only going to flip one coin one time you could just as easily use a real coin. But if you want to flip a coin many times and save all the results, it makes sense to let the computer do it for you. You can input any number of coin flips into rflip(). So rflip(3) would give you the results of three simulated coin flips.

require(coursekata) # Modify this code to simulate 10 coin flips. rflip() # Modify this code to simulate 10 coin flips. rflip(n = 10) ex() %>% check_function('rflip')%>% check_arg('n') %>% check_equal()
CK Code: ch1-8

You may want to run rflip(10) a few times to see that every time R flips 10 coins, it does not come up with the same number of heads just like real flips of coins would not give rise to the same number of heads. Later on in this course, we’ll tackle this question: Why is the probability of heads always .5 when the actual proportion of heads in a sample of coin flips is not always .5?

Trial and Error, and the Culture of Programming

Earlier we talked about the culture of math. Many students expect the teacher to teach them the right steps to follow for solving problems, and assume that their job is to remember the steps. We made the point that this isn’t a very useful way of thinking about math. It’s also not going to help you learn programming.

The best way to learn programming is to try things and see what happens. Write some code, run it, and think about why it didn’t work! (Sorry to be negative, but often things don’t work the first time.) There are so many ways to make tiny mistakes in programming (e.g., writing an uppercase letter when you need a lowercase letter). We often have to find these bugs by trial and error.

Trial and error can be frustrating if we are not used to learning this way, and it may seem inefficient. But trial and error is a great way to learn because we learn from wrong answers as well as right ones. In this course we might sometimes ask you to run code that is wrong just to see what happens!

By embracing the process of trial and error you will be learning about a whole new way of thinking and about the culture of programming. It will not always go in a straight line, getting better and better, but will be more like experimenting and exploring, making discoveries as you go. The benefit of exploring is that you will get a more thorough sense of R and statistics!