1-10: For Loops

0.1 Changes…

  • Console in R format or no code format??
  • Put find difference in vector values application here.

1 Purpose

  • Show how to perform if-else statements on multiple values in a vector using for loops

2 Questions about the material…

The files for this lesson:

 

If you have any questions about the material in this lesson, feel free to email them to the instructor, Charlie Belinsky, at belinsky@msu.edu.

3 Four structures in programming

As mentioned in the second lesson on variables, there are basically 4 main structures that cover almost every aspect of programming:

  1. Variables

  2. If-Else Statements

  3. For Loops

  4. Functions

In the next three lessons we cover for loops.  One of the main benefits of for loops (and functions) in programming is they it generalize code.  for loops allow you to write code once and execute the code on multiple values. 

4 Conditional statements on multiple values

In the last couple of lessons, we used conditional statements on values in a vector (e.g., asking about the weather conditions or the high temperatures on a specific day) and then output a message to the Console based on the results of the conditional statement.

 

But, when working with data, we rarely look at just one value – usually, we are looking at a vector of values (e.g., 365 high temperatures for a year) and we want to perform the same conditional statement on each of the values. In this lesson and the next lesson, we will learn to use conditional statements on multiple values – using for loops with embedded if-else statements.

5 Repeating commands

A for loop is used whenever you want to execute a codeblock multiple times.  The most basic example is repeating the exact same output a certain number of times:

for(i in 1:5)  # repeat 5 times
{
  cat("Hello, World\n");
}
Figure 1: The most basic for loop – repeating an output command multiple times

And the Console tab will show 5 Hellos:

Repeating the same code multiple times:
Hello, World
Hello, World
Hello, World
Hello, World
Hello, World

5.1 The codeblock attached to the for loop

Just like if-else statements, for loops have an attached codeblock, designated by curly brackets ( {  } ).  If-else statements execute the codeblock conditionally based on the conditional statement in the parentheses and for loops executes a codeblock multiple times.  The number of times for loops execute their attached codeblocks is dependent upon the sequence inside the parentheses.  In Figure 1, the sequence is 1:5, and since the sequence has 5 values, the codeblock attached will execute 5 times. 

 

Note: if the sequence was 15:19, which has 5 values, the attached codeblock will execute 5 times.

6 Repeated code on values

Figure 1 repeats the exact same code 5 times. However, you rarely want to repeat the exact same code – usually, you want to repeat similar code.  For instance, you might want to say hello to 5 different people given by a vector:

helloVector = c("Ann", "Bob", "Charlie", "Dave", "Eve");

To do this we cycle through every value in the vector,  which is helloVector[1], helloVector[2], helloVector[3], helloVector[4], and helloVector[5].

 

To cycle through the values we set the variable (often called i) in the for loop to the values 1 through 5 and then use that variable inside the codeblock for the for loop. i, in this example, indexes helloVector each type the for loop cycles.

for(i in 1:5)
{
  # i takes on the values 1 through 5 through the 5 cycles of the for loop
  cat("Hello," helloVector[i], "\n");
}
Figure 2
Hello, Ann 
Hello, Bob 
Hello, Charlie 
Hello, Dave 
Hello, Eve 

7 The index variable in a for loop

The true power of the for loops lies in the indexing variable, which is often named iNote: the indexing variable can be given any name – it’s just common to call it i.  The indexing variable cycles through the sequence each time the for loop executes. This means the indexing variable can be used to modify the code every time the codeblock is executed.

7.1 The index variable as a counter

Let’s look at the simplest of examples – using the indexing variable, i, as a counter:

# Using the indexing variable i as a counter
for(i in 1:5) # repeat 5 times
{
  cat("The count is:", i, "\n");
}

This outputs:

Using the indexing variable as a counter:
The count is: 1
The count is: 2
The count is: 3
The count is: 4
The count is: 5

The number of times the for loop cycles is the same as the number of values in the sequence. So, we can use a different sequence of 5 values…

cat("---------\nAny sequence of 5 number cycles the for loop 5 times:\n");
for(i in 15:19) # repeat 5 times
{
  cat("The count is:", i, "\n");
}

…and the for loop will still cycle 5 time.

Using the indexing variable as a counter:
The count is: 15
The count is: 16
The count is: 17
The count is: 18
The count is: 19

7.2 Taking a step back: How a for loop cycles

A for loops always consists of 3 things:

  • An indexing variable:  for(i in 1:5) 

  • A sequence of values: for(i in 1:5)

  • A codeblock to execute: encapsulated by the { }

 

The sequence of values determines how many times the codeblock will cycle.  If there are 12 values in the sequence, then the for loop will cycle 12 times.

 

For each cycle, the indexing variable is set to a different value in the sequence.  The value of the indexing variable take will go in the same order as the sequence.

 

So, if the sequence is 12:8 (a backwards sequence with 5 values), then:

  • i will be set to 12 the first time the codeblock cycles

  • i will be set to 11 the first time the codeblock cycles

  • i will be set to 10 the first time the codeblock cycles

  • i will be set to 9 the first time the codeblock cycles

  • i will be set to 8 the first time the codeblock cycles

7.3 Changing the sequence

The for loop executes for every value in the sequence, and the index variable takes on the values in order.

# Using a sequence that does not start with 1
for(i in 12:8) # 12, 11, 10, 9 ,8 (there are five values in the sequence)
{
  cat("The count is:", i, "\n");
}
Figure 3: Using a sequence that does not start with 1 in a for loop

In this case, the for loop will still execute 5 times, but i will take on values 12 down to 8:

Changing the sequence numbers -- length is still 5:
The count is: 12
The count is: 11
The count is: 10
The count is: 9
The count is: 8

7.4 Indexing variables and more complicated sequence

You do not have to name the index i, you can name it whatever you want.  For instance, we will be working with weather so day might be a more appropriate name for the indexing variable.

 

In this example we we use a more complicated sequence and change the name of the index variable to day:

for(day in seq(from=20, to=2, by=-3))  # 20, 17, 14, 11, 8, 5, 2 (seven values)
{
  cat("The count is:", day, "\n");
}
Figure 4: Using a more complicated sequence in a for loop
Using a more complicated sequence with seven values:
The count is: 20
The count is: 17
The count is: 14
The count is: 11
The count is: 8
The count is: 5
The count is: 2

So, in Figure 4, the codeblock cycles 7 times because there are 7 values in the sequence (20, 17, 14, 11, 8, 5, and 2).  Each time the codeblock cycles,  day is set to a different value in the sequence (going from the first value in the sequence to the last).

8 Indexing values inside a for loop

Most examples so far use the index variable as a counter. The real power comes when you use the indexing variable to index values in a vector as we did in Figure 2. This is how we perform an if-else statement on every value in a vector.

 

Let’s use the same data we have been using in the previous lessons:

### read in data from twoWeekWeatherData.csv
weatherData = read.csv(file="data/twoWeekWeatherData.csv",
                       sep=",",
                       header=TRUE);

### Extract the highTemps column from the data frame -- save it to a variable
highTemps = weatherData$highTemp;
noonCond = weatherData$noonCondition;

Since we want to go through every value in the vector, we need to know the size of the vector.  The easiest way to find the size of a vector is to use the function length():

numDays = length(highTemps);  # numDays is 14 (the number of values in highTemps)

Now let’s use a for loop to output all of the highTemp values.  To do this we use the indexing variable, day, as the index for highTemp:

for(day in 1:numDays)  # numDays, in this case, is 14 (sequence is 1:14)
{
  cat(highTemp[day], "\n");  # day will take the values 1-14
}
Figure 5: Using the indexing variable as an index for highTemp
The 14 values in the vector highTemps in order:
57
50
54
40
39
58
60
53
55
44
39
54
61
75

In the following example, we use the indexing variable day as both a counter and an index for noonCond:

for(day in 1:numDays)
{
  cat(day, ") ", noonCond[day], "\n", sep="");
}
Figure 6: Using indexing variable as a counter and an index for noonCond
Using indexing variable as a counter and an index for noonCond:
1) Cloudy
2) Cloudy
3) Sunny
4) Rain
5) Fog
6) Sunny
7) Sunny
8) Cloudy
9) Rain
10) Rain
11) Snow
12) Sunny
13) Sunny
14) Sunny

8.1 Making code more extensible

In the code in Figure 5 and Figure 6,  we already know that the number of values in the vector is 14 so we could just write:

for(day in 1:14)
{
  cat(highTemp[day], "\n");
}

But this code will only work if the data frame does not change size.  In general, you want to make your code as generic as possible, which means using variables instead of fixed numbers. When you make your code generic, you do not have to change the code when your data changes.  In Figure 5, numDays will automatically adjust to the size of the vector.

9 If-else within a for loop

We usually have a lot of questions that we want to ask of the values we are looking at. In other words, we want to use if-else statements within the for loops.  Let’s first ask if each day was sunny or not and output the answer to the Console.

for(i in 1:numDays)
{
  if(noonCond[i] == "Sunny")
  {
    cat("day ", i, " was sunny\n", sep="");
  }else
  {
    cat("day ", i, " was not sunny\n", sep="");
  }
}
Figure 7: Using if-else statements in a for loop
Using if-else statements within a for loop:
day 1 was not sunny
day 2 was not sunny
day 3 was sunny
day 4 was not sunny
day 5 was not sunny
day 6 was sunny
day 7 was sunny
day 8 was not sunny
day 9 was not sunny
day 10 was not sunny
day 11 was not sunny
day 12 was sunny
day 13 was sunny
day 14 was sunny

10 Checking a subset of values

We can use any type of sequence in the for loop so that we are only checking some of the values in the vector:

for(i in c(8,2,13))
{
  if(noonCond[i] == "Sunny")
  {
    cat("day ", i, " was sunny\n", sep="");
  }else
  {
    cat("day ", i, " was not sunny\n", sep="");
  }
}
Figure 8: Using the sequence to subset the values
Using a sequence to subset the values:
day 8 was not sunny
day 2 was not sunny
day 13 was sunny

11 State variables and for loops

The code in Figure 7 is displaying specific information about each value in a vector.  This would get quickly out of hand if there are thousands of values!  We usually want more general information about a vector – for instance, we might want to know how many days were sunny?

 

Finding more general information about the values in a vector requires the introduction of the last component of most for loops: the state variable.  The state variable is a variable that maintains a running tab of some value through each cycle of the for loop.

 

Let’s count the number of sunny days using a for loop.  To do this we need to:

  • create a state variable that holds the number of sunny days (this will start at 0)

  • have a for loop that cycles through every value in the noonCond vector

  • include an if() statement asking if the day is sunny

  • have a command to increase the state variable by 1 if it is sunny

 

Note: The state variable must be declared outside of the for loop.  Think about why this is true – it is the first question in the Application.

 

sunnyDays = 0; # state variable -- will hold the count of cloudy days
for(i in 1:numDays)
{
  if(noonCond[i] == "Sunny")  # was the day sunny
  {
    sunnyDays = sunnyDays +1;   # it was -- increase sunnyDays by 1
  }
  # there is no else here -- we don't care about non-sunny days
}
Figure 9: Using a state variable to calculate how many sunny days there was in noonCond

And we can look in the Environment tab to see what the final value of sunnyDays was:

sunnyDays:   6

And if we look at the noonCond vector, we see there are 6 sunny days:

> noonCond
 [1] "Cloudy" "Cloudy" "Sunny"  "Rain"   "Fog"    "Sunny"  "Sunny"  "Cloudy"
 [9] "Rain"   "Rain"   "Snow"   "Sunny"  "Sunny"  "Sunny"

11.1 The state variable

sunnyDays does a lot of heavy lifting in Figure 9.  We initially set sunnyDays to 0 because 0 is the default value, or the value if the condition (noonCond[i] == "Sunny") fails every time.

 

Whenever the value in noonCond is “Sunny” the following code is executed:

   sunnyDays = sunnyDays +1;

This code works because the right side of the equation is always evaluated first.

 

So, R will first calculate: sunnyDays +1

  • If sunnyDays is 0, then sunnyDays +1 evaluates to 1

 

Then R assigns the value on the right-side to the variable on the left, which is sunnyDays

  • sunnyDays is assigned the value of 1.

 

If sunnyDays is 4, then sunnyDays +1 evaluates to 5, and sunnyDays is assigned the value of 5.

12 Application:

1) In Comments: Why must the state variable be declared outside the for loop as opposed to inside the for loop?  Think about what happens to the state variable every time the for loop cycles if it is declared inside the for loop.  Put the answer in comments at the top of your script.

 

2) Create one for loop that outputs the (a) square (2nd power), cube (3rd power), and cube root (1/3rd power) for the numbers 1 through 10.

 

3) Using a for loop, find out how many days had high temperatures less than 50.

 

4) Using a for loop, find out how many even days were cloudy

 

5) Using a for loop, find out how many of the last 8 days were cloudy.

 

6) In comments: If you are using a for loop to add up all values in a vector then what value should your state variable start with? Why?

 

7) Challenge: Using for loops, find the total precipitation (i.e., add up all the precipitation values).  Hint: you keep adding each new precipitation value to the total precipitation.

 

 

Save the script as app1-10.r in your scripts folder and  email your Project Folder to Charlie Belinsky at belinsky@msu.edu.

 

Instructions for zipping the Project Folder are here.

 

If you have any questions regarding this application, feel free to email them to Charlie Belinsky at belinsky@msu.edu.

12.1 Questions to answer

Answer the following in comments inside your application script:

  1. What was your level of comfort with the lesson/application?

  2. What areas of the lesson/application confused or still confuses you?

  3. What are some things you would like to know more about that is related to, but not covered in, this lesson?