1-10: For Loops
0.1 Changes…
- Console in R format or no code format??
- Put find difference in vector values application here.
1 Purpose
- Show how to perform if-else statements on multiple values in a vector using for loops
2 Questions about the material…
The files for this lesson:
Script: you can download the script here
If you have any questions about the material in this lesson, feel free to email them to the instructor, Charlie Belinsky, at belinsky@msu.edu.
3 Four structures in programming
As mentioned in the second lesson on variables, there are basically 4 main structures that cover almost every aspect of programming:
Variables
If-Else Statements
For Loops
Functions
In the next three lessons we cover for loops. One of the main benefits of for loops (and functions) in programming is they it generalize code. for loops allow you to write code once and execute the code on multiple values.
4 Conditional statements on multiple values
In the last couple of lessons, we used conditional statements on values in a vector (e.g., asking about the weather conditions or the high temperatures on a specific day) and then output a message to the Console based on the results of the conditional statement.
But, when working with data, we rarely look at just one value – usually, we are looking at a vector of values (e.g., 365 high temperatures for a year) and we want to perform the same conditional statement on each of the values. In this lesson and the next lesson, we will learn to use conditional statements on multiple values – using for loops with embedded if-else statements.
5 Repeating commands
A for loop is used whenever you want to execute a codeblock multiple times. The most basic example is repeating the exact same output a certain number of times:
for(i in 1:5) # repeat 5 times
{cat("Hello, World\n");
}
And the Console tab will show 5 Hellos:
Repeating the same code multiple times:
Hello, World
Hello, World
Hello, World
Hello, World Hello, World
5.1 The codeblock attached to the for loop
Just like if-else statements, for loops have an attached codeblock, designated by curly brackets ( { } ). If-else statements execute the codeblock conditionally based on the conditional statement in the parentheses and for loops executes a codeblock multiple times. The number of times for loops execute their attached codeblocks is dependent upon the sequence inside the parentheses. In Figure 1, the sequence is 1:5, and since the sequence has 5 values, the codeblock attached will execute 5 times.
Note: if the sequence was 15:19, which has 5 values, the attached codeblock will execute 5 times.
6 Repeated code on values
Figure 1 repeats the exact same code 5 times. However, you rarely want to repeat the exact same code – usually, you want to repeat similar code. For instance, you might want to say hello to 5 different people given by a vector:
= c("Ann", "Bob", "Charlie", "Dave", "Eve"); helloVector
To do this we cycle through every value in the vector, which is helloVector[1], helloVector[2], helloVector[3], helloVector[4], and helloVector[5].
To cycle through the values we set the variable (often called i) in the for loop to the values 1 through 5 and then use that variable inside the codeblock for the for loop. i, in this example, indexes helloVector each type the for loop cycles.
for(i in 1:5)
{# i takes on the values 1 through 5 through the 5 cycles of the for loop
cat("Hello," helloVector[i], "\n");
}
Hello, Ann
Hello, Bob
Hello, Charlie
Hello, Dave Hello, Eve
7 The index variable in a for loop
The true power of the for loops lies in the indexing variable, which is often named i. Note: the indexing variable can be given any name – it’s just common to call it i. The indexing variable cycles through the sequence each time the for loop executes. This means the indexing variable can be used to modify the code every time the codeblock is executed.
7.1 The index variable as a counter
Let’s look at the simplest of examples – using the indexing variable, i, as a counter:
# Using the indexing variable i as a counter
for(i in 1:5) # repeat 5 times
{cat("The count is:", i, "\n");
}
This outputs:
:
Using the indexing variable as a counter: 1
The count is: 2
The count is: 3
The count is: 4
The count is: 5 The count is
The number of times the for loop cycles is the same as the number of values in the sequence. So, we can use a different sequence of 5 values…
cat("---------\nAny sequence of 5 number cycles the for loop 5 times:\n");
for(i in 15:19) # repeat 5 times
{cat("The count is:", i, "\n");
}
…and the for loop will still cycle 5 time.
Using the indexing variable as a counter:
The count is: 15
The count is: 16
The count is: 17
The count is: 18 The count is: 19
7.2 Taking a step back: How a for loop cycles
A for loops always consists of 3 things:
An indexing variable: for(i in 1:5)
A sequence of values: for(i in 1:5)
A codeblock to execute: encapsulated by the { }
The sequence of values determines how many times the codeblock will cycle. If there are 12 values in the sequence, then the for loop will cycle 12 times.
For each cycle, the indexing variable is set to a different value in the sequence. The value of the indexing variable take will go in the same order as the sequence.
So, if the sequence is 12:8 (a backwards sequence with 5 values), then:
i will be set to 12 the first time the codeblock cycles
i will be set to 11 the first time the codeblock cycles
i will be set to 10 the first time the codeblock cycles
i will be set to 9 the first time the codeblock cycles
i will be set to 8 the first time the codeblock cycles
7.3 Changing the sequence
The for loop executes for every value in the sequence, and the index variable takes on the values in order.
# Using a sequence that does not start with 1
for(i in 12:8) # 12, 11, 10, 9 ,8 (there are five values in the sequence)
{cat("The count is:", i, "\n");
}
In this case, the for loop will still execute 5 times, but i will take on values 12 down to 8:
Changing the sequence numbers -- length is still 5:
The count is: 12
The count is: 11
The count is: 10
The count is: 9 The count is: 8
7.4 Indexing variables and more complicated sequence
You do not have to name the index i, you can name it whatever you want. For instance, we will be working with weather so day might be a more appropriate name for the indexing variable.
In this example we we use a more complicated sequence and change the name of the index variable to day:
for(day in seq(from=20, to=2, by=-3)) # 20, 17, 14, 11, 8, 5, 2 (seven values)
{cat("The count is:", day, "\n");
}
Using a more complicated sequence with seven values:
The count is: 20
The count is: 17
The count is: 14
The count is: 11
The count is: 8
The count is: 5 The count is: 2
So, in Figure 4, the codeblock cycles 7 times because there are 7 values in the sequence (20, 17, 14, 11, 8, 5, and 2). Each time the codeblock cycles, day is set to a different value in the sequence (going from the first value in the sequence to the last).
8 Indexing values inside a for loop
Most examples so far use the index variable as a counter. The real power comes when you use the indexing variable to index values in a vector as we did in Figure 2. This is how we perform an if-else statement on every value in a vector.
Let’s use the same data we have been using in the previous lessons:
### read in data from twoWeekWeatherData.csv
= read.csv(file="data/twoWeekWeatherData.csv",
weatherData sep=",",
header=TRUE);
### Extract the highTemps column from the data frame -- save it to a variable
= weatherData$highTemp;
highTemps = weatherData$noonCondition; noonCond
Since we want to go through every value in the vector, we need to know the size of the vector. The easiest way to find the size of a vector is to use the function length():
= length(highTemps); # numDays is 14 (the number of values in highTemps) numDays
Now let’s use a for loop to output all of the highTemp values. To do this we use the indexing variable, day, as the index for highTemp:
for(day in 1:numDays) # numDays, in this case, is 14 (sequence is 1:14)
{cat(highTemp[day], "\n"); # day will take the values 1-14
}
The 14 values in the vector highTemps in order:
57
50
54
40
39
58
60
53
55
44
39
54
61 75
In the following example, we use the indexing variable day as both a counter and an index for noonCond:
for(day in 1:numDays)
{cat(day, ") ", noonCond[day], "\n", sep="");
}
for noonCond:
Using indexing variable as a counter and an index 1) Cloudy
2) Cloudy
3) Sunny
4) Rain
5) Fog
6) Sunny
7) Sunny
8) Cloudy
9) Rain
10) Rain
11) Snow
12) Sunny
13) Sunny
14) Sunny
8.1 Making code more extensible
In the code in Figure 5 and Figure 6, we already know that the number of values in the vector is 14 so we could just write:
for(day in 1:14)
{cat(highTemp[day], "\n");
}
But this code will only work if the data frame does not change size. In general, you want to make your code as generic as possible, which means using variables instead of fixed numbers. When you make your code generic, you do not have to change the code when your data changes. In Figure 5, numDays will automatically adjust to the size of the vector.
9 If-else within a for loop
We usually have a lot of questions that we want to ask of the values we are looking at. In other words, we want to use if-else statements within the for loops. Let’s first ask if each day was sunny or not and output the answer to the Console.
for(i in 1:numDays)
{if(noonCond[i] == "Sunny")
{cat("day ", i, " was sunny\n", sep="");
else
}
{cat("day ", i, " was not sunny\n", sep="");
} }
Using if-else statements within a for loop:
day 1 was not sunny
day 2 was not sunny
day 3 was sunny
day 4 was not sunny
day 5 was not sunny
day 6 was sunny
day 7 was sunny
day 8 was not sunny
day 9 was not sunny
day 10 was not sunny
day 11 was not sunny
day 12 was sunny
day 13 was sunny day 14 was sunny
10 Checking a subset of values
We can use any type of sequence in the for loop so that we are only checking some of the values in the vector:
for(i in c(8,2,13))
{if(noonCond[i] == "Sunny")
{cat("day ", i, " was sunny\n", sep="");
else
}
{cat("day ", i, " was not sunny\n", sep="");
} }
Using a sequence to subset the values:
day 8 was not sunny
day 2 was not sunny day 13 was sunny
11 State variables and for loops
The code in Figure 7 is displaying specific information about each value in a vector. This would get quickly out of hand if there are thousands of values! We usually want more general information about a vector – for instance, we might want to know how many days were sunny?
Finding more general information about the values in a vector requires the introduction of the last component of most for loops: the state variable. The state variable is a variable that maintains a running tab of some value through each cycle of the for loop.
Let’s count the number of sunny days using a for loop. To do this we need to:
create a state variable that holds the number of sunny days (this will start at 0)
have a for loop that cycles through every value in the noonCond vector
include an if() statement asking if the day is sunny
have a command to increase the state variable by 1 if it is sunny
Note: The state variable must be declared outside of the for loop. Think about why this is true – it is the first question in the Application.
= 0; # state variable -- will hold the count of cloudy days
sunnyDays for(i in 1:numDays)
{if(noonCond[i] == "Sunny") # was the day sunny
{= sunnyDays +1; # it was -- increase sunnyDays by 1
sunnyDays
}# there is no else here -- we don't care about non-sunny days
}
And we can look in the Environment tab to see what the final value of sunnyDays was:
: 6 sunnyDays
And if we look at the noonCond vector, we see there are 6 sunny days:
> noonCond
1] "Cloudy" "Cloudy" "Sunny" "Rain" "Fog" "Sunny" "Sunny" "Cloudy"
[9] "Rain" "Rain" "Snow" "Sunny" "Sunny" "Sunny" [
11.1 The state variable
sunnyDays does a lot of heavy lifting in Figure 9. We initially set sunnyDays to 0 because 0 is the default value, or the value if the condition (noonCond[i] == "Sunny")
fails every time.
Whenever the value in noonCond is “Sunny” the following code is executed:
= sunnyDays +1; sunnyDays
This code works because the right side of the equation is always evaluated first.
So, R will first calculate: sunnyDays +1
- If sunnyDays is 0, then sunnyDays +1 evaluates to 1
Then R assigns the value on the right-side to the variable on the left, which is sunnyDays.
- sunnyDays is assigned the value of 1.
If sunnyDays is 4, then sunnyDays +1 evaluates to 5, and sunnyDays is assigned the value of 5.
12 Application:
1) In Comments: Why must the state variable be declared outside the for loop as opposed to inside the for loop? Think about what happens to the state variable every time the for loop cycles if it is declared inside the for loop. Put the answer in comments at the top of your script.
2) Create one for loop that outputs the (a) square (2nd power), cube (3rd power), and cube root (1/3rd power) for the numbers 1 through 10.
3) Using a for loop, find out how many days had high temperatures less than 50.
4) Using a for loop, find out how many even days were cloudy
5) Using a for loop, find out how many of the last 8 days were cloudy.
6) In comments: If you are using a for loop to add up all values in a vector then what value should your state variable start with? Why?
7) Challenge: Using for loops, find the total precipitation (i.e., add up all the precipitation values). Hint: you keep adding each new precipitation value to the total precipitation.
Save the script as app1-10.r in your scripts folder and email your Project Folder to Charlie Belinsky at belinsky@msu.edu.
Instructions for zipping the Project Folder are here.
If you have any questions regarding this application, feel free to email them to Charlie Belinsky at belinsky@msu.edu.
12.1 Questions to answer
Answer the following in comments inside your application script:
What was your level of comfort with the lesson/application?
What areas of the lesson/application confused or still confuses you?
What are some things you would like to know more about that is related to, but not covered in, this lesson?