1-08: Conditional Operations

0.1 To-do:

  • figure out how to merge cells in tables

1 Purpose

  • Creating “decision points” in a script using if-statements

  • Introduce the six different conditional operators

  • Code conditional statements on strings and numbers

2 Material

The files for this lesson:

 

If you have any questions about the material in this lesson, feel free to email them to the instructor, Charlie Belinsky, at belinsky@msu.edu.

3 Four structures in programming

As mentioned in the second lesson on variables, there are basically 4 main structures that cover almost every aspect of programming:

  1. Variables

  2. If-Else Statements

  3. For Loops

  4. Functions

 

In the next two lessons we cover If-Else statements

4 Non-linear scripts

So far all of our scripts have been linear – in other words, the execution of the script goes line-by-line, or command-by-command, until the end of the script.  However, most scripts do not work like this – most scripts have multiple points where the execution of code is:

  • moved to another location (i.e., functions – future lesson)

  • done multiple times (i.e., for loops – future lesson)

  • dependent on some condition (i.e., if-else statements – this lesson)

 

And you will often find combinations of the above.  When dealing with data frames, if-else statements are often found within for loops.  For example, a for loop goes through every value in a temperature column and an embedded if-else statement executes code if the temperature is above a certain amount.

 

For the next couple of lessons we are just going to focus on if-else statements.  After that, we will introduce for loops with embedded if-else statements.

5 Questions and Statements

The if() statement can almost be thought of as a question with a yes/no answer:

  • Is the fish’s weight below 100 grams?  (yes/no)

  • Is the first runner’s speed greater than the second runner’s speed? (yes/no)

  • Is the location Silver Lake? (yes/no)

  • Is the temperature less than or equal to 30? (yes/no)

 

Except, instead of questions with yes/no answer, if() contains statements with TRUE/FALSE conditions:

  • The fish’s weight is below 100 grams. (TRUE/FALSE)

  • The first runner’s speed is greater than the second runner’s speed. (TRUE/FALSE)

  • The location is Silver Lake. (TRUE/FALSE)

  • The temperature is less than or equal to 30. (TRUE/FALSE)

5.1 Structure of an if() statement

The basic structure of an if() statement is:

if (some conditional statement is TRUE)
{ 
   # Execute the code in this codeblock
   # if the conditional statement is TRUE.
   # note: All command within the codeblock are indented
}
# next lesson we will cover what happens if the conditional statement is FALSE.
Figure 1: Structure of an if() statement

Inside the curly brackets ( {  } ) attached to the if() is a codeblock, or a series of commands, that gets executed if the statement is TRUE.  If the statement is FALSE, the codeblock is skipped.

 

Let’s do a simple example where a random number is picked and an if() statement checks the random number:

randomTemp = sample(30:80, size=1);  # pick a random number between 30 and 80

if (randomTemp > 50)  # if randomTemp is greater than 50...
{ 
  # ...execute the commands in this codeblock
  cat("The temperature is", randomTemp);
  cat("warm enough to go outside\n");
}
# next lesson we will cover what happens if the conditional statement is FALSE.
Figure 2: An example of an if() statement

5.2 Alternate codeblock format

The curly brackets ( {  } ) designate the beginning and the end of the codeblock.  There are many ways to space the curly brackets but only two methods that are generally accepted in the programming world:

  • Figure 2 where the curly brackets are on their own line and lined up at the same level as the if() and all code inside is indented

  • Figure 3 where the start bracket is on the same line as the if() statement – everything else is the same as the first example

if (randomTemp > 50)  { # if randomTemp is greater than 50...
  # ...execute the commands in this codeblock
  cat("The temperature is", randomTemp);
  cat("warm enough to go outside\n");
}
# next lesson we will cover what happens if the conditional statement is FALSE.
Figure 3: Alternate structure of an if() statement

I prefer the first method because it makes it easier for me to visualize the hierarchy of the codeblocks and easier to comment. You are required to use one of these two methods in this class.

 

Extension: Curly bracket placement

6 If() statements

The conditional statement inside the parentheses of the if() contains at least two values being compared with a conditional operation:

if( fishWeight «<» 100 )                # is greater than
if( runner1Speed «>» runner2Speed)      # is less than
if( location «==» "Silver Lake" )       # is equal to
if( temperature «<=» 30 )               # is less than or equal to

A conditional operator does two things:

1) It compares the two values on both sides of the operator. 

2) It outputs either TRUE or FALSE based on results of the comparison.

Note: TRUE/ FALSE statements are often called Boolean statements

6.1 Conditional Operators in R

There are six conditional operators in programming:

Operator Meaning
==

equal to

(easily confused with the assignment operator, =)

!=

not equal to

(reverses the logic of ==)

>= greater than or equal to
<= less than or equal to
> greater then (only)
< less than (only)

Note: >=, <=, >, and < operators are usually used to compare two numeric values. However they can be used to compare string values.

Extension: Greater than and less than on strings

6.2 Set up conditional operators

The following if statements all compare two values using a conditional operator:

if( fishWeight «<» 100 )
{
  # code that gets executed if fishWeight < 100
}
if( runner1Speed «>» runner2Speed )
{
  # code that gets executed if runner1Speed > runner2Speed
}
if( location «==» "Silver Lake" )
{
  # code that gets executed if location == "Silver Lake"
}
if( temperature «<=» 30)
{
  # code that gets executed if temperature <= 30
}

Note: A single Conditional Operators compares two values (e.g., location == “Silver Lake”).  In a future lesson, we will learn how to deal with more complex conditions (e.g., how to check if location is “Silver Lake” or “Round Lake”).

7 Using if() statements

To set up the script to use if() statements we will:

  • read in the data from twoWeekWeatherData.csv

  • save the highTemp and noonCondition columns to vectors

### read in data from twoWeekWeatherData.csv
weatherData = read.csv(file="data/twoWeekWeatherData.csv", 
                       sep=",",
                       header=TRUE, 
                       stringsAsFactors = FALSE);  
  
### Extract the highTemps column from the data frame -- save it to a variable
highTemps = weatherData$highTemp;
noonCond = weatherData$noonCondition;

Our first example will check three value in highTemps (3rd, 4th, and 5th) and see which of them are greater than 50

cat("---------\nChecking highTemps 3, 4, and 5 to see which are > 50:\n");

if(highTemps[3] > 50)
{
  cat("  high temp 3 is greater than 50\n");
}
if(highTemps[4] > 50)
{
  cat("  high temp 4 is greater than 50\n");
} 
if(highTemps[5] > 50)
{
  cat("  high temp 5 is greater than 50\n");
}

Only the cat() in the first codeblock (highTemps[3] > 50) was executed because highTemps[3] was the only value greater than 50. The second and third codeblocks were ignored.

Checking highTemps 3, 4, and 5 to see which are > 50:
  high temp 3 is greater than 50

Note: This is an efficient way to check multiple values. The more efficient way is using for().Extension: Checking all values in a vector (for loops)

 

Extension: Removing curly brackets in if() statements

7.1 Using all 6 conditional operators

The above example used ( > ) to compare the values in highTemps to the value 50.

 

Let’s use all six conditional operator to compare highTemp[2], which is 50, to the value 50:

cat("\n---------\nChecking high temp 2 using all 6 conditional operators:\n");

if(highTemps[2] >= 50)
{
  cat("  high temp is greater than or equal to 50\n")
}
if(highTemps[2] <= 50)
{
  cat("  high temp is less than or equal to 50\n")
}
if(highTemps[2] > 50)
{
  cat("  high temp is greater than 50\n")
}
if(highTemps[2] < 50)
{
  cat("  high temp is less than 50\n")
}
if(highTemps[2] == 50)
{
  cat("  high temp is equal to 50\n")
}
if(highTemps[2] != 50)
{
  cat("  high temp is not equal to 50\n")
}

In the Console, we see that three of the conditional statement passed because highTemps[2] is >=, <=, and == 50:

Checking high temp 2 using all 6 conditional operators:
  high temp is greater than or equal to 50
  high temp is less than or equal to 50
  high temp is equal to 50

8 Conditional Operators with characters/strings

We are going to use conditional operators on the values in the noonCondition vector.  The noonCondition vector consists of character values, or, as they are more commonly called, string values (labelled chr in the Environment tab).  Since the value is a string/chr, we need to compare it to a value in quotes

 

The following code will check the second value in noonCond using the equal ( == ) condition and the not equal to ( != ) condition:

cat("\n---------\nCheck to see the noon condition on the day 2:\n");

# checking the second noonCond, which is "Cloudy"
if(noonCond[2] == "Cloudy")  # noonCond[2] is "Cloudy"
{
  cat("  Day was cloudy\n");
}
if(noonCond[2] != "Sunny")   # noonCond[2] is not "Sunny"
{
  cat("  Day was not Sunny\n")
}

In the Console, we see that noonCond[2] was equal to “Cloudy” and not equal to “Sunny”:

Check to see the noon condition on the day 2:
  Day was Cloudy
  Day was not Sunny

We can use all same six conditional operators to compare characters/strings – but only ( == ) and ( != ) are commonly used.  Extension: Greater than and less than on strings

8.1 Quotes around numeric and string values

When we compared a numeric value within a vector to a number we do not use quotes:

if(highTemps[2] >= 50)

But, when we compare a string/character value within a vector to a string we need quotes around the latter:

if(noonCond[2] == "Cloudy")

If we do not use quotes then R thinks there is some variable named Cloudy and you will get an object ‘Cloudy’ not found error:

if(noonCond[2] == Cloudy)  # error because there is no variable (Object) named Cloudy

Remember that, in R, a variable name cannot start with a number. This means that a number is unambiguously not a variable.

 

However, you can put quotes around a number:

  if(highTemps[2] >= "50")  # this will often give a wrong answer...

This change the number 50 to the two characters: “5” and “0”.  Coincidentally, the if() statement above is also TRUE, but for the wrong reason.  Extension: Numbers as Characters

8.2 Case counts with strings

When comparing two strings, any difference between the two strings means they are not equal (so, == is FALSE and != is TRUE).  This includes capital and lowercase letters.

 

noonCond[2] is “Cloudy” but “Cloudy” is not the same as “cloudy”:

cat("\n---------\nChecking same condition but changed 'Cloudy' to 'cloudy':\n");

if(noonCond[2] == "cloudy") # This is FALSE because of the lowercase c
{
  cat("  Day was cloudy\n");
}
if(noonCond[2] != "cloudy") # This is TRUE
{
  cat("  Day was NOT cloudy\n"); 
}

And in the Console we see that “Cloudy” does not equal “cloudy”:

Checking same condition but changed 'Cloudy' to 'cloudy':
  Day was NOT cloudy

9 Checking other columns

Many times, when we are checking values within a data frame, we want to know more about other values within that row.  In the case of the weatherData data frame, these are the weather readings that occurred on the same day:

cat("\n---------\nOutputting information from another column:\n");
if(noonCond[2] == "Cloudy")   # checking if the day was cloudy
{
  cat("  Day was Cloudy");
  cat(" and the high temperature that day was", highTemps[2], "\n");
}
Outputting information from another column:
  Day was Cloudy and the high temperature that day was 50 

10 if() within if()

We often want to check multiple conditions and one way to do this is to embed an if() statement within an if() statement.

 

For instance, if the condition is cloudy, we might want to make a second check on the temperature for that day:

if(noonCond[2] == "Cloudy")   # checking if the day was cloudy
{
  # the following if statements are only checked if conditions are cloudy
  if( highTemps[2] > 60 )
  {
    cat("Still nice enough to go out!");
  }
  if( highTemps[2] < 60 )
  {
    cat("Best to stay indoors");
  }
}

The two inner if() statements (checking highTemps) are inside a codeblock that is only checked if the outer if() statement (checking noonCond) is TRUE.

11 Application

All of the following code should be in one script file.

A) Using if(), find which of the first five days had at least 1 inch of rain.  For those days, output to the Console the high and low temperature.

Note: this is an inefficient way to do the problem – the more efficient way is for loops, which we will do in a couple lessons.

 

B) Using if(), find which of the last five days had a low temperature that was 40 degrees or less.  For those days output date to the Console.

C) Use sample() to pick a random temperature from 40-80 (make sure you save the value to a variable)

 

D) Use sample() to pick a random weather condition from these four choices: “Cloudy”, “Sunny”, “Rainy”, “Foggy”

sample(x=c('a', 'b', 'c'), size=1) # randomly picks a, b, or c

E) Using if() with embedded if() (Section 10) output a single message to the Console for each of these four scenarios:

  1. random temperature (part C) is more than or equal to 60 and the weather condition (part D) is sunny

  2. random temperature (part C) is less than 60 and the weather condition (part D) is sunny

  3. random temperature (part C) is more than or equal to 60 and the weather condition (part D) is not sunny

  4. random temperature (part C) is less than 60 and the weather condition (part D) is not sunny

 

The four scenario are mutually exclusive (do not overlap each other) and complete (cover every possible scenario). 

If you do this right, then exactly one of the four messages will appear every time you run your script.

 

 

Save the script as app1-08.r in your scripts folder and  email your Project Folder to Charlie Belinsky at belinsky@msu.edu.

 

Instructions for zipping the Project Folder are here.

 

If you have any questions regarding this application, feel free to email them to Charlie Belinsky at belinsky@msu.edu.

11.1 Questions to answer

Answer the following in comments inside your application script:

  1. What was your level of comfort with the lesson/application?

  2. What areas of the lesson/application confused or still confuses you?

  3. What are some things you would like to know more about that is related to, but not covered in, this lesson?

12 Extension: Curly bracket placement

The curly brackets attached to a if() statement are used to encapsulate the codeblock that gets executed when the conditional statement is TRUE. All the curly brackets have to do is begin and end the codeblock, so there are many ways you could place the curly brackets.

 

Here are three bad ways that work:

if (randomTemp > 50) «{» cat("The temperature is", randomTemp);
cat("warm enough to go outside\n");  «}»
if (randomTemp > 50)
      «{» 
cat("The temperature is", randomTemp);
cat("warm enough to go outside\n");
      «}»
if (randomTemp > 50)             «{»
  cat("The temperature is", randomTemp);
  cat("warm enough to go outside\n");    
  
                     «}»

The three if() statement above would all execute correctly because the codeblock is in between the curly brackets.  But, in programming there are standards to how curly brackets get placed.  The two main standards are below, and you need to use one of these two in this class:

# standard 1
if (randomTemp > 50)
{ 
  cat("The temperature is", randomTemp);
  cat("warm enough to go outside\n");
}

# standard 2
if (randomTemp > 50)  { 
  cat("The temperature is", randomTemp);
  cat("warm enough to go outside\n");
}

In both these standards, the commands inside the codeblock are also indented – this makes it easier to see that the code that is attached to the if() statement. 

13 Extension: Removing curly brackets in if() statements

The three if() statements below have only one command in the codeblock: 

if(highTemps[3] > 50)
{
  cat("  high temp 3 is greater than 50\n");
}
if(highTemps[4] > 50)
{
  cat("  high temp 4 is greater than 50\n");
} 
if(highTemps[5] > 50)
{
  cat("  high temp 5 is greater than 50\n");
}

The curly brackets are not needed when there is only one command in a codeblock. The following code will execute correctly:

if(highTemps[3] > 50)
  cat("  high temp 3 is greater than 50\n");
if(highTemps[4] > 50)
  cat("  high temp 4 is greater than 50\n");
if(highTemps[5] > 50)
  cat("  high temp 5 is greater than 50\n");

But, that is because there is only one command attached to each if() statement.  The curly brackets are always needed if there is more than one command. This author would not recommend removing brackets if you are a beginning, it can lead to problems down the road.

14 Extension: Greater than and less than on strings

Greater than and less than conditional operator do work on strings.  If the strings only have letters from the English alphabet, then > and < will do an alphabetical comparison between the values.  In this case, > means “later in the alphabet”.  So, “C” > “B”, “D” > “C”…

> "Frank" > "Charlie"
[1] TRUE
> "Bob" > "Charlie"
[1] FALSE
> "Bob" > "Barb"
[1] TRUE

If you use characters that are not in the English alphabet then R will look at the Unicode character code.  Unicode character codes are unique numbers assigned to every imaginable character.  The English letters start at 65 in the Unicode chart.  The main use for Unicode character codes is that they allow someone to use and output characters that are not on your keyboard like the ζ , which is character number 950. 

Even though > and < works on all characters, it has limited usefulness.

15 Extension: Numbers as Characters

A common problem in R is that variables that look like numbers can really be a string/character.

 

If you add these variables to your script:

num1 = 50;
num2 = "50";

Then num1 is an integer but num2 is a string/character.

In other words, num1 is 50 and num2 is the character “5” followed by the character “0”.

 

This means the mathematical and conditional operators will not work on num2 as you would expect:

> num2 > 30
[1] TRUE
> num2 > 70
[1] FALSE
> num2 > 300
[1] TRUE
> num2 > 7
[1] FALSE
Figure 4: Conditional operators on numbers that are really string/characters: num2 = “50”

Since num2 is “50”, the first two seem correct, “50” is greater than 30 and not greater than 70.

 

But adding or taking away “0” does not change the result of the conditional operator so “50” is still greater than 300 and not greater than 7.

 

This happens because R does not think of “50” as a number but as two characters, a “5” and a “0”.  And since R sees “50” as two character, R is making an alphabetical comparison of the “numbers”:

> num2 > "30"
[1] TRUE
> num2 > "70"
[1] FALSE
> num2 > "300"
[1] TRUE
> num2 > "7"
[1] FALSE
Figure 5: Conditional operators on strings that have numbers in them

Alphabetic comparisons start with the leftmost character so “Frank” is greater than “Charlie” and not greater than “Harry”. Using this logic, R sees “5” as greater than “3” and not greater than “7”.  It does not matter what, or how many, “numbers” come afterwards

 

When doing an alphabetical comparison, if the first character is different then you do not need to look beyond the first character when doing a conditional operator.

16 Extension: Checking all values in a vector (for loops)

This is a preview of things to come.  It is a pain to code an if() statement for every value in a vector – especially if there are hundreds or thousands of values.  For this reason if() statements are often found embedded within for loops.  The for loop’s job is to cycle through a set of values with just one command:

# This for loop will execute 14 times with i=1:14
for(i in 1:14) 
{
  # only one if() is needed to check all 14 values in the vector
  if(highTemps[i] > 50)
  {
    cat("  high temp", i, "is greater than 50\n");
  }
}

And the if() statement does the check on all 14 values, with 9 of the 14 values bring greater than 50:

high temp 1 is greater than 50   
high temp 3 is greater than 50   
high temp 6 is greater than 50   
high temp 7 is greater than 50  
high temp 8 is greater than 50   
high temp 9 is greater than 50   
high temp 12 is greater than 50   
high temp 13 is greater than 50   
high temp 14 is greater than 50