1-14: Functions 2
0.1 Changes
Extension: combining all four error checks into one (or do this as an application?)
Create a function with a count value
1 Purpose
create a separate file to hold functions
arguments in functions
Use return values in functions
2 Questions about the material…
The files for this lesson:
Script: you can download the script here
Second Script: a second script containing functions can be downloaded here
If you have any questions about the material in this lesson, feel free to email them to the instructor, Charlie Belinsky, at belinsky@msu.edu.
3 Opening a script file
A reminder that anytime a script file is looking for some resource, (e.g., a data file, another script file), the script needs a starting point (i.e., a folder) to find the location of the resource. R calls this starting point the Working Directory. When you are in an RStudio Project, the starting folder/Working Directory is the Project Folder. This is what makes an RStudio Project easy to share – the file path used to link resources does not need to change when the Project Folder is moved.
The main script file for this lesson needs the functions within 1-14_myFunctions.R, which is in the scripts folder inside the Project Folder: scripts/1_14-myFunctions.r
And we put the functions in the Environment by calling:
source("scripts/1-14_myFunctions.R");
If the path is incorrect, either because the file name or folder path is incorrect, the line will give you the error:
'wrong_path/1-14_myFunctions.R': No such file or directory cannot open file
or
'script/wrong_name.R': No such file or directory cannot open file
Note: uppercase/lowercase does not matter in the file path on Windows and Mac. However, case does matter on Linux, so keep that in mind when sharing your files.
4 Modulus arithmetic
All of the functions in this lesson use the modulus operator: %%. The modulus operator divides the first number by the second number and return only the remainder.
Some examples:
> 11 %% 4
1] 3
[> 12 %% 4
1] 0
[> 13 %% 4
1] 1
[> 14 %% 4
1] 2 [
The modulus operator is often used in a for loop to perform a task at a regular interval. For instance, modulus can be used to check every 100th value in a large vector (val %% 100).
5 Modulus operator and checking for divisibility
We are going to use the %% operator to check if one number (the dividend) is evenly divided by another number (the divisor). If the divisor divides the dividend evenly, then the modulus will be 0 (i.e., no remainder). The function is called isDivisible().
= function(dividend, divisor)
isDivisible
{### get the remainder of the division using modulus
= dividend %% divisor;
remainder
### Check if the remainder is 0
= (remainder == 0); # TRUE if 0, FALSE otherwise
divBy0
### return whether the modulus was 0 (TRUE) or more than zero (FALSE)
return(divBy0);
}
Extension: Variable and function names do not matter… to R
5.1 A function that checks divisibility
The first function in 1-14_myFunctions.r has two arguments: dividend and divisor.
= function(dividend, divisor) isDivisible
The codeblock calculates the modulus of dividend
and divisor
(i.e., it divides the dividend by the divisor and returns the remainder)
= dividend %% divisor; remainder
Then the codeblock checks whether remainder is 0 and saves the results to divBy0:
= (remainder == 0); # TRUE if 0, FALSE otherwise divBy0
Finally, the codeblock returns divBy0 to the caller:
return(divBy0);
5.2 A Boolean return value
divBy0 is a Boolean value. In other words, divBy0 can only have two possible values: TRUE and FALSE.
The Boolean value is created in this line:
= (remainder == 0); divBy0
(remainder == 0)
compares remainder to 0, just like it would if this were an if() statement. The results of this comparison is a Boolean TRUE/FALSE that is saved to the variable divBy0.
This line of code uses both the comparison operator ( == ) to create a Boolean value and the assignment operator ( = ) to save the Boolean value to a variable (divBy0
).
5.3 Testing isDivisible()
We can pass in two arguments, the first representing the divdend, and the second representing the divisor:
= isDivisible(12,4);
div12_4 = isDivisible(12,5); div12_5
And we get the results:
: TRUE
div12_4: FALSE div12_5
We can also put in the argument names to make the functions more readable:
= isDivisible(dividend=12, divisor=4);
div12_4a = isDivisible(dividend=12, divisor=5); div12_5a
And we get the same results:
: TRUE
div12_4a: FALSE div12_5a
Aside from readability, another advantage to using argument names is that you can put the arguments in any order you want:
= isDivisible(divisor=4, dividend=12);
div12_4b = isDivisible(divisor=5, dividend=12); div12_5b
: TRUE
div12_4b: FALSE div12_5b
When calling isDivisible(), it is fine to skip argument names but using argument names is a necessity for functions that have lots of arguments, such as geom_boxplot():
6 Prime function
The function, isDivisible(), performs the simple task of checking if one number is divisible by another number. We are going to expand that by checking whether the dividend can be evenly divided by any number. In other words, we are checking to see if the dividend is a prime number. This function, in the functions script, is called isPrime1().
note: The method we are using to check for prime works but is very inefficient!
= function(dividend)
isPrime1
{# check all numbers between 2 and one less that dividend
for(i in 2:(dividend-1))
{if(dividend %% i == 0)
{## number can be divided evenly by another number -- return FALSE
return(FALSE);
}
}## number cannot be divided evenly by another number -- return TRUE
return(TRUE);
}
6.1 Parts of the isPrime() function
The header of the function only has one argument this time: the value you want to check for prime:
= function(dividend) isPrime1
To check if dividend is prime, you need to go through all numbers smaller than dividend, starting with 2. If none of those values evenly divide dividend then dividend is prime.
We check all possible divisors with a for loop that cycles from 2 to one less than dividend:
for(i in 2:(dividend-1))
Inside the for loop, we takes the modulus of dividend
and the divisor, which is the current for loop cycle value (i
):
if(dividend %% i == 0)
If any modulus is 0, then we know a number evenly divides dividend and dividend cannot be prime. At this point we do not need to check any more values and can immediately return FALSE back to the caller.
return(FALSE); #dividend is not prime
If the for loop cycles through all of the values, and the modulus is never 0, then we know dividend is prime and return TRUE to the caller:
return(TRUE); #dividend is prime
6.2 Two return locations
This function has two places where it calls return().
- In the for loop if the modulus is 0. At this point we know the dividend cannot be prime because another number evenly divides it. We can immediately end the function and return FALSE to the caller (i.e., the dividend is not prime)
- At the end of the function after the for loop. If the for loop cycles through every number and none evenly divide the dividend then we know the dividend has to be prime and return TRUE to the caller.
Note: There is no way both return() can be executed in one function call.
6.3 Checking for prime
We will call the function multiple times with different dividends. You can convince yourself that the function is correctly declaring prime numbers:
= isPrime1(13);
p0 = isPrime1(14);
p1 = isPrime1(81);
p2 = isPrime1(dividend=83);
p3 = isPrime1(dividend=87);
p4 = isPrime1(dividend=89); p5
Note: the first (and only) argument in isPrime1() is dividend. We can either explicitly name the one argument or have R assume the one value is for the first argument.
: TRUE
p0: FALSE
p1: FALSE
p2: TRUE
p3: FALSE
p4: TRUE p5
7 Error checking
The assumption when someone calls isPrime1() is that the caller will send a valid integer as an argument. It is not a good strategy to assume this will be true as there are many other types of value the caller can pass in:
A vector of values (i.e., multiple values)
A non-numeric value (e.g., a string or Boolean value)
A negative value
A decimal value
isPrime2() is the same as isPrime1() except that isPrime2() first does a series of checks using an if-else-if structure. If any of the statements are TRUE (i.e., the argument is an invalid value) then the function will return an error message and end:
### Error checks on the argument value
if(length(dividend) > 1) # error check 1: too many values
{return("Error: too many values");
}else if (!is.numeric(dividend)) # error check 2: value not numeric
{return("Error: value is not numeric");
}else if (dividend < 0) # error check 3: value is negative
{return("Error: value must be positive");
}else if (dividend %% 1 != 0) # error check 4: value is a decimal
{return("Error: value must be an integer");
}
A truly robust function will check to make sure arguments are valid using some sort of error checking.
7.1 The error checks:
There are four error checks:
1) Check to see if the argument has more than one value
if(length(val) > 1) # error check 1: too many values
2) Check to see if the argument is not a numeric value:
else if (!is.numeric(val)) # error check 2: value not numeric
3) Check to see if the argument is a negative value:
else if (val < 0) # error check 3: value is negative
4) Check to see if the number is a decimal (i.e., not an integer):
else if (val %% 1 != 0) # error check 4: value is a decimal
The first three checks are self-explanatory. The last one is a bit trickier as R does not have a dedicated check for integers. Note: R has a function named is.integer(), but this function only checks if the number has been explicitly declared an integer, something the caller is unlikely to do.
To check is the value is an integer, we perform a modulus between the value and 1.
If the value is an integer, the modulus is 0
If the value is a decimal, the modulus is the decimal
You can convince yourself of this in the Console:
> 5.5 %% 1
1] 0.5
[> 8.333 %% 1
1] 0.333
[> 10 %% 1
1] 0
[> 12.99 %% 1
1] 0.99 [
7.2 Testing the error checking
We will check for the four errors and test valid values to make sure we have not lost the functionality of the original isPrime1():
= isPrime2(c(10,34)); # too many values
e1 = isPrime2("hello"); # not numeric
e2 = isPrime2(FALSE); # not numeric
e3 = isPrime2(-35); # negative numeric
e4 = isPrime2(74.24); # decimal numeric
e5 = isPrime1(13); # valid -- and prime
e6 = isPrime1(14); # valid -- and not prime
e7 = isPrime1(81); # valid -- and not prime e8
: "Error: too many values"
e1: "Error: value is not numeric"
e2: "Error: value is not numeric"
e3: "Error: value must be positive"
e4: "Error: value must be an integer"
e5: TRUE
e6: FALSE
e7: FALSE e8
8 Multiple return values
All the functions we have created so far in the past two lessons have returned one value, either a single Boolean value, or a single numeric value.
We are going to create a function that returns an undetermined number of values. Specifically, we are going to modify the isPrime1() to return all factors of the dividend supplied by the caller. For example, 12 can be divided by 2, 3, 4, and 6 so the return has 4 values: c(2,3,4,6)
.
The function is called findFactors():
= function(val)
findFactors
{### Store the factors here
= c();
factors
for(i in 2:(val-1))
{if(val %% i == 0)
{## number can be divided evenly by another number
## insert this number as a factor
= c(factors, i);
factors
}
}## number cannot be divided evenly by another number -- return TRUE
return(factors);
}
8.1 Storing multiple values
We start the function by creating the vector that will store the values returned to the caller (i.e., the factors of the dividend):
### Store the factors here (starts as a NULL vector)
= c(); factors
factors starts as an empty, or NULL, vector. And a NULL vector will be returned to the caller if dividend is prime (i.e., dividend has no factors).
The for loop still cycles from 2 to one less than dividend and checks if the modulus is 0. Every time the modulus is 0, the value that evenly divides dividend is inserted in the factors vector:
for(i in 2:(dividend-1))
{if(val %% i == 0)
{## number can be divided evenly by another number
## insert this number as a factor
= c(factors, i);
factors
} }
8.2 Adding values to a vector
This line says that factors is equal to a vector of itself and the i value that we just calculated to be a factor:
= c(factors, i); factors
In other words, the code above creates a new vector that is the old vector with the i value inserted at the end.
8.3 Returning the factors
After cycling through all the values in the for loop, we return the factors vector to the caller:
return(factors);
factors will either be NULL (dividend is prime), or have a list of all factors of dividend.
8.4 Testing findFactors()
We will test findFactors() with values that we know are prime (13, 83), values we know are not prime (14, 81, 87), and one value that has many factors (72):
= findFactors(dividend=13);
f0 = findFactors(14);
f1 = findFactors(dividend=81);
f2 = findFactors(83);
f3 = findFactors(dividend=87);
f4 = findFactors(72); f5
And we get NULL for the dividends that are prime or a list of factors for the non-prime dividends:
: NULL
f0: int [1:2] 2 7
f1: int [1:3] 3 9 27
f2: NULL
f3: int [1:2] 3 29
f4: int [1:10] 2 3 4 6 8 9 12 18 24 36 f5
9 Application
1) For this application you need to create two scripts:
a functions script named app1-14_functions.r that contains the functions created in this application
a main script named app1-14.r where you will answer questions in comments and test the functions created in app1-14_functions.r
source() your functions script from the main script
Make sure you test all the functions thoroughly in your main script. I want to see the test code in app1-14.r.
2) In comments answer: Why is factors created as an empty vector in findFactors() before it is used in the for loop? What happens if factors is not created first?
3) Making modifications to isPrime1()
- copy isPrime1() to your function script for this application
- Fix isPrime1() so it can correctly handle the dividends 0, 1, and 2
- 0 and 1 are not prime, 2 is prime
- The for loop should not be executed if the dividend is 0, 1 or 2
- Make the function more efficient by having the for loop cycle from 2 to the square root of dividend
- note: the for loop will ignore the decimal in the square loop value
4) Create a function that checks a vector of numbers to see which of those numbers are divisible by 7, 11, or 13
The function has one argument: a vector of dividends
The function return all the dividends that can be evenly divided by at least one of 7, 11, or 13
5) Create a function that check if the modulus of two numbers is a value given by caller:
The function has three arguments: dividend, divisor, remainder
The function will check to see if the modulus of dividend and divisor is equal to remainder and return TRUE if it is and FALSE if it is not
Also…
Give default value for remainder
Have the function return an error if:
any of the three arguments numbers are zero or negative
remainder is bigger than divisor
6) Create one function that converts one temperature value between the three temperature measurements: Celsius (C), Fahrenheit (F), and Kelvin (K).
There are six possible conversions:
F -> C
C -> F
C -> K
K -> C
K -> F
The conversion for Celsius to Kelvin is: \(K = C + 273\)
The conversion for Celsius to Fahrenheit is: \(F=\frac{9}{5} C+32\)
You need an argument for the temperature value.
You need two arguments to determine the conversion: from and to
- an if-else-if structure will be needed to pick the exact conversion.
Save the script as app1-14.r in your scripts folder and email your Project Folder to Charlie Belinsky at belinsky@msu.edu.
Instructions for zipping the Project Folder are here.
If you have any questions regarding this application, feel free to email them to Charlie Belinsky at belinsky@msu.edu.
9.1 Questions to answer
Answer the following in comments inside your application script:
What was your level of comfort with the lesson/application?
What areas of the lesson/application confused or still confuses you?
What are some things you would like to know more about that is related to, but not covered in, this lesson?
10 Extension: Variable and function names do not matter… to R
Variable and function names are generally chosen to make it easier for the reader to understand the script. But R could care less what names you use. The following script executes the exact same calculation and returns the exact same TRUE/FALSE values as isDivisible() – it just uses variable and function names that are not intuitive to the user. Do not do this in your script!
= function(a_number, another_number)
do_stuff
{### get the remainder of the division using modulus
= a_number %% another_number;
the_answer ### Check if the remainder is 0
= (the_answer == 0); # TRUE if 0, FALSE otherwise
thing_to_return
### return whether the modulus was 0 (TRUE) or more than zero (FALSE)
return(thing_to_return);
}