1. Checking arguments
If something goes wrong with a function, there are two people who could be responsible: the developer and the user.
If the person who wrote the function made a mistake, then it's called a bug. However, even if there are no bugs, it's still possible for users to make mistakes.
2. The geometric mean
Here's the geometric mean function from the previous video.
If you pass it a silly input, like the letters of the alphabet, it throws an error.
This is good behavior, but there are two problems. Firstly, the error message appears to come from log, not from calc_geometric_mean, and secondly, the message doesn't say which argument was non-numeric, which isn't helpful.
3. Checking for numeric values
Let's modify the function to throw an error message when x is not numeric.
Checks on user inputs should be included at the start of the function body, so the feedback is given to the user as quickly as possible.
If x is not numeric, there is a problem, so you throw an error, with stop.
Now when you pass letters to calc_geometric_mean, it gives a clearer error.
4. assertive makes errors easy
That code for providing the error message is called an assertion. It worked, but it was rather fiddly and far too boring to write over and over in all your functions.
Fortunately, R has many fine packages for writing assertions. Here, we'll use the assertive package, which is my favorite, because it has a strong focus on providing clear error messages. I also like it because I wrote it.
5. Checking types of inputs
The assertive package contains over seventy checks on variable types, from common types like numeric or character or data frame, down to weirder variable types like two-sided formulas or time series kernels.
6. Using assertive to check x
To check that x is numeric in calc_geometric_mean, you can just add the line assert_is_numeric x.
That's much easier than before, and the error message is clear.
7. Checking x is positive
The geometric mean only makes sense when all values of x are positive, so we should write an assertion for this. Naturally, assertive has a check.
Now if we pass a vector with a negative value, we get a clear error message. This is great, but there is an issue: the error message explains what went wrong, but not why it was wrong. In this instance, it would be helpful to create a custom check on x.
8. is_* functions
Underlying every assert function in assertive is a corresponding is function.
For example, underneath assert_is_numeric lies is_numeric. This is like the base R is-dot-numeric but with more feedback when things go wrong. This returns a single logical value.
Likewise, is_positive underpins assert_all_are_positive. This returns a logical vector of the same length as the input.
In this case, we want the negation of this, is_non_positive.
9. Custom checks
By placing is_non_positive inside an if condition, we can define any behavior we want when x has non-positive values. In this case, we can throw an error with the exact error message that we want.
Now it is clear that the problem is that the geometric mean needs positive values.
10. Fixing input
Rather than throwing an error when inputs aren't correct, another alternative is to fix them. Two assertive functions are useful for this.
use_first will keep only the first value of a vector, giving a warning if its length was greater than 1.
coerce_to converts a value to a different type, again with a warning.
11. Fixing na.rm
You can use these functions to fix the na-dot-rm argument.
When na-dot-rm is given as a numeric vector, you get warnings that only the first value is used, and that it is converted to logical.
12. Let's practice!
Let's check some inputs!