1. Y kant I reed ur code?
Most variables represent objects. That is, they are "nouns". Functions are a little different because they perform actions. You can think of functions as verbs.
2. dplyr verbs
Let's consider some functions from dplyr for manipulating data frames.
select performs the action of selecting columns, and filter performs the action of filtering rows.
Notice that the words select and filter are verbs.
3. Function names should contain a verb
In fact, it is good practice that all function names should contain a verb.
If you are stuck for ideas, this list should get you started.
4. lm() is badly named
lm is perhaps the most high-profile badly named function.
First of all, it's an acronym, so you have to read the documentation to determine that lm means "linear model".
Secondly, it doesn't contain a verb.
Thirdly, there are lots of types of linear model, and it isn't obvious from the name that this runs a linear regression.
I'd prefer a more literal name like "run_linear_regression".
5. Readability vs. typeability
One possible counterargument is that run_linear_regression is more effort to type than lm.
That's true, but there are some good reasons why that doesn't really matter.
Firstly, the amount of time spent reading and understanding code is almost always longer than the time to type it. I'm a slow typist, but I can type run_linear_regression quicker than opening a help page for lm and reading it to figure out what the function does.
6. Readability vs. typeability
Secondly, every modern code editor will autocomplete function names.
In the DataCamp script pane, start typing the name of a function and press TAB. You can select an option without having to type the whole function name.
7. Readability vs. typeability
Thirdly, you can assign functions just like any other variable type.
I find myself calling head a lot, so I define h to be equal to head, and it saves me a few keystrokes.
8. Arguments of lm()
As well as naming your function, to make it easy to use, you have to put the arguments in a sensible order.
9. Types of argument
There are two types of function arguments. Data arguments are the things that you compute on, and detail arguments tell the function how to perform the computation.
For example, looking at the arguments of cor, for calculating correlations, x and y are data arguments, and use and method are detail arguments.
10. Data args should precede detail args
lm has another problem. formula is a detail argument, so data should precede it. Since data isn't first, lm doesn't play nicely with the pipe operator, and this code will throw an error.
Of course, lm has an excuse, since it was written several decades before the pipe operator existed.
11. Our revised function for linear regression
Since we've established that lm has problems, we can write a wrapper function.
run_linear_regression has a clear name, the data argument comes first, and it works with pipes.
12. Let's practice!
Time to name your own functions!