1. The Nine Systems
One quirk of R is that there are often half a dozen different packages that do more or less the same thing. This is mostly by design.
2. post
R’s great strength is that anyone can create a package,
3. cran
submit it to CRAN (or one of the other package repositories)
4. cran
and share it with millions of other users.
Unfortunately,
5. cart
it’s also easy to become overwhelmed by choice, and not know which package is best for you. In the case of object-oriented programming,
6. ReferenceClasses
there are nine different options, so let’s take a moment to work out which of these you might want to spend time learning.
R5 and mutatr were both experimental systems that never made it into real-world use before being abandoned. It’s almost impossible to use these, even if you wanted to.
7. ReferenceClasses
OOP is defunct and no longer available. Cross that off your list.
8. ReferenceClasses
proto had a brief moment of popularity as the underlying system in early versions of the ggplot2 package but is no longer really used.
9. ReferenceClasses
Again, you shouldn’t consider this for your projects.
R-dot-oo has been around for years, but hasn’t really taken off at all. Don’t bother with it.
10. ReferenceClasses
That’s great progress. Already we’ve narrowed down our list from nine to four. Let’s take a look at the others in a bit more depth.
The S3 system was introduced in the third version of the S language that was the precursor to R. It’s been around since the 1980s, so it’s completely mature, and still in wide use.
S3 is a very simple system. In fact it only implements one feature of object-oriented programming, that is, the ability to have functions work in different ways on different types of object. It’s a one-trick pony, but it’s a great trick.
11. ReferenceClasses
Using S3 is a fundamental R skill that you need to learn.
S4 was introduced, as you might be able to guess, in the fourth version of S. Again, that means that it is mature, and there’s a lot of code built with it. Unfortunately, it’s also a bit weird, so in most cases I don’t recommend S4 as a first choice for new projects. There is one exception to this: most of the packages on Bioconductor use S4, so if you work with ‘omics data,
12. ReferenceClasses
then S4 is an essential skill.
ReferenceClasses are an attempt to create a system that behaves similarly to popular object-oriented languages like Java or C#. You get powerful features like encapsulation and inheritance. However, I’m hesitant to recommend it
13. ReferenceClasses
as your first choice for new projects because of the last system that we’re going to discuss.
R6 covers much of the same ground as ReferenceClasses but does so in a simpler way. This means that
14. ReferenceClasses
the code is a little easier to work with, and is higher performance.
15. Summary
To summarise, if you want to work with object-oriented programming, S3 is the first thing that you should learn. It covers most of what you need for day-to-day data analysis. For the occasions where you need something more powerful, use R6. If you work with Bioconductor packages, learn S4 as well. Finally, if you have some spare time, learn ReferenceClasses.
16. Let's practice!
Now, let's try some examples.