Here be dragons
Before you get too excited, a word of warning. There are some things that you just can't do with Spark from R right now. The Scala and Python interfaces to Spark are more mature.
That means that you are sailing into uncharted territory with this course. The trip may be a little rough, so be prepared to be out of your comfort zone occasionally.
One further note of caution is that in this course you'll be running code on your own personal Spark mini-cluster in the DataCamp cloud. This is ideal for learning the concepts of how to use Spark, but you won't get the same performance boost as you would using a remote cluster on a high-performance server. That means that the examples here won't run faster than if you were only using R, but you can use the skills you learn here to run analyses on your own big datasets.
If you wish to install Spark on your local system, simply install the sparklyr
package and call spark_install()
.
Are you sure you want to continue?
This exercise is part of the course
Introduction to Spark with sparklyr in R
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
