Moving to parApply
To run code in parallel using the parallel
package, the basic workflow has three steps.
- Create a cluster using
makeCluster()
. - Do some work.
- Stop the cluster using
stopCluster()
.
The simplest way to make a cluster is to pass a number to makeCluster()
. This creates a cluster of the default type, running the code on that many cores.
The object dd
is a data frame with 10 columns and 100 rows. The following code uses apply()
to calculate the column medians:
apply(dd, 2, median)
To run this in parallel, you swap apply()
for parApply()
. The arguments to this function are the same, except that it takes a cluster argument before the usual apply()
arguments.
This is a part of the course
“Writing Efficient R Code”
Exercise instructions
- Use the
detectCores()
function to print the number of available cores to the console. - Create a cluster using
makeCluster()
; set the number of cores used equal to 2. Save the result ascl
. - Rewrite the above
apply()
function asparApply()
. Remember, the first argument should now be the cluster object,cl
. - Stop the cluster using
stopCluster()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Determine the number of available cores
___
# Create a cluster via makeCluster
cl <- makeCluster(___)
# Parallelize this code
apply(dd, 2, median)
# Stop the cluster
stopCluster(cl)
This exercise is part of the course
Writing Efficient R Code
Learn to write faster R code, discover benchmarking and profiling, and unlock the secrets of parallel programming.
Some problems can be solved faster using multiple cores on your machine. This chapter shows you how to write R code that runs in parallel.
Exercise 1: CPUs - why do we have more than oneExercise 2: How many cores does this machine have?Exercise 3: What sort of problems benefit from parallel computing?Exercise 4: Can this loop run in parallel (1)?Exercise 5: Can this loop run in parallel (2)?Exercise 6: The parallel package - parApplyExercise 7: Moving to parallel programmingExercise 8: Moving to parApplyExercise 9: The parallel package - parSapplyExercise 10: Using parSapply()Exercise 11: Timings parSapply()Exercise 12: You can write efficient R code!What is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.