Get Started

Moving to parApply

To run code in parallel using the parallel package, the basic workflow has three steps.

  1. Create a cluster using makeCluster().
  2. Do some work.
  3. Stop the cluster using stopCluster().

The simplest way to make a cluster is to pass a number to makeCluster(). This creates a cluster of the default type, running the code on that many cores.

The object dd is a data frame with 10 columns and 100 rows. The following code uses apply() to calculate the column medians:

apply(dd, 2, median)

To run this in parallel, you swap apply() for parApply(). The arguments to this function are the same, except that it takes a cluster argument before the usual apply() arguments.

This is a part of the course

“Writing Efficient R Code”

View Course

Exercise instructions

  • Use the detectCores() function to print the number of available cores to the console.
  • Create a cluster using makeCluster(); set the number of cores used equal to 2. Save the result as cl.
  • Rewrite the above apply() function as parApply(). Remember, the first argument should now be the cluster object, cl.
  • Stop the cluster using stopCluster().

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Determine the number of available cores
___

# Create a cluster via makeCluster
cl <- makeCluster(___)

# Parallelize this code
apply(dd, 2, median)

# Stop the cluster
stopCluster(cl)

This exercise is part of the course

Writing Efficient R Code

IntermediateSkill Level
4.2+
26 reviews

Learn to write faster R code, discover benchmarking and profiling, and unlock the secrets of parallel programming.

Some problems can be solved faster using multiple cores on your machine. This chapter shows you how to write R code that runs in parallel.

Exercise 1: CPUs - why do we have more than oneExercise 2: How many cores does this machine have?Exercise 3: What sort of problems benefit from parallel computing?Exercise 4: Can this loop run in parallel (1)?Exercise 5: Can this loop run in parallel (2)?Exercise 6: The parallel package - parApplyExercise 7: Moving to parallel programmingExercise 8: Moving to parApply
Exercise 9: The parallel package - parSapplyExercise 10: Using parSapply()Exercise 11: Timings parSapply()Exercise 12: You can write efficient R code!

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free