Moving to parApply
To run code in parallel using the parallel package, the basic workflow has three steps.
- Create a cluster using
makeCluster(). - Do some work.
- Stop the cluster using
stopCluster().
The simplest way to make a cluster is to pass a number to makeCluster(). This creates a cluster of the default type, running the code on that many cores.
The object dd is a data frame with 10 columns and 100 rows. The following code uses apply() to calculate the column medians:
apply(dd, 2, median)
To run this in parallel, you swap apply() for parApply(). The arguments to this function are the same, except that it takes a cluster argument before the usual apply() arguments.
Este ejercicio forma parte del curso
Writing Efficient R Code
Instrucciones del ejercicio
- Use the
detectCores()function to print the number of available cores to the console. - Create a cluster using
makeCluster(); set the number of cores used equal to 2. Save the result ascl. - Rewrite the above
apply()function asparApply(). Remember, the first argument should now be the cluster object,cl. - Stop the cluster using
stopCluster().
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Determine the number of available cores
___
# Create a cluster via makeCluster
cl <- makeCluster(___)
# Parallelize this code
apply(dd, 2, median)
# Stop the cluster
stopCluster(cl)