Microbenchmark revenues
You work as a Data Analyst for an online seller. You have queried sales data for different products sold during a month. This is available in your workspace as a list, ls_sales
. Each element of this list is a vector of revenues for a given product.
You would like to see how the revenue grew day by day. This mean calculating a cumulative sum. Base R has a function called cumsum()
to do the job. But you would like to see if parallelization can help. You want to apply cumsum()
to every element of ls_sales
sequentially and in parallel and compare the results. parallel
and microbenchmark
packages have been loaded for you.
This exercise is part of the course
Parallel Programming in R
Exercise instructions
- Pass the sequential and parallel versions as arguments to a
microbenchmark()
call. - Generate a cluster of all available cores minus two.
- Use the cluster to apply
cumsum()
tols_sales
in parallel usingparLapply()
. - Stop the cluster once computation is done.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Pass the code to microbenchmark
___
"lapply" = lapply(ls_sales, cumsum),
"parLapply" = {
# Make a cluster of all cores minus two
cluster <- ___
# Use cluster to apply in parallel
parLapply(cluster, ___, ___)
# Stop cluster
___
},
times = 3
___