BaşlayınÜcretsiz başlayın

Parallelizing calls to chunk.apply

The chunk.apply() function can also make use of parallel processes to process data more quickly. When the CH.PARALLEL parameter is set to a value greater than one on Linux and Unix machine (including the Mac) multiple processes read and process data at the same time thereby reducing the execution time. On Windows the CH.PARALLEL parameter is ignored.

Bu egzersiz, kursun bir parçasıdır

Scalable Data Processing in R

Kursa Göz Atın

Egzersiz talimatları

  • Benchmark the function iotools_read_fun(), first with 1 process and then with 3 parallel processes.

Uygulamalı etkileşimli egzersiz

Bu egzersizi bu örnek kodu tamamlayarak deneyin.

iotools_read_fun <- function(parallel) {
    fc <- file("mortgage-sample.csv", "rb")
    readLines(fc, n = 1)
    chunk.apply(fc, make_msa_table,
                CH.MAX.SIZE = 1e5, CH.PARALLEL = parallel)
    close(fc)
}

# Benchmark the new function
microbenchmark(
    # Use one process
    ___, 
    # Use three processes
    ___, 
    times = 20
)
Kodu Düzenle ve Çalıştır