Split-Apply-Combine
A common data science problem is to split your data frame by a grouping, apply some transformation to each group, and then recombine those pieces back into one data frame. This is such a common class of problems in R that it has been given the name split-apply-combine. In Intermediate R for Finance, you will explore a number of these problems and functions that are useful when solving them, but, for now, let's do a simple example.
Suppose, for the cash data frame, you are interested in doubling the cash_flow for company A, and tripling it for company B:
grouping <- cash$company
split_cash <- split(cash, grouping)
# We can access each list element's cash_flow column by:
split_cash$A$cash_flow
[1] 1000 4000 550
split_cash$A$cash_flow <- split_cash$A$cash_flow * 2
split_cash$B$cash_flow <- split_cash$B$cash_flow * 3
new_cash <- unsplit(split_cash, grouping)
Take a look again at how you access the cash_flow column. The first $ is to access the A element of the split_cash list. The second $ is to access the cash_flow column of the data frame in A.
Cet exercice fait partie du cours
Introduction to R for Finance
Instructions
- The
split_cashdata frame is available for you. Also, thegroupingthat was used to splitcashis available. - Print
split_cashto get a look at the list. - Print the
cash_flowcolumn for companyBinsplit_cash. - Tragically, you have learned that company A went out of business. Set the
cash_flowfor company A to0. - Use
groupingtounsplit()thesplit_cashdata frame. Assign this tocash_no_A. - Finally, print
cash_no_Ato see the modified data frame.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Print split_cash
# Print the cash_flow column of B in split_cash
split_cash$___$___
# Set the cash_flow column of company A in split_cash to 0
split_cash$___$___ <- ___
# Use the grouping to unsplit split_cash
cash_no_A <- unsplit(___, ___)
# Print cash_no_A