Plotting bivariate relationships with rxLinePlot()
We can plot bivariate relationships in our large datasets using rxLinePlot()
This exercise is part of the course
Big Data Analysis with Revolution R Enterprise
Exercise instructions
Use rxLinePlot() to plot a few relationships.
The syntax is: rxLinePlot(formula, data, …)
- formula - The formula specifying the relationship that you would like to visualize. For this function, this formula should have one variable on the left side of the ~ that reflects the Y-axis, and one variable on the right side of the ~ that reflects the X-axis. As in lattice and in rxHistogram(), you can also specify conditioning variables after a | symbol.
- data - The datasets in which you want to search for variables specified in formula.
- … - Additional arguments
Go ahead and make a plot of closing price as a function of the number of days since 1928.
You should see that there is a strong non-linear relationship, where the closing price accelerates sharply about \(20,000\) days after 1928, which corresponds to roughly r round(20000/365, 1)
years.
We can create different panels for different subsets of our data in the same way that we could do this for rxHistogram(). We simply specify the variable after a | symbol.
Go ahead and recreate the plot, but now create a different panel for each day of the week.
Further, we can also plot all of the different days of the week separately, but within the same graph by using the groups argument. This functionality parallels the formalism used in xyplot() (and other functions) in the lattice package.
Go ahead and replot this relationship, but instead of creating a different panel for each day of the week, create a different line within a single panel.
Finally, we can use transforms on the variables listed in the formula to compute transformations on the fly.
This relationship looks exponential: Go ahead and plot the log of the closing price as a function of the days since 1928.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
## Simple bivariate line plot:
rxLinePlot(___, data = djiXdf)
## Using different panels for different days of the week:
rxLinePlot(Close ~ ___, data = djiXdf)
## Using different groups.
rxLinePlot(Close ~ DaysSince1928, ___ = DayOfWeek, data = djiXdf)
## Simple bivariate line plot, after taking the log() of the ordinate (y) variable.
rxLinePlot(___ ~ DaysSince1928, data = djiXdf)