1. Plotting multiple variables
Now that we've seen how to build scatter and line plots let's explore how to customize our plots further and include multiple variables in a single figure, which can help to uncover relationships within the data.
2. Exploring our dataset
We'll continue working with the funds DataFrame, looking at daily prices and volume of the Invesco QQQ Trust.
We define the daily range as the difference between the high and low prices divided by the opening price multiplied by one hundred and assign it to the volatility column of the qqq data frame.
3. Customizing a scatter plot
We pass the volatility and volume columns to the scatter function to make a scatter plot of daily traded volume versus volatility. We also add axis labels and a title for clarity.
To hide the legend, we disable it by setting the label argument to false.
To alter the color of the markers, we can specify a predefined color name using the markercolor argument. In this example, we've chosen ivory2.
Here are some commonly used colors to help you get started with customization. For a comprehensive list of available colors, please refer to the documentation of the Colors-dot-jl package.
4. Exclamation mark notation
We can recreate the same plot by using the exclamation mark notation.
First, we call the scatter function with the same arguments as before but without specifying a title and axis labels.
Next, we modify the existing figure using the title, xlabel, and ylabel functions, followed by an exclamation mark.
5. Correlation
Scatter plots are a helpful tool to visualize the correlation between two variables. Correlation is a statistical measure that describes the relationship and strength of association between two variables.
Correlation can be positive, meaning that variables increase together, or negative, meaning that variables move in opposite directions.
A regression line in a scatter plot summarizes the relationship between two variables. Its slope visually represents the strength and direction of the correlation, either positive or negative.
6. Adding a regression line
To add a regression line to a scatter plot, set the smooth argument to true.
We can customize the width and color of the regression line by setting the linewidth and linecolor arguments. In this case, we use a linewidth of 2.5 and set linecolor to magenta3 to highlight the regression line.
Note that the regression line shows a clear positive correlation between traded volume and volatility.
7. Multiple line plots
Let us explore how to plot multiple variables on the same graph.
We pass both the high and low columns as the second argument of the plot function in row vector notation, that is enclosed in square brackets and separated by spaces.
We then set the label keyword argument by passing multiple strings to it in row vector form. It is essential to use this format to ensure that each curve is labeled appropriately.
Also, we can assign different line widths by passing multiple values to the linewidth argument. Here, we highlight the high prices by making the line thicker.
8. Multiple plots done differently
We can also plot these same variables using the exclamation mark notation.
First, we plot the daily high prices separately. We again use the label and linewidth arguments to add a label for the first plot and set its line width.
Next, we add a second line to the figure by calling the plot function followed by an exclamation mark. We also set the label and linewidth arguments for this line.
9. Cheat Sheet
Here's a handy summary of what we've learned in this video!
10. Let's practice!
That's it for this video! We learned some of the basic features of the Plots-dot-jl library, but you can do a lot more with it. Now let's create more plots in the exercises!