Get startedGet started for free

Visualizing time series

1. Visualizing time series

Visualizing variables as a function of time is a frequent requirement when working with data, so understanding how to plot dates becomes essential. Let us dive deeper into visualizing time series!

2. Time series

A time series is a collection of data points recorded over a specific period, representing variable values at regular intervals (for example, hourly or daily). It exhibits temporal ordering, with values influenced by past observations and displaying patterns, trends, and seasonality. For instance, here we show monthly tomato prices for a Mumbai center over time.

3. Tomato prices

Let us examine these monthly tomato prices at a market in Mumbai. We first sort the tomato DataFrame by date. We then use the plot function to plot the price over time, with the date column as the first argument and the price column as the second. Next, we customize the line plot by assigning it a line width of 2, a color resembling that of tomatoes, and providing a label. Furthermore, we include descriptive axis labels.

4. Tomato prices

This produces a time series plot, but the axis labels overlap and the dates are not properly sorted. We could manually adjust the xticks parameter and reorder the dates to address this issue. However, the fundamental problem is that dates are represented by strings in the date column. Notice how these strings ended up sorted in alphabetical order. Let's explore a solution to this problem.

5. Dates with Julia

Working with date variables is made simple with the Dates package in Julia. We start by importing the Dates package to create a Date type variable. Then, we use the Date function, passing the date in string format with the year, month, and day as argument. The Dates package also supports handling strings that contain dates in different formats. To achieve this, we need to specify the dateformat argument to match the format of the input strings. The following table shows some of the codes used in the date format. For instance, y represents the year, and u represents the abbreviated month name. For more, check the Dates package documentation.

6. Tomato prices with Dates

Using the Dates package, we can convert the date column of the tomato DataFrame to Date type. To achieve this, we apply the Date function element-wise to the date column using the dot notation while specifying the date format. Now, the data can be sorted by date using the sort function. Notice how the resulting DataFrame has a sorted date column containing Date types.

7. Tomato price time series

Following this, we can employ the same code as before to plot the time series of tomato prices. The resulting plot now automatically formats and adjusts the x-axis labels to be presented as dates in an aesthetically pleasing manner. Notice that the issue of label overlapping is resolved.

8. Annotating a plot

We can also incorporate annotations highlighting important dates in our time series plot. Let's create an annotation for the date with the highest tomato price. First, we use the argmax function to find the row that contains the data for the date with the highest price. Next, we call the annotate function to modify the previous plot. The first two arguments of the annotate function specify the x and y coordinates of the annotation. In this case, we can set the y coordinate as the maximum price plus two to prevent the annotation from overlapping with the curve. Finally, we provide the text for the annotation and specify the fontsize using the annotationfontsize argument. The annotation facilitates the interpretation of our time series plot.

9. Let's practice!

Now that you have gained knowledge on plotting variables of Date types let's proceed to the exercises!