Invariance in time
While you should always start by visualizing your raw data, this is often uninformative when it comes to discriminating between two classes of data points. Data is usually noisy or exhibits complex patterns that aren't discoverable by the naked eye.
Another common technique to find simple differences between two sets of data is to average across multiple instances of the same class. This may remove noise and reveal underlying patterns (or, it may not).
In this exercise, you'll average across many instances of each class of heartbeat sound.
The two DataFrames (normal
and abnormal
) and the time array (time
) from the previous exercise are available in your workspace.
This exercise is part of the course
Machine Learning for Time Series Data in Python
Exercise instructions
- Average across the audio files contained in
normal
andabnormal
, leaving the time dimension. - Visualize these averages over time.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Average across the audio files of each DataFrame
mean_normal = np.mean(normal, axis=____)
mean_abnormal = np.mean(abnormal, axis=____)
# Plot each average over time
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 3), sharey=True)
ax1.plot(____, ____)
ax1.set(title="Normal Data")
ax2.plot(____, ____)
ax2.set(title="Abnormal Data")
plt.show()