1. Experiment tracking
Now we will discuss machine learning experiments, what experiment tracking is, and why this is an important concept in MLOps.
2. The machine learning experiment
Part of machine learning model development is performing machine learning experiments. In a machine learning experiment, we train and evaluate multiple machine learning models to find the best one.
3. Why is experiment tracking important?
During machine learning experiments, we can configure different machine learning models, for example, linear regression or deep neural networks. We can alter the model hyperparameters, like the number of layers in a neural network. We could use different versions of the data and different scripts to run the experiment. We can also use different environment configuration files per experiment, like what version of Python or R are used and what libraries.
4. Using experiment tracking in the ML lifecycle
But why is tracking all of this so crucial? Well, it helps us compare results, reproduce past experiments, collaborate with our team, and report findings to stakeholders.
5. How to track experiments?
Depending on the maturity of our machine learning development, there are different options to track experiments.
We could start by using a spreadsheet in Excel, where we write down the details of each experiment on each row. If we do a lot of experiments, this requires a lot of manual work.
We could also build our own experiment platform that automatically tracks the experiment during model training. Having our own experiment platform allows us to build a custom solution for our specific process. However, this can cost time and effort.
For most cases, using modern experiment tracking tools is the most effective, even though it might require some time to get familiar with the tool.
6. A machine learning experiment
Imagine we're building a model to classify images as either containing a dog or a cat. In our first experiment, we might train a neural network with one hidden layer using 1,000 images. In the next, we add puppies and kittens to our dataset, bump it up to 2,000 images, and use a model with two hidden layers.
7. The experiment process
A machine learning experiment will roughly go through the following steps.
We begin by formulating a hypothesis based on our business objectives.
Next, we gather the necessary data and then we set our initial hyperparameters.
Before starting the training, we establish our experiment tracking to log every detail meticulously.
Then, we proceed to train and evaluate the model, carefully storing information about the dataset and other relevant settings.
Once we've identified the best-performing model, we register it, along with all associated experiment details.
Finally, we visualize and report the results, sharing them with the team and stakeholders to plan our next steps.
8. Let's practice!
And that's it. Now you understand the basics of machine learning experiments and the importance of tracking them. Time to put these concepts into practice in the exercises ahead.