1. Running MLflow Projects
Now that we have learned the basic structure for creating an MLflow Project, let's learn how to execute it.
2. API and command line
MLflow Projects can be executed programmatically using an API from MLflow and can also be run from the command line.
Offering both programmatic and command-line provides flexibility for automation and allows the possibility of chaining multiple Projects together into workflows.
3. Projects API
MLflow has a module called MLflow-dot-projects that provides an API for running MLflow Projects.
The MLflow-dot-projects module contains a run method for running either a local or Git-stored project.
The run method accepts several arguments such as:
URI which points to the URI of the Project.
entry_point which is used to point to the entry_point in the MLproject that the run method should execute.
experiment_name is used to specify which experiment the training run should belong to if the code is used for a training run.
env_manager is used to specify which Python environment manager.
4. MLproject
Before executing our Project using the run method, let's review our local MLproject file which defines our MLflow Project.
We are using an entry point called main that executes a Python file called train_model. We also specify a python_env file to specify our Python environment.
5. train_model.py
The train_model file is used to train a linear regression model to predict the salary of someone based on experience, age, and an interview score.
Here we define all needed Python libraries and modules as well as set up training data from a Salary_predict csv file.
6. train_model.py
In our code, we split our data into training and test data.
Then we use the autolog method from our model flavor to automatically log metrics and parameters to MLflow Tracking during the training run.
7. Projects run
To execute our MLflow Project with the run method from the MLflow-dot-projects module, we begin by importing MLflow.
When we call MLflow-dot-projects run, we set the uri argument equal to dot-forward slash, which represents the same directory as where the code is executed.
We define the entry_point as main which will execute the train_model file. We also define the experiment_name as "Salary Model" to log our run to the specified experiment.
8. Run output
When our Project is executed, MLflow Projects API begins by creating a new experiment since it did not exist.
Then a new Python environment is created and dependencies are installed according to the python_env argument in our MLprojects file.
9. Run output
Once dependencies are installed, our train_model.py script is executed and a new run is created.
Once our script has completed and our model is trained, we receive a message that the run succeeded.
10. MLflow Tracking
Checking the Tracking UI, our run and model were logged to MLflow Tracking successfully.
11. Command line
MLflow Projects can also be executed using the MLflow command line using the run command.
The run command supports the same options as the MLflow projects module.
12. Run command
Here we use the mlflow run command with dash-dash-entry-point main to specify the main entry point and dash-dash-experiment-dash-name Salary Model to specify the experiment name to use.
When executed, it uses the already created Python environment from the previous run.
The train_model.py script is then executed to create a new training run and returns a message when completed.
13. Let's practice!
Now let's test what we learned about running MLflow Projects.