1. GitHub Actions workflow for Hyperparameter Tuning
Hello again! In this video, we are going to discuss setting up a GitHub Actions pipeline that performs hyperparameter tuning.
2. Branching workflow
Since we have designed our hyperparameter tuning and training to be loosely coupled, we should trigger these independently in separate branches. We can do that by writing an appropriate if condition in GitHub Actions.
Our Hyperparameter tuning job would also print the hyperparameter tuning results and would open a new pull request with the best configuration in the parameter file.
This new PR is meant to kick off the training job on the entire training dataset with the training configuration generated by the hyperparameter run.
3. Setting conditionals
Here is the relevant section of the GitHub Actions YAML. Using the if conditional, we are allowing it to run only when the branch name starts with hp underscore tune slash. github dot head_ref refers to the source branch of the pull request in a workflow run. Notice that we are not using the typical syntax of a dollar sign with curly braces here while accessing context - because GitHub Actions automatically evaluates the if conditional as an expression.
Additionally, we run the hyperparameter tuning step in the DVC step.
Similarly, we can guard the training branch by an appropriate prefix and run training in the DVC step.
4. Setup workflow permissions
In order to create pull requests, we need to set permissions. Navigate to Repository Settings, Actions, General, and scroll down to set workflow permissions to create and approve PRs.
5. Hyperparameter tuning job kickoff
Upon opening the Hyperparameter tuning PR, we see that the relevant workflow has been triggered appropriately, while the training one has been skipped.
6. Hyperparameter tuning job metrics
Upon completion, the GitHub Action bot comments on the Hyperparameter tuning PR with the same results table we output as a markdown. We have used the cml comment create command for this purpose.
7. Creating a training PR from hyperparameter run
Let's now take a look at augmenting the hyperparameter training workflow to open a new PR with the updated model training parameter configuration. First, we create a branch name starting with train slash, followed by a shortened SHA hash of the commit.
Then we create a PR by using the cml pr create command that takes user info, message, and the target branch name, followed by file name to commit.
This abstraction of creating PR using CML is quite helpful, and it uses GITHUB TOKEN for elevated permissions.
8. New training branch PR
Once the hyperparameter tuning workflow runs to completion, we see that it creates a new PR for model training.
9. New training branch PR
We can confirm that the git diff shows the changes in the best parameter configuration file.
10. Starting training run manually
When we use the repository's GITHUB_TOKEN to perform tasks, events triggered by the GITHUB_TOKEN will not create a new workflow run. This is a safety measure that prevents us from accidentally creating recursive workflow runs.
To circumvent the issue, one may create a personal access token with equivalent permissions to the GITHUB TOKEN, referencing it in the usual fashion in the steps.
Another alternative could be to kick off training right after hyperparameter tuning. Still, a manual trigger would work best because it allows us to inspect the tuning job's results and decide to kick off training. To do this, simply checkout the branch locally, make an empty commit, and force push.
11. Training job kickoff
This kicks off the training run and skips the tuning run. The training run can be configured with printing or comparing metrics and plots, as discussed in the first video of this chapter.
Note that in order to trigger the training run independently, simply create a new branch starting with train slash, make changes, and open a PR.
12. Let's practice!
It is time to test your knowledge of hyperparameter tuning using GitHub Actions.