GitHub Actions for Machine Learning
This presentation discusses the use of GitHub Actions to automate certain steps of a toy ML project.
github mlops scikit-learn wandb code github-actions tutorial

In this deck I discuss the importance of incorporating CI/CD in ML engineering. We took a small demo that uses scikit-learn and GitHub Actions to automate certain parts of an ML Project and lets a bot comment on a PR with the latest experimental results.

This repository demonstrates how to integrate GitHub Actions to:

Upon a new commit

  • Automatically authenticate wandb (Weights and Biases) using a custom GitHub secret.
  • Automatically train a small Random Forest Regressor model on the wine quality dataset.
  • Automatically log the training and other important model metrics to wandb.
  • Cache Python dependencies so that old dependencies do not get installed each time a run is triggered.
  • Generate a metrics.csv file after a run is successfully completed.

Upon a new pull request

  • Fetch the latest wandb run URL and comment that on the PR.

Don't forget to tag @sayakpaul in your comment, otherwise they may not be notified.

Authors original post
Calling `model.fit()` @ https://pyimagesearch.com | Netflix Nerd
Share this project
Similar projects
GitHub Actions: Providing Data Scientists With New Superpowers
A Tutorial on GitHub Actions For Data Scientists
Using GitHub Actions for MLOps & Data Science
A collection of resources on how to facilitate Machine Learning Ops with GitHub.
Data Scientist Portfolio
Template to Create a charming Data Science Portfolio.
GitHub CLI 1.0: All you need to know
GitHub CLI basically brings GitHub to your terminal.
Top collections