AI Model Experiment Tracking - a guide for Data Scientists

Published on

November 7, 2023

Authors

Sara Gomes

ML Software Engineer, Deeper Insights

Advancements in AI Newsletter

Subscribe to our Weekly Advances in AI newsletter now and get exclusive insights, updates and analysis delivered straight to your inbox.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

What, why and how!

These are the questions we need to answer first and foremost to understand the utilisation of an experiment tracker.

Experiment tracking is simply the process of saving all experiment information for each and every experiment you run. That means saving what kind of model you used (or the model itself), what data it was trained/tested on, the training parameters, the resulting metrics, and whatever else might be important so that every experiment can be easily reproduced.

But why should you care? Because it makes for better science! Here's how:

Experiment tracking means you have all your experiments saved in a single place, whether you are working alone or with others, every change tested will be logged in a single repository. Knowing what everyone else is working on, makes the experimentation process more manageable and less likely to duplicate experiments.
It's much easier to search, filter and compare experiment results, because every experiment is logged in the exact same way. When you have several people (or even one person over a longer period of time) manually taking note of the tests that were implemented, you can easily end up not having the same information for all experiments. Maybe this one is missing the learning rate, and that one is missing the number of epochs. Whatever it may be, it makes comparing very difficult. Having an automated and standard experiment tracker running solves this problem!
Improves sharing and collaboration, by providing shareable dashboards, graphs and images, that make report elaboration easy (sometimes even automatic). This could prove to be a valuable asset in client communication.
Allows model monitoring in real time.Depending on how the tracker is implemented within the training pipeline, it can mean that the evolution of the learning curve can be monitored in real time, and not just when the training ends. This means that bad experiments (or experiments with bugs) can be stopped and fixed before losing a lot of time in a full training procedure.

Now that you are completely convinced that you should implement experiment tracking in all your projects, you may be wondering: how do I do that? Well, there are several publicly available, as well as paid, tools, for this exact purpose. Here at DI we've built our own Experiment Tracker module which supports 3 different off-the-shelf tracking tools, wrapping them in an abstract API, that allows seamless conversion from one tracker to another without changing the experiment code.

We continue to add better visualisation tools to our solution, making it easier to create useful dashboards to more efficiently compare the results of different runs and tune our experiments, no matter what impossible problems we're solving!

Related blogs:

Find out more about how to run a Data Science team here: https://deeperinsights.com/how-to-run-a-data-science-team/

Learn how to manage your AI projects: https://deeperinsights.com/how-to-manage-your-ai-projects/