What is your ML test score?

Tania Allard

Abstract

Machine learning in production is different to ML in R&D environment. So how can you test your ML models successfully before even reaching this stage? This session will present a number of techniques to test your ML quality and decay in both your R&D and production environments appropriately.

Description

Using machine learning in real-world applications and production systems is a very complex task involving issues rarely encountered in toy problems, R&D environments, or offline cases. Testing, monitoring, and logging are key considerations for assessing the decay, current status and production-readiness of machine learning systems. But how much of these is enough? Where do we even get started? Who should be responsible for the testing and the monitoring? How often have you heard the phrase “test in production” when it comes to machine learning? If your answer is: “too often” perhaps you need to change your strategy.

We will cover some of the most frequent issues encountered in real-life ML applications and how we can make our systems more robust. We will also consider a number of indicators pointing to decay of models or algorithms in production systems. By the end of the talk, the audience will have a clear rubric with actionable tests and examples to ensure that the quality or a model in production is adequate. It will also provide valuable guidelines to help engineers, DevOps, and data scientist to evaluate and improve the quality of ML models before they even reach production stage.

Bio

Tania is a Research Engineer and Microsoft developer advocate with vast experience in academic research and industrial environments. Her main areas of expertise are within data-intensive applications, scientific computing, and machine learning. One of her main areas of expertise is the improvement of processes, reproducibility, and transparency in research, data science and artificial intelligence.
Over the last few years, she has trained hundreds of people on scientific computing reproducible workflows and ML models testing, monitoring and scaling and delivered talks on the topic worldwide.

She is passionate about mentoring, open source, and its community and is involved in a number of initiatives aimed to build more diverse and inclusive communities. She is also a contributor, maintainer, and developer of a number of open source projects and the Founder of Pyladies NorthWest UK.