Reusability in Machine Learning

April 22, 1:26 pm - 1:36 pm (10 Minutes)

Add to Calendar

In this session we will explore modern techniques and tooling which empower reusability in data and analytics solutions. Creating and leveraging reusable machine-learning code has many similarities with traditional software engineering but is also different in many respects.

We will discuss ways of developing, delivering, assembling and deploying reusable components. We will compare multi-repos with mono-repos, libraries with micro-libraries, components with templates and pipelines, and present tooling which fosters discoverability and collaboration. We will touch on code and data dependency resolution and injection, reusable data assets, data lakes and feature stores. Additionally, we will discuss tooling and MLOps automation which empowers rapid development and continuous integration/delivery. The discussion is going to frequently link back to functional and non-functional requirements like modularity, composability, single source of truth, versioning, performance, isolation and security.

This talk aims to cover tools of choice, processes and design patterns for building and sharing production ready ML components at scale. It will surface learnings and battle-scars after trying to prevent reinvention of the wheel in one of the largest consultancies with 2000+ analytics practitioners.

Nayur Khan

Associate Partner - Global Head of Technical Delivery

McKinsey

Nayur is an Associate Partner, and the Global Head of Technical Delivery at QuantumBlack.

His focus over the last few years has been working with clients as a thought-partner and an IT sparring partner at the CxO level, with a specialty in Productionising and Scaling Artificial Intelligence (including DataOps and MLOps). He helps his clients move from single digit AI pilot models to double (in some cases triple) digit production AI models, running reliably 24 hours a day, 7 days a week, 52 weeks a year.