David Self (00:04):
Welcome, everyone, to a discussion on The Development Life Cycle of Machine Learning. My name is David Self and I am a strategist with A Brave New. In this discussion, Paul Welch, Redapt senior vice president of production, will be covering lessons learned, technical debt, development life cycles, industry best practices, and more. Welcome, Paul and thank you for being here.
Paul Welch (00:25):
Thanks, David. Happy to be here.
David Self (00:27):
Redapt has been working with customers to deploy machine learning projects, both in the cloud and on-premises. What are some of the best practices that drive success in this space?
Paul Welch (00:38):
I think one of the best practice categories is how to go about developing machine learning models and apply the lessons learned over the past several decades of software engineering to this slightly different, newer machine learning model development.
David Self (01:00):
We know that snowflakes build technical debt and also slow your ability to release new and improved production models. In fact, technical debt is estimated to cost businesses $5 trillion in the next 10 years, but it can be managed. What are some of the technical debt drivers that you've seen?
Paul Welch (01:19):
Technical debt in building machine learning models can often come from one-off processes, or dedicated environments where the data science and data engineering team is not sharing how they do things, how they build and develop the model, and the environments and dependencies they use to build them.
For example, in some cases of experimentation and development, go back and reverse engineer what you did, which libraries, and which versions of things are compatible with other things. Take that to a larger scale organization and that technical debt can really slow you down and eat up a lot of your time and budget.
David Self (02:04):
What are some of the ways Redapt has addressed these concerns for their clients?
Paul Welch (02:08):
Redapt has a long history of helping customers adopt cloud architectures and deploy applications to the cloud, using DevOps and SRE principles, as well as software engineering best practices. What we like to do is build on the things we're good at. We take what we've been doing for a long time and we apply that to the slightly newer machine learning model development processes.
David Self (02:39):
Let's talk about how to apply these best practices during the development life cycle. Can you take us through the ML development life cycle, Paul?
Paul Welch (02:49):
At a high level, developing a machine learning model starts with data and the prediction you want out of the model. Sourcing data, cleaning that data, and making it ready to use in the ML training process is a big part. There’s also the feature engineering stuff, where you really focus on which data attributes are most important to being able to make that prediction.
Then, the development stage of building the model looks a lot like traditional software engineering, where you write some code to read the data, create the structure of the model, and are able to execute iterations of predictions, which is done over and over in training cycles. That's a slightly newer step to what you do with building traditional software.
Once you're happy with the outcome of the trained model, then you can package that and deploy it to a production environment, just like you would with other traditional software.
David Self (03:52):
What is the difference between an ML model versus traditional software?
Paul Welch (03:56):
An ML model has some similarities to other software—there's some code and it's packaged up to be deployed. But an ML model is not a standalone application, like many other software apps. An ML model is made up of the structure of the model, maybe a neural network, or some other structure, as well as a set of weights and biases that represent the state of the model.
The model structure and the state of the model that's the output of the training cycles is the core of the model that goes along with the code. All of that has to be packaged up together. With traditional software, you normally package it in a way that's ready to run the entire business logic in production. An ML model is deployed into a framework, or a tool that knows how to deserialize that model and make new predictions in production.
David Self (04:54):
How do you run an ML model?
Paul Welch (04:58):
To run an ML model, it depends a lot on which tool set and frameworks you're using. There are different methods, but at a very high level, most of them take the model that is the state and structure of the model and use that to create an instance of that model run time using the tool set it was built with.
David Self (05:22):
Let's dig a little more into model lifecycle practices. Model development is not traditional software, but there seems to be more similarities than differences. Can you help clarify?
Paul Welch (05:33):
Yeah, model development does have more similarities than differences. In the stages of developing code, as well as how you package that model, run it, and operate it in production, I think it is very similar to traditional software and you can get some of the same benefits out of using traditional software best practices.
David Self (05:57):
Thank you so much, Paul, for taking the time to join us for this discussion.
Paul Welch (05:57):
And thank you, David.
David Self (06:05):
You can visit redapt.com to learn how to successfully adopt AI and ML capabilities in the cloud or on-premises with ready-to-use solutions tailored for advanced analytics workloads.