MLOps is a tall order
When we talk about ML systems or ecosystems we think of:
- Version control
- Development environments
- Training infrastructure/pipelines
- Workflow Orchestration
- Model Stores
- Feature Stores
- ML Metadata Stores
- Model Serving
- Continuous Learning
- Prediction stores
… that’s a lot.
If you’re tasked with building an ML system from scratch that’s pretty daunting.
Trying to build everything on the first pass is the wrong idea. You’re going to end up burning time and burning budget. When you come back to your stakeholders after 3 months and only have half of what you promised and all the money spent, they’re likely to pull the plug.
Creating a mature ML system takes a lot of effort. Even if you start with something out of the box like AWS Sagemaker. An example of a mature infrastructure using Sagemaker is this article.
Eat some cake
Rather than face a mammoth infrastructure project, the approach I take is to "eat some cake".
What do I mean by that? Think of the whole cake as the ML system, each layer being a different component. You don’t eat cake layer by layer, you take a slice so you get to taste each layer together.
Approach your system building in the same way. Don't build component by component. Build enough of each component so you get a workable system, an MVP. How do you know you've reached an MVP? When you can go from experimentation to deployment.
It doesn't have to be feature-filled and fast. It only has to give a taste of what MLOps can do.
MVPs give you a return on investment fast. It might not have all the fancy features you want but it works well enough to get more buy-in from the business.
Once you have the green light to expand the system then you can start taking more slices of the cake. Soon enough you’ll have a mature ML system that didn’t break the bank or your sanity.