Machine Learning Lifecycle: Take Projects from Idea to Launch
- Rebecca Dodd
Machine learning is the process of teaching deep learning algorithms to make predictions based on a specific dataset. ML engineers typically use popular languages such as Python or R to stand up ML programs. While there are emerging use cases for machine learning in software development, there are still challenges in getting ML projects into production. Overcoming the failure to launch starts with identifying genuinely useful use cases for machine learning, understanding what to expect at each stage of a machine learning project, and communicating goals and expectations to get stakeholders’ support.
This article will cover:
- Why startups might launch machine learning projects
- The machine learning lifecycle:
- Identifying your business goal
- Framing your business problem as a machine learning problem
- Data collection and processing
- Model development and training
- Model evaluation and retraining
- Model deployment
- Maintenance and monitoring
- Advice for getting stakeholders on board with machine learning initiatives
Why Startups Launch Machine Learning Projects
Machine learning can help startups and development teams for a variety of use cases:
Automation Across a Variety of Development Tasks
ML algorithms can extract and analyze data for a variety of development use cases, including:
- Project and Requirement Scoping: ML models can be useful to identify and catalog project requirements
- Unit Testing: ML can also be useful for software testing and QA tasks.
- Change Requests: ML can help automate the change request process to document and track requests across their lifecycles
Predict Maintenance Requirements
You might also use machine learning to perform predictive maintenance of your hardware or infrastructure, helping to mitigate the cost of disruption by performing maintenance at off-peak times and pre-empting failure or compromised performance.
Personalize Offerings for Customers
Machine learning algorithms can analyze which content and keywords resonate with your target audience. Armed with these insights, you can more easily deliver personalized marketing and offers to customers, improving chances of engagement and conversion.
Forecast Sales (and Churn)
A machine learning model can generate sales predictions based on data about your prospects or customers. Analysis of customer behavior and interactions can enable you to answer questions like: How many prospects convert after engaging with your website? How many are likely to buy your product after taking a demo? How are those rates influenced by someone’s location, seniority, or role at their company? How likely are they to take out a subscription over a one-time purchase? Beyond forecasting, these insights can then be used to optimize your sales and support processes to better serve and retain customers, or predict churn.
Add Value to Your Product
There’s also always the possibility of incorporating whatever machine learning capabilities you might find useful into your product as a feature for customers. For example, ML in an incident response platform can uncover and compile context around incidents to make post-mortems more comprehensive and valuable.
A word of caution though–knowing your audience is extremely important here:
I have spent 25+ years in infrastructure. Infrastructure people…[are] pretty risk-averse to things that are not deterministic. We knew we couldn’t just drop [artificial intelligence] into the product and ‘go with it.’ We had to think beyond AI being cool tech and interrogate where we could create real value for our customers.” — Andrew Fong, CEO and cofounder of Prodvana
So, machine learning is not a panacea and with any ML project you undertake, you will need to balance the perceived utility of your use case with the effort and investment required to set it up. Technology expert Michelle Yi suggests that “As more and more hallucinations or different challenges happen, I think the use cases are going to become narrower and narrower until the actual underlying technology improves.”
All of that is to say, ML is a big undertaking and not without its flaws, so it’s vital to be clear on what you want from a ML project and whether those expectations are realistic–starting with identifying your business goals.
The Machine Learning Lifecycle
>> Click to zoom in on this infographic
The typical machine learning project lifecycle, in just eight easy steps.
Business Goal Identification
Machine learning projects are resource-intensive–so you don’t want to invest in one just for the sake of saying you’re doing ML. Start by making sure everyone is clear and aligned on what business value you expect to derive from the project. Prioritizing business goals that directly impact your company’s north-star metrics will help you gather support.
ML Problem Framing
With a business goal established, you need to frame your business problem as a machine learning problem. Is machine learning even the right tool for the job? What data will be observed and what should be predicted? What is the ideal outcome of the ML model? For example, if your ideal outcome is to get customers to convert from a free trial, the model’s goal could be to recommend content highlighting useful features and benefits based on the user’s interactions with your product during the trial period. The final step is defining the metrics you will use to measure success–in this example that might be a percentage increase in conversion from trials.
Data Collection
Now a ML problem is defined, your data scientists can begin collecting data to develop the ML model. You might have existing datasets you may use, otherwise you will need to gather the relevant data, potentially supplementing your in-house dataset using off-the-shelf or synthetic data. There are two common challenges around data availability and data quality to watch out for here:
- Incomplete or irrelevant data can generate model results that are no better than a coin flip.
- It may be simple to capture telemetry around how customers interact with your website or product, but it can be harder to assign meaning to that behavior. Is someone failing to complete your product onboarding because the process is too cumbersome, or because the product wasn’t a fit to begin with?
It’s no small task to collect data that’s simultaneously relevant, accurate, and sufficient to train your model, but this step is vital to set up the machine learning project for success.
Data Preparation
Data in hand, your data scientists can begin processing the raw data into a usable format. This stage involves figuring out if any data processing is required or categorization needed. Exploratory data analysis (EDA) will help to reveal if there are any missing values or outliers, or if feature engineering is required to transform the data you have into a format that aligns more closely with the business objective. For our trial conversion example, that could mean determining the overall conversion rate by subtracting the number of drop-offs from the number of conversions. If you want to eventually run your trained model on an ad hoc or daily basis, you likely don’t want to be beholden to a lot of data preprocessing each time, and may find it more efficient to use an API for data management. Scalability for larger datasets also matters, which is why it’s important to choose a model with the capacity to process the volume of data you plan to use. Skipping or rushing the preparation stage can be a false economy.
Model Development and Training
The ML model development process encompasses building, selecting, training, and fine-tuning the model. The work you did in the earlier stages of gathering and preparing your data ensures that the raw material that goes into the model is relevant and usable. Now, the focus is on building early iterations of the model and selecting the one that is most performant and closely aligned with your problem. Your data science team will continue tuning which features to add weight to, or adding any implementation-specific details. With a model selected, model training can begin: Your data scientists will expose the model to historical data so it can learn the patterns and relationships within your dataset and identify any dependencies. Then, with initial training complete, you can move onto model evaluation.
Model Validation and Retraining
You want a continuous delivery loop of ensuring that your model accuracy is being maintained or even improved with new training data as you go on, so ideally you will evaluate your model’s performance both before and after deploying your model in production. There should be an ongoing reconciliation between what is expected from the machine learning model’s output and what is actually happening.
Your evaluation metrics will have been determined by your data scientists as a result of their EDA process and typically iterated on in collaboration with your product team, such as through a Jupyter notebook, for example. Typically your model output results would be piped into a monitoring solution that scores the performance and alerts you if it drops below a certain threshold. From there, you can refine the model through hyperparameter tuning or retraining with real-world data.
Production Deployment
After training and fine-tuning your model, it’s ready for the production environment–whether you’re using open-source alternatives such as TensorFlow, PyTorch, or MLFlow, or closed-source solutions such as Microsoft Azure or AWS SageMaker. Your data science team will deliver a packaged binary of the trained model, which essentially turns the model into a function with inputs and outputs for model deployment. This can then be run in something like AWS Glue, which in turn integrates with other parts of your ML ecosystem (such as Spark and Redshift).
This is how it should work, ideally, but by now you will have gathered that the ML lifecycle isn’t neat and linear, but a looping, iterative process. One of the reasons many ML projects fail to launch is continued siloing between data science and engineering teams:
Orgs will get past the bottlenecks of requests getting thrown over the fence by rethinking the development life cycle to include what MLOps teams need. There’s still a sense that data scientists are still very much viewed as an ‘other,’ as opposed to just a regular member of the engineering team. Right now, one of the biggest challenges is a cultural one, not a tooling one.” — Adam Zimman, angel investor and strategic advisor to early stage VC firms and software startups
Maintenance and Monitoring
Post-launch, it’s important to have visibility into your model’s performance and, for future jobs, to make iterative improvements–potentially utilizing techniques such as data visualization to properly assess your results. For example, it can be useful to consider metrics like comparing your expected and actual F1 score to determine how much data drift your model experienced. It may also be useful to track any metadata store that captures the output data you’ve accumulated from various model jobs to help you not only recall previous test conditions, but perform future tests against one configuration or another.
Getting Stakeholders On Board for Machine Learning Projects
Given the investment of resources, people, and tooling required for machine learning pipelines, and the high-profile security and privacy risks, you may experience some resistance from stakeholders. With so few data science projects making it into production, there isn’t a lot of compelling evidence that it’s worth the effort. So, how can you make the case for your ML project?
1. Do your due diligence: As mentioned, the first stages of the ML lifecycle include identifying a clear business goal and validating that it can be framed as–and makes functional and business sense as–a machine learning problem. Such early steps can set your project up for success, but can also help with getting leaders on board. With proper validation, you can speak clearly to the value you expect to derive from the project and its overall impact to the business.
2. Align on metrics: You’ll have an easier time winning (and keeping) support if there is less lag between initial investment and seeing returns from your ML model. It could be worth agreeing on a few metrics that satisfy different stakeholders’ expectations at different stages of the lifecycle:
In the very short term, you might look at metrics like iteration speed–define some boundaries of what success is. If your immediate goal is making execs happy, you can start with things that are inherently measurable, and narrow the scope, such as down to a specific use case.” — Stefan Krawczyk, CEO and founder of DAGWorks
3. Show responsible resource use: By monitoring your deployed model’s compute usage, you can not only keep an eye on costs, but ensure that any predictions you generate are actually being looked at and used by stakeholders.
4. Lean into ML ecosystem efficiencies: Building out your machine learning stack with tools that integrate well together is both more efficient and more secure. For example, if you’re a Google shop, BigQuery is a natural choice for your data platform. The most popular open-source models may offer first-party security features as well as community-built ones. Closed-source foundation models may offer vendor guarantees of security and privacy. In all cases, it’s a good idea to read the fine print and ensure your model provides the privacy and security your projects need, particularly if your team works in a highly regulated space such as healthcare, finance, or insurance.
5. Embrace change: Be ready to iterate on the process you build around the ML lifecycle. The landscape is evolving constantly and your methodology should too. Showing a willingness to adapt to changing conditions and feedback is more likely to get people on board than if they feel they have to commit long term. Stefan Krawczyk put it neatly: “If you’re going to get into the MLOps space, you need to design for change.”
More Resources:
Content from the Library
How to Make Open-Source & Local LLMs Work in Practice
How to Get Open-Source LLMs Running Locally Heavybit has partnered with GenLab and the MLOps Community, which gathers thousands...
How to Start an Open-Source Project
How to Start an Open-Source Project Why is Heavybit posting a first-principles guide on how to create an open-source project?...