Just a couple years ago, AI/ML promised to revolutionize our world within a few decades. Yet already we hear rumblings about how it is failing to deliver on those claims. Common complaints sound like, “My machine learning model did so well in training, but it’s a major disappointment when deployed to production!” First of all, it’s far too early to write off AI/ML. Second of all, achieving its greatest value requires a shift in approach on our part as engineers and developers. If your machine learning models are working well in the lab but not the production line, try these three straightforward steps from Janani Ravi, founder of Loonycorn.
1. Avoid overfitting
This is the easiest of the three tips to implement, because the problem of overfitted models is so well researched and plenty of techniques are available. In short: “An overfitted model is one that does very well in the training phase, but works poorly with real data when it’s deployed,” Janani says. The causes are distinct, but generally the training data is too sparse or simple for the model to learn how to recognize general patterns. To avoid overfitting with a traditional machine learning algorithm, you can use one of these strategies:
- Regularization – which penalizes complex models and encourages simpler ones
- Cross-validation – where training data is distinct from validation data to better verify the model’s output
- Dropout (only if using neural networks) – intentionally turning off certain neurons during the training process forces other neurons to learn more from the data
- Ensemble learning – where predictions are aggregates of multiple individual prediction models instead of relying on a single machine learning model
2. Handle real-time data correctly
Training-serving skew has to do with how data is processed. “The training data that you use for your model is typically sourced from batch pipelines,” Janani explains. “The data probably lives on a file system or a database somewhere within your organization. You basically process that data well and you use it to train the model.” Yet the prediction data is streaming data and is often processed in a more ad-hoc manner.
This fundamental processing difference can be the cause of a model performing poorly when working with real-world prediction data. To mitigate this processing difference, you can make sure that batch and streaming data use the same path. Processed in the same manner, in the same pipeline. Several existing architectures allow you to integrate the processing of batch and streaming data. These include, as examples, Lambda architecture and Kappa architecture. Building your machine learning pipelines using such design principles ensures that the model learns to handle real-time data correctly.
3. Babysit your model after deployment
Building a machine learning model is a lot like having a kid—a great deal of the work is still required after deployment. Models are often subject to concept drift; any ML model tries to capture relationships that exist in the real world, and these relationships are dynamic. They change over time. Models require ongoing monitoring and need to be fed with updated data, or else they grow stale and become less useful over time.
Machine learning models need human minders in order to continue learning and stay relevant.
Those humans constantly monitor and retrain the model on new instances. In essence, the learning has to stay as dynamic as the real world the model is trying to predict. Human minders are even more critical when a model has concerted adversaries—such as the common AI use-cases of fraud detection, fake news detection and quantitative trading. These frequently have hackers looking to fool or cheat the model. Human minders review the output of these models to see what data is wrongly classified , and then to use the information from those instances to retrain the model into ongoing relevancy.
“Remember, deploying your machine learning model is simply the first step,” Janani stresses. “There’s a lot of work yet to be done. Don’t just move on to the next cool problem.” Hear more on how to avoid the AI hype trap from Janani in this on-demand webinar.