Article

3 steps to training a machine learning model

3 steps to training a machine learning model

When you hear the words machine learning, you probably think of face recognition, robotics or self-driving cars. But it’s so much more than that. You don’t have to be inventing the next big thing to leverage the power of machine learning in your business. In fact, you should be considering all the ways machine learning could work for you today.

Machine learning is not a way to solve the problems you’re already familiar with. It’s a way to solve new problems, business issues and tasks with data-driven predictions. To understand how you can apply machine learning, you need to first understand how it works. Let’s start by training a machine learning model.

Step 1: Begin with existing data

Machine learning requires us to have existing data—not the data our application will use when we run it, but data to learn from. You need a lot of real data, in fact, the more the better. The more examples you provide, the better the computer should be able to learn. So just collect every scrap of data you have and dump it and voila! Right? 

Wrong. In order to train the computer to understand what we want and what we don’t want, you need to prepare, clean and label your data. Get rid of garbage entries, missing pieces of information, anything that’s ambiguous or confusing. Filter your dataset down to only the information you’re interested in right now. Without high quality data, machine learning does not work. So take your time and pay attention to detail.

Step 2: Analyze data to identify patterns

Unlike conventional software development where humans are responsible for interpreting large data sets, with machine learning, you apply a machine learning algorithm to the data. But don’t think you’re off the hook. Choosing the right algorithm, applying it, configuring it and testing it is where the human element comes back in.    

There are several platforms to choose from both commercial and open source. Explore solutions from Microsoft, Google, Amazon, IBM or open source frameworks like TensorFlow, Torch and Caffe. They each have their own strengths and downsides, and each will interpret the same dataset a different way. Some are faster to train. Some are more configurable. Some allow for more visibility into the decision process. In order to make the right choice, you need to experiment with a few algorithms and test until you find the one that gives you the results most aligned to what you’re trying to achieve with your data. 

When it’s all said and done, and you’ve successfully applied a machine learning algorithm to analyze your data and learn from it, you have a trained model.

Step 3: Make predictions

There is so much you can do with your newly trained model. You could import it into a software application you’re building, deploy it into a web back end or upload and host it into a cloud service. Your trained model is now ready to take in new data and feed you predictions, aka results. 

These results can look different depending on what kind of algorithm you go with. If you need to know what something is, go with a classification algorithm, which comes in two types. Binary classification categorizes data between two categories. Multi-class classification sorts data between—you guessed it—multiple categories. 

When the result you’re looking for is an actual number, you’ll want to use a regression algorithm. Regression takes a lot of different data with different weights of importance and analyzes it with historical data to objectively provide an end result. 

Both regression and classification are supervised types of algorithms, meaning you need to provide intentional data and direction for the computer to learn. There is also unsupervised algorithms which don’t require labeled data or any guidance on the kind of result you’re looking for. 

One form of unsupervised algorithms is clustering. You use clustering when you want to understand the structure of your data. You provide a set of data and let the algorithm identify the categories within that set. On the other hand, anomaly is an unsupervised algorithm you can use when your data looks normal and uniform, and you want the algorithm to pull anything out of the ordinary that doesn’t fit with the rest of the data. 

Although supervised algorithms are more common, it’s good to play around with each algorithm type and use case to better understand probability and practice splitting and training data in different ways. The more you toy with your data, the better your understanding of what machine learning can accomplish will become. 

Ultimately, machine learning helps you find new ways to make life easier for your customers and easier for yourself. Self-driving cars not necessary.