Preparing Data for Machine Learning with Java

by Federico Mestrone

Data is at the heart of machine learning. This course will teach you how to bring data into Java from various sources, as well as how to perform basic tidying up and transformations in view of further processing by specialized Java ML libraries.

What you'll learn

Machine learning algorithms require that data is formatted and presented in very specific ways. In this course, Preparing Data for Machine Learning with Java, you’ll learn to use the standard Java API to make data ready for ML libraries. First, you’ll explore various options to read files into Java objects and data structures. Next, you’ll discover how to scrape the web for data you could use in your ML models. Finally, you’ll learn how to perform transformation both in vanilla Java and at scale with the Beam SDK. When you’re finished with this course, you’ll have the skills and knowledge of data gathering needed to digitize various sources into Java data structures.

About the author

An IT professional, mostly in technical training and education, fond of Java/Scala and a Linux/Mac user, but experienced in a much wider range of topics, including some not-so-technological ones.

Ready to upskill? Get started