Simple play icon Course
Skills Expanded

Creating Your First Big Data Hadoop Cluster Using Cloudera CDH

by Xavier Morera

Data by itself has no meaning, it is what you do with it that counts. In this course, you'll fast track to Hadoop & Big Data with the Cloudera QuickStart VM and then you'll learn how to set up a Hadoop cluster with Cloudera CDH.

What you'll learn

"Ask Bigger Questions" is Cloudera's vision. You may not be familiar with this phrase, but you're likely familiar with "Knowledge is Power". To get knowledge you need to analyze and understand huge amounts of structured and unstructured data - Big Data. In this course, Creating Your First Big Data Hadoop Cluster Using Cloudera CDH, you'll get started on Big Data with Cloudera, taking your first steps with Hadoop using a pseudo cluster and then moving on to set up our own cluster using CDH, which stands for Cloudera's Distribution including Hadoop. First, you'll explore the case for Hadoop, Big Data, and Cloudera. Next, you'll learn about the fast track to Big Data with Cloudera's QuickStart VM and you'll also learn how to create a visualization environment with VirtualBox. Then, you'll discover how to create a Linux clean cluster with CentOS. Finally, you'll follow the steps to install and configure a cluster with the help of Cloudera Manager. By the end of this course, you'll have a Hadoop cluster, and you'll be ready to start your journey to Big Data.

Course FAQ

What are Hadoop clusters and what are they used for?

Hadoop clusters are collections of computers, known as nodes, that are networked together to perform these kinds of parallel computations on big data sets. Hadoop clusters consist of a network of connected master and slave nodes that utilize high availability, low-cost commodity hardware.

What is Clourdera?

Cloudera is a software company that provides an enterprise data cloud accessible via a subscription. Cloudera is built on open source technology that uses analytics and machine learning to yeild insights from data through a secure connection.

What software is needed for this course?

To complete this course, you will need the Cloudera Quickstart VM and Cloudera CDH software.

What are data clusters?

A data cluster is a sub-group of data which shares similar characteristics and is significantly different to other clusters in a database, usually defined by the statistical technique of cluster analysis.

What will you learn in this Hadoop course?

In this course, you will learn about big data and how to create data clusters. You will also learn how to create a visualization environment with VirtualBox. Finally, you'll discover how to create a Linux clean cluster with CentOS. By the end of this course you will have a Hadooop cluster, and you'll be ready to embark in big data.

About the author

Xavier is very passionate about teaching, helping others understand search and Big Data. He is also an entrepreneur, project manager, technical author, trainer, and holds a few certifications with Cloudera, Microsoft, and the Scrum Alliance, along with being a Microsoft MVP. He has spent a great deal of his career working on cutting-edge projects with a primary focus on .NET, Solr, and Hadoop among a few other interesting technologies. Throughout multiple projects, he has acquired skills to deal... more

Ready to upskill? Get started