Deploying Hadoop with Cloudera CDH to AWS
Course info



Course info



Description
Many years ago, hardware cost was pretty steep. It was not unexpected that a project with large amounts of data required 7 figures worth of hardware just to get started. But times have changed, and with cloud services it is possible now to store data cheaply and spin up as many servers with your desired specs to process this data with all kinds of available machines and get the answers that you need. In this course, Deploying Hadoop with Cloudera CDH to AWS, you will learn how to deploy Hadoop in the cloud. First you'll learn about some key topics. Then, you'll learn how to perform deployment manually. Finally, you'll learn about a specialized tool called Cloudera Director that helps automate deployments either for transient or for long running clusters. You will also learn about some differences between AWS and Azure/GCE. These differences can be important if you are working on a different platform, but by no means are they blockers for someone already familiar with their current platform. By the end of this course, you will be able to better manage your cloud needs.
Section Introduction Transcripts
Course Overview
Hi everyone, my name is Xavier Morera, and welcome to my course, Deploying Hadoop with Cloudera CDH to Amazon Web Services. I am very passionate about teaching, primarily helping developers understand, search, and bake data. Here's a fun fact; Did you know that the amount of data in the world right now is estimated at around 5 ZB and expected to grow up to 44 ZB by 2020? That's 44 trillion gigabytes. And at the moment, less than. 5% of the data is ever analyzed. Imagine the possibilities of what you can discover with the help of baked data. In this course we're going to learn how to deploy Hadoop in the cloud using Cloudera's distribution known as CDH on AWS to be precise. Some of the major topics that we will cover include preparing the prerequisites in AWS to deploy Hadoop. The cloud has many features, but there is only a small subset that we need to know. Planning required before deploying, this includes security, capacity planning, and understanding best practices for the different workload types. Then, we will deploy CDH manually, a similar process to deploying on-prem, but I will highlight the different steps. And finally, we'll learn how to automate cluster deployment and management with Cloudera Director. By the end of this course, you will be prepared to take your baked data to the cloud, taking advantage of the flexibility and power that AWS has to offer. Before beginning the course, it is desirable if you know about Linux, an overall idea of CDH and AWS, but if you don't, it is okay as I will present to you the detailed steps that are easy to follow. I hope you'll join me on this journey to learn about Cloudera in AWS with the Deploying Hadoop with Cloudera CDH to Amazon Web Services course at Pluralsight.