Course

Skills

Getting Started with HDFS

Learning to work with Hadoop Distributed File System (HDFS) is a baseline skill for anyone administering or developing in the Hadoop ecosystem. In this course, you will learn how to work with HDFS, Hive, Pig, Sqoop and HBase from the command line.

Preview this course

What you'll learn

Getting Started with Hadoop Distributed File System (HDFS) is designed to give you everything you need to learn about how to use HDFS to read, store, and remove files. In addition to working with files in Hadoop, you will learn how to take data from relational databases and import it into HDFS using Sqoop. After we have our data inside HDFS, we will learn how to use Pig and Hive to query that data. Building on our HDFS skills, we will look at how use HBase for near real-time data processing. Whether you are a developer, administrator, or data analyst, the concepts in this course are essential to getting started with HDFS.

Understanding HDFS

19mins

Creating, Manipulating, and Retrieving HDFS Files

47mins

Transferring Relational Data to HDFS Using Sqoop

22mins

Introducing Sqoop 1m
What Is Sqoop? 3m
Use Cases in Sqoop 3m
Sqoop Documentation 4m
Demo: MySQL Data for Sqoop Demo 3m
Sqoop Commands 2m
Demo: Moving Relational Data with Sqoop 4m
Summary 2m

Querying Data with Pig and Hive

36mins

Intro 2m
What Is Hive? 4m
Demo: Hive from the Shell 3m
Setting up the Hive Data 1m
Demo: Querying Data with Hive 7m
What Is Pig? 3m
Demo: Pig Grunt 3m
Setting up the Pig Data 1m
Demo: Querying Data with Pig 10m
Hive vs. Pig 4m

Processing Sparse Data with HBase

24mins

Introduction 1m
Defining HBase 2m
HBase Shell 2m
HBase Shell Commands 2m
Demo HBase Basic Shell Command 6m
Demo Moving Stock Data from HDFS to HBase 9m
Learn More About HBase 2m

Automating Basic HDFS Operations

18mins

What Is Bash Scripting? 4m
Why Automate Task? 1m
Demo: Basics of Bash Scripting 5m
Demo: Bash Scripting for Data Ingestion in HDFS 6m
Summary 3m

About the author

Thomas Henson

Thomas is a Senior Software Engineer and Certified ScrumMaster. During his career he has been involved in many projects from building web applications to setting up Hadoop clusters. Thomas's specialization is with Hortonworks Data Platform and Agile Software Development. Thomas is a proud alumnus of the University of North Alabama where he received his BBA - Computer Information System and his MBA - Information Systems. He currently resides in north Alabama with his wife and daughter, where ... morehe hacks away at running.

See more courses by Thomas Henson

Ready to upskill? Get started

Contact Sales

Getting Started with HDFS

What you'll learn

Table of contents

About the author

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Contact Sales

Getting Started with HDFS

What you'll learn

Table of contents

About the author

Get access now

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Ready to skill up
your entire team?

Ready to skill up
your entire team?