Getting Started with HBase: The Hadoop Database

As the data you store expands in size, traditional relational databases may no longer work. HBase has the ability to deal with billions of rows of data and each record can contains millions of fields. This course will help you get started with HBase.
Course info
Rating
(77)
Level
Beginner
Updated
Dec 29, 2016
Duration
2h 39m
Table of contents
Description
Course info
Rating
(77)
Level
Beginner
Updated
Dec 29, 2016
Duration
2h 39m
Description

Billions of records with millions of fields of semi-structured, unformatted data is the reality of the kind of data we are storing today. Traditional databases are bound by strict data layout requirements and constraints that, unfortunately, do not scale to meet big data requirements. HBase reimagines how data can be stored in a distributed system. This course, Getting Started with HBase: The Hadoop Database, teaches you how to use HBase from the start to finish. First, you'll learn how to design and layout data in a columnar format in order to optimize disk seeks and reduce read latency. Next, you'll learn how to manipulate and access this data using the command line HBase shell as well as the HBase Java API. Finally, you'll learn to process this data by performing complex aggregation and grouping operations using the MapReduce programming model with HBase. By the end of this course, you'll be ready to start making your data much more manageable using HBase.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Mining Data from Text
Intermediate
2h 21m
Jun 28, 2019
Building Regression Models with scikit-learn
Intermediate
2h 42m
Jun 28, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi, and I'm very happy to meet you today. I have a Master's degree in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and Flipkart. At Google, I was one of the first engineers working on real-time collaborative editing in Google Docs, and I hold four patterns for it on the line technologies. I currently work on my own startup, Looneycorn, a studio for high-quality video content. Etchbase is a distributed database built on the Hadoop distributed computing streambook. Etchbase has the ability to deal with billions of rows of data where each record can contain millions of fields. Etchbase allows you real-time random access to the data stored within it. It completely reimagines the design and layout of unstructured, sparse data. This course teaches you how to use Etchbase from very first principles. Learn how to design and layout data in a columnar format to optimize disk seeks and reduce read latency. Learn how to manipulate and access this data using both the command line utility or the Etchbase shell. As for last the Etchbase Java API processes data by performing complex aggregation and grouping operations using the map reduce programming model with Etchbase.