role-iq-logo
Rock [Azure]
See all Azure roles
Description
Course info
Rating
(15)
Level
Intermediate
Updated
Dec 11, 2018
Duration
1h 9m
Description

Data lakes are used to hold vast amounts of data, a must when working with Big Data. In this course, Microsoft Azure Developer: Implementing Data Lake Storage Gen2, you will learn foundational knowledge and gain the ability to work with a large and HDFS-compliant data repository in Microsoft Azure. First, you will figure out how to ingest data. Next, you will discover how to manage and work with your Big Data. Finally, you will explore how to run jobs using a Hadoop cluster, using platforms like Spark with the use of the ABFS driver. When you're finished with this course, you will have the skills and knowledge of work with large data repositories in Microsoft's cloud, everything needed to build solutions at scale to help you discover trends and insights.

About the author
About the author

Xavier is very passionate about teaching, helping others understand search and Big Data. He is also an entrepreneur, project manager, technical author, trainer, and holds a few certifications with Cloudera, Microsoft, and the Scrum Alliance, along with being a Microsoft MVP.

More from the author
Programming Python Using an IDE
Intermediate
2h 0m
Jun 26, 2019
More courses by Xavier Morera
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi everyone, my name is Xavier Morera, and welcome to my course Implementing Microsoft Azure Data Lake Storage Gen2. I am very passionate about working with data. And you know which is one of the most widely used large data repositories when dealing with big data? If you guessed HDFS, or Hadoop Distributed File System, you are correct. And in this course we're going to learn how to work with an HDFS compliant large object store that's built on top of Azure Blob Storage, the Azure Data Lake Store Gen2. Some of the major topics that we will cover include understanding Azure Data Lake Store Gen2, creating a data lake using several different ways, including portal and PowerShell. Working with a data lake using several tools and services, like Azure Data Factory, DistCp, AzCopy, and the REST API. And running big data spark jobs using an HDInsight cluster. By the end of this course, you will be able to create a data lake, configure security, ingest data, and finally, extract, transform, and load data. Before beginning the course, you should be familiar working with Azure services. Big bonus if you know blob storage, but primarily with applications that work with HDFS. I hope you'll join me on this journey to learn about large object stores in the cloud with the Implementing Microsoft Azure Data Lake Storage Gen2 course at Pluralsight.