Introduction to the Azure Data Lake and U-SQL

The Big Data revolution has exposed the limitations of traditional data processing models like cubes and ETL. Learn a different way of doing things with the Azure Data Lake, using the U-SQL language to query raw data files and create databases.
Course info
Rating
(59)
Level
Beginner
Updated
Dec 18, 2017
Duration
2h 51m
Table of contents
Course Overview
Abandoning ETL with an Azure Data Lake and U-SQL
Tool Time – Saving Money with Visual Studio
Cutting Development Timescales
Building Dams - Structuring the Data Lake
Converting and Manipulating Data
Description
Course info
Rating
(59)
Level
Beginner
Updated
Dec 18, 2017
Duration
2h 51m
Description

Building good reporting structures can be difficult, especially when those pesky users keep asking for new reports. Throw Big Data into the mix and things become a lot more complicated. What if you didn’t need to build any data models at all, or you could build models that could be quickly put up and torn down? In this course, Introduction to the Azure Data Lake and U-SQL, you'll be introduced to Azure Data Lake and the U-SQL language, and learn how to abandon ETL. First, you'll delve into querying by using the powerful U-SQL language, built straight into the Azure Data Lake. Next, you'll discover how to throw your files into the Data Lake and query them directly without needing to load them into a database. Finally, you'll learn about how Azure Data Lakes offers the best of both worlds, with support for unstructured files and structured databases. By the end of this course, you’ll not only know what a Data Lake is, you’ll know how to populate it, query it, and develop for it using Visual Studio. Software required: Visual Studio 2017 Community Edition and Azure subscription (optional).

About the author
About the author

Mike loves to mess around with data and programming problems, the bigger the better. He’s worked with a variety of companies, helping to build and improve systems of all shapes and sizes.

More from the author
Improving Azure Data Lake Performance
Intermediate
1h 39m
13 Jun 2018
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi everyone, my name is Mike McQuillan and welcome to my course, Introduction to Azure Data Lakes and U-SQL. I'm a data specialist, consulting at various organizations we're all reliant on data these days, lots of data, big data; big data has exposed the inflexibility of traditional data processing solutions, like ETL and cubes, the Azure Data Lake offers an alternative approach which is much faster, very flexible, and fun. In this course we'll learn what Azure Data Lakes are and why they are incredibly useful. No prior experience is required. Some of the major topics we will cover include: the advantages of a data lake against traditional data processing solutions, creating data lake solutions with Visual Studio, creating databases and tables in the cloud, and writing queries with the U-SQL language. By the end of this course you'll know exactly what a data lake is, how to populate it, and how to query it using the U-SQL language. You'll be able to start creating your own really cool data lake solutions. Before beginning the course you should be familiar with general database concepts and some basic knowledge of the SQL language will be helpful. I hope you'll join me on this journey to learn about Azure Data Lakes, with the introduction to Azure Data Lakes and U-SQL course at Pluralsight.

Abandoning ETL with an Azure Data Lake and U-SQL
Hi there, I'm Mike McQuillan. Welcome to this course, Introduction to the Azure Data Lake and U-SQL. In this course, we'll see how an Azure Date Lake stands up against a more traditional data processing solution. We'll learn how to develop Azure Data Lake solutions using Visual Studio and how a Azure Data Lake solution can help cut development timescales. Well finish up by investigating how to structure data in the Data Lake and how we can write queries in the U-SQL language to interrogate the data. Does this all sound good? Great! Let's begin our introduction to the Azure Data Lake and U-SQL.

Tool Time – Saving Money with Visual Studio
Hey! Does everybody know what time it is? That's right, it's tool time! More specifically, it's Tool Time, Saving Money with Visual Studio, part of the introduction to the Azure Data Lake and U-SQL course. I am Mike McQuillan, let's see what this module has in store for us. The module begins by showing where you can find the Azure Data Lake tools for Visual Studio and how to install them. We'll then see the U-SQL project types available and how to create a U-SQUL project in V Studio on the local file system. With a project created, we'll move on to adding a U-SQL script to the project and executing that script. Once the script has been executed locally, we'll see how to connect visual studio to Azure and how to execute scripts directly against Azure from Visual Studio. The module finishes up with a look at the stages a U-SQL job passes through when executed against an Azure Data Lake. All the way through, we'll be discussing the tools Visual Studio provides to aid you with your Azure Data Lake and U-SQL developments. Okay, the scene is set. Let's tool up.

Cutting Development Timescales
Developers are always being put under pressure. Regular questions developers hear from managers include "We need this feature to go live in two days, " and "Why can't you just give me an estimate? " As developers, we're always looking for ways of cutting our development timescales, and in this module of the Introduce to the Azure Data Lake and U-SQL course, we'll see how the Azure Data Lake can help us do this. I'm Mike McQuillan for Pluralsight, and this is Cutting Development Timescales. We'll start this module off by taking a high-level look at the data we need to process. Once we know what needs to be done with the data, we'll compare the work involved to implement a SQL Server Integration Serves Solution as opposed to the work required to implement using an Azure Data Lake. The majority of this module will focus on how the solution is created using a Data Lake and U-SQL, including how to debug common errors when developing using U-SQL. We'll finish off by comparing the finished Azure Data solution to a completed SSIS solution. Time is marching on. Let's start cutting those timescales.

Building Dams - Structuring the Data Lake
Hello. Mike McQuillan here for Pluralsight. This module, Building Dams - Structuring the Data Lake, introduces structure to the Azure Data Lake in the form of databases. Let's take a look at what we'll learn in this module. We start off by investigating how the Azure Data Lake deals with structured and unstructured data before debating why using databases might be a better idea than querying files directly. The main part of the module concentrates on creating a database using U-SQL scripts, introducing the U-SQL statements needed to create databases and tables. This will include a brief introduction to indexes. We'll end the module by seeing how to insert data into database tables. I'm excited, are you? I hope so. Let's go build a database.

Converting and Manipulating Data
Hi there, I'm Mike McQuillan. In this final module of the Introduction to the Azure Data Lake and U-SQL course we're going to introduce how U-SQL can be used to convert and manipulate data. We'll begin by discussing the reports we are going to produce in this module. Then we'll start querying an Azure Data Lake database for data. The report will demonstrate core concepts like aggregation, joins, data ranges, and C Sharp expressions. We've seen U-SQL in action throughout the course, but we're going to really concentrate on it here in converting and manipulating data. Come on, time to write some code.