-
Course
- Data
Data Preparation and Exploration in Databricks
Are you planning for large-scale data preparation and analysis? This course will teach you Databricks - how to use it to explore, analyze, clean and transform data; store data in Delta format; and visualize it using Databricks charts and dashboards.
What you'll learn
Databricks is a unified analytics platform that can handle huge volumes of data, process it faster, and help explore and analyze the data in-depth.
In this course, Data Preparation and Exploration in Databricks, you’ll gain the ability to explore, analyze, clean, and transform data using the Databricks platform; store the processed data in Delta Lake format; and visualize the data using Databricks charts and dashboards. First, you’ll learn how to set up the Databricks environment. Then, you’ll discover how to extract data from multiple sources in Databricks - like Azure Data Lake Store and Databricks File System (DBFS) - and create Spark DataFrames. Next, you’ll go over how to explore and analyze data using Spark and Databricks features, clean and transform the data using Spark, and visualize the data using Databricks charts and dashboards.
Following this, you’ll see how to store the processed data in Data Lake or as Databricks Tables in Delta Lake format. Finally, you’ll learn how to do performance optimization in Databricks. When you’re finished with this course, you’ll have the skills and knowledge of Databricks needed to prepare and explore the data.
Table of contents
About the author
Mohit is a Data Engineer, a Microsoft Certified Trainer (MCT) and a consultant. Mohit has 15+ years of extensive experience in architecting large scale Business Intelligence, Data Warehousing and Big Data solutions with companies like Microsoft and some leading investment banks.
More Courses by Mohit