AWS Big Data

Paths

AWS Big Data

Authors: Andrew Brust, Kim Schmidt, Reza Salehi, Ivan Mushketyk, Russ Thomas, Andru Estes, Matthew Alexander

Processing big data jobs is a common use of cloud resources mainly because of the sheer computing power needed. AWS has created several services that enable you to use big data... Read more

What you will learn:

  • AWS Athena
  • S3 Storage
  • DynamoDB
  • Redshift Data Warehouse
  • Kinesis
  • Elasticsearch
  • AWS Elastic MapReduce

Pre-requisites

This path is for learners who are already literate with big data but need to learn how to process their jobs on the AWS platform.

Beginner

In this section you get a survey of the services and capabilities of those services on AWS that are specific to big data. You'll also start learning about one of the AWS services, Athena.

Big Data on AWS: The Big Picture

by Andrew Brust

Oct 22, 2019 / 1h 33m

1h 33m

Start Course
Description

This course will teach you big data fundamentals in the context of Amazon Web Services (AWS), the leading cloud computing platform. In this course, Big Data on AWS: The Big Picture, you will learn foundational knowledge of big data concepts and the major big data services on AWS. First, you will learn all about core big data concepts, like data lakes, NoSQL and MapReduce. Next, you will discover the array of big data services available on AWS and how they tie together. After that, you'll learn the details of each service and see many of them demoed. When you’re finished with this course, you will have the skills and knowledge of big data on AWS needed to understand which combination of services is best suited to your organization's skill sets and which best meet your organization's needs.

Table of contents
  1. Course Overview
  2. Introduction: Big Data Concepts
  3. AWS Big Data Services
  4. Batch Analytics with Elastic MapReduce (EMR)
  5. AI and Streaming Data Processing with EMR
  6. Data Warehousing with Amazon Redshift
  7. (Big) Data Integration and Pipelines
  8. Visualizing Your Big Data with QuickSight
  9. Strategy

Serverless Analytics on AWS

by Kim Schmidt

Aug 20, 2019 / 2h 40m

2h 40m

Start Course
Description

How to architect and build big data analytics in the AWS cloud in the day of AI and ML has been transformed by both AWS Glue and Amazon Athena. In this course, Serverless Analytics on AWS, you'll gain the ability to have one centralized data source for all your globally scattered data silos regardless if the data is structured, unstructured, or semi-structured so you can perform multiple types of advanced analytics on the data without affecting the underlying data store. First, you'll learn how to use AWS Glue Crawlers, AWS Glue Data Catalog, and AWS Glue Jobs to dramatically reduce data preparation time, doing ETL “on the fly”. Next, you’ll discover how to immediately analyze your data without regard to data format, giving actionable insights within seconds. Finally, you’ll explore how to use AWS best practices to keep up by having AI and ML analytics incorporated into your analytics workflows, future-proofing your data. When you’re finished with this course, you'll have the skills and knowledge of using state of the art serverless technologies to provide a myriad of insight types whenever you need them.

Table of contents
  1. Course Overview
  2. Download and Install Course Prerequisites
  3. The State of Analytics in the AWS Cloud
  4. Infrastructure and Data Setup via Amazon CloudFormation
  5. The Power of AWS Glue
  6. Creating AWS Glue Resources and Populating the AWS Glue Data Catalog
  7. The Power of Amazon Athena
  8. How to AI and ML Your Apps and Business Processes

Intermediate

In this section you will learn about the different databases and storage options that you can implement on AWS that are suited to big data jobs.

Implementing Amazon S3 Storage on AWS

by Reza Salehi

Feb 23, 2019 / 3h 26m

3h 26m

Start Course
Description

AWS S3 is one of the most fundamental services offered by Amazon. S3 is also used by several other AWS services as well as Amazon's own websites. The securing, auditing, versioning, automating, and optimizing cost for S3 can be a challenge for engineers and architects who are new to AWS. In this course, Implementing Amazon S3 Storage on AWS, you will gain the ability to get the most out of your Amazon S3 service. First, you will learn how to create buckets, upload objects to the storage class matching your need and budget, and retrieve them. Next, you will discover how to apply the recommended security practices to your S3 buckets and audit access to them. Finally, you will explore how to work with multiple object versions, archive cold data in S3 Glacier, and configure life-cycle rules to automatically save big on your S3 costs. When you are finished with this course, you will have the skills and knowledge of Amazon S3 needed to use it as your main cloud-based storage option.

Table of contents
  1. Course Overview
  2. Creating S3 Buckets
  3. Securing Your Data
  4. Managing S3 Buckets
  5. Transferring and Migrating Data
  6. Data Life-cycles

AWS DynamoDB Deep Dive

by Ivan Mushketyk

Jun 27, 2019 / 6h 8m

6h 8m

Start Course
Description

With recent advancements in modern technologies, such as the sharp growth of the IoT sector, we need databases that can handle loads that are magnitudes higher than before. AWS DynamoDB is a NoSQL database that addresses these new challenges. It is easy to operate and has a myriad of powerful features. Unlike other databases that require complicated installation and support, DynamoDB allows you to bootstrap a fully-fledged database that can operate on a high scale within minutes. In this course, AWS DynamoDB Deep Dive, you will learn how to develop applications that fully utilize the power of DynamoDB. You will explore how to process a stream of updates to DynamoDB tables in real time, how DynamoDB works under the hood, how to use DynamoDB transactions, how other AWS services integrate with DynamoDB, and how you can use them to get the most out of it. By the end of this course you will have a deeper understanding of DynamoDB, one of the core services which should be studied by anyone who is serious about using AWS.

Table of contents
  1. Course Overview
  2. Introduction to DynamoDB
  3. Getting Started with DynamoDB
  4. DynamoDB API
  5. Introduction to the High-level Interface
  6. Queries with the High-level Interface
  7. DynamoDB Streams
  8. DynamoDB Transactions
  9. DynamoDB Best Practices
  10. Data Analytics with DynamoDB

Building Your First Amazon Redshift Data Warehouse

by Russ Thomas

Mar 9, 2018 / 2h 40m

2h 40m

Start Course
Description

Amazon Redshift brings the power of scale-out architecture to the world of traditional data warehousing. In Building Your First Amazon Redshift Data Warehouse, you will explore this low cost, cloud based storage that can be scaled up or down to meet your true size and performance needs. First, you will learn to stand up and configure a sample data warehouse. Next, you will explore the internal workings and architecture of Redshift and what makes it so fast. Finally, you will get hands on experience connecting, querying, and building BI and data viz products as well as learn how to secure, maintain, and administer your new platform. By the end of this course, you will be able to scale from gigabytes to petabytes on this high performance column-oriented SQL engine.

Table of contents
  1. Course Overview
  2. Answering the Question: "Why Amazon Redshift?"
  3. Populating Redshift
  4. Connecting, Querying, and Consuming Data
  5. Securing Your Data Warehouse
  6. Exploring Advanced System Topics

Advanced

In this section you will learn about the services that aid in processing your streams of big data on AWS.

Developing Stream Processing Applications with AWS Kinesis

by Ivan Mushketyk

Mar 1, 2018 / 3h 32m

3h 32m

Start Course
Description

The landscape of the Big Data field is changing. Previously, you could get away with processing incoming data for hours or even days. Now you need to do it in minutes or even seconds. These challenges require new solutions, new architectures, and new tools. In Developing Stream Processing Applications with AWS Kinesis, you will learn the ins and outs of AWS Kinesis. First, you will learn how it works, how to scale it up and down, and how to write applications with it. Next, you will explore how to use a variety of tools to work with it such as Kinesis Client Library, Kinesis Connector Library, Apache Flink, and AWS Lambda. Finally, you will discover how to use more high-level Kinesis products such as Kinesis Firehose and how to write streaming applications using SQL queries with Kinesis Analytics. When you are finished with this course, you will have an in-depth knowledge of AWS Kinesis that will help you to build your streaming applications.

Table of contents
  1. Course Overview
  2. Kinesis Fundamentals
  3. Developing Applications Using Kinesis Client Library
  4. Implementing Advanced Kinesis Consumers
  5. Funneling Data with Kinesis Firehose
  6. Implementing Stream Analysis Applications Using Streaming SQL

Handling and Analyzing Data with AWS Elastic MapReduce

by Andru Estes

Aug 22, 2019 / 2h 19m

2h 19m

Start Course
Description

A lot of people hear about big data analyzation, but how can you use it for your use cases? In this course, Handling and Analyzing Data with AWS Elastic MapReduce, you’ll learn foundational knowledge and gain the ability to use AWS Elastic MapReduce to perform data analyzation. First, you’ll explore configuring AWS EMR and Hadoop. Next, you’ll discover how to process, move, and query data using big data frameworks. Finally, you’ll learn how to stream and analyze data using Apache products and MLlib. When you’re finished with this course, you’ll have the skills and knowledge of using AWS EMR needed to handle and analyze your own big data datasets.

Table of contents
  1. Course Overview
  2. Configuring Elastic MapReduce in a Pipeline
  3. Processing, Moving, and Querying Data
  4. Streaming and Analyzing Data with Apache Products
  5. Adding Machine Learning to the Pipeline

AWS Big Data in Production

by Matthew Alexander

Jun 26, 2019 / 1h 27m

1h 27m

Start Course
Description

As the world of business continues to operate at a normal pace, the amount of data that is generated grows almost exponentially. Handling this increase in data requires both intelligent applications and smart tooling. In this course, AWS Big Data in Production, you will learn how to strategically implement big data on AWS in production environments. First, you will learn how to automate infrastructure provisioning with CloudFormation all the while controlling costs. Next, you will discover how to secure customer data through IAM and encryption at rest with S3 and EBS. Finally, you will explore how to visualize data using QuickSight. When you're finished with this course, you will have the skills and knowledge of big data practices needed to enrich your current big data systems.

Table of contents
  1. Course Overview
  2. Automating Governance with CloudFormation
  3. Securing Data with IAM and Encryption at Rest
  4. Monitoring Availability with CloudWatch
  5. Visualizing Data with QuickSight
Offer Code *
Email * First name * Last name *
Company
Title
Phone
Country *

* Required field

Opt in for the latest promotions and events. You may unsubscribe at any time. Privacy Policy

By providing my phone number to Pluralsight and toggling this feature on, I agree and acknowledge that Pluralsight may use that number to contact me for marketing purposes, including using autodialed or pre-recorded calls and text messages. I understand that consent is not required as a condition of purchase from Pluralsight.

By activating this benefit, you agree to abide by Pluralsight's terms of use and privacy policy.

I agree, activate benefit