Paths

AWS Big Data

Authors: Aaron Medacco, Reza Salehi, Ivan Mushketyk, Russ Thomas, Matthew Alexander

Processing big data jobs is a common use of cloud resources mainly because of the sheer computing power needed. AWS has created several services that enable you to use big data... Read more

What you will learn:

  • AWS Athena
  • S3 Storage
  • DynamoDB
  • Redshift Data Warehouse
  • Kinesis
  • Elasticsearch
  • AWS Elastic MapReduce

Pre-requisites

This path is for learners who are already literate with big data but need to learn how to process their jobs on the AWS platform.

Beginner

In this section you get a survey of the services and capabilities of those services on AWS that are specific to big data. You'll also start learning about one of the AWS services, Athena.

Getting Started with AWS Athena

by Aaron Medacco

Aug 23, 2017 / 2h 14m

2h 14m

Start Course
Description

Ever wish you could query data without needing to provision, manage, and configure infrastructure and software? Enter AWS Athena, a scalable, serverless, and interactive query service newly provided by Amazon Web Services. In this course, Getting Started with AWS Athena, you'll learn how to utilize Athena and perform ad-hoc analysis of data in the Amazon Cloud. First, you'll explore how to setup user access, and define schemas which point to your S3 data. Next, you'll discover how to query information using SQL in a few simple steps. Finally, you'll delve into how Athena works from behind the scenes and understand the best practices that drive Athena cost and performance optimization. By the end of this course, you'll have the skills and knowledge necessary to start implementing solutions with AWS Athena on your own datasets in your own AWS environments.

Table of contents
  1. Course Overview1m
  2. Exploring AWS Athena26m
  3. Establishing Access to Your Data10m
  4. Understanding AWS Athena Internals17m
  5. Creating Databases & Tables to Define Your Schema21m
  6. Retrieving Information by Querying Tables with SQL13m
  7. Optimizing Cost and Performance Using Best Practices33m
  8. AWS Athena vs. Other Solutions10m

Intermediate

In this section you will learn about the different databases and storage options that you can implement on AWS that are suited to big data jobs.

Implementing Amazon S3 Storage on AWS

by Reza Salehi

Feb 23, 2019 / 3h 25m

3h 25m

Start Course
Description

AWS S3 is one of the most fundamental services offered by Amazon. S3 is also used by several other AWS services as well as Amazon's own websites. The securing, auditing, versioning, automating, and optimizing cost for S3 can be a challenge for engineers and architects who are new to AWS. In this course, Implementing Amazon S3 Storage on AWS, you will gain the ability to get the most out of your Amazon S3 service. First, you will learn how to create buckets, upload objects to the storage class matching your need and budget, and retrieve them. Next, you will discover how to apply the recommended security practices to your S3 buckets and audit access to them. Finally, you will explore how to work with multiple object versions, archive cold data in S3 Glacier, and configure life-cycle rules to automatically save big on your S3 costs. When you are finished with this course, you will have the skills and knowledge of Amazon S3 needed to use it as your main cloud-based storage option.

Table of contents
  1. Course Overview1m
  2. Creating S3 Buckets46m
  3. Securing Your Data48m
  4. Managing S3 Buckets37m
  5. Transferring and Migrating Data27m
  6. Data Life-cycles44m

AWS DynamoDB Deep Dive

by Ivan Mushketyk

Jun 27, 2019 / 6h 8m

6h 8m

Start Course
Description

With recent advancements in modern technologies, such as the sharp growth of the IoT sector, we need databases that can handle loads that are magnitudes higher than before. AWS DynamoDB is a NoSQL database that addresses these new challenges. It is easy to operate and has a myriad of powerful features. Unlike other databases that require complicated installation and support, DynamoDB allows you to bootstrap a fully-fledged database that can operate on a high scale within minutes. In this course, AWS DynamoDB Deep Dive, you will learn how to develop applications that fully utilize the power of DynamoDB. You will explore how to process a stream of updates to DynamoDB tables in real time, how DynamoDB works under the hood, how to use DynamoDB transactions, how other AWS services integrate with DynamoDB, and how you can use them to get the most out of it. By the end of this course you will have a deeper understanding of DynamoDB, one of the core services which should be studied by anyone who is serious about using AWS.

Table of contents
  1. Course Overview1m
  2. Introduction to DynamoDB20m
  3. Getting Started with DynamoDB55m
  4. DynamoDB API38m
  5. Introduction to the High-level Interface25m
  6. Queries with the High-level Interface48m
  7. DynamoDB Streams58m
  8. DynamoDB Transactions34m
  9. DynamoDB Best Practices50m
  10. Data Analytics with DynamoDB33m

Building Your First Amazon Redshift Data Warehouse

by Russ Thomas

Mar 9, 2018 / 2h 40m

2h 40m

Start Course
Description

Amazon Redshift brings the power of scale-out architecture to the world of traditional data warehousing. In Building Your First Amazon Redshift Data Warehouse, you will explore this low cost, cloud based storage that can be scaled up or down to meet your true size and performance needs. First, you will learn to stand up and configure a sample data warehouse. Next, you will explore the internal workings and architecture of Redshift and what makes it so fast. Finally, you will get hands on experience connecting, querying, and building BI and data viz products as well as learn how to secure, maintain, and administer your new platform. By the end of this course, you will be able to scale from gigabytes to petabytes on this high performance column-oriented SQL engine.

Table of contents
  1. Course Overview1m
  2. Answering the Question: "Why Amazon Redshift?"40m
  3. Populating Redshift41m
  4. Connecting, Querying, and Consuming Data34m
  5. Securing Your Data Warehouse27m
  6. Exploring Advanced System Topics15m

Advanced

In this section you will learn about the services that aid in processing your streams of big data on AWS.

Developing Stream Processing Applications with AWS Kinesis

by Ivan Mushketyk

Mar 1, 2018 / 3h 32m

3h 32m

Start Course
Description

The landscape of the Big Data field is changing. Previously, you could get away with processing incoming data for hours or even days. Now you need to do it in minutes or even seconds. These challenges require new solutions, new architectures, and new tools. In Developing Stream Processing Applications with AWS Kinesis, you will learn the ins and outs of AWS Kinesis. First, you will learn how it works, how to scale it up and down, and how to write applications with it. Next, you will explore how to use a variety of tools to work with it such as Kinesis Client Library, Kinesis Connector Library, Apache Flink, and AWS Lambda. Finally, you will discover how to use more high-level Kinesis products such as Kinesis Firehose and how to write streaming applications using SQL queries with Kinesis Analytics. When you are finished with this course, you will have an in-depth knowledge of AWS Kinesis that will help you to build your streaming applications.

Table of contents
  1. Course Overview1m
  2. Kinesis Fundamentals47m
  3. Developing Applications Using Kinesis Client Library49m
  4. Implementing Advanced Kinesis Consumers40m
  5. Funneling Data with Kinesis Firehose22m
  6. Implementing Stream Analysis Applications Using Streaming SQL49m

AWS Big Data in Production

by Matthew Alexander

Jun 26, 2019 / 1h 27m

1h 27m

Start Course
Description

As the world of business continues to operate at a normal pace, the amount of data that is generated grows almost exponentially. Handling this increase in data requires both intelligent applications and smart tooling. In this course, AWS Big Data in Production, you will learn how to strategically implement big data on AWS in production environments. First, you will learn how to automate infrastructure provisioning with CloudFormation all the while controlling costs. Next, you will discover how to secure customer data through IAM and encryption at rest with S3 and EBS. Finally, you will explore how to visualize data using QuickSight. When you're finished with this course, you will have the skills and knowledge of big data practices needed to enrich your current big data systems.

Table of contents
  1. Course Overview1m
  2. Automating Governance with CloudFormation23m
  3. Securing Data with IAM and Encryption at Rest21m
  4. Monitoring Availability with CloudWatch31m
  5. Visualizing Data with QuickSight10m
Offer Code *
Email * First name * Last name *
Company
Title
Phone
Country *

* Required field

Opt in for the latest promotions and events. You may unsubscribe at any time. Privacy Policy

By activating this benefit, you agree to abide by Pluralsight's terms of use and privacy policy.

I agree, activate benefit