AWS Authorized Training Course - Big Data on AWS

Course Summary

The Big Data on AWS training course is designed to demonstrate the cloud-based big data solutions like Amazon EMR, Amazon Redshift, Amazon Kinesis, and other features of the AWS big data platform.

The course begins by utilizing Amazon EMR to process data using the broad ecosystem of Hadoop tools, like Hive and Hue. Next, it explores how to create big data environments through work with Amazon DynamoDB, Amazon Redshift, Amazon QuickSight, Amazon Athena and Amazon Kinesis. The course concludes by explaining how to design big data environments for security and cost-effectiveness.

Prerequisites:

- Basic familiarity with big data technologies, including Apache Hadoop, HDFS, and SQL/NoSQL querying
- Completed Data Analytics Fundamentals free digital training or equivalent experience
- Working knowledge of core AWS services and public cloud implementation
- Completed the AWS Technical Essentials classroom training or have equivalent experience
- Basic understanding of data warehousing, relational database systems, and database design

AWS Authorized Training is only available in Argentina, Brazil, Canada, Chile, Colombia, Costa Rica, Mexico, United States, and Peru.

THIS COURSE IS NOT ELIGIBLE FOR TRAINING BUNDLES.

Purpose	To demonstrate how to design big data environments for security and cost-effectiveness.
Audience	Individuals responsible for designing and implementing big data solutions, namely Solutions Architects and SysOps Administrators, Data Scientists and Data Analysts interested in learning about big data solutions on AWS
Role	Data Scientist - System Administrator
Skill Level	Intermediate
Style	Workshops
Duration	3 Days
Related Technologies	Amazon DynamoDB \| Amazon Redshift \| Hadoop \| Cloud Computing Training \| AWS

Productivity Objectives

Use Apache Hadoop with Amazon EMR
Launch and configure an Amazon EMR cluster
Utilize common programming frameworks for Amazon EMR, including Hive, Pig, and Streaming
Use Hue to improve the ease-of-use of Amazon EMR
Use in-memory analytics with Spark on Amazon EMR
Understand how services like AWS Glue, Amazon Kinesis, Amazon Redshift, Amazon Athena, and Amazon QuickSight can be used with big data workloads

What You'll Learn:

In the AWS Authorized Training Course - Big Data on AWS training course, you'll learn:

Overview of Big Data
- What is Big Data?
- The Big Data pipeline
- Big Data architectural principals
Big Data Ingestion and Transfer
- Data ingestion
- Transfer data
Big Data Streaming and Amazon Kinesis
- Stream process of Big Data
- Amazon Kinesis
- Amazon Kinesis Data Firehose
- Amazon Kinesis Video Streams
- Amazon Kinesis Data Analytics
- Stream and process Apache Server Logs using Amazon Kinesis Lab
Big Data Storage Solutions
- AWS data storage options
- Storage solutions concepts
- Factors in choosing a data store
Big data Processing and Analytics
- Big Data processing and analytics
- Amazon Athena
- Utilize Amazon Athena to Analyze Log Data Lab
Apache Hadoop and Amazon EMR
- Introduction to Amazon EMR and Apache Hadoop
- Best practices for ingesting data
- Amazon EMR and its architecture
- Store and Query Data on Amazon DynamoDB Lab
Using Amazon EMR
- Develop and run the application
- Launch the cluster
- Handle output from the completed jobs
Hadoop Programming Frameworks
- Hadoop frameworks
- Other frameworks for use on Amazon EMR
- Process Server Logs with Hive on Amazon EMR Lab
Web Interfaces on Amazon EMR
- Hue on Amazon EMR
- Monitor the cluster
- Run Pig Scripts in Hue on Amazon EMR Lab
Apache Spark on Amazon EMR
- Apache Spark
- Utilize Spark
- Processing NY Taxi Data Using Apache Spark Lab
Using AWS Glue to Automate ETL Workloads
- What is AWS Glue?
- AWS Glue: Job orchestration
Amazon Redshift and Big Data
- Data warehouses vs. traditional databases
- Amazon Redshift
- Amazon Redshift architecture
Securing Your Amazon Deployments
- Secure your Amazon deployments
- Amazon EMR security overview
- AWS Identity and Access Management (IAM) overview
- Secure data
- Amazon Kinesis security overview
- Amazon DynamoDB security overview
- Amazon Redshift security overview
Managing Big Data Costs
- Total cost considerations for Amazon EMR
- Amazon EC2 pricing models
- Amazon Kinesis pricing models
- Cost considerations for Amazon DynamoDB
- Cost considerations and pricing models for Amazon Redshift
- Optimize cost with AWS
Visualizing and Orchestrating Big Data
- Visualize big data
- Amazon QuickSight
- Orchestrate a big data workflow
- Utilize TIBCO Spotfire to visualize data Lab
Big Data Design Patterns
- Common architectures
Course Wrap-up
- What's next?

Real-World Content

Project-focused demos and labs using your tool stack and environment, not some canned "training room" lab.

Expert Practitioners

Industry experts that bring their battle scars into the classroom.

Experiential Learning

More coding than lecture, coupled with architectural and design discussions.

Tailored Outlines

One-size-fits-all doesn't apply to training teams. That's where we come in!

“I appreciated the instructor's technique of writing live code examples rather than using fixed slide decks to present the material.”

VMware

Dive in and learn more

When transforming your workforce, it's important to have expert advice and tailored solutions. We can help. Tell us your unique needs and we'll explore ways to address them.

Let's chat

First Name*

Last Name*

Business Email*

Company*

Job Title*

Phone*

Country*

Tell us about what you’re looking to accomplish:

By filling out this form and clicking submit, you acknowledge our privacy policy.

AWS Authorized Training Course - Big Data on AWS

Course Summary

Purpose

Audience

Role

Skill Level

Style

Duration

Related Technologies