The Big Data on AWS training course is designed to demonstrate the cloud-based big data solutions like Amazon EMR, Amazon Redshift, Amazon Kinesis, and other features of the AWS big data platform.
The course begins by utilizing Amazon EMR to process data using the broad ecosystem of Hadoop tools, like Hive and Hue. Next, it explores how to create big data environments through work with Amazon DynamoDB, Amazon Redshift, Amazon QuickSight, Amazon Athena and Amazon Kinesis. The course concludes by explaining how to design big data environments for security and cost-effectiveness.
Prerequisites:
- Basic familiarity with big data technologies, including Apache Hadoop, HDFS, and SQL/NoSQL querying
- Completed Data Analytics Fundamentals free digital training or equivalent experience
- Working knowledge of core AWS services and public cloud implementation
- Completed the AWS Technical Essentials classroom training or have equivalent experience
- Basic understanding of data warehousing, relational database systems, and database design
THIS COURSE IS NOT ELIGIBLE FOR TRAINING BUNDLES.
Purpose
| To demonstrate how to design big data environments for security and cost-effectiveness. |
Audience
| Individuals responsible for designing and implementing big data solutions, namely Solutions Architects and SysOps Administrators, Data Scientists and Data Analysts interested in learning about big data solutions on AWS |
Role
| Data Scientist - System Administrator |
Skill Level
| Intermediate |
Style
| Workshops |
Duration
| 3 Days |
Related Technologies
| Amazon DynamoDB | Amazon Redshift | Hadoop | Cloud Computing Training | AWS |
Productivity Objectives
- Use Apache Hadoop with Amazon EMR
- Launch and configure an Amazon EMR cluster
- Utilize common programming frameworks for Amazon EMR, including Hive, Pig, and Streaming
- Use Hue to improve the ease-of-use of Amazon EMR
- Use in-memory analytics with Spark on Amazon EMR
- Understand how services like AWS Glue, Amazon Kinesis, Amazon Redshift, Amazon Athena, and Amazon QuickSight can be used with big data workloads