- Lab
- A Cloud Guru

Performing Real-Time Data Analysis with Kinesis
Easily ingesting data from numerous sources and making timely decisions is becoming a critical and core capability for many businesses. In this lab, we provide hands-on experience using Kinesis Data Firehose to capture and load data streams into Amazon S3 and perform near real-time analysis on the stream with Kinesis Data Analytics.

Path Info
Table of Contents
-
Challenge
Create a Kinesis Data Firehose Stream
- Log in to the AWS Console and navigate to Kinesis Data Firehose.
- Create a delivery stream named
captains-kfh
that will send our space captain scores to a new S3 bucket that you will create.- To save time during the lab, set the buffer sizes to the minimum values so data gets flushed from the stream faster. In a real environment, you will need to tune these values based on what you're doing with the data.
- This lab isn't focused on IAM, so an IAM role named FirehoseDeliveryRole (with some characters for uniqueness) has been provided for this stream. For an extra challenge, you can create your own role.
-
Challenge
Send Data to the Stream
- Log in to the provided server using the credentials in the lab.
- View the
send_captains_to_cloud.py
script in your user's home directory. - Run the
send_captains_to_cloud.py
script using Python3 to generate and send data to Firehose. The generated data will be displayed in the terminal. - Back in the AWS Console, monitor the Firehose stream to see data coming in.
- This may take a minute to begin populating, so refresh a few times if you don't see any data.
- Once you see data on the Console, go back to the server and stop the script.
- Pull the generated data from S3 onto the server, then inspect it. It should match what was printed in the terminal.
- Start the data generating script again so we have data coming into the stream.
-
Challenge
Find the Average Captain Ratings
- Create a new Kinesis Data Analytics application using the data from the
captains-kfh
stream.- Again, an IAM role has been provided. Feel free to use this, or for the extra challenge, create a new role yourself.
- Using the SQL editor, create a query that will show the average rating and total rating of each captain per minute.
- Check the Amazon Kinesis Data Analytics SQL Reference documentation for help.
- An example query has been provided for you on GitHub.
- Save and run the query.
- After about a minute, you will see the results of your query streaming in.
- Create a new Kinesis Data Analytics application using the data from the
-
Challenge
Find Anomalous Captain Ratings
- Using the SQL editor, create a query that will rank the incoming captain ratings by how anomalous the rating is, displaying the most anomalous values first.
- Check the Random Cut Forest documentation for help.
- An example query has been provided for you on GitHub.
- Save and run the query.
- After a few seconds, you will see the results of your query streaming in.
- Using the SQL editor, create a query that will rank the incoming captain ratings by how anomalous the rating is, displaying the most anomalous values first.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.

