- A Cloud Guru
Querying Real Estate Data in S3 Using Amazon Athena
In this AWS hands-on lab, you’ll use Amazon Athena to query sample data of sold Manhattan houses stored in Amazon S3. To do so, you’ll first upload the sample data to Amazon S3, partition the data in Hive format, create an underlying table in Amazon Athena, and finally, use the Amazon Athena query editor to run SQL queries against the estate data.
Table of Contents
Populate S3 with Manhattan Real Estate Data
Download the CSV files from this lab's GitHub repository, and upload them to Amazon S3.
Note: For partitioning purposes, make sure to create a respective folder for each CSV file.
Set Up Amazon Athena
Create a folder in Amazon S3 and update query results in Amazon Athena using that folder.
Create a Table from the S3 Bucket Metadata
Create a database and a table using the first and second queries found in the README file. If everything is successful, you should see the table listed under Tables and views.
Add Partition Metadata
Load the partitions and confirm they have been loaded using the third command in the README file.
Query Data Using SQL
Query the data using SQL. You can use several different SQL queries to explore the data; for example, you can use the fourth command in the README file.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.