- Course
Big Data Foundations: Storage and Data Formats
Big data performance and efficiency depend on proper storage and data organization. This course will teach you how to use distributed storage, select file formats like Parquet and ORC, and apply data layout strategies effectively.
- Course
Big Data Foundations: Storage and Data Formats
Big data performance and efficiency depend on proper storage and data organization. This course will teach you how to use distributed storage, select file formats like Parquet and ORC, and apply data layout strategies effectively.
Get started today
Access this course and other top-rated tech content with one of our business plans.
Try this course for free
Access this course and other top-rated tech content with one of our individual plans.
This course is included in the libraries shown below:
- Data
What you'll learn
Big data workloads often face challenges such as inefficient storage, slow queries, and poorly organized data.
In this course, Big Data Foundations: Storage and Data Formats, you’ll gain the ability to design and manage storage and data formats that improve performance, efficiency, and scalability.
First, you’ll explore distributed storage systems and their core concepts, including replication, partitioning, and durability.
Next, you’ll discover big data file and table formats, how to choose between row-based and columnar formats, and the roles of schema enforcement, metadata, and compression.
Finally, you’ll learn how to organize data using partitioning, bucketing, sorting, and modern table formats like Iceberg, Delta Lake, and Hudi.
When you’re finished with this course, you’ll have the skills and knowledge of storage design, file formats, and data layout strategies needed to manage big data workloads effectively.