This course, Applying MapReduce to Common Data Problems, helps you with three unique MapReduce patterns: summarizing numeric data, filtering large datasets, and building an index for fast data lookup. First, you'll learn about how you start "Thinking MapReduce" including what's involved and what needs to be broken down to start thinking in these terms. Next, you'll explore how to compute numeric summary metrics, and how to filter large data sets. Finally, you'll wrap up the course by learning about building indices, and why an inverted index is so important in the context of search engines. After watching this course, you'll have the confidence to spot patterns in MapReduce problems and will be on you're way to mastering this programming model.
A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.
Course Overview Hi, my name is Janani Ravi, and I'm very happy to meet you today. I have a master's degree in Electrical Engineering from Stanford, and have worked with companies such as Microsoft, Google, and Flipkart. At Google, I was one of the first engineers working on real-time collaborative editing in Google Docs, and I hold four patents for its underlying technologies. I currently work on my own startup, Loony Corn, a studio for high-quality video content. This course focuses on the backbone of the data technologies, the MapReduce programming model. You'll see that different problems require the application of a different MapReduce design pattern, and you'll study three of the most common patterns, numeric summarization, filtering records, and building an index. This course will help you see how you identify the key value output of the mapper, and the combining operation preformed in the reducer for summarization, filtering, and indexing problems. All of this is accompanied with actual code-alongs in Java, so you can see your solutions come to life, see how marital status affects the working hours of individuals based on census data, and build an inverted index to help power your basic search engine.