Indexing Data in Elasticsearch

This course explains the index distribution architecture of Elasticsearch, cluster configuration, shards and replicas, similarity models, advanced search, and mixed-language documents, all of which improve the performance of search queries.
Course info
Rating
(20)
Level
Intermediate
Updated
Mar 22, 2018
Duration
2h 47m
Table of contents
Description
Course info
Rating
(20)
Level
Intermediate
Updated
Mar 22, 2018
Duration
2h 47m
Description

Getting Elasticsearch up and running is very simple, but tuning it to have low latency and high performance for search queries requires a deep understanding of the index distribution architecture. In this course, Indexing Data in Elasticsearch, you will understand the structure of distributed indices and advanced search constructs such as similarity models, segment merging, suggesters, fuzzy searches and working with mixed-language documents. First, you will study why shard overallocation is a good thing and how you can configure your cluster to avoid the split-brain scenario. Then, you will see how indices can be configured to use different similarity models and how to use force merging of segments to improve the performance of large indices. Next, you will explore how to cache prudently and use advanced search features. Finally, you will learn to deal with different languages in the same document with the ICU plugin. At the end of this course, you will have a deep understanding of how indexing works in Elasticsearch and be comfortable with advanced query constructs.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Building Features from Image Data
Advanced
2h 10m
Aug 13, 2019
Designing a Machine Learning Model
Intermediate
3h 25m
Aug 13, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi, and welcome to this course on Elasticsearch Indexing. I'll introduce myself first. I have a master's degree in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and Flipkart. At Google, I was one of the first engineers working on realtime collaborative editing in Google Docs, and I hold four patents for its underlying technologies. I currently work on my own startup, Loonycorn, a studio for high-quality video content. Getting Elasticsearch up and running is very simple. Tuning it to have low latency and high performance for search queries requires a deep understanding of the index distribution architecture. In this course, we'll study why shard overallocation is a good thing, and how you can configure your cluster to avoid this split-brain scenario. We'll then study how our indices can be configured to use different similarities models, which affect how our documents are scored. We'll see how we can use force merging of segments to improve the performance of large indices, which have been around a long time. We'll also study how we can use caching prudently to improve query performance. Elasticsearch offers a number of advanced search features such as word, phrase, and context suggesters, fuzzy searches, and autocomplete. We'll cover examples of all of these. This course covers the Elasticsearch functionality to deal with different languages in the same document. We'll specifically cover the install and use of the ICU plugin for Asian languages. At the end of this course, you should have a deep understanding of how indexing works in Elasticsearch, and be comfortable with advanced query constructs.