Mining Data from Text

This course discusses text and document feature vectors that can be passed into machine learning models, topic modeling using Latent Semantic Analysis, Latent Dirichlet Allocation, Non-negative Matrix Factorization, and keyword extraction using RAKE.
Course info
Level
Intermediate
Updated
Jun 28, 2019
Duration
2h 22m
Table of contents
Course Overview
Modeling Text Using Natural Language Processing
Building Classification Models Using Text Data
Understanding Topic Modeling
Implementing Topic Modeling
Understanding and Implementing Keyword Extraction
Description
Course info
Level
Intermediate
Updated
Jun 28, 2019
Duration
2h 22m
Description

A large part of the appeal of deep learning models is their ability to work with unstructured data types such as text, images, and video. However such models are only as good as the feature vectors that they operate on. In this course, Mining Data from Text, you will gain the ability to build highly optimized and efficient feature vectors from textual and document data. First, you will learn how to represent documents as numeric data using simple numeric identifiers for individual words as well as more elegant methods such as term frequency and inverse document frequency. Next, you will discover how to perform topic modeling using techniques such as latent semantic analysis, latent Dirichlet allocation, and non-negative matrix factorization. Finally, you will explore how to implement keyword extraction using a popular algorithm - RAKE. When you’re finished with this course, you will have the skills and knowledge to move on to build efficient and optimized feature vectors from a large document corpus and use those feature vectors in building powerful machine learning models.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Scraping Your First Web Page with Python
Beginner
2h 39m
Nov 5, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi, and welcome to this course on Mining Data from Text. A little about myself. I have a master's degree in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and Flipkart. At Google, I was one of the first engineers working on real-time collaborative editing in Google Docs, and I hold four patents for its underlying technologies. I currently work on my own start up, Loonycorn, a studio for high-quality video content. A large part of the appeal of deep learning models is their ability to work with unstructured data types such as text, images, and video. However, such models are only as good as the feature vectors they operate on. In this course, you will gain the ability to build highly optimized and efficient feature vectors from textual and document data. First, you will learn how to represent documents as numeric data using simple numeric identifiers for individual words, as well as more elegant methods such as term frequency and inverse document frequency. Next. you will discover how to perform topic modeling using techniques such as latent Dirichlet allocation and non-negative matrix factorization. Finally, you will explore how to implement keyword extraction using a popular algorithm, RAKE. When you're finished with this course, you will have the skills and knowledge to move on to build efficient and optimized feature vectors from a large document corpus and use those feature vectors in building powerful ML models.