Language Modeling with Recurrent Neural Networks in TensorFlow

If you are working with text data using neural networks, RNNs are a natural choice for sequences. This course works through language modeling problems using RNNS - optical character recognition or OCR and generating text using character prediction.
Course info
Rating
(14)
Level
Intermediate
Updated
Mar 30, 2018
Duration
2h 35m
Table of contents
Description
Course info
Rating
(14)
Level
Intermediate
Updated
Mar 30, 2018
Duration
2h 35m
Description

Recurrent Neural Networks (RNN) performance and predictive abilities can be improved by using long memory cells such as the LSTM and the GRU cell. In this course, Language Modeling with Recurrent Neural Networks in Tensorflow, you will learn how RNNs are a natural fit for language modeling because of their inherent ability to store state. RNN performance and predictive abilities can be improved by using long memory cells such as the LSTM and the GRU cell. First, you will learn how to model OCR as a sequence labeling problem. Next, you will explore how you can architect an RNN to predict the next character based on past sequences. Finally, you will focus on understanding advanced functions that the TensorFlow library offers, such as bi-directional RNNs and the multi-RNN cell. By the end of this course, you will know how to apply and architect RNNs for use-cases such as image recognition, character prediction, and text generation; and you will be comfortable with using TensorFlow libraries for advanced functionality, such as the bidirectional RNN and the multi-RNN cell.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Using PyTorch in the Cloud: PyTorch Playbook
Intermediate
2h 21m
Apr 25, 2019
Building Clustering Models with scikit-learn
Intermediate
2h 33m
Apr 24, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi, and welcome to this course on Language Modeling using Recurrent Neural Networks in TensorFlow. A little about myself. I have a master's degree in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and Flipkart. At Google, I was one of the first engineers working on real-time collaborative editing in Google docs, and I hold four patents for its underlying technologies. I currently work on my own startup, Loonycorn, a studio for high-quality video content. RNNs are a natural fit for language modeling because they work very well with sequence data. An RNN's performance and its predictive abilities can be improved by using long memory cells such as the LSTM and the GRU cell. The TensorFlow library has powerful built-in functions to allow us to build complex RNNs. This course applies RNNs to solve common problems in language modeling. We first build a neural network for optical character recognition, OCR. Instead of using image data in two dimensions to identify patterns, RNNs use the context in which a character occurs for OCR. That processing performance can be further improved by using bidirectional RNNs which use future data to predict current state. The second problem is one of character prediction. A trained model can be used to generate very natural-sounding sentences where every character input is used to predict the next character in the sequence. A fully trained model on technical papers can generate sentences which mimic the writing the style of those papers. This course explains the RNN architecture for these problems and implements a fully fledged neural network to find their solutions.

Applying Bidirectional Recurrent Neural Networks to Word Recognition
Hi and welcome to this course on Language Modeling with Recurrent Neural Networks in TensorFlow. This very first module starts off with understanding how we can apply bidirectional recurrent neural networks word recognition. RNNs, or recurrent neural networks, are built to understand the meaning of events which occur over a certain time period. They work very well with time series or sequential data. This is because the recurrent neuron, as opposed to a simple feedforward neuron, has the ability to maintain internal state. The recurrent neuron has memory, which allows it learn and remember the past. Now the past can be as simple as simply feeding back the last output in the last time period back to the input of the neuron in the current time period, but it can also mean something more complicated. This is where LSTM and GRU cells come in. LSTM and GRU cells are recurrent neurons that can remember even the distant past. They can remember important events going very far back in time. LSTM stands for long short-term memory, and these are specially constructed memory cells which allow it retain state in very long time sequences, which result in very deep recurrent neural networks. The GRU cell is just a simplified version of the LSTM cell with better performance. In certain kinds of batch processing operations, you'll find that RNNs stand to perform better when they make two passes over the input data, one in a forward direction and another in a backward direction. These RNNs are called bidirectional RNNs. Bidirectional RNNs far outperform RNNs in word recognition and speech recognition, which is why they're widely used in language modeling.

Implementing Character Recognition Using Bidirectional RNNs
Hi, and welcome to this module where we'll implement character recognition using bidirectional RNNs. Our RNN implementation will use gated recurrent unit cells, a modified version of the LSTM, which is more performant and more efficient. We'll work on a modified version of the MIT OCR dataset. We'll start off by building a conventional RNN with just one forward layer and use this for character recognition. We'll measure the accuracy of our setup. Once we've done so, we'll manually build a bidirectional RNN with a forward layer, as well as a backward layer, which takes in the input in reverse. Bidirectional RNNs need not be constructed manually. The TensorFlow library offers a class which will allow is to set it up in a very easy manner. That's what we'll do in the third demo. OCR can be thought of as an image classification problem if you focus on the image and patterns within it, but here we are going to model OCR as a sequence labeling problem. We'll build a recurrent neural network, which will focus on identifying individual characters represented in the form of images by understanding the context in which that character occurs. Every word will be represented as a sequence of character images.

Applying RNNs to Character Prediction for Text Generation
Hi and welcome to this module. The problem that we solve here and in the next module is rather interesting. We'll apply RNNs to predict the next character in a sequence and use this prediction mechanism to generate arbitrary sentences of text. You'll be surprised at how coherent the generated text is once you've trained this RNN on a sufficiently large dataset for a sufficient number of epochs. In recent times, language modeling has become a particularly important area of research in machine learning. Language modeling involves applying some kind of statistical probability distribution over word sequences and using that distribution to generate text. Language modeling is widely used in areas such as speech recognition. Text prediction and generation is one of several classic language modeling problems. In this module, we'll study how to set up a recurrent neural network that can be used for character-level prediction and use this prediction to generate text. We'll use something called multi-RNNs which are recurrent neural networks built using multiple RNN cells which act as one memory cell. You'll see that a key technique to generating very plausible sentences is to smartly reinitialize the state of the RNN while we are predicting the next character. The evaluation measure that we'll use to determine how well our language model performs is called perplexity. Perplexity is a measure of how many choices a model has to choose between in order to make the next prediction.

Implementing RNNs for Character Prediction Used to Generate Text
Hi and welcome to this module where we'll build a recurrent neural network for character prediction, and once we've trained this model, we'll use it to generate text of arbitrary length. We'll see how well the generated text mimics actual, natural speech. Model that we'll build will train on abstracts of technical papers. These papers will be based on topics such as machine learning, neural networks, deep learning, so very topical. The cell that we'll use to build the RNN layers will be a multi-RNN cell. We'll have two GRU cells which come together to act as one. Once the model has been trained, we'll generate text one character at a time. Every character will generate the next character in sequence. In order to improve how our model performs, we'll reinitialize the state of the RNN during prediction with the last recurrent activation. The evaluation metric that we'll use to see how well our character prediction language model performs is perplexity.