Debugging and Monitoring TensorFlow Programs

This course goes deep into two specific tools in the TensorFlow toolkit - tfdbg and TensorBoard. These tools can be used to examine the internal state of TensorFlow programs and to visualize execution metrics and state.
Course info
Level
Intermediate
Updated
Mar 21, 2018
Duration
2h 17m
Table of contents
Description
Course info
Level
Intermediate
Updated
Mar 21, 2018
Duration
2h 17m
Description

An important facet of building good ML models is the ability to debug TensorFlow code when your models do not converge. Traditional debuggers fall short in this regard which is why tfdbg and TensorBoard are important skills in your toolkit. In this course, Debugging and Monitoring TensorFlow Programs, you will learn how you can adapt TensorFlow commands and library functions to help debug your programs in addition to learning specialized tools like tfdbg and Tensorboard. First, you will go over TensorFlow's special features to debug your code. Partial graph executions, tf.Print() and tf.Assert() statements, traditional Python debuggers and the tf.py_func() to interpose arbitrary Python code into your computation graph all help debug the graph build phase. Next, you will see that the specialized TensorFlow debugger tfdbg works very much like traditional Python debuggers but has the ability to step into session.run() statements and display the state of your computation graph at every step. It also has filters like the has_inf_or_nan which allows you to break at the exact point your model begins to diverge. Finally, you will be shown Tensorboard, which is a browser-based tool that helps you visualize your computation graph and view how control flows through your code. In addition, it can be used to display execution metrics and the current state of your program. After finishing this course, you will be closer to mastering TensorFlow through equipping you with important tools to build and debug robust machine learning models.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Building Features from Image Data
Advanced
2h 10m
Aug 13, 2019
Designing a Machine Learning Model
Intermediate
3h 25m
Aug 13, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi and welcome to this course on Debugging and Monitoring TensorFlow Programs. I'll introduce myself. I have a master's degree in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and FlipKart. At Google, I was one of the first engineers working on real-time collaborative editing in Google Docs and I hold four patents for its underlying technologies. I currently work on my own startup, Loonycorn, a studio for high-quality video content. Tensorflow offers a number of special features to debug your code. Partial graph executions, tf. Print and tf. assert statements, traditional Python debuggers, and tf. py_func, which can be used to interpose arbitrary Python code into your computation graph all help debug the graph build phase. In order to debug the execution of your computation graph, you need the specialized TensorFlow debugger tfdbg. It works very much like traditional Python debuggers, but has the ability to step into session. run statements and display the state of computation graph at every step. This debugger also has filters like the has_inf_or_nan filter, which allows you to break at the exact point your model beings to diverge. This course also covers TensorBoard, which is a browser-based tool which allows you to visualize your computation graph and view how control flows through your code. In addition, it can be used to display execution metrics and the current state of your program. This course will help you gain mastery over TensorFlow by equipping you with important tools to build and debug robust machine learning models.