Author avatar

Gaurav Singhal

Implement Hyperparameter Tuning for Tensorflow2.0

Gaurav Singhal

  • Jul 31, 2020
  • 10 Min read
  • 647 Views
  • Jul 31, 2020
  • 10 Min read
  • 647 Views
Data
Data Analytics
Machine Learning
Tensorflow

Introduction

Remember how you used to tune the radio to improve the channel bandwidth for better sound quality and less background noise?

introduction image hyperparameter

Similarly, in machine learning (ML), you can improve the accuracy of a model (learning algorithm) by tuning hyperparameters, such as the learning rate. Hyperparameters are the parameters whose values are tuned to obtain optimal performance for a model.
Hyperparameter tuning is also known as hyperparameter optimization. Most programmers use exhaustive manual search, which has higher computation cost and is less interactive. TensorFlow 2.0 introduced the TensorBoard HParams dashboard to save time and get better visualization in the notebook.

Model optimization is a continuous process, as shown in the image below:

model optimization

This guide will use the inbuilt MNIST dataset, which can easily be loaded from the Keras API database. But before jumping into implementation let's get familiar with some terms.

What is a Hyperparameter?

In neural network (NN) design, hyperparameter values help the model find weights of a node to understand the pattern of an image, text, or speech more accurately. Their value is set before the training process and doesn't change during the training process.

You can tune values for the following hyperparameters:

  1. Number of units and nodes in the dense layer.

  2. Learning rate. This controls how quickly the model adapts to the problem. At each iteration, it will determine the step size while moving towards a minimum loss function. Range is between 0.0 and 1.0.

  3. Dropout layer. Dropout gives the probability of training a given node in the layer.

  4. Optimizer. To reduce loss and get results faster, an optimizer changes the weights and learning rate of a NN. Adam, SDG, rmsprop, and nadam are some of the most commonly used optimizers.

  5. L2 regularization. This chooses weights of small magnitude for the model to give a non-spare solution. Regularization is the sum of the square of all feature weights. Lambda is the hyperparameter tuned to strike the balance between simplicity and training-data fit.

This can improve your NN performance by reducing overfitting.

  1. Epochs. This defines the amount of time that the learning algorithm will take to run through the entire training set. For example, MNIST has 60,000 images, so one epoch means going through all 60,000 images at once.

  2. Activation functions. These introduce non-linearity into the output of the neurons. Some examples are given in the image below.

TensorBoard HParams Dashboard

Often in TensorFlow, while training a model, you just have the screen outputs displaying performance metrics. You can hardly track how the model achieves. To make it easier to understand, optimize, and debug TF programs, TF2.0 has introduced TensorBoard.

TensorBoard helps you visualize TF graphs, plot quantitative metrics, etc. This guide will focus on hyperparameter values using the HParams dashboard. The following steps in the HParams dashboard tools will help you identify the best practices to optimize a set of hyperparameters:

  1. Experiment setup and HParams summary
  2. Adapt TensorFlow runs to log hyperparameters and metrics
  3. Start runs and log them all under one parent directory
  4. Visualize the results in TensorBoard's HParams dashboard

Code Implementation

Pre-requisites

Start by installing TF 2.0 and loading the TensorBoard notebook extension:

1
%load_ext tensorboard
python

Clear any logs from previous runs:

1
!rm -rf ./logs/ 
python

Import TensorFlow and the TensorBoard HParams plugin:

1
2
import tensorflow as tf
from tensorboard.plugins.hparams import api as hp
python

Download the MNIST dataset and scale it:

1
2
3
4
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
python

downloading

1. Experiment Setup and HParams Experiment Summary

Experiment with four hyperparameters: in the model:

  1. Number of units in the first dense layer
  2. Dropout rate in dropout layer
  3. Optimizer
  4. L2 Regularizer
1
2
3
4
5
6
7
8
9
10
11
12
HP_NUM_UNITS = hp.HParam('num_units', hp.Discrete([256, 512]))
HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.5, 0.6)
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam','sgd','rmsprop']))
HP_L2 = hp.HParam('l2 regularizer', hp.RealInterval(.001,.01))

METRIC_ACCURACY = 'accuracy'

with tf.summary.create_file_writer('logs/hparam_tuning').as_default():
  hp.hparams_config(
    hparams=[HP_NUM_UNITS, HP_DROPOUT, HP_OPTIMIZER,HP_L2],
    metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accuracy')],
  )
python

2. Adapt TensorFlow Runs to Log Hyperparameters and Metrics

The model contains two dense layers with a dropout layer between them. The hyperparameters are not hardcoded, although the training code will be similar. All the hyperparameters are provided in the hparams dictionary.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def train_test_model(hparams):
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(hparams[HP_NUM_UNITS], kernel_regularizer=tf.keras.regularizers.l2(0.001), activation=tf.nn.relu),
    tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax),
  ])
  model.compile(
      optimizer=hparams[HP_OPTIMIZER],
      loss='sparse_categorical_crossentropy',
      metrics=['accuracy'],
  )

  model.fit(x_train, y_train, epochs=2) 
  _, accuracy = model.evaluate(x_test, y_test)
  return accuracy
python

For each run, log an HParams summary with the hyperparameters and final accuracy:

1
2
3
4
5
def run(run_dir, hparams):
  with tf.summary.create_file_writer(run_dir).as_default():
    hp.hparams(hparams)  # record the values used in this trial
    accuracy = train_test_model(hparams)
    tf.summary.scalar(METRIC_ACCURACY, accuracy, step=2)
python

3. Start Runs and Log them All Under One Parent Directory

You can now try multiple experiments, training each one with a different set of hyperparameters.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
session_num = 0

for num_units in HP_NUM_UNITS.domain.values:
  for dropout_rate in (HP_DROPOUT.domain.min_value, HP_DROPOUT.domain.max_value):
    for l2 in (HP_L2.domain.min_value, HP_L2.domain.max_value):
      for optimizer in HP_OPTIMIZER.domain.values:
        hparams = {
            HP_NUM_UNITS: num_units,
            HP_DROPOUT: dropout_rate,
            HP_L2: l2,
            HP_OPTIMIZER: optimizer,
        }
        run_name = "run-%d" % session_num
        print('--- Starting trial: %s' % run_name)
        print({h.name: hparams[h] for h in hparams})
        run('logs/hparam_tuning/' + run_name, hparams)
        session_num += 1
python

training

4. Visualize the Results in TensorBoard's HParams Dashboard

Open the HParams Dashboard. Once TensorBoard starts, click HParams at the top.

1
%tensorboard --logdir logs/hparam_tuning
python

Table View

Table View lists the name of the session and performance metrics of the hyperparameters. The square checkboxes allow you to limit the view of the metrics.

table view

Parallel Coordinate View

This view displays a run as a line (color-coded) that passes through the axis of each hyperparameter, and metrics at the end show the accuracy. It is important to know which set of hyperparameters is more important. If you place the mouse pointer on any axis, the run that passes through will get highlighted. You can reorder the axes by dragging them.

Parallel Co-ordinate View

Scatter Plot View

This view is used to identify the correlation between each metric. Click or hover over a session group to highlight the session across plots.

Scatter Plot View

Conclusion

Sorting the accuracy in descending order shows that the most optimized model has 512 units with a dropout rate of 0.5 and Adam optimizer with an L2 regularization rate of 0.01 and accuracy of 95.710%. The model can be optimized further. You can include more performance metrics for better visualization and understanding.

This guide gave a brief introduction to TensorBoard. TensorBoard's HParams dashboard provides amazing visualization to help you understand which hyperparameter can be further fine-tuned to make your NN model more accurate and reliable.

You can explore other TensorBoard features like graphs, projector, etc., here.

I hope you enjoyed learning. If you have any queries, feel free to contact me at CodeAlphabet.

3