Grad-CAM from scratch

The goal of this exercise is to make a saliency map using Grad-CAM.

Your final image may resemble the one below:

For this exercise, we will use the MobileNetV2 pre-trained model. You will apply Grad-CAM to the input cat image using what we learnt from lecture:


  • Load the pre-trained model and pre-process the given image to make a prediction.
  • Find the predicted class of the image. It should be an Egyptian cat.
  • Using the tf.keras Functional API, build a model that gives the model predictions and the feature maps after the last convolution in the pre-trained network.
  • Using tf.GradientTape() find the gradients of the output with respect to the activations.
  • As per the Grad-CAM implementation, pool the gradients and find the heatmap.
  • Upsample the heatmap using the helper function and superimpose it on the original image to get the output like the one shown above.


model.layersAccesses layers of the model

tf.keras.activations.linearLinear activation function

model.predict()Used to predict the values given the model

tf.keras.applications.mobilenet_v2.MobileNetV2Instantiates the MobileNet v2 architecture.

In [1]:
# Import required libraries

import tensorflow as tf
from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2, preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model
import numpy as np
import matplotlib.pyplot as plt
import cv2
import pickle
In [2]:
# Load the MobileNet V2 pre-trained model
# Rather than training a model from scratch we can use a pre trained
# model that has already been trained in the imagenet dataset
# MobileNetV2 is a SOTA model for image classification 
model = MobileNetV2(weights='imagenet')
In [5]:
### edTest(test_chow1) ###

# Find the last convolutional layer
# Inspect the model summary and find the last convolution layer
# Get the name of the last convolution layer
conv_layer_name = model.layers[-3].name
In [6]:
# Take a sample image to find the saliency map
img_path = './cat.png'

# Load the image with the target_size for mobilenet
img = image.load_img(img_path, target_size=(224, 224))

# Convert the image to a numpy array 
x = image.img_to_array(img)

# Add an extra dimension for batch size 
# to change it to (1,224,224,3)
x = np.expand_dims(x, axis=0)

# Use the MobileNetV2 preprocess_input function on the image
x = preprocess_input(x)
In [7]:
# Use the pretrained model to make a prediction 
preds = model.predict(x)

# Useful dictionary to go from label index to actual label
with open('idx2name.pkl', 'rb') as handle:
    keras_idx_to_name = pickle.load(handle)
In [8]:
# See what the output predictions is:
prediction_class = keras_idx_to_name[np.argmax(preds,axis=1).item(0)]
print(f'Prediction class is {prediction_class}')
Prediction class is Egyptian cat

In [9]:
# We use the tf.keras Functional API to get 
# 1. The model prediction probabilities
# 2. The feature maps after the last convolution in the model

# Get the last convolution layer in the network
last_conv_layer = model.get_layer(conv_layer_name)

# Get the output predictions and the last_conv_layer
# Using tf.keras functional API
get_maps = Model(inputs = [model.inputs], outputs = [model.output, last_conv_layer.output])
In [10]:
# Now we perform the Grad-CAM,
# We take the gradient of the output with respect to the feature maps
# after the convolution 
with tf.GradientTape() as tape:

    # Getting the required outputs 
    model_out, last_conv_layer = get_maps(x)
    # We choose the output with maximum probability
    # But this can be different depending on your choice
    # For eg. you could select the second highest probability value 
    class_out = tf.reduce_max(model_out)

Take the gradients

In [11]:
### edTest(test_chow2) ###

# We take the gradients 
# tape.gradient() takes the gradient of something with respect to 
# something else. Here we want the derivative of the output class
# with respect to to the last conv layer
grads = tape.gradient(class_out, last_conv_layer)
In [12]:
# Here we combine all the gradients for each feature map 
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
# As per grad-CAM literature, here we need to multiply 
# the pooled grads with each feature map and take the average across
# all the feature maps to make the heat map
heatmap = tf.reduce_mean(tf.multiply(pooled_grads, last_conv_layer), axis=-1)
In [13]:
# Below we convert heatmap to numpy
# Make all values positive
# and reshape from (1,7,7) to (7,7) for ease of plotting
heatmap = heatmap.numpy()
heatmap[heatmap < 0] = 0 #relu
heatmap = (heatmap - heatmap.min())/(heatmap.max() - heatmap.min())
heatmap = heatmap.reshape((7, 7))

# We plot the (7,7) heatmap
In [14]:
# Inorder to map to the original image
# This heatmap has to be be resized
resized_heatmap = np.uint8(cv2.resize(heatmap,(224,224))*255)
In [15]:
# We need to add a pre-processing step
# to convert the grayscale heatmap
# to a true JET colormap of 3 channels
# for ease of viewing
val = np.uint8(256-resized_heatmap)
heatmap_final = cv2.applyColorMap(val, cv2.COLORMAP_JET)
In [16]:
# We also prepare the image for plotting
# by converting to tensor
# and converting dtype to int8
img = image.img_to_array(img)
img = np.uint8(img)
In [17]:
# Finally, we use the cv2.addweighted function
# to superimpose the heatmap on the original image
# Use the helper code below to do the same
fig, ax = plt.subplots(1,1, figsize=(6,6))
ax.imshow(cv2.addWeighted(heatmap_final, 0.5, img, 0.5, 0))
fig.suptitle(f'Predicted class: {prediction_class}',y=0.92,fontsize=14);

⏸ Will Grad-CAM work if we took the output from the last ReLU instead ? (True or False)

In [18]:
### edTest(test_chow3) ###

# Type your answer within in the quotes given 
answer3 = 'True'

⏸ The heatmap output is displaying:

A: The weights of the layer

B: A 7x7 mask from the input image

C: The pixels that activates the most in red and the least in blues

D: A feature map of the input image

In [20]:
### edTest(test_chow4) ###

# Type your answer within in the quotes given 
answer4 = 'C'
In [0]: