Techno Blender
Digitally Yours.

Hands-On Tutorial for Applying Grad-CAMs for Explaining Image Classifiers Using Keras and TensorFlow | by Aditya Bhattacharya | Oct, 2022

0 61


Learn how to apply Grad-CAM using Keras and TensorFlow for explaining deep learning-based image classifiers

Output of Grad-CAM method for explaining image classifiers (image by author, base image source: Unsplash)

Classical machine learning (ML) algorithms are not efficient as compared to deep learning (DL) algorithms when applied to unstructured data such as images and text. Due to the benefit of automatic feature extraction in DL as compared to manual feature engineering in classical ML, DL algorithms are more efficient in terms of model accuracy and, hence, more preferred. However, these models are more complex and less interpretable than classical ML models. So, explainability is always a concern for DL models of unstructured data like images. Layer-wise Relevance Propagation (LRP) is one of the explainability approaches which highlights the relevant region of the images for explaining the model predictions.

If you are not very familiar with Explainable AI (XAI) concepts, I would strongly recommend watching one of my past sessions on XAI delivered at the AI Accelerator Festival APAC, 2021:

You can also go through my book Applied Machine Learning Explainability Techniques and take a look at the code repository for getting hands-on exposure to other XAI methods. In this article, I will refer to the hands-on application of one of the popular LRP techniques called Grad-CAM for explaining image classifiers. I will also present the step by step code tutorial to apply Grad-CAMs using Keras and TensorFlow.

Now, let’s get started!

To explain DL models, LRP is one of the most prominent approaches. Intuitively speaking, this method utilizes the weights in the network and the forward pass neural activations to propagate the output back to the input layer through the various layers in the network. So, with the help of the network weights, we can visualize the data elements (pixels in the case of images and words in the case of text data) that contributed most toward the final model output. The contribution of these data elements is a qualitative measure of relevance that gets propagated throughout the network layers. Moreover, for deep neural networks with multiple layers, learning happens when the flow of information through the gradient flow process between the layers is maintained consistently. So, to explain any deep learning model, the LRP method allows us to visualize the activated or most influential data elements throughout the different layers of the network and qualitatively inspect the functioning of the algorithm.

Class Activation Maps (CAMs) are visualization methods used for explaining deep learning models. In this method, the model predicted class scores are traced back to the last convolution layer to highlight discriminative regions of interest in the image that are class-specific and not even generic to other computer vision or image processing algorithms. Gradient CAM or popularly called as Grad-CAMs combines the effect of guided backpropagation and CAM to highlight class discriminative regions of interest without highlighting the granular pixel importance. But Grad-CAM can be applied to any CNN architectures, unlike CAM, which can be applied to architectures that perform global average pooling over output feature maps coming from the convolution layer, just prior to the prediction layer. To get a more detailed understanding on the Grad-CAM process, you can have a look at this research paper Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, Ramprasaath et. al — https://arxiv.org/abs/1610.02391.

Architecture diagram of Guided Grad-CAM (Source: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, Ramprasaath et. al — https://arxiv.org/abs/1610.02391)

Now comes the fun part of this article: learning how to apply Grad-CAMs! We will use Keras and TensorFlow to apply Grad-CAMs for explaining pre-trained images classifiers. You will need the following Python frameworks to apply Grad-CAMs which can be installed using Python pip installer:

!pip install --upgrade numpy matplotlib tensorflow

Let’s start by loading the required modules in Python. I would recommend using local Jupyter notebooks or Google colab to run this code tutorial.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as c_map
from IPython.display import Image, display
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications.xception import Xception, preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image
import os

We will use TensorFlow and Keras framework to get a pretrained network on the ImageNet dataset and test the approach on a sample open image obtained from the source: https://images.unsplash.com/photo-1615963244664-5b845b2025ee?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=464&q=80. For more examples using Keras and TensorFlow please visit: https://keras.io/examples/.

model_builder = Xception
preprocess_input = preprocess_input
decode_predictions = decode_predictions
IMG_SIZE = (299, 299)
last_conv_layer = "block14_sepconv2_act"
# The local path to our target image
image_path = keras.utils.get_file(
"tiger.jpg", "https://images.unsplash.com/photo-1615963244664-5b845b2025ee?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=464&q=80"
)

display(Image(image_path))

Source inference image (Source: Unsplash)

Once the image is loaded, you would need to apply the pre-processing layer. Since we will be using the pre-trained Xception model from Keras and TensorFlow, we will need to apply the same pre-processing.

def vectorize_image(img_path, size):
'''
Vectorize the given image to get a numpy array
'''
img = image.load_img(img_path, target_size=size)
array = image.img_to_array(img)
array = np.expand_dims(array, axis=0) # Adding dimension to convert array into a batch of size (1,299,299,3)
return array

Now, let’s apply the pre-trained model on our pre-processed image and see the prediction.

vectorized_image = preprocess_input(vectorize_image(image_path, size=IMG_SIZE))
model = model_builder(weights="imagenet")
model.layers[-1].activation = None # Removing the last layer as it is the softmax layer used for classification

model_prediction = model.predict(vectorized_image)
print(f"The predicted class is : {decode_predictions(model_prediction, top=1)[0][0][1]}")

This is the output that we get:

The predicted class is : tiger

So, our model correctly predicted our inference image as tiger. Now, let’s understand the rationale behind the prediction using Grad-CAMs.

We will build a Grad-CAM Heat-map visualizer to highlight the influential super-pixels of the model.

def get_heatmap(vectorized_image, model, last_conv_layer, pred_index=None):
'''
Function to visualize grad-cam heatmaps
'''
gradient_model = tf.keras.models.Model(
[model.inputs], [model.get_layer(last_conv_layer).output, model.output]
)

# Gradient Computations
with tf.GradientTape() as tape:
last_conv_layer_output, preds = gradient_model(vectorized_image)
if pred_index is None:
pred_index = tf.argmax(preds[0])
class_channel = preds[:, pred_index]

grads = tape.gradient(class_channel, last_conv_layer_output)
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
last_conv_layer_output = last_conv_layer_output[0]
heatmap = last_conv_layer_output @ pooled_grads[..., tf.newaxis]
heatmap = tf.squeeze(heatmap)
heatmap = tf.maximum(heatmap, 0) / tf.math.reduce_max(heatmap) # Normalize the heatmap
return heatmap.numpy()

plt.matshow(get_heatmap(vectorized_image, model, last_conv_layer))
plt.show()

As we apply the heatmap on the 4th convolution layer of our pre-trained model, this is the output heatmap image that we get:

Grad-CAM Heatmap from inference tiger image (Source: By Author)

But this doesn’t tell us anything unless we superimpose this image on our inference image. So, let’s see how to do this using the following code snippet:

def superimpose_gradcam(img_path, heatmap, output_path="grad_cam_image.jpg", alpha=0.4):
'''
Superimpose Grad-CAM Heatmap on image
'''
img = image.load_img(img_path)
img = image.img_to_array(img)

heatmap = np.uint8(255 * heatmap) # Back scaling to 0-255 from 0 - 1
jet = c_map.get_cmap("jet") # Colorizing heatmap
jet_colors = jet(np.arange(256))[:, :3] # Using RGB values
jet_heatmap = jet_colors[heatmap]
jet_heatmap = image.array_to_img(jet_heatmap)
jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))
jet_heatmap = image.img_to_array(jet_heatmap)

superimposed_img = jet_heatmap * alpha + img # Superimposing the heatmap on original image
superimposed_img = image.array_to_img(superimposed_img)

superimposed_img.save(output_path) # Saving the superimposed image
display(Image(output_path)) # Displaying Grad-CAM Superimposed Image

superimpose_gradcam(image_path, get_heatmap(vectorized_image, model, last_conv_layer))

And, voila! We get the following superimposed heatmap image of the reference image:

Output of Grad-CAM method for explaining image classifiers (image by author, base image source: Unsplash)

Was that too difficult to apply Grad-CAMs? Absolutely not! Keras and TensorFlow makes it even easier to apply such an explainability technique for image classifiers! This is a very powerful technique that is used to explain the working of complex Deep Learning algorithms on unstructured data like images. Although this method is difficult to understand for beginner learners. However, once you get a hang of it, it is a very powerful method and very helpful for model explainability.

Hope you have enjoyed this article! The full tutorial notebook is available at: https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/blob/main/Chapter02/Layerwise%20Propagation.ipynb.

Follow me on Medium and LinkedIn to learn more about Explainable AI and Machine Learning.

  1. Explainable Machine Learning for Models Trained on Text Data: Combining SHAP with Transformer Models
  2. EUCA — An effective XAI framework to bring artificial intelligence closer to end-users
  3. Understand the Workings of SHAP and Shapley Values Used in Explainable AI
  4. How to Explain Image Classifiers Using LIME
  1. Keras Tensorflow Tutorial Examples — https://keras.io/examples/
  2. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, Ramprasaath et. al https://arxiv.org/abs/1610.02391.
  3. Applied Machine Learning Explainability Techniques
  4. GitHub repo from the book Applied Machine Learning Explainability Techniques — https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/


Learn how to apply Grad-CAM using Keras and TensorFlow for explaining deep learning-based image classifiers

Output of Grad-CAM method for explaining image classifiers (image by author, base image source: Unsplash)

Classical machine learning (ML) algorithms are not efficient as compared to deep learning (DL) algorithms when applied to unstructured data such as images and text. Due to the benefit of automatic feature extraction in DL as compared to manual feature engineering in classical ML, DL algorithms are more efficient in terms of model accuracy and, hence, more preferred. However, these models are more complex and less interpretable than classical ML models. So, explainability is always a concern for DL models of unstructured data like images. Layer-wise Relevance Propagation (LRP) is one of the explainability approaches which highlights the relevant region of the images for explaining the model predictions.

If you are not very familiar with Explainable AI (XAI) concepts, I would strongly recommend watching one of my past sessions on XAI delivered at the AI Accelerator Festival APAC, 2021:

You can also go through my book Applied Machine Learning Explainability Techniques and take a look at the code repository for getting hands-on exposure to other XAI methods. In this article, I will refer to the hands-on application of one of the popular LRP techniques called Grad-CAM for explaining image classifiers. I will also present the step by step code tutorial to apply Grad-CAMs using Keras and TensorFlow.

Now, let’s get started!

To explain DL models, LRP is one of the most prominent approaches. Intuitively speaking, this method utilizes the weights in the network and the forward pass neural activations to propagate the output back to the input layer through the various layers in the network. So, with the help of the network weights, we can visualize the data elements (pixels in the case of images and words in the case of text data) that contributed most toward the final model output. The contribution of these data elements is a qualitative measure of relevance that gets propagated throughout the network layers. Moreover, for deep neural networks with multiple layers, learning happens when the flow of information through the gradient flow process between the layers is maintained consistently. So, to explain any deep learning model, the LRP method allows us to visualize the activated or most influential data elements throughout the different layers of the network and qualitatively inspect the functioning of the algorithm.

Class Activation Maps (CAMs) are visualization methods used for explaining deep learning models. In this method, the model predicted class scores are traced back to the last convolution layer to highlight discriminative regions of interest in the image that are class-specific and not even generic to other computer vision or image processing algorithms. Gradient CAM or popularly called as Grad-CAMs combines the effect of guided backpropagation and CAM to highlight class discriminative regions of interest without highlighting the granular pixel importance. But Grad-CAM can be applied to any CNN architectures, unlike CAM, which can be applied to architectures that perform global average pooling over output feature maps coming from the convolution layer, just prior to the prediction layer. To get a more detailed understanding on the Grad-CAM process, you can have a look at this research paper Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, Ramprasaath et. al — https://arxiv.org/abs/1610.02391.

Architecture diagram of Guided Grad-CAM (Source: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, Ramprasaath et. al — https://arxiv.org/abs/1610.02391)

Now comes the fun part of this article: learning how to apply Grad-CAMs! We will use Keras and TensorFlow to apply Grad-CAMs for explaining pre-trained images classifiers. You will need the following Python frameworks to apply Grad-CAMs which can be installed using Python pip installer:

!pip install --upgrade numpy matplotlib tensorflow

Let’s start by loading the required modules in Python. I would recommend using local Jupyter notebooks or Google colab to run this code tutorial.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as c_map
from IPython.display import Image, display
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications.xception import Xception, preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image
import os

We will use TensorFlow and Keras framework to get a pretrained network on the ImageNet dataset and test the approach on a sample open image obtained from the source: https://images.unsplash.com/photo-1615963244664-5b845b2025ee?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=464&q=80. For more examples using Keras and TensorFlow please visit: https://keras.io/examples/.

model_builder = Xception
preprocess_input = preprocess_input
decode_predictions = decode_predictions
IMG_SIZE = (299, 299)
last_conv_layer = "block14_sepconv2_act"
# The local path to our target image
image_path = keras.utils.get_file(
"tiger.jpg", "https://images.unsplash.com/photo-1615963244664-5b845b2025ee?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=464&q=80"
)

display(Image(image_path))

Source inference image (Source: Unsplash)

Once the image is loaded, you would need to apply the pre-processing layer. Since we will be using the pre-trained Xception model from Keras and TensorFlow, we will need to apply the same pre-processing.

def vectorize_image(img_path, size):
'''
Vectorize the given image to get a numpy array
'''
img = image.load_img(img_path, target_size=size)
array = image.img_to_array(img)
array = np.expand_dims(array, axis=0) # Adding dimension to convert array into a batch of size (1,299,299,3)
return array

Now, let’s apply the pre-trained model on our pre-processed image and see the prediction.

vectorized_image = preprocess_input(vectorize_image(image_path, size=IMG_SIZE))
model = model_builder(weights="imagenet")
model.layers[-1].activation = None # Removing the last layer as it is the softmax layer used for classification

model_prediction = model.predict(vectorized_image)
print(f"The predicted class is : {decode_predictions(model_prediction, top=1)[0][0][1]}")

This is the output that we get:

The predicted class is : tiger

So, our model correctly predicted our inference image as tiger. Now, let’s understand the rationale behind the prediction using Grad-CAMs.

We will build a Grad-CAM Heat-map visualizer to highlight the influential super-pixels of the model.

def get_heatmap(vectorized_image, model, last_conv_layer, pred_index=None):
'''
Function to visualize grad-cam heatmaps
'''
gradient_model = tf.keras.models.Model(
[model.inputs], [model.get_layer(last_conv_layer).output, model.output]
)

# Gradient Computations
with tf.GradientTape() as tape:
last_conv_layer_output, preds = gradient_model(vectorized_image)
if pred_index is None:
pred_index = tf.argmax(preds[0])
class_channel = preds[:, pred_index]

grads = tape.gradient(class_channel, last_conv_layer_output)
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
last_conv_layer_output = last_conv_layer_output[0]
heatmap = last_conv_layer_output @ pooled_grads[..., tf.newaxis]
heatmap = tf.squeeze(heatmap)
heatmap = tf.maximum(heatmap, 0) / tf.math.reduce_max(heatmap) # Normalize the heatmap
return heatmap.numpy()

plt.matshow(get_heatmap(vectorized_image, model, last_conv_layer))
plt.show()

As we apply the heatmap on the 4th convolution layer of our pre-trained model, this is the output heatmap image that we get:

Grad-CAM Heatmap from inference tiger image (Source: By Author)

But this doesn’t tell us anything unless we superimpose this image on our inference image. So, let’s see how to do this using the following code snippet:

def superimpose_gradcam(img_path, heatmap, output_path="grad_cam_image.jpg", alpha=0.4):
'''
Superimpose Grad-CAM Heatmap on image
'''
img = image.load_img(img_path)
img = image.img_to_array(img)

heatmap = np.uint8(255 * heatmap) # Back scaling to 0-255 from 0 - 1
jet = c_map.get_cmap("jet") # Colorizing heatmap
jet_colors = jet(np.arange(256))[:, :3] # Using RGB values
jet_heatmap = jet_colors[heatmap]
jet_heatmap = image.array_to_img(jet_heatmap)
jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))
jet_heatmap = image.img_to_array(jet_heatmap)

superimposed_img = jet_heatmap * alpha + img # Superimposing the heatmap on original image
superimposed_img = image.array_to_img(superimposed_img)

superimposed_img.save(output_path) # Saving the superimposed image
display(Image(output_path)) # Displaying Grad-CAM Superimposed Image

superimpose_gradcam(image_path, get_heatmap(vectorized_image, model, last_conv_layer))

And, voila! We get the following superimposed heatmap image of the reference image:

Output of Grad-CAM method for explaining image classifiers (image by author, base image source: Unsplash)

Was that too difficult to apply Grad-CAMs? Absolutely not! Keras and TensorFlow makes it even easier to apply such an explainability technique for image classifiers! This is a very powerful technique that is used to explain the working of complex Deep Learning algorithms on unstructured data like images. Although this method is difficult to understand for beginner learners. However, once you get a hang of it, it is a very powerful method and very helpful for model explainability.

Hope you have enjoyed this article! The full tutorial notebook is available at: https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/blob/main/Chapter02/Layerwise%20Propagation.ipynb.

Follow me on Medium and LinkedIn to learn more about Explainable AI and Machine Learning.

  1. Explainable Machine Learning for Models Trained on Text Data: Combining SHAP with Transformer Models
  2. EUCA — An effective XAI framework to bring artificial intelligence closer to end-users
  3. Understand the Workings of SHAP and Shapley Values Used in Explainable AI
  4. How to Explain Image Classifiers Using LIME
  1. Keras Tensorflow Tutorial Examples — https://keras.io/examples/
  2. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, Ramprasaath et. al https://arxiv.org/abs/1610.02391.
  3. Applied Machine Learning Explainability Techniques
  4. GitHub repo from the book Applied Machine Learning Explainability Techniques — https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques/

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment