{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Title :\n", "Investigating CNNs\n", "\n", "## Description :\n", "The goal of the exercise is to investigate the building blocks of a CNN, such as kernels, filters, and feature maps using a CNN model trained on the CIFAR-10 dataset.\n", "\n", "\n", "\n", "\n", "## Instructions:\n", "\n", "- Import the CIFAR-10 dataset, and the pre-trained model from the helper file by calling the `get_cifar10()` function.\n", "- Evaluate the model on the test set in order to verify if the selected model has trained weights. You should get a test set accuracy of about **75%**.\n", "- Take a quick look at the model architecture using `model.summary()`.\n", "- Investigate the weights of the pre-trained model and plot the weights of the 1st filter of the 1st convolution layer.\n", "- Plot all the filters of the first convolution layer.\n", "- Use the helper code give to visualize the `feature maps` of the first convolution layer along with the input image.\n", "- Use the helper code give to visualize the `activations` of the first convolution layer along with the input image.\n", "\n", "## Hints:\n", "\n", "model.layersAccesses various layers of the model\n", "\n", "model.predict()Used to predict the values given the model\n", "\n", "model.layers.get_weights()Get the weights of a particular layer\n", "\n", "tensorflow.keras.Model()Functional API to group layers into an object with training and inference features." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visual Demonstration of CNNs" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Import necessary libraries\n", "import numpy as np\n", "import random\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "import tensorflow as tf\n", "from tensorflow.keras.models import Sequential, Model\n", "\n", "from matplotlib import cm\n", "import helper\n", "from helper import cnn_model, get_cifar10, plot_featuremaps" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# As we are using a pre-trained model, \n", "# we will only use 1000 images from the 'unseen' test data \n", "# The get_cifar10() function will load 1000 cifar10 images\n", "(x_test, y_test) = get_cifar10()\n", "\n", "# We also provide a handy dictionary to map response values to image labels\n", "cifar10dict = helper.cifar10dict\n", "cifar10dict" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Let's look at some sample images with their labels\n", "# Run the helper code below to plot the image and its label\n", "num_images = 5\n", "fig, ax = plt.subplots(1,num_images,figsize=(12,12))\n", "for i in range(num_images):\n", " image_index = random.randint(0,1000)\n", " img = (x_test[image_index] + 0.5)\n", " ax[i].imshow(img)\n", " label = cifar10dict[np.argmax(y_test[image_index])]\n", " ax[i].set_title(f'Actual: {label}')\n", " ax[i].axis('off')" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# For this exercise we use a pre-trained network by calling\n", "# the cnn_model() function\n", "model = cnn_model()\n", "model.summary()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Evaluate the pretrained model on the test set\n", "model_score = model.evaluate(x_test,y_test)\n", "print(f'The test set accuracy for the pre-trained model is {100*model_score[1]:.2f} %')" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Visualizing the predictions on 5 randomly selected images\n", "num_images = 5\n", "fig, ax = plt.subplots(1,num_images,figsize=(12,12))\n", "for i in range(num_images):\n", " image_index = random.randint(0,1000)\n", " prediction= cifar10dict[int(np.squeeze(np.argmax(model.predict(x_test[image_index:image_index+1]),axis=1),axis=0))]\n", " img = (x_test[image_index] + 0.5)\n", " ax[i].imshow(img)\n", " ax[i].set_title(f'Predicted: {prediction} \\n Actual: {cifar10dict[np.argmax(y_test[image_index:image_index+1])]}')\n", " ax[i].axis('off')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize kernels corresponding to the filters for the 1st layer\n" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# The 'weights' variable is of the form \n", "# [height, width, channel, number of filters]\n", "\n", "# Use .get_weights() with the appropriate layer number \n", "# to get the weights and bias of the first layer i.e. layer number 0\n", "weights, bias= ___\n", "\n", "assert weights.shape == (3,3,3,32), \"Computed weights are incorrect\"" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# How many filters are in the first convolution layer?\n", "n_filters = ___\n", "print(f'Number of filters: {n_filters}')\n", "\n", "# Print the filter size\n", "filter_channel = ___\n", "filter_height = ___\n", "filter_width = ___\n", "print(f'Number of channels {filter_channel}')\n", "print(f'Filter height {filter_height}')\n", "print(f'Filter width {filter_width}')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ⏸ Based on the dimensions of the input image given to the defined model, how many kernels constitute the first filter?\n", "\n", "\n", "#### A. $3$\n", "#### B. $32$\n", "#### C. $1$\n", "#### D. $24$" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "### edTest(test_chow1) ###\n", "# Submit an answer choice as a string below (eg. if you choose option C, put 'C')\n", "answer1 = '___'" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# The 'weights' variable (defined above) is of the form \n", "# [height, width, channel, number of filters]\n", "# From this select all three channels, the entire length and width\n", "# and the first filter\n", "kernels_filter1 = ___\n", "\n", "# Test case to check if you have indexed correctly\n", "assert kernels_filter1.shape == (3,3,3)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Use the helper code below to plot each kernel of the choosen filter \n", "fig, axes = plt.subplots(1,3,figsize = (12,4))\n", "colors = ['Reds','Greens','Blues']\n", "for num, i in enumerate(axes):\n", " i.imshow(kernels_filter1[num],cmap=colors[num])\n", " i.set_title(f'Kernel for {colors[num]} channel')\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing one *filter* for the first convolutional layer\n", "\n", "Each of the above *kernels* stacked together forms a filter, which interacts with the input." ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# For the same filter above, we perform normalization because the current\n", "# values are between -1 and 1 and the imshow function would truncate all values \n", "# less than 0 making the visual difficult to infer from.\n", "kernels_filter1 = (kernels_filter1 - kernels_filter1.min())/(kernels_filter1.max() - kernels_filter1.min())\n", "\n", "# Plotting the filter\n", "fig, ax = plt.subplots(1,1,figsize = (4,4))\n", "ax.imshow(kernels_filter1)\n", "ax.set_title(f'1st Filter of convolution')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing all the filters (32) for the first convolutional layer" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Use the helper code below to visualize all filters for the first layer\n", "\n", "fig,ax=plt.subplots(4,8,figsize=(14,14))\n", "fig.subplots_adjust(bottom=0.2,top=0.5)\n", "for i in range(4):\n", " for j in range(8):\n", " filters= weights[:,:,:,(8*i)+j]\n", " filters = (filters - filters.min())/(filters.max() - filters.min())\n", " ax[i,j].imshow(filters)\n", " \n", "fig.suptitle('All 32 filters for 1st convolution layer',fontsize=20, y=0.53); " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize Feature Maps & Activations\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ⏸ Which of the following statements is true?\n", "\n", "\n", "#### A. Feature maps are a collection of weights, and filters are outputs of convolved inputs. \n", "#### B. Filters are a collection of learned weights, and feature maps are outputs of convolved inputs.\n", "#### C. Feature maps are learned features of a trained CNN model.\n", "#### D. Filters are the outputs of an activation layer on a feature map." ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "### edTest(test_chow2) ###\n", "# Submit an answer choice as a string below (eg. if you choose option C, put 'C')\n", "answer2 = '___'" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Use model.layers to get a list of all the layers in the model\n", "layers_list = ___\n", "print('\\n'.join([layer.name for layer in layers_list]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this exercise, we take a look at only the first convolution layer and the first activation layer." ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Get the output of the first convolution layer\n", "layer0_output = model.layers[0].output\n" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Use the tf.keras functional API : Model(inputs= , outputs = ) where\n", "# the input will come from model.input and output will be layer0_output\n", "feature_model = ___\n", "\n", "# Use a sample image from the test set to visualize the feature maps\n", "img = x_test[16].reshape(1,32,32,3)\n", "\n", "# NOTE: We have to reshape the image to 4-d tensor so that\n", "# it can input to the trained model" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Use the helper code below to plot the feature maps\n", "features = feature_model.predict(img)\n", "plot_featuremaps(img,features,[model.layers[0].name])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualizing the first activation" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Get the output of the first activation layer\n", "layer1_output = model.layers[1].output" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Follow the same steps as above for the next layer\n", "activation_model = ___\n" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "# Use the helper code to again visualize the outputs\n", "img = x_test[16].reshape(1,32,32,3)\n", "activations = activation_model.predict(img)\n", "\n", "# You can download the plot_featuremaps helper file\n", "# to see how exactly do we make this nice plot below\n", "plot_featuremaps(img,activations,[model.layers[1].name])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ⏸ Using the feature maps, is it possible to locate the part of the image that is most responsible for predicting the output class?\n", "#### A. Yes\n", "#### B. No" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "### edTest(test_chow3) ###\n", "# Submit an answer choice as a string below \n", "# (eg. if you choose option B, put 'B')\n", "answer3 = '___'" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }