Title :¶

Feed-Forward Neural Networks vs Convolution Neural Networks

Description :¶

The aim of this exercise is to train a feed-forward neural network and a convolutional neural network and compare the number of parameters between them on the following image classification task

Instructions:¶

Since we have only one 'Pavlos' and one 'Not Pavlos' image, we will need to augment our dataset. We use an image generator to create 'translated' versions of our two images. The training is performed on these translated images given in the data folder.
Feed-Forward Neural Network:
- Build a simple Feed-Forward Neural Network and compile the model with binary cross entropy as the loss.
- Fit the model on the training data and save the history.
- Predict on the entire data.
- Visualize the loss and accuracy on train and validation data with respect to the epochs.
Convolutional Neural Network:
- Build a Convolution Neural Networks and compile the model with binary cross-entropy as the loss.
- Fit the model on the training data and save the history.
- Predict on the entire data.
- Visualize the loss and accuracy on train and validation data with respect to the epochs.
Compare the accuracy and the number of parameters of both the models.

Hints:¶

keras.Sequential()Creates a sequential model. A Sequential model is appropriate for a plain stack of layers where each layer has exactly one input tensor and one output tensor.

keras.compile()Configures the model for training.

keras.fit()Trains the model for a fixed number of epochs.

history.history[]The returned "history" object from model.fit() holds a dictionary of the loss values and metric values during training.

keras.evaluate()Returns the loss value & metrics values for the model in test mode.

tf.keras.preprocessing.image.ImageDataGenerator()Generate batches of tensor image data with real-time data augmentation. This function is used in our helper code.

tf.keras.layers.Flatten()Flattens the input. Does not affect the batch size.

tf.keras.layers.Conv2D()2D convolution layer (e.g. spatial convolution over images).

tf.keras.layers.Dense()A regular densely-connected NN layer.

NOTE - The accuracy testing is done on the original network. Ensure to reset to the original parameters after answering the pause and think questions to pass the tests.

Image Classification: FFNN vs CNN¶

In [1]:

# Importing necessary libraries
import numpy as np
import tensorflow as tf
from numpy.random import seed
seed(1)
tf.random.set_seed(1)
import os
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Flatten, Input
from matplotlib import pyplot as plt
%matplotlib inline
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import load_img
from keras.preprocessing.image import ImageDataGenerator
from PIL import Image
from numpy import asarray
from helper import plot_history

In [2]:

# Initialise an image generator object
generator = ImageDataGenerator(rescale=1./255)

# Initialising number of data images
num_data = len(os.listdir('data/pavlos') + os.listdir('data/not_pavlos'))

# Read the image data from the directory using the generator object
img_generator = generator.flow_from_directory(directory="data/", color_mode='rgb', seed=1,
                                    batch_size=16, target_size=(150, 150), class_mode='binary')

# Print the target size i.e. the total dataset size
TARGET_SIZE = img_generator.target_size
print(f'Generator produces images of size {TARGET_SIZE} (with 3 color channels)')

# Print the batch size
BATCH_SIZE = img_generator.batch_size
print(f'Images are generated in batches of size {BATCH_SIZE}')

Found 100 images belonging to 2 classes.
Generator produces images of size (150, 150) (with 3 color channels)
Images are generated in batches of size 16

In [3]:

# Plotting a sample of the generated images 
sample_batch = img_generator.next()[0]
fig, ax = plt.subplots(4,4)
ax = ax.ravel()
for i, img in enumerate(sample_batch):
    ax[i].set_axis_off()
    ax[i].imshow(img)
plt.suptitle('Sample Batch of Generated Images', y=1.05)
plt.tight_layout()

Feed-Forward Neural Network¶

Our first network will be a feed-forward neural network. The only layers with learned parameters we will be using are dense layers.

In [28]:

# Fixing the random seed
seed(1)
tf.random.set_seed(1)

# Creating a feed-forward Neural Network
FFNN = Sequential()

# Specify a layer that takes the input with input shape
# the same as the size of the image defined during image generation
# Remember to take into account that the image has 3 channels
FFNN.add(tf.keras.layers.Input(shape=(150,150, 3)))

# Add a flatten layer to enable FFNN to process images
FFNN.add(tf.keras.layers.Flatten())

# Specify a list of the number of nodes for each dense layer
ffnn_filters = [6,4,2]

# Add dense layers for the number of nodes in ffnn_filters with ReLU activation
for n_nodes in ffnn_filters:
    FFNN.add(tf.keras.layers.Dense(n_nodes, activation='relu'))

# Add the final dense layer with 1 output node to differentiate 
# between the two classes and sigmoid activation
FFNN.add(tf.keras.layers.Dense(1, activation='sigmoid'))

# Compile the model with bce as the loss, accuracy as the metric and adam optimizer
FFNN.compile(loss='binary_crossentropy', metrics=['accuracy'], optimizer='adam')

In [29]:

# Print a summary of the model and observe the total number of parameters
FFNN.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_3 (Flatten)          (None, 67500)             0         
_________________________________________________________________
dense_10 (Dense)             (None, 6)                 405006    
_________________________________________________________________
dense_11 (Dense)             (None, 4)                 28        
_________________________________________________________________
dense_12 (Dense)             (None, 2)                 10        
_________________________________________________________________
dense_13 (Dense)             (None, 1)                 3         
=================================================================
Total params: 405,047
Trainable params: 405,047
Non-trainable params: 0
_________________________________________________________________

In [30]:

# Train the model
FFNN_history = FFNN.fit(
        img_generator,
        steps_per_epoch=num_data// BATCH_SIZE,
        epochs=10, shuffle=False, workers=0,
        validation_data=img_generator,
        validation_steps=num_data*0.25// BATCH_SIZE)

Epoch 1/10
6/6 [==============================] - 1s 46ms/step - loss: 1.4907 - accuracy: 0.5323 - val_loss: 0.6934 - val_accuracy: 0.4375
Epoch 2/10
6/6 [==============================] - 0s 19ms/step - loss: 0.6931 - accuracy: 0.5182 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 3/10
6/6 [==============================] - 0s 19ms/step - loss: 0.6931 - accuracy: 0.5012 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 4/10
6/6 [==============================] - 0s 20ms/step - loss: 0.6931 - accuracy: 0.5132 - val_loss: 0.6921 - val_accuracy: 0.6250
Epoch 5/10
6/6 [==============================] - 0s 19ms/step - loss: 0.6933 - accuracy: 0.4854 - val_loss: 0.6937 - val_accuracy: 0.4375
Epoch 6/10
6/6 [==============================] - 0s 19ms/step - loss: 0.6928 - accuracy: 0.5424 - val_loss: 0.6921 - val_accuracy: 0.6250
Epoch 7/10
6/6 [==============================] - 0s 19ms/step - loss: 0.6932 - accuracy: 0.4999 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 8/10
6/6 [==============================] - 0s 20ms/step - loss: 0.6929 - accuracy: 0.5344 - val_loss: 0.6914 - val_accuracy: 0.6875
Epoch 9/10
6/6 [==============================] - 0s 26ms/step - loss: 0.6936 - accuracy: 0.4563 - val_loss: 0.6937 - val_accuracy: 0.4375
Epoch 10/10
6/6 [==============================] - 0s 21ms/step - loss: 0.6933 - accuracy: 0.4893 - val_loss: 0.6937 - val_accuracy: 0.4375

⏸ Enter the number of parameters in the given FFNN architecture.¶

In [31]:

### edTest(test_chow1) ###

# Enter the answer by typing in a number in the space provided
answer1 = '405,047'

In [32]:

# Use the plot history function from the helper file to plot the data
plot_history(FFNN_history, 'Feed-Forward Neural Network')

In [33]:

### edTest(test_ffnn_acc) ###

# Evaluate your model
FFNN_loss, FFNN_acc = FFNN.evaluate(img_generator, steps=2)
print(f'FFNN Accuracy: {FFNN_acc}')

2/2 [==============================] - 0s 12ms/step - loss: 0.6913 - accuracy: 0.7188
FFNN Accuracy: 0.71875

⏸ Alter the network architecture by increasing the number of nodes and/or layers. Enter the number of parameters of the network that gives a validation accuracy of above 80%.¶

In [42]:

### edTest(test_chow2) ###

# Enter the answer by typing in a number in the space provided
answer2 = '405,047'

Convolutional Neural Network¶

In [43]:

### edTest(test_cnn_count_param) ###

# Fixing the random seed
seed(1)
tf.random.set_seed(1)

# Creating a CNN
CNN = Sequential()

# Add a layer to take the input with shape (150,150,3)
CNN.add(Input(shape=(150, 150, 3)))

# Specify a list of the number of filters for each convolutional layer
cnn_filters = [8,8,8,8,8]

# Add convolutional layers with number of filters in cnn_filters
# with kernel size as 3, stride of 2 and relu activation
for n_filters in cnn_filters:
    CNN.add(Conv2D(n_filters,strides=(2, 2), kernel_size=3, activation='relu'))

# Add the flatten layer between the CNN and dense layer
CNN.add(Flatten())

# Add a dense layer with 64 nodes and relu activation
CNN.add(Dense(64, activation='relu'))
 
# Specify the output layer with sigmoid activation and one node
CNN.add(Dense(1, activation='sigmoid'))
 
# Compile the model with bce as the loss, accuracy as the metric and adam optimizer
CNN.compile(loss='binary_crossentropy', metrics=['accuracy'], optimizer='adam')

In [44]:

# Print a summary of the model and observe the total number of parameters
CNN.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_15 (Conv2D)           (None, 74, 74, 8)         224       
_________________________________________________________________
conv2d_16 (Conv2D)           (None, 36, 36, 8)         584       
_________________________________________________________________
conv2d_17 (Conv2D)           (None, 17, 17, 8)         584       
_________________________________________________________________
conv2d_18 (Conv2D)           (None, 8, 8, 8)           584       
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 3, 3, 8)           584       
_________________________________________________________________
flatten_5 (Flatten)          (None, 72)                0         
_________________________________________________________________
dense_16 (Dense)             (None, 64)                4672      
_________________________________________________________________
dense_17 (Dense)             (None, 1)                 65        
=================================================================
Total params: 7,297
Trainable params: 7,297
Non-trainable params: 0
_________________________________________________________________

⏸ Enter the number of parameters in the given CNN architecture.¶

In [45]:

### edTest(test_chow3) ###

# Enter the answer by typing in a number in the space provided
answer3 = '7,297'

In [46]:

# Fit the model on the image generator
CNN_history = CNN.fit(
        img_generator,
        steps_per_epoch=num_data // BATCH_SIZE,
        epochs=10, shuffle=False, workers=0,
        validation_data=img_generator,
        validation_steps=num_data*0.25// BATCH_SIZE)

Epoch 1/10
6/6 [==============================] - 1s 50ms/step - loss: 0.6937 - accuracy: 0.4350 - val_loss: 0.6906 - val_accuracy: 0.6250
Epoch 2/10
6/6 [==============================] - 0s 23ms/step - loss: 0.6899 - accuracy: 0.6763 - val_loss: 0.6850 - val_accuracy: 0.7500
Epoch 3/10
6/6 [==============================] - 0s 33ms/step - loss: 0.6834 - accuracy: 0.8830 - val_loss: 0.6754 - val_accuracy: 1.0000
Epoch 4/10
6/6 [==============================] - 0s 23ms/step - loss: 0.6721 - accuracy: 0.9557 - val_loss: 0.6581 - val_accuracy: 0.9375
Epoch 5/10
6/6 [==============================] - 0s 23ms/step - loss: 0.6460 - accuracy: 0.9808 - val_loss: 0.6095 - val_accuracy: 1.0000
Epoch 6/10
6/6 [==============================] - 0s 91ms/step - loss: 0.5915 - accuracy: 0.9826 - val_loss: 0.5142 - val_accuracy: 1.0000
Epoch 7/10
6/6 [==============================] - 0s 27ms/step - loss: 0.4830 - accuracy: 0.9737 - val_loss: 0.3584 - val_accuracy: 1.0000
Epoch 8/10
6/6 [==============================] - 0s 24ms/step - loss: 0.3068 - accuracy: 0.9900 - val_loss: 0.1658 - val_accuracy: 1.0000
Epoch 9/10
6/6 [==============================] - 0s 23ms/step - loss: 0.1193 - accuracy: 1.0000 - val_loss: 0.0316 - val_accuracy: 1.0000
Epoch 10/10
6/6 [==============================] - 0s 23ms/step - loss: 0.0379 - accuracy: 1.0000 - val_loss: 0.0097 - val_accuracy: 1.0000

In [47]:

# Plot the history of the model
plot_history(CNN_history, 'Convolutional Neural Network')

In [48]:

### edTest(test_cnn_acc) ###

# Evaluate the model on the entire data
CNN_loss, CNN_acc = CNN.evaluate(img_generator, steps=2)
print(f'CNN Test Accuracy: {CNN_acc}')

2/2 [==============================] - 0s 15ms/step - loss: 0.0153 - accuracy: 1.0000
CNN Test Accuracy: 1.0

⏸ Remove the last convolution layer in the Convolution Neural Network defined above. How does this affect the number of parameters?¶

A. The number of parameters decrease.¶

B. The number of parameters increase.¶

C. The number of parameters remains the same.¶

In [49]:

### edTest(test_chow4) ###

# Enter the answer by typing in a number in the space provided
answer4 = 'A'

In [0]: