Key Word(s): seq2seq, attention



Title

Introduction to Autoencoders using MNIST

Description :

The goal of the exercise is to use an autoencoder to first compress hand-written digit images from the MNIST dataset down to lower-dimensional representations and then expand them back to the original images.

Your final output will look similar to the image below:

Instructions:

  • Load the MNIST dataset with the mnist.load_data() function provided by keras. Load the images in two separate lists, x_train and x_test.
  • Create the first part of the autoencoder - the encoder model.
    • The encoder model compresses the input image down to a lower dimensional latent space
  • Next create the 2nd half of the autoencoder - the decoder.
    • The decoder expands an image representation in the latent space back to the full dimensions of the original input image.
  • Normalize your data by dividing each pixel by 255.
  • Finally, we combine the encoder and decoder into the autoencoder.
  • The autoencoder shrinks the image down to the latent space representation and then expands it again to the original dimensions.
  • Visualize the model predictions on the test set after every epoch using the helper code given.
  • You can experiment with the different latent_size, layer_size and regularization.

Hints:

More on Keras Functional API here.

keras.compile() Compiles the layers into a network.

keras.Sequential() Models a sequential neural network.

keras.Dense() A regular densely-connected NN layer.

layers.Flatten() Flattens the input. Does not affect the batch size.

tf.keras.Input() Used to instantiate a Keras tensor.

NOTE: To keep things simple we will use dense layers, so no convolutions here.

In [0]:
# import required libraries

from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Dense, Input, Flatten, Reshape
from tensorflow.keras import models
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras import layers
from matplotlib import pyplot as plt
from IPython import display
import numpy as np

%matplotlib inline
In [0]:
### edTest(test_normalize) ###
# First we load in the MNIST dataset.

# Do not fill in the blank space
(x_train, _), (x_test, _) = mnist.load_data()


# We will only take 4000 data points from the original dataset to demonstrate the autoencoders
sample_size = 4000 
x_train = x_train[:sample_size]
x_test = x_test[:sample_size]

# We normalize the pixel data (i.e divide by 255)

x_train = ___
x_test = ___

# We print image dimensions to confirm 
print(f'image shape: {x_train[0].shape} and random pixel value is {x_train[0][20][10]}')

# We also plot example image from x_train
plt.imshow(x_train[0], cmap = "gray")
plt.show()
In [0]:
### edTest(test_model_encoder) ###
# Now we create the encoder model to take compress each image down to a lower dimensional latent space.

# pick a size for the latent dimension like 32
latent_size = ___

# Note how sequential models can also be passed a list of layers

# This can be more concise than using add()
model_1 = models.Sequential(name='Encoder')

# add a flatten layer to convert image of size (28,28) to 784
# don't forget to include the `input_shape` argument
model_1.add(___)

# add a dense layer with 128 neurons
model_1.add(___)

# add another dense layer with 64 neurons
model_1.add(___)

# Finally add the last dense layer with latent_size number of neurons
model_1.add(___)

# Take a quick look at the model summary
model_1.summary()
In [0]:
### edTest(test_model_decoder) ###
# Now we create the decoder model to take compress each image down to a lower dimensional latent space.
model_2 = models.Sequential(name='Decoder')

# add a dense layer with 64 neurons
model_2.add(___)

# add a dense layer with 128 neurons
model_2.add(___)

# add a dense layer with 784 neurons and especially choose an appropriate activation function
model_2.add(___)

# finally reshape it back to size 28,28
model_2.add(___)

# Take a quick look at the model summary
model_2.summary()
In [0]:
### edTest(test_model_autoencoder) ###
# To build autoencoders, we will use the keras 'functional api'
# read more here -> https://www.tensorflow.org/guide/keras/functional

# define an input of the dimension of the image
img = Input(shape=(28,28))

# Use the 'encoder' i.e model_1 from above to get a variable `latent_vector`
latent_vector = model_1(___)

# Use the 'decoder' i.e model_2 from above to get the output variable
output = model_2(___)


# using functional api to define autoencoder model
autoencoder = Model(inputs = ___, outputs = ___)

# choose an appropriate loss function for 'reconstruction error' and optimizer = nadam
autoencoder.compile(___)

# Take a quick look at the model summary
autoencoder.summary()
In [0]:
# # You can train for 10 or more epochs to see how well our autoencoder model performs
epochs = 10

for i in range(epochs+1):
# Note: epoch 0 is before any fitting
    fig, axs = plt.subplots(1, 2,figsize = (8,4))
    sample_x = x_test[np.random.choice(x_test.shape[0])]
    axs[0].imshow(sample_x,cmap = "gray")
    axs[0].set_title('Test image',fontsize = 16)
    axs[1].imshow(autoencoder.predict(sample_x.reshape(1,28,28))[0],cmap = "gray")
    axs[1].set_title('Autoencoder Prediction',fontsize = 16);
    fig.suptitle(f'Autoencoder recreation after epoch number {i}',fontsize =14)
    plt.show()
    
    # specify predictors and targets for train and validation and train for an epoch
    
    autoencoder.fit(x=x_train,
                y=x_train,
                validation_data=(x_test, x_test))

Mindchow 🍲

Go back and change the latent_space dimension to a lower value like 2. Does your autoencoder's reconstructions become better or worse? Why?

Your answer here