{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Title\n",
"\n",
"**Exercise: Early Stopping**\n",
"\n",
"# Description\n",
"\n",
"The goal of this exercise is to understand early stopping. Early stopping is a method of avoiding overfitting, not exactly regularizing. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"\n",
"NOTE: This graph is only a sample."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Instructions:\n",
"\n",
"- Generate the predictor and response data using the helper code given.\n",
"- Split the data into train and test sets.\n",
"- Visualise the split data using the helper code.\n",
"- Build a simple neural network with 5 hidden layers with 100 neurons each with the given pre-trained weights. This network has no regularization.\n",
"- Compile the model with MSE as the loss.\n",
"- Fit the model on the training data and save the history.\n",
"- Use the helper code to visualise the MSE of the train and test data with respect to the epochs.\n",
"- Predict on the entire data. \n",
"- Use the helper function to plot the predictions along with the generated data.\n",
"- Repeat steps 4 to 8 by building the same neural network with early stopping.\n",
"- The last plot will consist of the predictions of both the neural networks. The graph will look similar to the one given above.\n",
"\n",
"\n",
"# Hints:\n",
"\n",
"Use the Dense layer to regularize using l2 and l1 regularization. More details can be found here.\n",
"\n",
"tf.keras.sequential() : A sequential model is for a plain stack of layers where each layer has exactly one input tensor and one output tensor.\n",
"\n",
"tf.keras.optimizers() : An optimizer is one of the two arguments required for compiling a Keras model\n",
"\n",
"model.add() : Adds layers to the model.\n",
"\n",
"model.compile() : Compiles the layers defined into a neural network\n",
"\n",
"model.fit() : Fits the data to the neural network\n",
"\n",
"model.predict() : Used to predict the values given the model\n",
"\n",
"history() : The history object is returned from calls to the fit() function used to train the model. Metrics are stored in a dictionary in the history member of the object returned.\n",
"\n",
"tf.keras.regularizers.L2() : A regularizer that applies a L2 regularization penalty."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Import the necessary libraries\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\")\n",
"import tensorflow as tf\n",
"np.random.seed(0)\n",
"tf.random.set_seed(0)\n",
"from tensorflow.keras import layers\n",
"from tensorflow.keras import models\n",
"from tensorflow.keras import optimizers\n",
"from tensorflow.keras.models import load_model\n",
"from tensorflow.keras import regularizers\n",
"from sklearn.metrics import mean_squared_error\n",
"from tensorflow.keras.models import load_model\n",
"from sklearn.model_selection import train_test_split\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# Use the helper code below to generate the data\n",
"\n",
"# Defines the number of data points to generate\n",
"num_points = 30 \n",
"\n",
"# Generate predictor points (x) between 0 and 5\n",
"x = np.linspace(0,5,num_points)\n",
"\n",
"# Generate the response variable (y) using the predictor points\n",
"y = x * np.sin(x) + np.random.normal(loc=0, scale=1, size=num_points)\n",
"\n",
"# Generate data of the true function y = x*sin(x) \n",
"# x_b will be used for all predictions below \n",
"x_b = np.linspace(0,5,100)\n",
"y_b = x_b*np.sin(x_b)\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Split the data into train and test sets with .33 and random_state = 42\n",
"x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# Helper code to plot the generated data \n",
"\n",
"# Plot the train data\n",
"plt.rcParams[\"figure.figsize\"] = (10,8)\n",
"\n",
"plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n",
"\n",
"# Plot the test data\n",
"plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n",
"\n",
"# Plot the true data\n",
"plt.plot(x_b, y_b, '-', label='True function', linewidth=3, color='#5E5E5E')\n",
"\n",
"# Set the axes labels\n",
"plt.xlabel('X')\n",
"plt.ylabel('Y')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Begin with an unregularized NN. \n",
"\n",
"#### Same as the previous exercise"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# Building an unregularized NN. \n",
"# Initialise the NN, give it an appropriate name for the ease of reading\n",
"# The FCNN has 5 layers, each with 100 nodes\n",
"model_1 = models.Sequential(name='Unregularized')\n",
"\n",
"# Add 5 hidden layers with 100 neurons each\n",
"model_1.add(layers.Dense(100, activation='tanh', input_shape=(1,)))\n",
"model_1.add(layers.Dense(100, activation='relu'))\n",
"model_1.add(layers.Dense(100, activation='relu'))\n",
"model_1.add(layers.Dense(100, activation='relu'))\n",
"model_1.add(layers.Dense(100, activation='relu'))\n",
"\n",
"# Add the output layer with one neuron \n",
"model_1.add(layers.Dense(1, activation='linear'))\n",
"\n",
"# View the model summary\n",
"model_1.summary()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# Load with the weights already provided for the unregularized network\n",
"\n",
"model_1.load_weights('weights.h5')\n",
"\n",
"# Compile the model\n",
"model_1.compile(loss='MSE',optimizer=optimizers.Adam(learning_rate=0.001)) "
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# Use the model above to predict for x_b (used exclusively for plotting) \n",
"y_pred = model_1.predict(x_b)\n",
"\n",
"# Use the model above to predict on the test data\n",
"y_pred_test = model_1.predict(x_test)\n",
"\n",
"# Compute the MSE on the test data\n",
"mse = mean_squared_error(y_test,y_pred_test)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"# Use the helper code to plot the predicted data\n",
"plt.rcParams[\"figure.figsize\"] = (10,8)\n",
"plt.plot(x_b, y_pred, label = 'Unregularized model', color='#5E5E5E', linewidth=3)\n",
"plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n",
"plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n",
"plt.xlabel('X')\n",
"plt.ylabel('Y')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Implement previous NN with early stopping \n",
"For early stopping we build the same network but then we implement early stopping using callbacks. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Building an unregularized NN with early stopping. \n",
"# Initialise the NN, give it an appropriate name for the ease of reading\n",
"# The FCNN has 5 layers, each with 100 nodes\n",
"model_2 = models.Sequential(name='EarlyStopping')\n",
"\n",
"# Add 5 hidden layers with 100 neurons each \n",
"# tanh is the activation for the first layer\n",
"# relu is the activation for all other layers\n",
"model_2.add(layers.Dense(100, activation='tanh', input_shape=(1,)))\n",
"model_2.add(layers.Dense(100, activation='relu'))\n",
"model_2.add(layers.Dense(100, activation='relu'))\n",
"model_2.add(layers.Dense(100, activation='relu'))\n",
"model_2.add(layers.Dense(100, activation='relu'))\n",
"\n",
"# Add the output layer with one neuron \n",
"model_2.add(layers.Dense(1, activation='linear'))\n",
"\n",
"# View the model summary\n",
"model_2.summary()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Use the keras early stopping callback with patience=10 while monitoring the loss\n",
"callback = ___\n",
"\n",
"# Compile the model with MSE as loss and Adam optimizer with learning rate as 0.001\n",
"___\n",
"\n",
"# Save the history about the model after fitting on the train data\n",
"# Use 0.2 validation split with 1500 epochs and batch size of 10\n",
"# Use the callback for early stopping here\n",
"history_2 = ___\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Helper function to plot the data\n",
"# Plot the MSE of the model\n",
"plt.rcParams[\"figure.figsize\"] = (10,8)\n",
"plt.title(\"Early stop model\")\n",
"plt.semilogy(history_2.history['loss'], label='Train Loss', color='#FF9A98', linewidth=2)\n",
"plt.semilogy(history_2.history['val_loss'], label='Validation Loss', color='#75B594', linewidth=2)\n",
"plt.legend()\n",
"\n",
"# Set the axes labels\n",
"plt.xlabel('Epochs')\n",
"plt.ylabel('Log MSE Loss')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Use the early stop implemented model above to predict for x_b (used exclusively for plotting)\n",
"y_early_stop_pred = ___\n",
"\n",
"# Use the model above to predict on the test data\n",
"y_earl_stop_pred_test = ___\n",
"\n",
"# Compute the test MSE by predicting on the test data\n",
"mse_es = ___"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Use the helper code to plot the predicted data\n",
"\n",
"# Plotting the predicted data using the L2 regularized model\n",
"plt.rcParams[\"figure.figsize\"] = (10,8)\n",
"plt.plot(x_b, y_early_stop_pred, label='Early stop regularized model', color='black', linewidth=2)\n",
"\n",
"# Plotting the predicted data using the unregularized model\n",
"plt.plot(x_b, y_pred, label = 'Unregularized model', color='#005493', linewidth=2)\n",
"\n",
"# Plotting the training data\n",
"plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n",
"\n",
"# Plotting the testing data\n",
"plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n",
"\n",
"# Set the axes labels\n",
"plt.xlabel('X')\n",
"plt.ylabel('Y')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Mindchow 🍲\n",
"\n",
"**After marking change the patience parameter once to 2 and once to 100 in the early stopping callback with the same data. Do you notice any change? While value is more efficient?**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}