{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Title\n", "\n", "**Exercise: Early Stopping**\n", "\n", "# Description\n", "\n", "The goal of this exercise is to understand early stopping. Early stopping is a method of avoiding overfitting, not exactly regularizing. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "NOTE: This graph is only a sample." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Instructions:\n", "\n", "- Generate the predictor and response data using the helper code given.\n", "- Split the data into train and test sets.\n", "- Visualise the split data using the helper code.\n", "- Build a simple neural network with 5 hidden layers with 100 neurons each with the given pre-trained weights. This network has no regularization.\n", "- Compile the model with MSE as the loss.\n", "- Fit the model on the training data and save the history.\n", "- Use the helper code to visualise the MSE of the train and test data with respect to the epochs.\n", "- Predict on the entire data. \n", "- Use the helper function to plot the predictions along with the generated data.\n", "- Repeat steps 4 to 8 by building the same neural network with early stopping.\n", "- The last plot will consist of the predictions of both the neural networks. The graph will look similar to the one given above.\n", "\n", "\n", "# Hints:\n", "\n", "Use the Dense layer to regularize using l2 and l1 regularization. More details can be found here.\n", "\n", "tf.keras.sequential() : A sequential model is for a plain stack of layers where each layer has exactly one input tensor and one output tensor.\n", "\n", "tf.keras.optimizers() : An optimizer is one of the two arguments required for compiling a Keras model\n", "\n", "model.add() : Adds layers to the model.\n", "\n", "model.compile() : Compiles the layers defined into a neural network\n", "\n", "model.fit() : Fits the data to the neural network\n", "\n", "model.predict() : Used to predict the values given the model\n", "\n", "history() : The history object is returned from calls to the fit() function used to train the model. Metrics are stored in a dictionary in the history member of the object returned.\n", "\n", "tf.keras.regularizers.L2() : A regularizer that applies a L2 regularization penalty." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Import the necessary libraries\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")\n", "import tensorflow as tf\n", "np.random.seed(0)\n", "tf.random.set_seed(0)\n", "from tensorflow.keras import layers\n", "from tensorflow.keras import models\n", "from tensorflow.keras import optimizers\n", "from tensorflow.keras.models import load_model\n", "from tensorflow.keras import regularizers\n", "from sklearn.metrics import mean_squared_error\n", "from tensorflow.keras.models import load_model\n", "from sklearn.model_selection import train_test_split\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Use the helper code below to generate the data\n", "\n", "# Defines the number of data points to generate\n", "num_points = 30 \n", "\n", "# Generate predictor points (x) between 0 and 5\n", "x = np.linspace(0,5,num_points)\n", "\n", "# Generate the response variable (y) using the predictor points\n", "y = x * np.sin(x) + np.random.normal(loc=0, scale=1, size=num_points)\n", "\n", "# Generate data of the true function y = x*sin(x) \n", "# x_b will be used for all predictions below \n", "x_b = np.linspace(0,5,100)\n", "y_b = x_b*np.sin(x_b)\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Split the data into train and test sets with .33 and random_state = 42\n", "x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Helper code to plot the generated data \n", "\n", "# Plot the train data\n", "plt.rcParams[\"figure.figsize\"] = (10,8)\n", "\n", "plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n", "\n", "# Plot the test data\n", "plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n", "\n", "# Plot the true data\n", "plt.plot(x_b, y_b, '-', label='True function', linewidth=3, color='#5E5E5E')\n", "\n", "# Set the axes labels\n", "plt.xlabel('X')\n", "plt.ylabel('Y')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Begin with an unregularized NN. \n", "\n", "#### Same as the previous exercise" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Building an unregularized NN. \n", "# Initialise the NN, give it an appropriate name for the ease of reading\n", "# The FCNN has 5 layers, each with 100 nodes\n", "model_1 = models.Sequential(name='Unregularized')\n", "\n", "# Add 5 hidden layers with 100 neurons each\n", "model_1.add(layers.Dense(100, activation='tanh', input_shape=(1,)))\n", "model_1.add(layers.Dense(100, activation='relu'))\n", "model_1.add(layers.Dense(100, activation='relu'))\n", "model_1.add(layers.Dense(100, activation='relu'))\n", "model_1.add(layers.Dense(100, activation='relu'))\n", "\n", "# Add the output layer with one neuron \n", "model_1.add(layers.Dense(1, activation='linear'))\n", "\n", "# View the model summary\n", "model_1.summary()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Load with the weights already provided for the unregularized network\n", "\n", "model_1.load_weights('weights.h5')\n", "\n", "# Compile the model\n", "model_1.compile(loss='MSE',optimizer=optimizers.Adam(learning_rate=0.001)) " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Use the model above to predict for x_b (used exclusively for plotting) \n", "y_pred = model_1.predict(x_b)\n", "\n", "# Use the model above to predict on the test data\n", "y_pred_test = model_1.predict(x_test)\n", "\n", "# Compute the MSE on the test data\n", "mse = mean_squared_error(y_test,y_pred_test)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Use the helper code to plot the predicted data\n", "plt.rcParams[\"figure.figsize\"] = (10,8)\n", "plt.plot(x_b, y_pred, label = 'Unregularized model', color='#5E5E5E', linewidth=3)\n", "plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n", "plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n", "plt.xlabel('X')\n", "plt.ylabel('Y')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Implement previous NN with early stopping \n", "For early stopping we build the same network but then we implement early stopping using callbacks. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Building an unregularized NN with early stopping. \n", "# Initialise the NN, give it an appropriate name for the ease of reading\n", "# The FCNN has 5 layers, each with 100 nodes\n", "model_2 = models.Sequential(name='EarlyStopping')\n", "\n", "# Add 5 hidden layers with 100 neurons each \n", "# tanh is the activation for the first layer\n", "# relu is the activation for all other layers\n", "model_2.add(layers.Dense(100, activation='tanh', input_shape=(1,)))\n", "model_2.add(layers.Dense(100, activation='relu'))\n", "model_2.add(layers.Dense(100, activation='relu'))\n", "model_2.add(layers.Dense(100, activation='relu'))\n", "model_2.add(layers.Dense(100, activation='relu'))\n", "\n", "# Add the output layer with one neuron \n", "model_2.add(layers.Dense(1, activation='linear'))\n", "\n", "# View the model summary\n", "model_2.summary()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Use the keras early stopping callback with patience=10 while monitoring the loss\n", "callback = ___\n", "\n", "# Compile the model with MSE as loss and Adam optimizer with learning rate as 0.001\n", "___\n", "\n", "# Save the history about the model after fitting on the train data\n", "# Use 0.2 validation split with 1500 epochs and batch size of 10\n", "# Use the callback for early stopping here\n", "history_2 = ___\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Helper function to plot the data\n", "# Plot the MSE of the model\n", "plt.rcParams[\"figure.figsize\"] = (10,8)\n", "plt.title(\"Early stop model\")\n", "plt.semilogy(history_2.history['loss'], label='Train Loss', color='#FF9A98', linewidth=2)\n", "plt.semilogy(history_2.history['val_loss'], label='Validation Loss', color='#75B594', linewidth=2)\n", "plt.legend()\n", "\n", "# Set the axes labels\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Log MSE Loss')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Use the early stop implemented model above to predict for x_b (used exclusively for plotting)\n", "y_early_stop_pred = ___\n", "\n", "# Use the model above to predict on the test data\n", "y_earl_stop_pred_test = ___\n", "\n", "# Compute the test MSE by predicting on the test data\n", "mse_es = ___" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Use the helper code to plot the predicted data\n", "\n", "# Plotting the predicted data using the L2 regularized model\n", "plt.rcParams[\"figure.figsize\"] = (10,8)\n", "plt.plot(x_b, y_early_stop_pred, label='Early stop regularized model', color='black', linewidth=2)\n", "\n", "# Plotting the predicted data using the unregularized model\n", "plt.plot(x_b, y_pred, label = 'Unregularized model', color='#005493', linewidth=2)\n", "\n", "# Plotting the training data\n", "plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n", "\n", "# Plotting the testing data\n", "plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n", "\n", "# Set the axes labels\n", "plt.xlabel('X')\n", "plt.ylabel('Y')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mindchow 🍲\n", "\n", "**After marking change the patience parameter once to 2 and once to 100 in the early stopping callback with the same data. Do you notice any change? While value is more efficient?**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }