{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Title\n",
    "\n",
    "**Exercise: Early Stopping**\n",
    "\n",
    "# Description\n",
    "\n",
    "The goal of this exercise is to understand early stopping. Early stopping is a method of avoiding overfitting, not exactly regularizing. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"../img/image2.png\" style=\"width: 500px;\">\n",
    "\n",
    "NOTE: This graph is only a sample."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Instructions:\n",
    "\n",
    "- Generate the predictor and response data using the helper code given.\n",
    "- Split the data into train and test sets.\n",
    "- Visualise the split data using the helper code.\n",
    "- Build a simple neural network with 5 hidden layers with 100 neurons each with the given pre-trained weights. This network has no regularization.\n",
    "- Compile the model with MSE as the loss.\n",
    "- Fit the model on the training data and save the history.\n",
    "- Use the helper code to visualise the MSE of the train and test data with respect to the epochs.\n",
    "- Predict on the entire data. \n",
    "- Use the helper function to plot the predictions along with the generated data.\n",
    "- Repeat steps 4 to 8 by building the same neural network with early stopping.\n",
    "- The last plot will consist of the predictions of both the neural networks. The graph will look similar to the one given above.\n",
    "\n",
    "\n",
    "# Hints:\n",
    "\n",
    "Use the Dense layer to regularize using l2 and l1 regularization. More details can be found <a href=\"https://keras.io/api/layers/regularizers/\" target=\"_blank\">here</a>.\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/Sequential\" target=\"_blank\">tf.keras.sequential()</a> : A sequential model is for a plain stack of layers where each layer has exactly one input tensor and one output tensor.\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/optimizers\" target=\"_blank\">tf.keras.optimizers()</a> : An optimizer is one of the two arguments required for compiling a Keras model\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense\" target=\"_blank\">model.add()</a> : Adds layers to the model.\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile\" target=\"_blank\">model.compile()</a> : Compiles the layers defined into a neural network\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/Model\" target=\"_blank\">model.fit()</a> : Fits the data to the neural network\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/Model\" target=\"_blank\">model.predict()</a> : Used to predict the values given the model\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/History\" target=\"_blank\">history()</a> : The history object is returned from calls to the fit() function used to train the model. Metrics are stored in a dictionary in the history member of the object returned.\n",
    "\n",
    "<a href=\"https://www.tensorflow.org/api_docs/python/tf/keras/regularizers/L2\" target=\"_blank\">tf.keras.regularizers.L2()</a> : A regularizer that applies a L2 regularization penalty."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import the necessary libraries\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "import tensorflow as tf\n",
    "np.random.seed(0)\n",
    "tf.random.set_seed(0)\n",
    "from tensorflow.keras import layers\n",
    "from tensorflow.keras import models\n",
    "from tensorflow.keras import optimizers\n",
    "from tensorflow.keras.models import load_model\n",
    "from tensorflow.keras import regularizers\n",
    "from sklearn.metrics import mean_squared_error\n",
    "from tensorflow.keras.models import load_model\n",
    "from sklearn.model_selection import train_test_split\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use the helper code below to generate the data\n",
    "\n",
    "# Defines the number of data points to generate\n",
    "num_points = 30 \n",
    "\n",
    "# Generate predictor points (x) between 0 and 5\n",
    "x = np.linspace(0,5,num_points)\n",
    "\n",
    "# Generate the response variable (y) using the predictor points\n",
    "y = x * np.sin(x) + np.random.normal(loc=0, scale=1, size=num_points)\n",
    "\n",
    "# Generate data of the true function y = x*sin(x) \n",
    "# x_b will be used for all predictions below \n",
    "x_b = np.linspace(0,5,100)\n",
    "y_b = x_b*np.sin(x_b)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Split the data into train and test sets with .33 and random_state = 42\n",
    "x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Helper code to plot the generated data \n",
    "\n",
    "# Plot the train data\n",
    "plt.rcParams[\"figure.figsize\"] = (10,8)\n",
    "\n",
    "plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n",
    "\n",
    "# Plot the test data\n",
    "plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n",
    "\n",
    "# Plot the true data\n",
    "plt.plot(x_b, y_b, '-', label='True function', linewidth=3, color='#5E5E5E')\n",
    "\n",
    "# Set the axes labels\n",
    "plt.xlabel('X')\n",
    "plt.ylabel('Y')\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Begin with an unregularized NN. \n",
    "\n",
    "#### Same as the previous exercise"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Building an unregularized NN. \n",
    "# Initialise the NN, give it an appropriate name for the ease of reading\n",
    "# The FCNN has 5 layers, each with 100 nodes\n",
    "model_1 = models.Sequential(name='Unregularized')\n",
    "\n",
    "# Add 5 hidden layers with 100 neurons each\n",
    "model_1.add(layers.Dense(100,  activation='tanh', input_shape=(1,)))\n",
    "model_1.add(layers.Dense(100,  activation='relu'))\n",
    "model_1.add(layers.Dense(100,  activation='relu'))\n",
    "model_1.add(layers.Dense(100,  activation='relu'))\n",
    "model_1.add(layers.Dense(100,  activation='relu'))\n",
    "\n",
    "# Add the output layer with one neuron \n",
    "model_1.add(layers.Dense(1,  activation='linear'))\n",
    "\n",
    "# View the model summary\n",
    "model_1.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load with the weights already provided for the unregularized network\n",
    "\n",
    "model_1.load_weights('weights.h5')\n",
    "\n",
    "# Compile the model\n",
    "model_1.compile(loss='MSE',optimizer=optimizers.Adam(learning_rate=0.001)) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use the model above to predict for x_b (used exclusively for plotting) \n",
    "y_pred = model_1.predict(x_b)\n",
    "\n",
    "# Use the model above to predict on the test data\n",
    "y_pred_test = model_1.predict(x_test)\n",
    "\n",
    "# Compute the MSE on the test data\n",
    "mse = mean_squared_error(y_test,y_pred_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use the helper code to plot the predicted data\n",
    "plt.rcParams[\"figure.figsize\"] = (10,8)\n",
    "plt.plot(x_b, y_pred, label = 'Unregularized model', color='#5E5E5E', linewidth=3)\n",
    "plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n",
    "plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n",
    "plt.xlabel('X')\n",
    "plt.ylabel('Y')\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Implement previous NN with early stopping \n",
    "For early stopping we build the same network but then we implement early stopping using callbacks. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Building an unregularized NN with early stopping. \n",
    "# Initialise the NN, give it an appropriate name for the ease of reading\n",
    "# The FCNN has 5 layers, each with 100 nodes\n",
    "model_2 = models.Sequential(name='EarlyStopping')\n",
    "\n",
    "# Add 5 hidden layers with 100 neurons each \n",
    "# tanh is the activation for the first layer\n",
    "# relu is the activation for all other layers\n",
    "model_2.add(layers.Dense(100,  activation='tanh', input_shape=(1,)))\n",
    "model_2.add(layers.Dense(100,  activation='relu'))\n",
    "model_2.add(layers.Dense(100,  activation='relu'))\n",
    "model_2.add(layers.Dense(100,  activation='relu'))\n",
    "model_2.add(layers.Dense(100,  activation='relu'))\n",
    "\n",
    "# Add the output layer with one neuron \n",
    "model_2.add(layers.Dense(1,  activation='linear'))\n",
    "\n",
    "# View the model summary\n",
    "model_2.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use the keras early stopping callback with patience=10 while monitoring the loss\n",
    "callback = ___\n",
    "\n",
    "# Compile the model with MSE as loss and Adam optimizer with learning rate as 0.001\n",
    "___\n",
    "\n",
    "# Save the history about the model after fitting on the train data\n",
    "# Use 0.2 validation split with 1500 epochs and batch size of 10\n",
    "# Use the callback for early stopping here\n",
    "history_2 = ___\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Helper function to plot the data\n",
    "# Plot the MSE of the model\n",
    "plt.rcParams[\"figure.figsize\"] = (10,8)\n",
    "plt.title(\"Early stop model\")\n",
    "plt.semilogy(history_2.history['loss'], label='Train Loss', color='#FF9A98', linewidth=2)\n",
    "plt.semilogy(history_2.history['val_loss'],  label='Validation Loss', color='#75B594', linewidth=2)\n",
    "plt.legend()\n",
    "\n",
    "# Set the axes labels\n",
    "plt.xlabel('Epochs')\n",
    "plt.ylabel('Log MSE Loss')\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use the early stop implemented model above to predict for x_b (used exclusively for plotting)\n",
    "y_early_stop_pred = ___\n",
    "\n",
    "# Use the model above to predict on the test data\n",
    "y_earl_stop_pred_test = ___\n",
    "\n",
    "# Compute the test MSE by predicting on the test data\n",
    "mse_es = ___"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use the helper code to plot the predicted data\n",
    "\n",
    "# Plotting the predicted data using the L2 regularized model\n",
    "plt.rcParams[\"figure.figsize\"] = (10,8)\n",
    "plt.plot(x_b, y_early_stop_pred, label='Early stop regularized model', color='black', linewidth=2)\n",
    "\n",
    "# Plotting the predicted data using the unregularized model\n",
    "plt.plot(x_b, y_pred, label = 'Unregularized model', color='#005493', linewidth=2)\n",
    "\n",
    "# Plotting the training data\n",
    "plt.plot(x_train,y_train, '.', label='Train data', markersize=15, color='#FF9A98')\n",
    "\n",
    "# Plotting the testing data\n",
    "plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')\n",
    "\n",
    "# Set the axes labels\n",
    "plt.xlabel('X')\n",
    "plt.ylabel('Y')\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Mindchow 🍲\n",
    "\n",
    "**After marking change the patience parameter once to 2 and once to 100 in the early stopping callback with the same data. Do you notice any change? While value is more efficient?**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}