{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Title\n", "\n", "**Exercise: Back-propagation by hand**\n", "\n", "# Description\n", "\n", "The aim of this exercise is to perform back-propagation to update the weights of a simple neural network (shown below):" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Instructions:\n", "- Get the predictor and response variables from the file backprop.csv and assign them to variables `x` and `y`.\n", "- Build a forward pass of the above neural network with one hidden layer. You will build this neural network using **numpy** (no deep learning package allowed).\n", "- Initialize the weights randomly with the random seed as 310, and make a prediction.\n", "- Plot your neural net predictions with the true value." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Compute the mean_squared_error of your predictions with the actual values.\n", "- Find the derivative of the loss function with respect to $w_1$.\n", "- Find the derivative of the loss function with respect to $w_2$.\n", "- Use the derivatives to update $w_1$ and $w_2$ .\n", "- Use the updated weights to make a forward pass and compute new predictions.\n", "- Plot the new predictions with the actual data. This will look similar to the one given above.\n", "- Calculate your $MSE$ and compare with the earlier value.\n", "\n", "\n", "\n", "# Hints:\n", "\n", "## Loss function: \n", "\n", "$$L\\ =\\ \\frac{1}{n}\\sum_1^n\\left(y_{pred}-y_{true}\\right)^2$$\n", "\n", "## Activation function: \n", "\n", "$$f\\left(x\\right)=\\sin x$$\n", "\n", "ax.plot() : A scatter plot of y vs. x with varying marker size and/or colour.\n", "\n", "np.exp() : Calculates the exponential of all elements in the input array.\n", "\n", "plt.xlabel() : This is used to specify the text to be displayed as the label for the x-axis.\n", "\n", "plt.ylabel() : This is used to specify the text to be displayed as the label for the y-axis.\n", "\n", "**Note: This exercise is auto-graded and you can try multiple attempts.**" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# import required libraries\n", "\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from sklearn.metrics import mean_squared_error\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# get the data from the file `backprop.csv`\n", "\n", "df = pd.read_csv('backprop.csv')\n", "\n", "# The input needs to be a 2D array so since we have a single\n", "# column (1000,) we need to reshape to a 2D array (1000,1)\n", "\n", "x = df.x.values.reshape(-1,1)\n", "\n", "y = df.y.values" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Designing the simple neural network \n", "\n", "def neural_network(W, x):\n", " # W is a list of the two weights (w1,w2) of your neural network\n", " # x is the input to the neural network\n", " '''\n", " Compute a1, a2, and y\n", " a1 is the dot product of the input and weight\n", " To compute a2, first use the activation function on a1, then multiply by w2\n", " Finally, use the activation function on a2 to compute y\n", " Return all three values which you will use to compute derivatives later\n", " '''\n", " a1 = np.dot(___, ___)\n", " fa1 = np.sin(__)\n", " a2 = np.dot(___,___)\n", " y = np.sin(__)\n", " \n", " return a1,a2,y" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Initialize the weights, but keep the random seed as 310 for reproducable results\n", "\n", "np.random.seed(310)\n", "W = [np.random.randn(1, 1), np.random.randn(1, 1)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot the predictor and response variables \n", "\n", "fig,ax = plt.subplots(1,1,figsize=(8,6))\n", "\n", "\n", "# plot the true x and y values\n", "ax.plot(x,y,label = 'True function',color='darkblue',linewidth=2)\n", "\n", "# plot the x values with the network predictions\n", "\n", "ax.plot(x,neural_network(W,x)[2],label = 'Neural net predictions',color='#9FC131FF',linewidth=2)\n", "\n", "# Set the x and y labels\n", "ax.set_xlabel('$x$',fontsize=14)\n", "ax.set_ylabel('$y$',fontsize=14)\n", "ax.legend(fontsize=14);" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### edTest(test_nn_mse) ###\n", "\n", "# You can use the mean_squared_error function to find the MSE of your predictions with true function values\n", "y_pred = ___\n", "mse = mean_squared_error(y, y_pred)\n", "print(f'The MSE of the neural network predictions wrt true function is {mse:.2f}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Single update" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "ename": "NameError", "evalue": "name 'a1' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;31m# Ge the predicted response, and the two activations of the network\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0ma1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my_pred\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mneural_network\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mW\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;31m# Compute the gradient of the loss function with respect to weight 2\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m\u001b[0m in \u001b[0;36mneural_network\u001b[0;34m(W, x)\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0m___\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 15\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0ma1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mNameError\u001b[0m: name 'a1' is not defined" ] } ], "source": [ "# Here we will update the weights only once\n", "\n", "# Get the predicted response, and the two a's of the network\n", "\n", "a1,a2,y_pred = neural_network(W,x)\n", "\n", "# Compute the gradient of the loss function with respect to weight 2\n", "# Use pen and paper to calculate these derivatives before coding them\n", "\n", "dldw2 = ___\n", "\n", "# Now compute the gradient of the loss function with respect to weight 1\n", "\n", "dldw1 = ___\n", "\n", "# combine the two in a list\n", "dldw = [np.mean(dldw1),np.mean(dldw2)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# In the update step, make sure to update the weights with their gradients\n", "\n", "Wnew = [i - j for i,j in zip(W,dldw)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot the predictor and response variables \n", "\n", "fig,ax = plt.subplots(1,1,figsize=(8,6))\n", "ax.plot(x,y,label = 'True function',color='darkblue',linewidth=2)\n", "ax.plot(x,neural_network(Wnew,x)[2],label = 'Neural net predictions',color='#9FC131FF',linewidth=2)\n", "ax.set_xlabel('$x$',fontsize=14)\n", "ax.set_ylabel('$y$',fontsize=14)\n", "ax.legend(fontsize=14);" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### edTest(test_one_update_mse) ###\n", "# Compute the new MSE after one update and print it\n", "y_pred = ___\n", "mse_update = ___\n", "print(f'The MSE of the new neural network predictions with true function is {mse_update:.2f} as compared to {mse:.2f} from before ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Several updates\n", "\n", "In principle, only a single update will never be sufficient to improve model predictions.\n", "In the below segment, use the method from above, and update the weight 300 times before plotting predictions.\n", "\n", "Does your MSE decrease?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Reinitialize the weights to start again \n", "np.random.seed(310)\n", "W = [np.random.randn(1, 1), np.random.randn(1, 1)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Unlike the previous step, this time we will set a learning rate of 0.01 to avoid drastic updates and run the above code for 10000 loops\n", "\n", "lmb = 0.01\n", "for i in range(300):\n", " a1,a2,y_pred = ___\n", "\n", " # Remember to use np.mean\n", " dldw2 = ___\n", " dldw1 = ___\n", " \n", " W[0] = W[0] - lmb * dldw1\n", " W[1] = W[1] - lmb * dldw2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot your results and calculate the MSE \n", "\n", "# Plot the predictor and response variables \n", "fig,ax = plt.subplots(1,1,figsize=(8,6))\n", "ax.plot(x,y,label = 'True function',color='darkblue',linewidth=2)\n", "ax.plot(x,neural_network(W,x)[2],label = 'Neural net predictions',color='#9FC131FF',linewidth=2)\n", "ax.set_xlabel('$x$',fontsize=14)\n", "ax.set_ylabel('$y$',fontsize=14)\n", "ax.legend(fontsize=14);" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### edTest(test_mse) ###\n", "# We again compute the MSE and compare it with the original predictions\n", "y_pred = ___\n", "mse_final = mean_squared_error(y,y_pred)\n", "print(f'The final MSE is {mse_final:.2f} as compared to {mse:.2f} from before ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Mindchow 🍲\n", "\n", "If you notice, your predicted values are off by approximately 0.5, from the actual values.\n", "After marking, go back to your neural network and add a bias correction to your predictions of 0.5.\n", "i.e `y = np.sin(a2) + 0.5` and rerun your code.\n", "\n", "Does your code fit better? And does your $MSE$ reduce?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }