{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Title\n", "\n", "**Exercise A.1 - Build a Single Neuron by Hand**\n", "\n", "# Description\n", "\n", "The goal of this exercise is to predict the probability of a person having heart disease given their max heart rate using a **single neuron**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Instructions:\n", "- Read the datafile Heart.csv as a pandas dataframe.\n", "- Use the maximum heart rate as your predictor and probability of a person having heart disease as your response variable.\n", "- Plot the data by replacing the column values i.e. 'Yes' and 'No' of the response variables with 1 and 0 respectively. The graph will look like the one given below.\n", "- Construct a perceptron. This will need 3 functions:\n", " - The first function should return an **affine** transformation of the data for a single neuron. \n", " - The second function should return the sigmoid **activation** function. \n", " - We'll use the previous two functions to create a **predict** function to output predictions from our perceptron (aka neuron).\n", "- After making predictions using the perceptron we will plot our results." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Hints:\n", "- Remember you will need to tune the perceptron's parameters by hand. The following selections from the lecture may be helpful. \n", "\n", "(Note: $\\beta_0$ and $\\beta_1$ in the slides are referred to as b and w in the code)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "plt.scatter() : A scatter plot of y vs. x with varying marker size and/or colour.\n", "\n", "np.exp() : Calculates the exponential of all elements in the input array.\n", "\n", "plt.xlabel() : This is used to specify the text to be displayed as the label for the x-axis.\n", "\n", "plt.ylabel() : This is used to specify the text to be displayed as the label for the y-axis.\n", "\n", "**Note: This exercise is auto-graded and you can try multiple attempts.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Import the libraries\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Read the dataset and take a quick look\n", "heart_data = pd.read_csv('data/Heart.csv', index_col=0)\n", "heart_data.head()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Assign the predictor and reponse variables\n", "x = heart_data.___.___\n", "\n", "#Remember to replace the string column values to 0 and 1\n", "y = heart_data.___.___.___\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Plot the predictor and reponse vairables as a scatter plot with axes label\n", "plt.scatter(___)\n", "plt.xlabel(___)\n", "plt.ylabel(___)\n", "plt.legend(loc='best')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Construct the components of our percpetron." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### edTest(test_affine) ###\n", "def affine(x, w, b):\n", " \"\"\"Return affine transformation of x\n", " \n", " INPUTS\n", " ======\n", " x: A numpy array of points in x\n", " w: A float representing the weight of the perceptron\n", " b: A float representing the bias of the perceptron\n", " \n", " RETURN\n", " ======\n", " z: A numpy array of points after the affine transformation\n", " \"\"\"\n", " \n", " # your code here\n", " z = ___\n", " return z" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### edTest(test_sigmoid) ###\n", "def sigmoid(z):\n", " # hint: numpy has an exponentiation function, np.exp()\n", " \n", " # your code here\n", " h = ___\n", " return h" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### edTest(test_neuron_predict) ###\n", "def neuron_predict(x, w, b):\n", " #Call the previous functions\n", " \n", " # your code here\n", " h = ___\n", " return h" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Manually set the weight and bias parameters. \n", "\n", "Recall from lecture that the weight changes the slope of the sigmoid and the bias shifts the function to the left or right." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Hint: try values between -1 and 1\n", "w = ___ \n", "\n", "# Hint: try values between 50 and 100\n", "b = ___ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Use the perceptron to make predictions and plot our results." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The forward mode or predict of a single neuron\n", "\n", "# Create evenly spaced values of x to predict on\n", "x_linspace = np.linspace(x.min(),x.max(),500) \n", "h = neuron_predict(x_linspace,w, b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot Predictions\n", "fig, ax = plt.subplots(1,1, figsize=(11,7))\n", "ax.scatter(x, y, label=r'Heart Data', alpha=0.2)\n", "ax.plot(x_linspace, h, lw=2, c='orange', label=r'Single Neuron')\n", "\n", "# first value in x_linspace with a probability < 0.5\n", "db = x_linspace[np.argmax(h<0.5)] \n", "ax.axvline(x=db, alpha=0.3, linestyle='-.', c='r', label='Decision Boundary')\n", "\n", "# Proper plot labels are very important!\n", "\n", "# Make the tick labels big enough to read\n", "ax.tick_params(labelsize=16)\n", "plt.xlabel('MaxHR', fontsize=16)\n", "plt.ylabel('Heart Disease (AHD)', fontsize=16)\n", "\n", "# Create a legend and make it big enough to read\n", "ax.legend(fontsize=16, loc='best') \n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One way to assess our perceptron model's performance is to look at the binary cross entropy loss." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def loss(y_true, y_pred, eps=1e-15):\n", " assert y_true.shape[0] == y_pred.shape[0]\n", " \n", " # Clipping\n", " y_pred = np.clip(y_pred, eps, 1 - eps)\n", " return -sum(y_true*np.log(y_pred) + (1-y_true)*(np.log(1-y_pred)))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "## Print the loss\n", "h = neuron_predict(x, w, b)\n", "print(loss(y, h))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To ensure our perceptron model is not trivial we need to compare its accuracy to a baseline which always predicts the majority class (i.e., no heart disease). Play with your weights above and rerun the notebook until you can outperform the baseline." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def accuracy(y_true, y_pred):\n", " assert y_true.shape[0] == y_pred.shape[0]\n", " return sum(y_true == (y_pred >= 0.5).astype(int))/len(y_true)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### edTest(test_performance) ###\n", "\n", "# For the baseline predictions are all ones\n", "baseline_acc = accuracy(y, np.ones(len(y))) \n", "perceptron_acc = accuracy(y, h)\n", "print(f'Baseline Accuracy: {baseline_acc:.2%}')\n", "print(f'Perceptron Accuracy: {perceptron_acc:.2%}')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }