Key Word(s): Neural Networks, Perceptron, MLP
Instructions:¶
- Read the datafile Heart.csv as a pandas dataframe.
- Use the maximum heart rate as your predictor and probability of a person having heart disease as your response variable.
- Plot the data by replacing the column values i.e. 'Yes' and 'No' of the response variables with 1 and 0 respectively. The graph will look like the one given below.
- Construct a perceptron. This will need 3 functions:
- The first function should return an affine transformation of the data for a single neuron.
- The second function should return the sigmoid activation function.
- We'll use the previous two functions to create a predict function to output predictions from our perceptron (aka neuron).
- After making predictions using the perceptron we will plot our results.
Hints:¶
- Remember you will need to tune the perceptron's parameters by hand. The following selections from the lecture may be helpful.
(Note: $\beta_0$ and $\beta_1$ in the slides are referred to as b and w in the code)
plt.scatter() : A scatter plot of y vs. x with varying marker size and/or colour.
np.exp() : Calculates the exponential of all elements in the input array.
plt.xlabel() : This is used to specify the text to be displayed as the label for the x-axis.
plt.ylabel() : This is used to specify the text to be displayed as the label for the y-axis.
Note: This exercise is auto-graded and you can try multiple attempts.
#Import the libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
#Read the dataset and take a quick look
heart_data = pd.read_csv('data/Heart.csv', index_col=0)
heart_data.head()
#Assign the predictor and reponse variables
x = heart_data.___.___
#Remember to replace the string column values to 0 and 1
y = heart_data.___.___.___
#Plot the predictor and reponse vairables as a scatter plot with axes label
plt.scatter(___)
plt.xlabel(___)
plt.ylabel(___)
plt.legend(loc='best')
Construct the components of our percpetron.¶
### edTest(test_affine) ###
def affine(x, w, b):
"""Return affine transformation of x
INPUTS
======
x: A numpy array of points in x
w: A float representing the weight of the perceptron
b: A float representing the bias of the perceptron
RETURN
======
z: A numpy array of points after the affine transformation
"""
# your code here
z = ___
return z
### edTest(test_sigmoid) ###
def sigmoid(z):
# hint: numpy has an exponentiation function, np.exp()
# your code here
h = ___
return h
### edTest(test_neuron_predict) ###
def neuron_predict(x, w, b):
#Call the previous functions
# your code here
h = ___
return h
Manually set the weight and bias parameters.¶
Recall from lecture that the weight changes the slope of the sigmoid and the bias shifts the function to the left or right.
# Hint: try values between -1 and 1
w = ___
# Hint: try values between 50 and 100
b = ___
Use the perceptron to make predictions and plot our results.¶
# The forward mode or predict of a single neuron
# Create evenly spaced values of x to predict on
x_linspace = np.linspace(x.min(),x.max(),500)
h = neuron_predict(x_linspace,w, b)
# Plot Predictions
fig, ax = plt.subplots(1,1, figsize=(11,7))
ax.scatter(x, y, label=r'Heart Data', alpha=0.2)
ax.plot(x_linspace, h, lw=2, c='orange', label=r'Single Neuron')
# first value in x_linspace with a probability < 0.5
db = x_linspace[np.argmax(h<0.5)]
ax.axvline(x=db, alpha=0.3, linestyle='-.', c='r', label='Decision Boundary')
# Proper plot labels are very important!
# Make the tick labels big enough to read
ax.tick_params(labelsize=16)
plt.xlabel('MaxHR', fontsize=16)
plt.ylabel('Heart Disease (AHD)', fontsize=16)
# Create a legend and make it big enough to read
ax.legend(fontsize=16, loc='best')
plt.show()
One way to assess our perceptron model's performance is to look at the binary cross entropy loss.
def loss(y_true, y_pred, eps=1e-15):
assert y_true.shape[0] == y_pred.shape[0]
# Clipping
y_pred = np.clip(y_pred, eps, 1 - eps)
return -sum(y_true*np.log(y_pred) + (1-y_true)*(np.log(1-y_pred)))
## Print the loss
h = neuron_predict(x, w, b)
print(loss(y, h))
To ensure our perceptron model is not trivial we need to compare its accuracy to a baseline which always predicts the majority class (i.e., no heart disease). Play with your weights above and rerun the notebook until you can outperform the baseline.
def accuracy(y_true, y_pred):
assert y_true.shape[0] == y_pred.shape[0]
return sum(y_true == (y_pred >= 0.5).astype(int))/len(y_true)
### edTest(test_performance) ###
# For the baseline predictions are all ones
baseline_acc = accuracy(y, np.ones(len(y)))
perceptron_acc = accuracy(y, h)
print(f'Baseline Accuracy: {baseline_acc:.2%}')
print(f'Perceptron Accuracy: {perceptron_acc:.2%}')