Pair Programming Exercises¶

Week 4 (September 24th - September 29th)¶

Preamble¶

The goal today is to build some components of a simple, fully connected neural network. For those unfamiliar with neural networks, you can think of them as functions with many fitting parameters. You want this function to represent some data. In it's basic form, you pass your data through the neural network and get a prediction. Then you compare this prediction against the true data in a loss function. The loss function tells you how good your function is at approximating the data. Next, you update the parameters of the neural network so that the prediction of the neural network results in a smaller loss function.

The figure on the PP4 page illustrates some of the key components behind a basic neural network.

Peak Inside a Neural Network

Your task will be to create a closure that represents a layer of the neural network (the orange, shaded, dashed rectangle in the figure). The user will instantiate a layer by passing in the size of the layer as well as the associated activation function. Later on, they can use this layer by calling the instantiated object. When they use the layer, they need to pass in the inputs to the layer as well as the weights and biases for each unit in the layer. The result of calling an instantiated layer object will be an activated output fo the appropriate size. The exact API is specified in the exercise statement.

In today's exercise, you will create a layer object using closures.

Exercise¶

Create a closure for a neural network layer with the following API:

The outer function should return a layer object (the inner function). It should take in two arguments:
1. shape: A list of two numbers where the first number is the number of inputs to the layer and the second number is the number of units in the layer.
2. actv: An activation function (remember, functions are first class in Python!)
The inner function should return the layer outputs. Remember, the layer outputs are the outputs of each unit in the layer. This function should take in three arguments:
1. inputs: The inputs to the layer. This should be a numpy array.
2. weights: The weights for this layer. This should be a matrix of size shape.
3. bias: The bias for each unit in this layer. This should be a vector of size shape[1] (the number of units in the layer).

Here's how a user is expected to interact with this object.

t = np.random.uniform(0.0, 1.0, 100).reshape(1,-1) # input to the network

layer1 = layer(shape1, np.tanh) # Define layer 1
layer2 = layer(shape2, np.tanh) # Define layer 2

# Initialize weights and biases
w1 = ...
w2 = ...
b1 = ...
b2 = ...

# Run through the network
h1 = layer1(t, w1, b1) # First layer
h2 = layer2(h1, w2, b2) # Last layer

This is, of course, a drastically simplified example. Here's how it relates to the real world:

The input could be an image. The first thing we do is reshape that image into a vector
Now run that image through the network like we did above. This process can be automated more than what is shown in the example.
The last layer gives a prediction. In our little example it was just a scalar. In real life, it may be be a vector representing classes. For example, in the MNIST dataset the output layer would have 10 units for the 10 possible digits.
This prediction is passed to a loss function and compared with actual labeled data. Notice we're not thinking about the loss function here.
Take the derivative of the loss function with respect to the weights and biases and try some new values. Again, we're not doing this here. But this derivative is the motivation behind the entire course project!
Run through the network again with these new values.
Repeat this process until the loss function is as small as possible.

The special thing about our closure is that we were able to define the basic characteristics of each layer just one time. This is called instantiating the layer. In our little example, the basic characteristics were the size of the layer and the activation function of the layer. Now the only thing we need to pass to the layer during training of the network is the input to the layer and the updated layer weights and biases.

Deliverables¶

exercise_1.py
- Contains closure definition
- Demonstration using the closure