Homework 4¶
Due Date: Thursday, October 24th at 11:59 PM¶
Problem 0. Homework Workflow [10 pts]
Problem 1. Motivating Automatic Differentiation [20 pts]
Problem 2. A Neural Network, Forward Mode [25 pts]
Problem 3. Visualizing Reverse Mode [10 pts]
Problem 4. A Toy AD Implementation [20 pts]
Problem 5. Continuous Integration and Coverage [15 pts]
IMPORTANT¶
Don't forget to work on Milestone 1: Milestone 1 Page.
Problem 0: Homework Workflow¶
Once you receive HW3 feedback (no later than Friday Oct. 18th), you will need to merge your HW3-dev
branch into master
.
You will earn points for following all stages of the git workflow which involves:
- 3pts for merging
HW3-dev
intomaster
- 5pts for completing HW4 on
HW4-dev
- 2pts for making a PR on
HW4-dev
to merge intomaster
Problem 1: Motivating Automatic Differentiation¶
For scalar functions of a single variable, the derivative is defined by $$ f'(x) = \lim_{h \rightarrow 0} \frac{f(x+h)-f(x)}{h}$$
We can approximate the derivative of a function using finite, but small values of h
. All code for this problem should be contained in P1.py
.
Part A: Write a Numerical Differentiation Closure¶
Write a closure called numerical_diff
which takes as inputs a function (of a single variable) f
and a value of h
and returns a function which takes as input a value of x
that computes the numerical approximation of the derivative of f
with stepsize h
at x
.
Part B: Compare the Closure to the True Derivative¶
Let $f(x) = \ln(x)$. For $.2 \leq x \leq .4$, make a plot comparing the numerically estimated derivative for h=1e-1
, h=1e-7
, and h=1e-15
to the analytic derivative. Save your plot as P1_fig.png
.
Notes:
- You should use the analytic derivative explicitly.
- Your plot should be readable and interpretable (i.e., you should include axis labels and a legend). You may need to change the line style for all lines to be visible.
Part C: Why Automatic Differentiation?¶
Answer the following questions using two print
statements to display your answers. These can be placed at the very end of P1.py
(but before plt.show()
). Each print statement should start with the string: "Answer to Q-i:"
where i
is either a
or b
to reference the questions below.
- Q-a: Which value of
h
most closely approximates the true derivative? What happens for values of h that are too small? What happens for values of h that are too large? - Q-b: How does automatic differentiation address these problems?
Deliverables¶
P1.py
P1_fig.png
Problem 2: A Neural Network, Forward Mode¶
Artificial neural networks take as input the values of an input layer of neurons and combine these inputs in a series of layers to compute an output. A small network with a single hidden layer is drawn below.
This network can be expressed in matrix notation as $$f\left(x,y\right) = w_{\text{out}}^{T}z\left(W\begin{bmatrix}x \\ y \end{bmatrix} + \begin{bmatrix}b_{1} \\ b_{2}\end{bmatrix}\right) + b_{\text{out}}$$ where $$W = \begin{bmatrix} w_{11} & w_{12} \\ w_{21} & w_{22}\end{bmatrix}$$ is a (real) matrix of weights, $$w_{\text{out}} = \begin{bmatrix} w_{\text{out},1} \\ w_{\text{out},2} \end{bmatrix}$$ is a vector representing output weights, $b_i$ are bias terms, and $z$ is a nonlinear function that acts component-wise.
The above graph helps us visualize the computation in different layers. This visualization hides many of the underlying operations which occur in the computation of $f$ (e.g. it does not explicitly express the elementary operations).
Your Tasks¶
In this part, you will completely neglect the biases. The mathematical form is therefore $$f\left(x,y\right) = w_{\text{out}}^{T}z\left(W\begin{bmatrix}x \\ y \end{bmatrix}\right).$$ Note that in practical applications the biases play a key role. However, we have elected to neglect them in this problem so that your results are more readable. You will complete the two steps below while neglecting the bias terms.
- As we have done in lecture, draw the complete forward graph. You may treat $z$ as a single elementary operation. Submit a picture of the graph as
P2_graph.png
. This picture can be a picture taken of your graph drawn on a piece of paper or it can be something that you draw electronically. - Use your graph to write out the full forward mode table and submit a picture of the table as
P2_table.png
. This table can, once again, be done on a piece of paper or in markdown.
Submission Notes:¶
- For the graph, you should explicitly show the multiplications and additions that are masked in the schematic of the network above.
- You should relabel the nodes of the graph with traces (e.g. $x_{13}$) as we have done in class.
- Your table should include columns for the trace, elementary function, current function value, elementary function derivative, partial x derivative, and partial y derivative. Here is an example table with a row filled in (Note: Your table will not contain this exact row. Your table will have something else for $x_{4}$.)
Trace | Elementary Function | Current Value | Elementary Function Derivative | $\nabla_{x}$ Value | $\nabla_{y}$ Value |
---|---|---|---|---|---|
$x_{4}$ | $1/x_{1}$ | 1 | $-\dot{x}_{1}/x_{1}^{2}$ | $-1$ | $0$ |
- The values in your table should be in terms of $w_{1, out}, w_{2, out}, w_{11}, w_{12}, w_{21}, w_{22}, z,$ and $z^{\prime}$, where $\prime$ denotes a derivative.
- Pictures of handwritten graphs and tables are fine but make sure that these are legible.
Deliverables¶
P2_graph.png
P2_table.png
Problem 3: Visualizing Reverse Mode¶
This problem provides you with a tool to help visualize automatic differentiation. This automatic differentiation code base and GUI originated as the fall 2018 CS207 final project extension for Lindsey Brown, Xinyue Wang, and Kevin Yoon, and it has been further developed over the past year. It is meant to be a resource for you as you learn automatic differentiation.
Part A: Preview of Virtual Environments¶
In lecture, we will be learning about how to use virtual environments to create workspaces to use only certain packages. For this problem, we will walk through how to set up a virtual environment using conda
to create a workspace to contain a GUI to help with the visualization of the automatic differentiation functions.
- Set up a virtual environment. Use the command
conda create -n env_name python=3.6 anaconda
whereenv_name
is a name of your choosing for your virtual environment. - Activate your virtual environment, using
source activate env_name
.- Note: Some systems might given an error here and ask you to execute a different sequence of commands. If this happens, please follow the commands suggested in your terminal.
- Install the visualization tools for automatic differentiation,
pip install ADvis
. - Get the homework files using
git clone https://github.com/CS207-AD20/CS207-2019-HW4
. - Change into the new directory,
cd CS207-2019-HW4
. - Install the dependencies using
pip install -r requirements.txt
.
Part B: Visualize Backward Mode for the Neural Network¶
For this part, we will use the simplified neural network model (no bias terms) from Problem 2 again. For this part only (Problem 3B), take $w_{out} = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$, $W = \begin{bmatrix} 1.1 & 1.2 \\ 2.1 & 2.2 \end{bmatrix}$, and $z$ to be the identity.
To do so you'll be inputing $f(x,y)$ into either the command line (option A) or the GUI (option B), so you will need $f(x,y)$ in scalar form. (Hint: This should be the last line of your evaluation table.) Do not simplify by multiplying the weights togther. Each of the 6 constants should appear in the function expression exactly once. This will look strange, but it is the way that you should do the problem.
Visualize the reverse mode either by running the HW4-ADvis.py
file and running the script or by using the GUI interface.
Option A: Run HW4vis.py
¶
- Run the file using
python HW4vis.py
. - Follow the input instructions in the file. You are free to choose any $x$ and $y$ value at which to evaluate the function and its derivatives.
- Save the reverse mode graph as
P3_graph.png
. - Deactivate your virtual environment using
conda deactivate
.
Option B: Use the GUI¶
Note: It is a known issue that Tkinter (the package used for the GUI development) and macOS Mojave have some compatibility issues. If you are running Mojave, please use Option A. In the worst case, Macs running Mojave will restart without any warning when running Tkinter-based GUIs. Other behaviors have been observed as well.
- Launch the GUI using
python ADGUI.py
. - Input $f(x,y)$ into the GUI. Maximize the calculator window to be able to see all the options.
- Hit "Calculate" and choose an $x$ and $y$ value at which to evaluate the function and its derivatives. Again maximize the graph window to see all the options.
- Use the GUI options to visualize the forward and reverse graphs. Save the reverse mode graph as
P3_graph.png
. (Hint: The graph structure will be easier to see if you maximize the graph window.) - Deactivate your virtual environment using
conda deactivate
.
Deliverables¶
P3_graph.png
Problem 4: A Toy AD Implementation¶
You will write a toy forward automatic differentiation class. Write a class called AutoDiffToy
that can return the derivative of functions of the form $$f = \alpha x + \beta$$ for constants $\alpha, \beta \in \mathbb{R}$.
Interface¶
- Must contain a constructor that sets the value of the function and derivative
- This would be like the first row in the evaluation trace tables that we've been making.
- Must overload functions where appropriate.
- Note: Python's
__add__(self, other)
and__mul__(self, other)
methods are meant to be defined for objects of the same type. Your implementation should not assume thatother
is a real number but be robust enough to handle the case when it is.
- Note: Python's
- Handle exceptions appropriately.
- This is a good place to use (and practice) duck-typing. For example, rather than checking if an argument to a special method is an instance of the object, instead use a
try-except
block, catch anAttributeError
and do the appropriate calculation. - Hint: Asking Forgiveness
- This is a good place to use (and practice) duck-typing. For example, rather than checking if an argument to a special method is an instance of the object, instead use a
Use Case¶
a = 2.0 # Value to evaluate at
x = AutoDiffToy(a)
alpha = 2.0
beta = 3.0
f = alpha * x + beta
Output¶
print(f.val, f.der)
7.0 2.0
Requirements¶
- Implementation must be robust enough to handle functions written in the form
f = alpha * x + beta f = x * alpha + beta f = beta + alpha * x f = beta + x * alpha
- You should demo your code with an example for each of these 4 cases.
Deliverables¶
P4.py
containing your class and demo
Problem 5 [15 pts]: Continuous Integration¶
Note: You will not be able to start this problem until after lecture 13.
We discussed documentation and testing in lecture (or will very soon) and also briefly touched on code coverage. You must write tests for your code in your final project (and in life). There is a nice way to automate the testing process called continuous integration (CI). This problem will walk you through the basics of CI and show you how to get up and running with some CI software.
The idea behind continuous integration is to automate aspects of the testing process.
The basic workflow goes something like this:
- You work on your part of the code in your own branch or fork.
- On every commit you make and push to GitHub, your code is automatically tested by an external service (e.g. Travis CI). This ensures that there are no specific dependencies on the structure of your machine that your code needs to run and also ensures that your changes are sane.
- When you want to merge your changes with the master / production branch you submit a pull request to
master
in the main repo (the one you're hoping to contribute to). The repo manager creates a branch offmaster
. - This branch is also set to run tests on Travis. If all tests pass, then the pull request is accepted and your code becomes part of master.
In this problem, we will use GitHub to integrate our roots library with Travis CI and CodeCov. (Note that this is not the only workflow people use.)
Part A:¶
Create a public GitHub repo called cs207test
and clone it to your local machine. (Note: This should be done outside your course repo.)
Part B:¶
Use the example from lecture 13 to create a file called roots.py
, which contains the quad_roots
and linear_roots
functions (along with their documentation). Now, also create a file called test_roots.py
, which contains the tests from lecture.
All of these files should be in your newly created cs207test
repo. Don't push yet!!!
Part C: Create an account on Travis CI and Start Building¶
Create an account on Travis CI and set your cs207test
repo up for continuous integration once this repo can be seen on Travis.
Part D:¶
Create an instruction to Travis to make sure that
- python 3.6 is installed
- pytest is installed
The file should be called .travis.yml
and should have the contents (Note: The yml
line should not be present in your file. This is a rendering issue with the html. Everything below yml
should be in your file. Please download and open the Jupyter notebook to see the correct format with synatx highlighting.):
yml
language: python
python:
- "3.6"
before_install:
- pip install pytest pytest-cov
script:
- pytest
Part E:¶
Push the new changes to your cs207test
repo.
At this point you should be able to see your build on Travis and if and how your tests pass.
Part F: CodeCov Integration¶
In class, we also discussed code coverage. Just like Travis CI runs tests automatically for you, CodeCov automatically checks your code coverage.
Create an account on CodeCov
, connect your GitHub, and turn CodeCov integration on.
Part G:¶
Update your the .travis.yml
file as follows (Note: The yml
line should not be present in your file. This is a rendering issue with the html. Everything below yml
should be in your file. Please download and open the Jupyter notebook to see the correct format with synatx highlighting.):
yml
language: python
python:
- "3.6"
before_install:
- pip install pytest pytest-cov
- pip install codecov
script:
- pytest --cov=./
after_success:
- codecov
Be sure to push the latest changes to your new repo.
Part H:¶
You can have your GitHub repo reflect the build status on Travis CI and the code coverage status from CodeCov. To do this, you should modify the README.md
file in your repo to include some badges. Put the following at the top of your README.md
file:
[![Build Status](https://travis-ci.org/dsondak/cs207testing.svg?branch=master)](https://travis-ci.org/dsondak/cs207testing.svg?branch=master)
[![Coverage Status](https://codecov.io/gh/dsondak/cs207testing/branch/master/graph/badge.svg)](https://codecov.io/gh/dsondak/cs207testing)
Of course, you need to make sure that the links are to your repo and not mine. You can find embed code on the CodeCov and Travis CI sites.
Deliverables¶
P5.md
which contains a link to your publiccs207test
repo. The course staff will grade theREADME
in that repository.