Key Word(s): Documentation, Testing, Coverage



Lecture 13

Thursday, October 17th 2019

Software design, documentation, and testing

Design of a program

From the Practice of Programming:

The essence of design is to balance competing goals and constraints. Although there may be many tradeoffs when one is writing a small self-contained system, the ramifications of particular choices remain within the system and affect only the individual programmer. But when code is to be used by others, decisions have wider repercussions.

Software Design Desirables

  • Documentation
    • names (understandable names)
    • pre+post conditions or requirements
  • Maintainability
    • Extensibility
    • Modularity and Encapsulation
  • Portability
  • Installability

Software Design Desirables (continued)

  • Generality
    • Data Abstraction (change types, change data structures)
    • Functional Abstraction (the object model, overloading)
    • Robustness
      • Provability: Invariants, preconditions, postconditions
      • User Proofing, Adversarial Inputs
  • Efficiency
    • Use of appropriate algorithms and data structures
    • Optimization (but not premature optimization)

Issues to be aware of

  • Interfaces - Your program is being designed to be used by someone: either an end user, another programmer, or even yourself. This interface is a contract between you and the user.

  • Hiding Information - There is information hiding between layers (a higher up layer can be more abstract). Encapsulation, abstraction, and modularization, are some of the techniques used here.

  • Resource Management - Who allocates storage for data structures? Generally we want resource allocation/deallocation to happen in the same layer.

  • How to Deal with Errors - Do we return special values? Do we throw exceptions? Who handles them?

Interface principles

  • hide implementation details
  • have a small set of operations exposed, the smallest possible, and these should be orthogonal. Be stingy with the user.
  • but be transparent with the user in what goes on behind the scenes
  • be consistent internally: library functions should have similar signature, classes similar methods, and externally programs should have the same Command Line Interface (CLI) flags

Testing should deal with ALL of the issues above, and each layer ought to be tested separately.

Testing

There are different kinds of tests inspired by the interface principles just described.

  • acceptance tests verify that a program meets a customer's expectations. In a sense, these are a test of the interface to the customer: does the program do everything you promised the customer it would do?
  • unit tests are tests which test a unit of the program for use by another unit. These could test the interface for a client, but they must also test the internal functions that you want to use.

Testing Continued

Exploratory testing, regression testing, integration testing are done in both of these categories, with the latter trying to combine layers and subsystems, not necessarily at the level of an entire application.

One can also run performance tests and stress test a system (to create adversarial situations).

Documentation

Documentation is a contract between a user (client) and an developer (library writer).

Write good documentation

  • Follow standards of PEP 257
  • Clearly outline the inputs, outputs, default values, and expected behavior
  • Include basic usage examples when possible
In [1]:
def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrt_disc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrt_disc
        r2 = -b - sqrt_disc
        two_a = 2.0 * a
        return (r1 / two_a, r2 / two_a)

Documenting Invariants

  • An invariant is something that is true at some point in the code.
  • Invariants and the contract are what we use to guide our implementation.
  • Pre-conditions and post-conditions are special cases of invariants.
  • Pre-conditions are true at function entry. They constrain the user.
  • Post-conditions are true at function exit. They constrain the implementation.

You can change implementations, stuff under the hood, etc, but once the software is in the wild you can't change the pre-conditions and post-conditions since the client user is depending upon them.

def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c.

    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term

    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised

    NOTES
    =====
    PRE: 
         - a, b, c have numeric type
         - three or fewer inputs
    POST:
         - a, b, and c are not changed by this function
         - raises a ValueError exception if a = 0
         - returns a 2-tuple of roots

    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrt_disc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrt_disc
        r2 = -b - sqrt_disc
        two_a = 2.0 * a
        return (r1 / two_a, r2 / two_a)
In [2]:
def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised

    NOTES
    =====
    PRE: 
         - a, b, c have numeric type
         - three or fewer inputs
    POST:
         - a, b, and c are not changed by this function
         - raises a ValueError exception if a = 0
         - returns a 2-tuple of roots

    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrt_disc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrt_disc
        r2 = -b - sqrt_disc
        two_a = 2.0 * a
        return (r1 / two_a, r2 / two_a)

Accessing Documentation (1)

  • Documentation can be accessed by calling the __doc__ special method
  • Simply calling function_name.__doc__ will give a pretty ugly output
  • You can make it cleaner by making use of splitlines()
In [3]:
quad_roots.__doc__.splitlines()
Out[3]:
['Returns the roots of a quadratic equation: ax^2 + bx + c.',
 '    ',
 '    INPUTS',
 '    =======',
 '    a: float, optional, default value is 1',
 '       Coefficient of quadratic term',
 '    b: float, optional, default value is 2',
 '       Coefficient of linear term',
 '    c: float, optional, default value is 0',
 '       Constant term',
 '    ',
 '    RETURNS',
 '    ========',
 '    roots: 2-tuple of complex floats',
 '       Has the form (root1, root2) unless a = 0 ',
 '       in which case a ValueError exception is raised',
 '',
 '    NOTES',
 '    =====',
 '    PRE: ',
 '         - a, b, c have numeric type',
 '         - three or fewer inputs',
 '    POST:',
 '         - a, b, and c are not changed by this function',
 '         - raises a ValueError exception if a = 0',
 '         - returns a 2-tuple of roots',
 '',
 '    EXAMPLES',
 '    =========',
 '    >>> quad_roots(1.0, 1.0, -12.0)',
 '    ((3+0j), (-4+0j))',
 '    ']

Accessing Documentation (2)

A nice way to access the documentation is to use the pydoc module.

In [4]:
import pydoc
pydoc.doc(quad_roots)
Python Library Documentation: function quad_roots in module __main__

quad_roots(a=1.0, b=2.0, c=0.0)
    Returns the roots of a quadratic equation: ax^2 + bx + c.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    NOTES
    =====
    PRE: 
         - a, b, c have numeric type
         - three or fewer inputs
    POST:
         - a, b, and c are not changed by this function
         - raises a ValueError exception if a = 0
         - returns a 2-tuple of roots
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))

Testing

There are different kinds of tests inspired by the interface principles just described.

  • acceptance tests verify that a program meets a customer's expectations. In a sense, these are a test of the interface to the customer: does the program do everything you promised the customer it would do?

  • unit tests are tests which test a unit of the program for use by another unit. These could test the interface for a client, but they must also test the internal functions that you want to use.

Exploratory testing, regression testing, and integration testing are done in both of these categories, with the latter trying to combine layers and subsystems, not necessarily at the level of an entire application.

One can also run performance tests and stress test a system (to create adversarial situations).

Testing of a program

Test as you write your program.

This is so important that I repeat it.

Test as you go.

Test-driven Development

Test Driven Development: what it is, and what it is not.

From The Practice of Programming:

The effort of testing as you go is minimal and pays off handsomely. Thinking about testing as you write a program will lead to better code, because that's when you know best what the code should do. If instead you wait until something breaks, you will probably have forgotten how the code works. Working under pressure, you will need to figure it out again, which takes time, and the fixes will be less thorough and more fragile because your refreshed understanding is likely to be incomplete.

doctest

The doctest module allows us to test pieces of code that we put into our doc string.

The doctests are a type of unit test, which document the interface of the function by example.

Doctests are an example of a test harness. We write some tests and execute them all at once. Note that individual tests can be written and executed individually in an ad-hoc manner. However, that is especially inefficient.

doctest Continued

Of course, too many doctests clutter the documentation section.

The doctests should not cover every case; they should describe the various ways a class or function can be used. There are better ways to do more comprehensive testing.

In [5]:
import doctest
doctest.testmod(verbose=True)
Trying:
    quad_roots(1.0, 1.0, -12.0)
Expecting:
    ((3+0j), (-4+0j))
ok
1 items had no tests:
    __main__
1 items passed all tests:
   1 tests in __main__.quad_roots
1 tests in 2 items.
1 passed and 0 failed.
Test passed.
Out[5]:
TestResults(failed=0, attempted=1)

Principles of Testing

  • Test simple parts first
  • Test code at its boundaries
    • The idea is that most errors happen at data boundaries such as empty input, single input item, exactly full array, weird values, etc. If a piece of code works at the boundaries, its likely to work elsewhere...
  • Automate using a test harness
  • Test incrementally

Principles of Testing Continued

  • Program defensively

    "Program defensively. A useful technique is to add code to handle "can't happen" cases, situations where it is not logically possible for something to happen but (because of some failure elsewhere) it might anyway. Adding a test for zero or negative array lengths to avg was one example. As another example, a program processing grades might expect that there would be no negative or huge values but should check anyway.

Test simple parts first:

A test for the quad_roots function:

In [6]:
def test_quadroots():
    assert quad_roots(1.0, 1.0, -12.0) == ((3+0j), (-4+0j))

test_quadroots()

Test at the boundaries

Here we write a test to handle the crazy case in which the user passes strings in as the coefficients.

In [7]:
import sys
def test_quadroots_types():
    try:
        quad_roots("", "green", "hi")
    except:
        assert(sys.exc_info()[0] == TypeError)

test_quadroots_types()

We can also check to make sure the $a=0$ case is handled okay:

In [8]:
import sys
def test_quadroots_zerocoeff():
    try:
        quad_roots(a=0.0)
    except:
        assert(sys.exc_info()[0] == ValueError)

test_quadroots_zerocoeff()

When you get an error

It could be that:

  • you messed up an implementation
  • you did not handle a case
  • your test was messed up (be careful of this)

If the error was not found in an existing test, create a new test that represents the problem before you do anything else. The test should capture the essence of the problem: this process itself is useful in uncovering bugs. Then this error may even suggest more tests.

Automate Using a Test Harness

Great! So we've written some ad-hoc tests. It's pretty clunky. We should use a test harness.

As mentioned already, doctest is a type of test harness. It has it's uses, but gets messy quickly.

We'll talk about pytest here.

Preliminaries

  1. The idea is that our code consists of several different pieces (or objects)
  2. The objects are grouped based on how they are related to each other
    • e.g. you may have a class that contains different statistical operations
  3. For now, we can think of having related functions all in one file
  4. We want to test each of those functions
    • Tests should include checking correctness of output, correctness of input, fringe cases, etc

Preliminaries

I will work in the Jupyter notebook for demo purposes.

To create and save a file in the Jupyter notebook, you type %%file file_name.py.

You must write your code using a text editor (like vim) or an IDE like Sypder.

The toy examples that we've been working with in the class so far can be done in Jupyter, but a real project should be done through other means.

In [9]:
%%file roots.py
def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrtdisc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrtdisc
        r2 = -b - sqrtdisc
        twoa = 2.0 * a
        return (r1 / twoa, r2 / twoa)
Writing roots.py

Let's put our tests into one file.

In [10]:
%%file test_roots.py
import pytest
import roots

def test_quadroots_result():
    assert roots.quad_roots(1.0, 1.0, -12.0) == ((3+0j), (-4+0j))

def test_quadroots_types():
    with pytest.raises(TypeError):
        roots.quad_roots("", "green", "hi")

def test_quadroots_zerocoeff():
    with pytest.raises(ValueError):
        roots.quad_roots(a=0.0)
Writing test_roots.py
In [11]:
!pytest
============================= test session starts ==============================
platform darwin -- Python 3.6.7, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /Users/dsondak/Teaching/Harvard/CS207/2019-CS207/content/lectures/lecture13/notebook, inifile:
plugins: remotedata-0.3.0, openfiles-0.3.0, doctestplus-0.1.3, cov-2.5.1, arraydiff-0.2
collected 3 items                                                              

test_roots.py ...                                                        [100%]

=========================== 3 passed in 0.03 seconds ===========================

Code Coverage

In some sense, it would be nice to somehow check that every line in a program has been covered by a test. If you could do this, you might know that a particular line has not contributed to making something wrong. But this is hard to do: it would be hard to use normal input data to force a program to go through particular statements. So we settle for testing the important lines. The pytest-cov module makes sure that this works.

Coverage does not mean that every edge case has been tried, but rather every critical statement has been tried.

Let's add a new function to our roots.py file.

In [12]:
%%file roots.py
def linear_roots(a=1.0, b=0.0):
    """Returns the roots of a linear equation: ax+ b = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of linear term
    b: float, optional, default value is 0
       Coefficient of constant term
    
    RETURNS
    ========
    roots: 1-tuple of real floats
       Has the form (root) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> linear_roots(1.0, 2.0)
    -2.0
    """
    if a == 0:
        raise ValueError("The linear coefficient is zero.  This is not a linear equation.")
    else:
        return ((-b / a))

def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrtdisc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrtdisc
        r2 = -b - sqrtdisc
        twoa = 2.0 * a
        return (r1 / twoa, r2 / twoa)
Overwriting roots.py

Run the tests and check code coverage

In [13]:
!pytest --cov
============================= test session starts ==============================
platform darwin -- Python 3.6.7, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /Users/dsondak/Teaching/Harvard/CS207/2019-CS207/content/lectures/lecture13/notebook, inifile:
plugins: remotedata-0.3.0, openfiles-0.3.0, doctestplus-0.1.3, cov-2.5.1, arraydiff-0.2
collected 3 items                                                              

test_roots.py ...                                                        [100%]

---------- coverage: platform darwin, python 3.6.7-final-0 -----------
Name            Stmts   Miss  Cover
-----------------------------------
roots.py           13      3    77%
test_roots.py      10      0   100%
-----------------------------------
TOTAL              23      3    87%


=========================== 3 passed in 0.04 seconds ===========================

Run the tests, report code coverage, and report missing lines.

In [14]:
!pytest --cov --cov-report term-missing
============================= test session starts ==============================
platform darwin -- Python 3.6.7, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /Users/dsondak/Teaching/Harvard/CS207/2019-CS207/content/lectures/lecture13/notebook, inifile:
plugins: remotedata-0.3.0, openfiles-0.3.0, doctestplus-0.1.3, cov-2.5.1, arraydiff-0.2
collected 3 items                                                              

test_roots.py ...                                                        [100%]

---------- coverage: platform darwin, python 3.6.7-final-0 -----------
Name            Stmts   Miss  Cover   Missing
---------------------------------------------
roots.py           13      3    77%   22-25
test_roots.py      10      0   100%
---------------------------------------------
TOTAL              23      3    87%


=========================== 3 passed in 0.04 seconds ===========================

Run tests, including the doctests, report code coverage, and report missing lines.

In [15]:
!pytest --doctest-modules --cov --cov-report term-missing
============================= test session starts ==============================
platform darwin -- Python 3.6.7, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /Users/dsondak/Teaching/Harvard/CS207/2019-CS207/content/lectures/lecture13/notebook, inifile:
plugins: remotedata-0.3.0, openfiles-0.3.0, doctestplus-0.1.3, cov-2.5.1, arraydiff-0.2
collected 5 items                                                              

roots.py ..                                                              [ 40%]
test_roots.py ...                                                        [100%]

---------- coverage: platform darwin, python 3.6.7-final-0 -----------
Name            Stmts   Miss  Cover   Missing
---------------------------------------------
roots.py           13      1    92%   23
test_roots.py      10      0   100%
---------------------------------------------
TOTAL              23      1    96%


=========================== 5 passed in 0.05 seconds ===========================

Let's put some tests in for the linear roots function.

In [16]:
%%file test_roots.py
import pytest
import roots

def test_quadroots_result():
    assert roots.quad_roots(1.0, 1.0, -12.0) == ((3+0j), (-4+0j))

def test_quadroots_types():
    with pytest.raises(TypeError):
        roots.quad_roots("", "green", "hi")

def test_quadroots_zerocoeff():
    with pytest.raises(ValueError):
        roots.quad_roots(a=0.0)

def test_linearoots_result():
    assert roots.linear_roots(2.0, -3.0) == 1.5

def test_linearroots_types():
    with pytest.raises(TypeError):
        roots.linear_roots("ocean", 6.0)

def test_linearroots_zerocoeff():
    with pytest.raises(ValueError):
        roots.linear_roots(a=0.0)
Overwriting test_roots.py

Now run the tests and check code coverage.

In [17]:
!pytest --doctest-modules --cov --cov-report term-missing
============================= test session starts ==============================
platform darwin -- Python 3.6.7, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /Users/dsondak/Teaching/Harvard/CS207/2019-CS207/content/lectures/lecture13/notebook, inifile:
plugins: remotedata-0.3.0, openfiles-0.3.0, doctestplus-0.1.3, cov-2.5.1, arraydiff-0.2
collected 8 items                                                              

roots.py ..                                                              [ 25%]
test_roots.py ......                                                     [100%]

---------- coverage: platform darwin, python 3.6.7-final-0 -----------
Name            Stmts   Miss  Cover   Missing
---------------------------------------------
roots.py           13      0   100%
test_roots.py      18      0   100%
---------------------------------------------
TOTAL              31      0   100%


=========================== 8 passed in 0.08 seconds ===========================

Exercise: Documentation and Testing

The little program L2.py needs some documentation and some tests. Since you didn't write it, I'll tell you what it's supposed to do. You'll need to document it. Feel free to test for additional exceptions if you have time but start with it as it is.

The point of the program is to compute the $L_{2}$ norm of a vector $v$. A second argument, if provided, will be interpreted as a vector of weights. The second argument must have the same length as the input vector.

NOTE: The input type of the vectors for this program should be a list of numbers.

As a reminder, the weighted $L_2$ norm of a vector $v$ is given by \begin{align*} \|v\|_{W} = \sqrt{\sum_{i=1}^{N}{\left(w_{i}v_{i}\right)^2}} \end{align*} where $N$ is the length of the vector $v$, $v_{i}$ is the i-th component of the vector $v$ and $w_{i}$ is the i-th component of the weight vector.

Requirements

  • You must write the documentation and a decent test suite.
    • Include some doctests as well!
  • Use the pytest module to run the doctests and unit tests and to assess the code coverage.

If you don't already have pytest, you can install it using pip install pytest. If you have trouble installing, here's the website: pytest installation.

Deliverables

  • L2.py: The documented function.
  • test_L2.py: The test suite.