We will be using an autograder for this homework. Please don't change the specified function signature.
Problem 0: Homework Workflow¶
Once you receive HW4 feedback, you will need to merge your HW4-dev
branch into master
.
You will earn points for following all stages of the git workflow which involves:
- 3pts for merging
HW4-dev
intomaster
- 5pts for completing HW5 on
HW5-dev
- 2pts for making a PR on
HW5-dev
to merge intomaster
Problem 1: @property
Decorator¶
Suppose we have a simple Animal
class, which is initialized by its species:
class Animal:
def __init__(self, name, species):
self.name = name
self.species = species
def __repr__(self):
return f'{self.name} ({self.species})'
We have clients who build applications on top of our Animal
class and they want to directly modify the species
attribute. For example, they may implement functions to transform animals.
def transform(animal, into):
animal.species = into
return animal
However, one day it occurred to us that our Animal
class is designed to be used in an H. C. Andersen universe. We really don't want people to make an Animal
instance into a Cthulhu [Cthulhu is not part of the H. C. Anderson universe]. How should we enforce this restriction? The obvious solution is to create a method to set the species, which checks if the new species is valid inside the method:
class Animal:
# a class attribute of the valid species in our universe
valid_species = {
'cat',
'dog',
'duck',
'elf',
'goblin',
'horse',
'human',
'mermaid',
'nightingale',
'pig',
'swan',
'wolf'
}
def __init__(self, species):
self.species = species
def __repr__(self):
return f'{self.name} ({self.species})'
def set_species(self, into):
assert into in Animal.valid_species, Exception(f'invalid species: {into}')
self.species = into
There are two problems with this implementation:
- If our clients don't know they have to change their implementation of
transform
(useset_species
instead of directly assign thespecies
attribute), people can still do things likea = transform(ugly_duckling, into='Cthulhu')
- We really don't want to bother our clients to change their code because they may have used
animal.species = something
hundreds of times throughout their code.
Fortunately, we have the property
decorator. Read more about it here.
For this problem, you will utilize the property
decorator to limit what species users can set their Animal
object to. Please copy the following code into the file animal.py
(please create the file and copy below code).
class Animal:
# a class attribute of the valid species in our universe
valid_species = {
'cat',
'dog',
'duck',
'elf',
'goblin',
'horse',
'human',
'mermaid',
'nightingale',
'pig',
'swan',
'wolf'
}
def __init__(self, name, species):
self.name = name
self._species = species
def __repr__(self):
return f'{self.name} ({self._species})'
Part A [10 points]:¶
We have also now designated the species
attribute as "non-user-facing" by calling it _species
because we do not want the user to access it directly. However, this presents a problem because in our first version of the Animal
class, a client could access the species through the species
attribute. You can check that the attributes of the Animal
class are name
and _species
using the following steps (on Python shell):
>>> dog = Animal('Fido', 'dog')
>>> vars(dog)
{'name': 'Fido', '_species' : 'dog'}
To prevent the client's previous code from breaking, modify the Animal
class in animal.py
by adding a species(self)
method to the Animal
class. The species
method should return the _species
attribute of this Animal
. Apply the property
decorator to this method.
If this is done correctly, then a client should still be able to retrieve the species of an Animal
object as follows in the Python shell (you should use print()
in an editor to print out):
>>> dog = Animal('Fido', 'dog')
>>> dog.species
'dog'
Part B [10 points]:¶
What happens when the client tries to set the value of the species
attribute as in the following:
>>> dog.species = 'cat'
You don't need to write the answer to this question, but you should test it.
Thus to prevent the client's previous code from breaking, we need to create another method that:
- Allows the client to set the species of an
Animal
object using.species
- And checks that the species attribute is set to a species in our
valid_species
list.
Add another method species(self, into)
to the Animal
class. This method should set the _species
attribute of this Animal
to into
, but before that, it makes sure that into
is in the valid_species
list. If into
is not in the valid_species
list, raise an Exception
. Apply the species.setter
decorator to this method.
If this is done correctly, then a client should still be able to set the species of an Animal
object (within the restricted list) as follows, in a Python shell (in an editor, you should use print()
to print out):
>>> dog = Animal('Fido', 'dog')
>>> dog.species
'dog'
>>> dog.species = 'cat'
>>> dog.species
'cat'
>>> dog.species = 'TheThing'
...
AssertionError: invalid species: TheThing
Do our clients need to change the implementation of their transform
function now?
Nope, and hopefully they give you a bonus!
Deliverables¶
animal.py
Problem 2: Linked List¶
A singly linked list can be implemented by nodes. Each node has 2 properties: a head and a tail. The head is the element held by the first node in the list. The tail is the rest of the list, which is a reference to the next node.
The code below is a partially implemented singly linked list. It consists of 2 classes: LinkedList
is a node that holds an element and Nil
is a special node that indicates the end of the linked list. In this problem, you will finish the implementation of these two classes. To name a node LinkedList
is a bit confusing, but as we will see, although the LinkedList
is implemented as a node, it behaves as if it is a whole list.
We start by copying the following template code into the file linked_list.py
(please create the file and copy the template code).
Hints: Even though this problem seems long, the first 3 parts are solvable with a one-line statement and Part D is only slightly longer (depending your implementation choices). Also, you don't need any loops in this problem (you will not lose points if you do so, but you should find it quite unnatural to use loops here). Also please review the given dunder methods for Nill() class.
class LinkedList:
def __init__(self, head, tail):
assert isinstance(tail, LinkedList) or isinstance(tail, Nil), TypeError(
'tail should either be a LinkedList or a Nil')
self._head, self._tail = head, tail
@property
def head(self):
return self._head
@property
def tail(self):
return self._tail
def __str__(self):
return str(self._head) + ' -> ' + str(self._tail)
def __repr__(self):
return f'LinkedList({repr(self._head)}, {repr(self._tail)})'
def __len__(self):
return 1 + len(self._tail)
def __getitem__(self, i):
return self._head if i == 0 else self._tail[i-1]
def prepend(self, val):
pass # TODO
def append(self, val):
pass # TODO
def for_each(self, fun):
pass # TODO
def summation(self):
return self._head + self._tail.summation() if self._tail else self._head
def minimum(self):
def smaller(a, b):
return a if a < b else b
return smaller(self._head, self._tail.minimum()) if self._tail else self._head
def reduce_right(self, fun):
pass # TODO
class Nil():
def __str__(self):
return 'Nil'
def __repr__(self):
return 'Nil()'
def __len__(self):
return 0
def __getitem__(self, i):
raise IndexError('index out of range')
def __bool__(self):
return False
def prepend(self, val):
pass # TODO
def append(self, val):
return LinkedList(val, Nil())
def for_each(self, fun):
pass # TODO
Part A [15 points]:¶
In this part, we want to make our linked list immutable and we are not updating its head or tail after it's been created.
Please implement the append
method.¶
This method appends a node holding the object val
to a LinkedList
and returns the LinkedList
.
Hint: It's recursive. We recursively invoke the append
on the tail, the tail of the tail, and so on. While implementing, try to draw a schematic of what's going on step-by-step for each tail creation. This will give a clearer picture.
There is a price to pay for the immutability. If we want to append an element to this list, we will need to update the tail of the last LinkedList
node, which breaks the immutability. Therefore, we need to re-create the last LinkedList
node, but then we realize that we need to update the next-to-last LinkedList
node by re-creating it as well and so on. It is essentially re-creating the whole new list with one more node. This is not very efficient. Fortunately, instead of appending an element to the end of this list, we can also prepend an element to the start of this list, which will be much more efficient.
Please implement the prepend
method in both the LinkedList
and Nil
classes.¶
Note: Both append
and prepend
return a new object without mutating the object passed to it (which is our linked list, the self
). In this problem, rather than imagining append
and prepend
as operations done on the original linked list, we imagine append
and prepend
as pure functions that map some input (original linked list) to an output (a linked list extended by a new node).
Part B [5 points]:¶
We want to be able to conveniently apply a function to all elements in this list. Please implement the for_each
method (in both LinkedList
and Nil
), which takes a function fun
as an argument, and returns a LinkedList
(or a Nil
) in which the element e
in the list is mapped to fun(e)
. For example:
l = Nil().prepend(1).prepend(2).prepend(3).prepend(4)
def square(x):
return x**2
print(l)
print(l.for_each(square))
gives
4 -> 3 -> 2 -> 1 -> Nil
16 -> 9 -> 4 -> 1 -> Nil
Note: Again, in the above example, for_each
is not modifying the values in the LinkedList
l
. It is returning a new LinkedList
.
Part C [10 points]:¶
If we look at the logic of summation
and minimum
, we will notice that they are quite similar. Both are first reducing the tail of the list into an intermediate result and then combining the head with the intermediate result. The difference between summation
and minimum
lies in what this combining function is. For summation
, two elements are combined by the function:
def combine(a, b):
return a+b
(We don't really have this function in our code, but its logic is used in summation
.)
For minimum
, two elements are combined by the hypothetical function
def combine(a, b):
return a if a < b else b
(We don't really have this function in our code either, but its logic is used in minimum
.)
Ideally, we should be able to create any combine
function and apply it to the linked list.
Please implement the reduce_right
method¶
reduce_right
requires a function fun
as an argument. reduce_right
will first reduce the tail of the list into an intermediate result by applying fun
to the tail nodes. Then, fun
should be applied to the head and the intermediate result to combine them. For example:
l = Nil().prepend(1).prepend(2).prepend(3).prepend(4)
def smaller(a, b): # our "combine" function
return a if a < b else b
print(l)
print(l.reduce_right(smaller))
gives
4 -> 3 -> 2 -> 1 -> Nil
1
Part D [10 points]:¶
For this problem, copy the following into longest_sentence.py
:
def get_list_of_sentences(chapter1='swansway-chapter1.txt'):
def to_sentences(p):
for delimiter in '.?!': p = p.replace(delimiter, '|')
return [s.strip('\" ') for s in p.split('|')]
with open(chapter1, 'r', encoding='UTF-8') as f:
paragraphs = f.readlines()
sentences = [s for p in paragraphs for s in to_sentences(p) if len(s) > 1]
list_of_sentences = Nil()
for s in sentences[::-1]:
list_of_sentences = list_of_sentences.prepend(s)
return list_of_sentences
def longest_sentence():
list_of_sentences = get_list_of_sentences()
pass # TODO
Marcel Proust is known for writing long sentences. The swansway-chapter1.txt
contains the first chapter of Remembrance of Things Past. The function get_list_of_sentences
will preprocess this file and return a LinkedList
of all sentences in this chapter.
Your task is to implement longest_sentence
which returns the number of words in the longest sentence.¶
You can assume that anything delimited by a space is a word.
Note: get_list_of_sentences
reads text from swansway-chapter1.txt
. You should have the text file in your HW5-final/
directory.
Hint: Use the for_each
and reduce_right
methods that you implemented.
Deliverables¶
linked_list.py
longest_sentence.py
Problem 3: Binary Search Tree¶
A binary search tree (BST) is a binary tree in which:
- The left subtree of a node contains only nodes with keys less than the node’s key.
- The right subtree of a node contains only nodes with keys greater than the node’s key.
- The left and right subtree each must also be a binary search tree.
The code below is a partially implemented binary search tree. BSTNode
is a node in the binary search tree. It contains a key-value pair and has references to its child nodes. BSTTable
is the binary search tree itself. It holds a reference to the root node of the tree and contains methods that manipulate the nodes. Each child of the root node and its child should be a BSTNode
. A BSTTable
can be used as a table to store key-value pairs. Our implementation will only be able to handle the case when the keys are comparable to each other. Simple client code of BSTTable
looks like this:
greektoroman = BSTTable()
greektoroman.put('Athena', 'Minerva')
greektoroman.put('Eros', 'Cupid')
greektoroman.put('Aphrodite', 'Venus')
greektoroman.get('Eros')
and will return
'Cupid'
In this problem, we will implement the get
and put
method of BSTTable
.
We start by copying the template code below into the file binary_search_tree.py
.
Hint: You can print
a BSTTable
or BSTNode
to see its structure. It's not pretty, but should be enough for debugging purposes. If you get lost, please try to draw the schematic on a paper step-by-step for each node creation and try to understand what your implemented functions do. For example,
print(greektoroman)
The above statement will print out:
BSTNode(Athena, Minerva)
|
|-(L)->BSTNode(Aphrodite, Venus)
| |
| |-(L)->None
| |
| |-(R)->None
|
|-(R)->BSTNode(Eros,Cupid)
| |
| |-(L)->None
| |
| |-(R)->None
Note: You may assume that the keys are unique.
class BSTNode:
def __init__(self, key, val):
self.key, self.val = key, val
self.left = None
self.right = None
self.size = 1
def __str__(self):
return f'BSTNode({self.key}, {self.val})' + \
'\n|\n|-(L)->' + '\n| '.join(str(self.left ).split('\n')) + \
'\n|\n|-(R)->' + '\n| '.join(str(self.right).split('\n'))
class BSTTable:
def __init__(self):
self._root = None
def __str__(self):
return str(self._root)
def __len__(self):
return self._size(self._root)
def put(self, key, val):
self._root = self._put(self._root, key, val)
def get(self, key):
return self._get(self._root, key)
def _put(self, node, key, val):
pass # TODO
def _get(self, node, key):
pass # TODO
@staticmethod
def _size(node):
return node.size if node else 0
Part A [15 points]:¶
The put
method inserts a key-value pair into the BSTTable
. It delegates the function call to _put
. Please implement the non-public method _put
. It takes a node
, a key
, and a val
as arguments. node
is the root of the subtree into which the key-value pair is to be inserted. After the insertion, the root of this subtree may change. _put
should return the new root of this subtree. The difference of the input subtree and the returned subtree is that in the returned subtree, a new key-value pair has been inserted. If the key was already in the subtree, then we update the value of the corresponding node.
Note: We assume all the keys are of the same type and are comparable to each other with >
, <
and ==
. So if all your keys are strings (e.g. greektoroman
), they will be ordered lexicographically. If all your keys are numbers, they will be ordered from the smallest to the largest.
Hints:
- This is recursive.
- Think about the base case: what should be returned if the
node
passed in isNone
? - Which subtree (left/right) should we insert if the
node
is notNone
? - Remember to update the
size
of the nodes.- The
size
attribute of eachBSTNode
is the size of this subtree, i.e. the number of nodes including itself and all its descendants.
- The
Part B [15 points]:¶
The get
method returns the value with the corresponding key. It delegates the function call to _get
. Please implement the non-public method _get
. It takes a node
and a key
as an argument. It will search for a node with the same key under the subtree whose root is node
. If there is such a node, return the val
of this node. _get
should raise a KeyError
if no such node is found.
Deliverables¶
binary_search_tree.py