We will be using an autograder for this homework. Please don't change the specified function signature.
Problem 0: Homework Workflow¶
Once you receive HW4 feedback, you will need to merge your HW4-dev branch into master.
You will earn points for following all stages of the git workflow which involves:
- 3pts for merging
HW4-devintomaster - 5pts for completing HW5 on
HW5-dev - 2pts for making a PR on
HW5-devto merge intomaster
Problem 1: @property Decorator¶
Suppose we have a simple Animal class, which is initialized by its species:
class Animal:
def __init__(self, name, species):
self.name = name
self.species = species
def __repr__(self):
return f'{self.name} ({self.species})'
We have clients who build applications on top of our Animal class and they want to directly modify the species attribute. For example, they may implement functions to transform animals.
def transform(animal, into):
animal.species = into
return animal
However, one day it occurred to us that our Animal class is designed to be used in an H. C. Andersen universe. We really don't want people to make an Animal instance into a Cthulhu [Cthulhu is not part of the H. C. Anderson universe]. How should we enforce this restriction? The obvious solution is to create a method to set the species, which checks if the new species is valid inside the method:
class Animal:
# a class attribute of the valid species in our universe
valid_species = {
'cat',
'dog',
'duck',
'elf',
'goblin',
'horse',
'human',
'mermaid',
'nightingale',
'pig',
'swan',
'wolf'
}
def __init__(self, species):
self.species = species
def __repr__(self):
return f'{self.name} ({self.species})'
def set_species(self, into):
assert into in Animal.valid_species, Exception(f'invalid species: {into}')
self.species = into
There are two problems with this implementation:
- If our clients don't know they have to change their implementation of
transform(useset_speciesinstead of directly assign thespeciesattribute), people can still do things likea = transform(ugly_duckling, into='Cthulhu') - We really don't want to bother our clients to change their code because they may have used
animal.species = somethinghundreds of times throughout their code.
Fortunately, we have the property decorator. Read more about it here.
For this problem, you will utilize the property decorator to limit what species users can set their Animal object to. Please copy the following code into the file animal.py (please create the file and copy below code).
class Animal:
# a class attribute of the valid species in our universe
valid_species = {
'cat',
'dog',
'duck',
'elf',
'goblin',
'horse',
'human',
'mermaid',
'nightingale',
'pig',
'swan',
'wolf'
}
def __init__(self, name, species):
self.name = name
self._species = species
def __repr__(self):
return f'{self.name} ({self._species})'
Part A [10 points]:¶
We have also now designated the species attribute as "non-user-facing" by calling it _species because we do not want the user to access it directly. However, this presents a problem because in our first version of the Animal class, a client could access the species through the species attribute. You can check that the attributes of the Animal class are name and _species using the following steps (on Python shell):
>>> dog = Animal('Fido', 'dog')
>>> vars(dog)
{'name': 'Fido', '_species' : 'dog'}
To prevent the client's previous code from breaking, modify the Animal class in animal.py by adding a species(self) method to the Animal class. The species method should return the _species attribute of this Animal. Apply the property decorator to this method.
If this is done correctly, then a client should still be able to retrieve the species of an Animal object as follows in the Python shell (you should use print() in an editor to print out):
>>> dog = Animal('Fido', 'dog')
>>> dog.species
'dog'
Part B [10 points]:¶
What happens when the client tries to set the value of the species attribute as in the following:
>>> dog.species = 'cat'
You don't need to write the answer to this question, but you should test it.
Thus to prevent the client's previous code from breaking, we need to create another method that:
- Allows the client to set the species of an
Animalobject using.species - And checks that the species attribute is set to a species in our
valid_specieslist.
Add another method species(self, into) to the Animal class. This method should set the _species attribute of this Animal to into, but before that, it makes sure that into is in the valid_species list. If into is not in the valid_species list, raise an Exception. Apply the species.setter decorator to this method.
If this is done correctly, then a client should still be able to set the species of an Animal object (within the restricted list) as follows, in a Python shell (in an editor, you should use print() to print out):
>>> dog = Animal('Fido', 'dog')
>>> dog.species
'dog'
>>> dog.species = 'cat'
>>> dog.species
'cat'
>>> dog.species = 'TheThing'
...
AssertionError: invalid species: TheThing
Do our clients need to change the implementation of their transform function now?
Nope, and hopefully they give you a bonus!
Deliverables¶
animal.py
Problem 2: Linked List¶
A singly linked list can be implemented by nodes. Each node has 2 properties: a head and a tail. The head is the element held by the first node in the list. The tail is the rest of the list, which is a reference to the next node.
The code below is a partially implemented singly linked list. It consists of 2 classes: LinkedList is a node that holds an element and Nil is a special node that indicates the end of the linked list. In this problem, you will finish the implementation of these two classes. To name a node LinkedList is a bit confusing, but as we will see, although the LinkedList is implemented as a node, it behaves as if it is a whole list.
We start by copying the following template code into the file linked_list.py (please create the file and copy the template code).
Hints: Even though this problem seems long, the first 3 parts are solvable with a one-line statement and Part D is only slightly longer (depending your implementation choices). Also, you don't need any loops in this problem (you will not lose points if you do so, but you should find it quite unnatural to use loops here). Also please review the given dunder methods for Nill() class.
class LinkedList:
def __init__(self, head, tail):
assert isinstance(tail, LinkedList) or isinstance(tail, Nil), TypeError(
'tail should either be a LinkedList or a Nil')
self._head, self._tail = head, tail
@property
def head(self):
return self._head
@property
def tail(self):
return self._tail
def __str__(self):
return str(self._head) + ' -> ' + str(self._tail)
def __repr__(self):
return f'LinkedList({repr(self._head)}, {repr(self._tail)})'
def __len__(self):
return 1 + len(self._tail)
def __getitem__(self, i):
return self._head if i == 0 else self._tail[i-1]
def prepend(self, val):
pass # TODO
def append(self, val):
pass # TODO
def for_each(self, fun):
pass # TODO
def summation(self):
return self._head + self._tail.summation() if self._tail else self._head
def minimum(self):
def smaller(a, b):
return a if a < b else b
return smaller(self._head, self._tail.minimum()) if self._tail else self._head
def reduce_right(self, fun):
pass # TODO
class Nil():
def __str__(self):
return 'Nil'
def __repr__(self):
return 'Nil()'
def __len__(self):
return 0
def __getitem__(self, i):
raise IndexError('index out of range')
def __bool__(self):
return False
def prepend(self, val):
pass # TODO
def append(self, val):
return LinkedList(val, Nil())
def for_each(self, fun):
pass # TODO
Part A [15 points]:¶
In this part, we want to make our linked list immutable and we are not updating its head or tail after it's been created.
Please implement the append method.¶
This method appends a node holding the object val to a LinkedList and returns the LinkedList.
Hint: It's recursive. We recursively invoke the append on the tail, the tail of the tail, and so on. While implementing, try to draw a schematic of what's going on step-by-step for each tail creation. This will give a clearer picture.
There is a price to pay for the immutability. If we want to append an element to this list, we will need to update the tail of the last LinkedList node, which breaks the immutability. Therefore, we need to re-create the last LinkedList node, but then we realize that we need to update the next-to-last LinkedList node by re-creating it as well and so on. It is essentially re-creating the whole new list with one more node. This is not very efficient. Fortunately, instead of appending an element to the end of this list, we can also prepend an element to the start of this list, which will be much more efficient.
Please implement the prepend method in both the LinkedList and Nil classes.¶
Note: Both append and prepend return a new object without mutating the object passed to it (which is our linked list, the self). In this problem, rather than imagining append and prepend as operations done on the original linked list, we imagine append and prepend as pure functions that map some input (original linked list) to an output (a linked list extended by a new node).
Part B [5 points]:¶
We want to be able to conveniently apply a function to all elements in this list. Please implement the for_each method (in both LinkedList and Nil), which takes a function fun as an argument, and returns a LinkedList (or a Nil) in which the element e in the list is mapped to fun(e). For example:
l = Nil().prepend(1).prepend(2).prepend(3).prepend(4)
def square(x):
return x**2
print(l)
print(l.for_each(square))
gives
4 -> 3 -> 2 -> 1 -> Nil
16 -> 9 -> 4 -> 1 -> Nil
Note: Again, in the above example, for_each is not modifying the values in the LinkedList l. It is returning a new LinkedList.
Part C [10 points]:¶
If we look at the logic of summation and minimum, we will notice that they are quite similar. Both are first reducing the tail of the list into an intermediate result and then combining the head with the intermediate result. The difference between summation and minimum lies in what this combining function is. For summation, two elements are combined by the function:
def combine(a, b):
return a+b
(We don't really have this function in our code, but its logic is used in summation.)
For minimum, two elements are combined by the hypothetical function
def combine(a, b):
return a if a < b else b
(We don't really have this function in our code either, but its logic is used in minimum.)
Ideally, we should be able to create any combine function and apply it to the linked list.
Please implement the reduce_right method¶
reduce_right requires a function fun as an argument. reduce_right will first reduce the tail of the list into an intermediate result by applying fun to the tail nodes. Then, fun should be applied to the head and the intermediate result to combine them. For example:
l = Nil().prepend(1).prepend(2).prepend(3).prepend(4)
def smaller(a, b): # our "combine" function
return a if a < b else b
print(l)
print(l.reduce_right(smaller))
gives
4 -> 3 -> 2 -> 1 -> Nil
1
Part D [10 points]:¶
For this problem, copy the following into longest_sentence.py:
def get_list_of_sentences(chapter1='swansway-chapter1.txt'):
def to_sentences(p):
for delimiter in '.?!': p = p.replace(delimiter, '|')
return [s.strip('\" ') for s in p.split('|')]
with open(chapter1, 'r', encoding='UTF-8') as f:
paragraphs = f.readlines()
sentences = [s for p in paragraphs for s in to_sentences(p) if len(s) > 1]
list_of_sentences = Nil()
for s in sentences[::-1]:
list_of_sentences = list_of_sentences.prepend(s)
return list_of_sentences
def longest_sentence():
list_of_sentences = get_list_of_sentences()
pass # TODO
Marcel Proust is known for writing long sentences. The swansway-chapter1.txt contains the first chapter of Remembrance of Things Past. The function get_list_of_sentences will preprocess this file and return a LinkedList of all sentences in this chapter.
Your task is to implement longest_sentence which returns the number of words in the longest sentence.¶
You can assume that anything delimited by a space is a word.
Note: get_list_of_sentences reads text from swansway-chapter1.txt. You should have the text file in your HW5-final/ directory.
Hint: Use the for_each and reduce_right methods that you implemented.
Deliverables¶
linked_list.pylongest_sentence.py
Problem 3: Binary Search Tree¶
A binary search tree (BST) is a binary tree in which:
- The left subtree of a node contains only nodes with keys less than the node’s key.
- The right subtree of a node contains only nodes with keys greater than the node’s key.
- The left and right subtree each must also be a binary search tree.
The code below is a partially implemented binary search tree. BSTNode is a node in the binary search tree. It contains a key-value pair and has references to its child nodes. BSTTable is the binary search tree itself. It holds a reference to the root node of the tree and contains methods that manipulate the nodes. Each child of the root node and its child should be a BSTNode. A BSTTable can be used as a table to store key-value pairs. Our implementation will only be able to handle the case when the keys are comparable to each other. Simple client code of BSTTable looks like this:
greektoroman = BSTTable()
greektoroman.put('Athena', 'Minerva')
greektoroman.put('Eros', 'Cupid')
greektoroman.put('Aphrodite', 'Venus')
greektoroman.get('Eros')
and will return
'Cupid'
In this problem, we will implement the get and put method of BSTTable.
We start by copying the template code below into the file binary_search_tree.py.
Hint: You can print a BSTTable or BSTNode to see its structure. It's not pretty, but should be enough for debugging purposes. If you get lost, please try to draw the schematic on a paper step-by-step for each node creation and try to understand what your implemented functions do. For example,
print(greektoroman)
The above statement will print out:
BSTNode(Athena, Minerva)
|
|-(L)->BSTNode(Aphrodite, Venus)
| |
| |-(L)->None
| |
| |-(R)->None
|
|-(R)->BSTNode(Eros,Cupid)
| |
| |-(L)->None
| |
| |-(R)->None
Note: You may assume that the keys are unique.
class BSTNode:
def __init__(self, key, val):
self.key, self.val = key, val
self.left = None
self.right = None
self.size = 1
def __str__(self):
return f'BSTNode({self.key}, {self.val})' + \
'\n|\n|-(L)->' + '\n| '.join(str(self.left ).split('\n')) + \
'\n|\n|-(R)->' + '\n| '.join(str(self.right).split('\n'))
class BSTTable:
def __init__(self):
self._root = None
def __str__(self):
return str(self._root)
def __len__(self):
return self._size(self._root)
def put(self, key, val):
self._root = self._put(self._root, key, val)
def get(self, key):
return self._get(self._root, key)
def _put(self, node, key, val):
pass # TODO
def _get(self, node, key):
pass # TODO
@staticmethod
def _size(node):
return node.size if node else 0
Part A [15 points]:¶
The put method inserts a key-value pair into the BSTTable. It delegates the function call to _put. Please implement the non-public method _put. It takes a node, a key, and a val as arguments. node is the root of the subtree into which the key-value pair is to be inserted. After the insertion, the root of this subtree may change. _put should return the new root of this subtree. The difference of the input subtree and the returned subtree is that in the returned subtree, a new key-value pair has been inserted. If the key was already in the subtree, then we update the value of the corresponding node.
Note: We assume all the keys are of the same type and are comparable to each other with >, < and ==. So if all your keys are strings (e.g. greektoroman), they will be ordered lexicographically. If all your keys are numbers, they will be ordered from the smallest to the largest.
Hints:
- This is recursive.
- Think about the base case: what should be returned if the
nodepassed in isNone? - Which subtree (left/right) should we insert if the
nodeis notNone? - Remember to update the
sizeof the nodes.- The
sizeattribute of eachBSTNodeis the size of this subtree, i.e. the number of nodes including itself and all its descendants.
- The
Part B [15 points]:¶
The get method returns the value with the corresponding key. It delegates the function call to _get. Please implement the non-public method _get. It takes a node and a key as an argument. It will search for a node with the same key under the subtree whose root is node. If there is such a node, return the val of this node. _get should raise a KeyError if no such node is found.
Deliverables¶
binary_search_tree.py