Lecture 8

Object Oriented Programming IV

Tuesday, October 1st 2019

Last Time

  • Dunder methods
  • The Python Data Model

Today

  • Class methods, static methods, instance methods
  • Modules and packages

If we have time...

  • "Privacy" in Python
  • More details on Polymorphism

Building out our class: instances and classmethods

At this point, you should feel comfortable with classes, special methods, and the python data model.

We will take a short excursion to enhance our classes using classmethods. We will also see staticmethods and regular instance methods.

A Favorite Example

In [1]:
class ComplexClass():
    def __init__(self, a, b):
        self.real = a
        self.imaginary = b
        
    @classmethod
    def make_complex(cls, a, b):
        return cls(a, b)
        
    def __repr__(self):
        class_name = type(self).__name__
        return "%s(real=%r, imaginary=%r)" % (class_name, self.real, self.imaginary)
        
    def __eq__(self, other):
        return (self.real == other.real) and (self.imaginary == other.imaginary)
In [2]:
c1 = ComplexClass(1,2)
c1
Out[2]:
ComplexClass(real=1, imaginary=2)
In [3]:
class ComplexClass():
    def __init__(self, a, b):
        self.real = a
        self.imaginary = b
        
    @classmethod
    def make_complex(cls, a, b):
        return cls(a, b)
        
    def __repr__(self):
        class_name = type(self).__name__
        return "%s(real=%r, imaginary=%r)" % (class_name, self.real, self.imaginary)
        
    def __eq__(self, other):
        return (self.real == other.real) and (self.imaginary == other.imaginary)

make_complex is a class method. See how its signature is different above. It is a factory to produce instances.

In [4]:
c2 = ComplexClass.make_complex(1,2)
c2
Out[4]:
ComplexClass(real=1, imaginary=2)
In [5]:
c1 == c2
Out[5]:
True

The take-away

  • A classmethod has access to the actual class, but not the instance of the class

Static Methods, Class Methods, Instance Methods

What's really going on under the hood here?

In [6]:
# From fluent python
class Demo():
    @classmethod
    def klassmeth(*args): # Class methods do not have to return an instance of the class
        return args
    
    @staticmethod
    def statmeth(*args): # This is just a regular function
        return args
    
    def instmeth(*args): # This is a true blue instance method
        return args
    
In [7]:
sm = Demo.statmeth(1,2)
print(type(sm))
sm

Out[7]:
(1, 2)
In [8]:
cm = Demo.klassmeth(1,2)
print(type(cm))
cm

Out[8]:
(__main__.Demo, 1, 2)
In [9]:
ademo = Demo()
Demo.instmeth(ademo, 1,2)
Out[9]:
(<__main__.Demo at 0x111affa90>, 1, 2)
In [10]:
ademo.instmeth(1,2)
Out[10]:
(<__main__.Demo at 0x111affa90>, 1, 2)

Class variables and instance variables

In [11]:
class Demo2():
    classvar = 1
      
ademo2 = Demo2()
print(Demo2.classvar, ademo2.classvar)

ademo2.classvar = 2 # Different from the classvar above
print(Demo2.classvar, ademo2.classvar)
1 1
1 2

Practical Comments on class methods

  • Act as a factory to produce objects that are configured the way you want.
  • Pre-define commonly used objects.
  • These objects all still use the same constructor.

Practical Comments on static methods

  • Python doesn't need to instantiate a bound method for each object.
    • This saves on cost.
  • Might improve code readability.
    • You know right away that the method doesn't depend on the state.

Code and Data for objects

In [12]:
class A():
    
    def __init__(self, x):
        self.x = x
        
    def doit(self, y):
        return self.x + y

dir for classes contains the names of its attributes and recursively of the attributes of its bases.

In [13]:
dir(A)
Out[13]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'doit']

vars on an object gets the contents of a special attribute called __dict__.

In [14]:
vars(A)
Out[14]:
mappingproxy({'__module__': '__main__',
              '__init__': ,
              'doit': ,
              '__dict__': ,
              '__weakref__': ,
              '__doc__': None})

Let's make an instance of A.

In [15]:
a = A(5)

dir again:

In [16]:
dir(a)
Out[16]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'doit',
 'x']

vars again:

In [17]:
vars(a)
Out[17]:
{'x': 5}

There is some kind of a table implementation for Python objects (it's written in C).

This implementation allows us to look for attributes and methods, and if not found look elsewhere.

The exact details are complex, using descriptors and other lookups, and we'll tackle them in more detail later (hopefully).

But currently it suffices for us to know that lookup first happens in the instance table, followed by the class table (methods) and if not there somewhere up in the inheritance hierarchy.

In [18]:
A.__class__, a.__class__
Out[18]:
(type, __main__.A)

Creating Packages from Python Code

Module Recap

  • Import a module with the import statement
    import mymod
    

Here's how Python searches for a module once it's imported:

  1. The interpreter searches for a built-in module with that name.
  2. If no built-in module exists with that name, then the interpreter searches for the name in the list of directories in the sys.path variable.
  3. If the requested name can't be found, an ImportError exception is thrown.

The Many Ways to Import

Suppose your module contains some methods called myf1, myf2, and so on.

There are a variety of ways to import the module and its methods. Here are a few along with their uses:

import mymod as new_name # rename mymod
new_name.myf1() # access myf1() method in mymod via new_name
from mymod import myf1 # Just import myf1() from mymod
myf1() # Direct use
from mymod import myf1 as new_f # Import myf1 from mymod and rename
new_f() # Direct use
from mymod import * # Make all methods and objects in mymod directly accessible!
myf2()              # (Except for objects with leading underscores)

Comments on Importing

  • Generally a very bad idea to do from mymod import *. Can lead to name clashes!
  • from mymod import myf1 is also dangerous if you're not careful.
  • Recommendation: Just do import mymod or import mymod as new_name unless you have a very good reason for doing otherwise.

Where to put the import statements? A common convention is:

  • After the module's documentation.

What order to import libraries?

  • First import standard library modules.
  • Then import third-party library modules.
  • Then import your own modules.

Modules and Packages

  • So far, you made a toy module in your homework.
  • For larger projects, you will have multiple modules.
  • A collection of multiple modules is called a package.

Why multiple modules?

Having multiple modules helps with code organization.

physics_code/
             __init__.py  
             preprocessing/
                           __init__.py
                           parse_xml_inputs.py
                           parse_txt_inputs.py
                           ...
             solvers/
                     __init__.py
                     time_integrators.py
                     discretization.py
                     linear_solvers.py
                     ...
             postprocessing/
                            __init__.py
                            write_hdf5.py
                            write_txt.py
                            ...
                            stat_utils/
                                       __init__.py
                                       ...
                            viz/
                                __init__.py
                                line_plots.py
                                ...
             tests/
                   ...

What is __init__.py?

  • Used for package initialization-time actions.
  • Generates a module namespace for a directory.
    • In Python 3.3+, empty __init__.py is not required: Packages
    • Still use for package initialization
  • Implements the from * behavior.
    • This is done using __all__ lists.
    • e.g. include the line __all__ = ["mod1", "mod2", ..., "modN"]

More Practical Comments on __init__.py

  • Empty __init__.py files no longer necessary
  • They help prevent directories with common names from hiding true modules
  • The first time Python imports through a directory, it runs the code in __init__.py.

Working With Packages

  • Once you have your directory structure set up (with the __init_.py files), you are ready to use the package.
dir\
     driver.py
     package\
             __init__.py
             subdir1\
                     __init__.py
                     s1mod1.py
                     s1mod2.py
             subdir2\
                     __init__.py
                     s2mod1.py
                     s2mod2.py
# driver.py:  can make use of the package by simple imports.
import package.subdir1.s1mod1.py as s1mod1
s1mod1.method()
...

Notes on __name__

  • You may have seen the code snippet:
    if __name__ == "__main__":
      # Do some things
    
  • The variable __name__ is created whenever a .py file is run and is set to the string "__main__".
  • However, when a module is imported, __name__ is set to the module's name.
  • Hence, if the module is not being run as a Python script, the if statement will not be executed.

Additional Information

As with most things Python, you can simply consult the excellent documentation: Python Modules.

  • Absolute vs. Relative imports
  • Compiled Python files

Illustrative Example

Consider the following directory structure:

dir1\
     __init__.py
     dir2/
          __init__.py
          mymod.py

Here is what is in each file:

# dir1/__init__.py
print("Initializing dir1/")
# dir2/__init__.py
print("Initializing dir2/")
# dir2/mymod.py
my_name = "David"

Outputs

  • If I work from the command line in the container of dir1, I can see various things happen.
>>> import dir1.dir2.mymod
Initializing dir1/
Initializing dir2/
>>> dir1.dir2.mymod.my_name
'David'
>>> import dir1.dir2.mymod as mod
>>> mod.my_name
'David'

Some Practical Comments

  • It's annoying to write all those paths manually.
  • You can make the functions and classes available in __init__.py and then use the direct import statement

Consider the directory structure:

example.py
dir1\
     __init__.py
     mymod.py

Now the import can be achieved with:

# __init__.py
from .mymod import myclass
from .mymod import myfunc
# example.py
import package
C = package.myclass()

Creating and Distributing Packages

At this point, you know how to create packages in Python and the basics of how things fit together.

Ultimately, you want to be able to distribute your package to other people.

There are a number of ways to do this...brace yourself.

There Are So Many Options!

  • As you can see, you have many options on how to set up and distribute your package.
  • I will give you broad freedom in how you do this, but your project must be easily installable.

What does "easiliy installable" mean?

  • Using pip is great! This would be the easiest for the user.
  • You are also welcome to host your project on GitHub and have the user manually install and test with setup.py.
  • Either way, your package should be installable and the user should be able to run the tests.

Privacy in Python

  • Python does not have private names
  • It can "localize" some names in classes
  • This localization is handled by "name mangling"
  • Name mangling does not prevent access by code outside the class!
  • Name mangling is intended to help avoid namespace collisions

Therefore, we say that Python has the notion of pseudoprivate names.

Pseudoprivacy and Name Mangling

Names inside a class that begin with two underscores are expanded to include the name of the enclosing class

For example, suppose you have a class called Universes and a name in that class called __our_universe.

Python changes the name __our_universe to _Universes__our_universe.

Now if there is another class in the hierarchy containing an attribute name our_universe then the two names will not clash.

If you know the name of the enclosing class, you can still access the "private" attributes.

Some details: Private Variables.

A note on single underscores:

a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member).