5. Classes and Interface Design

Reading 5: Classes and Interface Design #

Every object in Python has a type, such as int or list, that determines what you can do with that object. The types that you have learned so far in this course have been useful for a variety of programming tasks - you can store text as a str, count the frequency of each character in the text using a dict, or find the unique characters used in the text using a set.

In this reading, we will discuss how you can define your own types through classes. By using classes, you can create objects that are well-suited to the programming task at hand. These objects essentially bundle data along with functions to read, process, or modify this data for a particular purpose. We will also discuss how to design a good interface for a class, that is, the set of functions that the class provides.

Classes #

A class is essentially a collection of data and functions designed to be used for a specific purpose. Classes are used to define new types, and once you have declared a class, you can create new objects of that class called instances in your code.

Because the distinction between a class and an instance can be confusing, we will start by discussing this in more detail before covering how to define a class.

A class is a type, and an instance is an object #

Suppose you create an empty set object like this:

words = set()

After this line executes, words is a set, and specifically, it is an instance of a set. Its type is set, but words is not equal to the set type itself (words == set is False). If you want to check that words is of the set type, you use isinstance(words, set), checking that words is an instance of the set type.

In the context of classes, this distinction is important, because some data can belong to the class as a whole while other data can belong to a specific instance of that class.

Classes have their own syntax and style #

Here is an (incomplete) example class that we will use throughout this section to illustrate the various features of classes:

class PlayerSpaceship:
    
    model = "Anscombe 4X"
    mass = 1000  # kg
    engine_span = 10  # m
    
    def __init__(self, pilot):
        self.pilot = pilot
        self.fuel = 100  # Liters
        self.angle = 0  # Radians
        self._left_engine_thrust = 0  # Newtons
        self._right_engine_thrust = 0  # Newtons
        
    def __repr__(self):
        return f"({self.pilot}, {self.fuel}, {self.angle}, " \
               f"{self._left_engine_thrust}, {self._right_engine_thrust})"
        
    def set_left_thrust(self, thrust):
        self._left_engine_thrust = thrust
        
    def set_right_thrust(self, thrust):
        self._right_engine_thrust = thrust
        
    def total_thrust(self):
        return self._left_engine_thrust + self._right_engine_thrust
    
    def rotational_velocity(self):
        clockwise_thrust = self._right_engine_thrust - self._left_engine_thrust
        return clockwise_thrust * engine_span / 2

You declare a class using the class keyword, followed by the name of the class. The body of the class, like the body of a function, is indented by four spaces. A class can contain variables and functions.

Unlike most variable names, class names should begin with a capital letter, and if the name consists of multiple words, it should be written in camel case (PlayerSpaceship) rather than snake case (Player_spaceship).

Within classes, function definitions are separated by one blank line instead of two. Class definitions should be separated from one another by two blank lines. Automated style checkers such as pycodestyle will check for this spacing.

self refers to an instance of a class #

Except in a few rare cases that we will not discuss here, a function defined within a class is designed to be used by an instance of that class. This kind of function is called a method. Every method has to take self as its first parameter, which refers to the specific instance of the class. Data belonging to an instance of a class can be set or read using self as well. You can see this in the PlayerSpaceship class:

class PlayerSpaceship:
    
    # Only the relevant part ot this class is shown for this example.

    def set_right_thrust(self, thrust):
        self._right_engine_thrust = thrust
        
    def total_thrust(self):
        return self._left_engine_thrust + self._right_engine_thrust

In this class, each PlayerSpaceship instance has its own name. The set_right_thrust method takes self as its first parameter, and the reference to self._right_engine_thrust in this method represents a variable called _right_engine_thrust that belongs to the class instance. This kind of variable is called an attribute of the instance.

The self parameter does not appear when actually calling the function, though. If spaceship was a PlayerSpaceship instance, you would set its name like this:

spaceship.set_right_thrust(50)

You can access a variable set in this way like spaceship.pilot, so for example, if you wanted to print the pilot’s name for this spaceship, you could call print(spaceship.pilot). Variables like this that belong to a class instance are called attributes of the instance.

Note that self is not available outside of methods, so you cannot do something like this:

# WARNING: This code will not work.
class InvalidSelf:
    self.name = ""
    # Define other things here.

As you see from the beginning of the PlayerSpaceship class, you can define variables without self:

class PlayerSpaceship:
    
    # Only the relevant part of this class is shown for this example.
    
    model = "Anscombe 4X"
    mass = 1000  # kg
    engine_span = 10  # m

However, this has a slightly different meaning, as we will explain later.

Use the class name to create an instance of a class #

Remember from Reading 2 that to create a new, empty set, you use the following syntax:

empty_set = set()

To create a new instance of a class, you can use a similar form. For the PlayerSpaceship class above, you could do the following:

spaceship = PlayerSpaceship("Major Tom")

Use __init__ to set up a class instance #

The PlayerSpaceship class has a special method called __init__:

class PlayerSpaceship:
    
    # Only the relevant part of this class is shown for this example.
    
    def __init__(self, pilot):
        self.pilot = pilot
        self.fuel = 100  # Liters
        self.angle = 0  # Radians
        self._left_engine_thrust = 0  # Newtons
        self._right_engine_thrust = 0  # Newtons

You would still be able to create an instance of PlayerSpaceship without this method. But if you do so, you may be surprised to learn that trying to access spaceship.pilot (where spaceship is a PlayerSpaceship instance) results in an error message:

AttributeError: 'PlayerSpaceship' object has no attribute 'pilot'

This is the case even for variables such as _left_engine_thrust that are set in other methods of PlayerSpaceship - trying to access the variable will result in an error until some method sets its value.

The __init__ method is usually not called directly, but executes automatically when creating a new class insance.

Use __repr__ to represent a class as a string #

The PlayerSpaceship class also has a special method called __repr__:

class PlayerSpaceship:
    
    # Only the relevant part of this class is shown for this example.

    def __repr__(self):
        return f"({self.pilot}, {self.fuel}, {self.angle}, " \
               f"{self._left_engine_thrust}, {self._right_engine_thrust})"

Without this method, printing a PlayerSpaceship instance would produce a result like this:

<__main__.PlayerSpaceship object at 0x7efccdd24f10>

The __repr__ method allows you to define your own string representation of a class instance. It must take only self as a parameter and return a string. Ideally, the __repr__ method should return the information required to build a copy of the instance - in our case, making a copy of a PlayerSpaceship instance requires all of its attributes.

It is especially helpful to write a __repr__ method because it can help others as they use your code. If someone else is using your class and wants to print out an instance for debugging purposes, a well-implemented __repr__ method will be useful indeed.

Like the __init__ method, a class’s __repr__ method is usually not called directly, but is automatically called as a result of calling print or str on a class instance.

Use a leading underscore to indicate internal attributes or methods #

You may have noticed that the PlayerSpaceship class has attributes named _left_engine_thrust and _right_engine_thrust, with a leading underscore in the name of each. This is done to indicate that these variables should not be accessed outside of the class’s implementation.

In practice, nothing stops you from actually accessing and changing such an attribute, so the following code will execute:

spaceship = PlayerSpaceship("Major Tom")
spaceship._left_engine_thrust = 50

Thus naming attributes or functions in this way is mostly as a reminder to you and to those using your code, much like the use of _ as a variable name. That being said, some automated checkers in IDEs like VS Code will generate a warning if your code accesses an internal attribute or method outside of a class.

Classes can also have attributes #

At the beginning of the PlayerSpaceship definition, we defined some variables like this:

class PlayerSpaceship:
    
    # Only the relevant part of this class is shown for this example.
    
    model = "Anscombe 4X"
    mass = 1000  # kg
    engine_span = 10  # m

These are called class attributes and are shared among all instances of the PlayerSpaceship class. You can access these attributes either through the instance or through the class name itself, so if spaceship is an instance of PlayerSpaceship, both spaceship.mass and PlayerSpaceship.mass will be 1000.

You can change PlayerSpaceship.mass by assigning directly to it, but this will have surprising results:

PlayerSpaceship.mass = 42
print(spaceship.mass)  # This will print 42 instead of 1000

Perhaps even more surprisingly, this does not apply in the reverse direction: if you try to assign a new value to mass through spaceship, it does not carry through to PlayerSpaceship or to other instances of the class:

# Assume that PlayerSpaceship is currently 1000 as originally defined.
spaceship.mass = 42
print(PlayerSpaceship.mass)  # This will still print 1000

This is because the statement spaceship.mass = 42 creates a new instance attribute and sets its value to 42 instead. If an instance attribute and class attribute have the same name, the more specific one (the instance attribute) takes precedence.

Be careful with class attributes! When in doubt, always change its value through the class name, but in general, you should avoid changing class attributes at all if possible.

Classes should have docstrings #

Like functions, classes should have docstrings. The format is nearly identical to that of functions, except that you should list the attributes of the class:

class PlayerSpaceship:
    """
    Player-controlled spaceship used for all levels.
    
    Attributes:
        model: A string representing the spaceship model name.
        mass: An int representing the mass of the spaceship in kilograms.
        engine_span: An int representing the distance in meters between the
            spaceship's left and right engines.
        pilot: A string represent the name of the spaceship pilot.
        fuel: An int representing the fuel level of the spaceship.
        angle: A float representing the current angle of the spaceship in
            radians, with 0 being the spaceship nose pointing right.
    """
    
    # Only the relevant part of this class is shown for this example.

Internal attributes do not need to be listed in the docstring, but you should still document their type and what they represent in the class implementation.

Methods, like any other function, should have a docstring with the usual formatting.

Interface Design #

So far, we have discussed several aspects of how you can design and implement classes, but we have not yet described how you should design and implement classes. As you design a class, it is particularly important to consider your class’s interface, that is, the set of public attributes and methods it offers for other code to interact with it.

Below, we describe a few guidelines that we encourage you to follow as you design your class’s interface.

Prefer private instance attributes in most cases #

In general, you should make instance attributes private, with one exception that we will describe below. By making an instance attribute private, you can have more confidence that it will be accessed and modified only in the way that you have implemented in the class methods.

As an example, suppose that you were writing a class to represent a character in a role playing game, like this:

class Character:
    
    # Only the relevant part of this class is shown for this example.
    
    def __init__(self, name):
        self.name = name
        self.max_health = 10
        self.health = self.max_health

While this might seem like a reasonable way to implement the class, the fact that all of the instance attributes are public can cause some issues. A minor problem is that the character’s name can be changed by any other function, including those outside of the Character implementation. A more serious problem is that the character’s health can be changed to beyond its maximum value.

To see an example of this, suppose that you can restore a character’s health by using a potion. However, if the potion simply does the following, a character at full health would now have a value that is technically not allowed:

character.health += 25  # This could result in a health greater than max_health

Because of this, it would be better to provide several functions that can increase, decrease, or view the character’s health, like this:

class Character:
    
    # Only the relevant part of this class is shown for this example.
    
    def __init__(self, name):
        self._name = name
        self._max_health = 10
        self._health = self._max_health
        
    def heal(self, amount):
        self._health = min(self._health + amount, self._max_health)
        
    def damage(self, amount):
        self._health = max(self._health - amount, 0)
        
    def get_health(self):
        return self._health

By using private instance attributes in this way, you can ensure that the “state” of your class instances always remain valid. Again, you cannot stop code from changing the values of these attributes, but a private instance variable will make it a bit easier for you or a user of your code to realize that doing so is a mistake.

Use public instance attributes for data containers #

If your class is simply designed to “hold data” and not do anything with it, or if your class has no “invalid” set of instance attributes, it is fine to make the instance attributes public. For example, if you have a class designed to represent an object in a physics simulator, you might define it like this:

class Object2D:
    
    def __init__(self, x, y, velocity, label=""):
        self.x = x
        self.y = y
        self.velocity = velocity
        self.label = label

This class is essentially equivalent to a tuple of four elements, except that rather than accessing its elements like obj[0], you can do so using obj.x. This type of class is sometimes referred to as a “named tuple” for that reason. In this case, it is appropriate to define the instance attributes as public, since changing the attributes does not really affect the validity of the object.

If you want to use a class like this, you may find it useful to use the collections.namedtuple class in the Python standard library.

Methods should access private instance attributes or methods #

If you write a function that does not access any of a class’s private instance attributes or methods, you should consider implementing it as a function outside of the class instead.

For example, in the Object2D example above, suppose that we want to determine whether two objects are colliding (that is, they have the same position) or not. You could make it a part of the class, like this:

class Object2D:
    
    def colliding(self, other):
        return self.x == other.x and self.y == other.y

However, the Object2D class has no private attributes, so it would make more sense to write a function outside of the Object2D class, like this:

def colliding(obj_1, obj_2):
    return obj_1.x == obj_2.x and obj_1.y == obj_2.y

While it may be difficult to understand why these two functions are significantly different, the key lies in what each function can access. A method implemented as part of the Object2D class would be able to access any method or attribute of the class, including private methods and attributes. By contrast, an external function would only be able to access those methods or attributes that are public, reducing the risk that the function will accidentally access a private method or attribute and thereby place a class instance in an invalid state.

Avoid exposing implementation details #

Finally, you should avoid exposing the internal details of a class’s private instance attributes in the interface of the class. Suppose that you create a class that keeps a list of integers, like this:

class AccountChecker:
    
    # Only the relevant part of this class is shown for this example.

    def __init__(self):
        self.accounts_to_check = []

Now suppose that accounts_to_check contains a series of integers representing the ID numbers of the accounts to check, and that accounts need to be checked in order of increasing ID.

If accounts_to_check is not kept in sorted order, then you would have to repeatedly find the minimum element of accounts_to_check, check it, and then remove it. If you did not make this a part of the AccountChecker class, you would have to write this code in a function external to the class.

However, this also means that if later you decided to change accounts_to_check to a different class that you had designed, you would likely have to change every program that accesses it. A better design would be to make accounts_to_check a private attribute and create a method to calculate the next account to check, like this:

class AccountChecker:
    
    # Only the relevant part of this class is shown for this example.

    def __init__(self):
        self._accounts_to_check = []
        
    def next_account(self):
        return min(self._accounts_to_check)

In this way, if you ever change the type of _accounts_to_check, you can simply change the implementation of next_account and existing code that uses AccountChecker will continue to operate as is.