Reading 2: Cleaner Code, Data Structures, and Functions #
The first part of this reading introduces a few new features that will make it easier to write some types of code in Python. In Reading 1, we avoided introducing these to let you get used to a simpler syntax, but now that you have some experience, we will explain how to simplify some of your common coding patterns.
The rest of this reading focuses on data structures and functions, two important building blocks that are part of nearly every Python program in common use. In short, data structures are a way of packaging data, while functions are a way of packaging code. Both of these make it easier to reason about and write programs.
Writing Cleaner Code #
Arithmetic Assignment Operators #
You may have noticed that in most while loops you write that some variable needs to be updated in each loop. Something like:
>>> i = 3
>>> while i > 0:
... print(i)
... i = i - 1
Instead of writing i = i - 1
, you can write i -= 1
. In fact, you can do this
with any arithmetic operator on a numeric variable:
>>> x = 10
>>> x += 5
>>> x
15
>>> x //= 3
>>> x
5
>>> x **= 2
>>> x
25
The +=
operator also work with strings:
>>> s = "sp"
>>> s += "am"
>>> s
spam
Type Conversion #
You may have found it a bit frustrating that you cannot combine a string and
integer so that by writing something like "I was born in " + 2000
, you could
get the string "I was born in 2000"
. You can display this using print("I was born in", 2000)
, but you may want the actual string.
One way to do this is with type conversion. By using the type name you want to
convert to and parentheses (()
), you can turn a variable into a different
type:
>>> str(2000)
'2000'
>>> int("123")
123
>>> float(42)
42.0
This will not work for all types we will see in this course, but can be done for
all of the types you have seen so far (int
, float
, bool
, and str
).
String Formatting #
While you can use type conversion to make printing easier, there is a much cleaner way to print strings like the above. You can use string interpolation, which allows you to use and format the values of variables in a string. The newest and recommended way to do string interpolation is called the f-string, for a reason that is clear in the example below:
>>> birth_year = 2000
>>> print(f"I was born in {birth_year}")
I was born in 2000
As you can see, by writing an f
before the starting quote in the string, you
can use curly braces ({}
) to surround a Python expression and use its value in
the string. This does not have to be a variable name - you could write something
like {birth_year // 2}
if you wanted to.
As a historical note, older ways to do string interpolation include the following:
"I was born in {}".format(birth_year)
"I was born in %d" % birth_year
You may see these older formats on Q&A sites, but we recommend avoiding using them, as they are more verbose and generally have slightly worse performance than f-strings.
Break & Continue #
You may have wondered if there are ways to end a for or while loop early. For
example, let’s say you have string of text and want to print only the first
sentence. You aren’t able to use a for
loop, since you don’t know the index
where you have to stop. Instead, you could use a while
loop like this:
>>> text = "I am Sam. Sam I am. That Sam-I-am!"
>>> i = 0
>>> first_sentence = ""
>>> while text[i] != ".":
... first_sentence += text[i]
... i += 1
>>> print(first_sentence, text[i])
I am Sam.
However, with the use of break
, we can use a for loop. The break
keyword is
used to end the current loop and go on to the next section of code. Rewriting
the above code:
>>> text = "I am Sam. Sam I am. That Sam-I-am!"
>>> first_sentence = ""
>>> for character in text:
... first_sentence += text[i]
... if character == ".":
... break
>>> print(first_sentence)
I am Sam.
Similarly, let’s consider if you only wanted to skip a certain cycle or
iteration of a loop. The keyword continue
allows you to do just that. Now, if
we wanted to print remove all spaces from a string:
>>> text = "I am Sam. Sam I am. That Sam-I-am!"
>>> no_spaces = ""
>>> for character in text:
... if character == " ":
... continue
... no_spaces += character
>>> print(no_spaces)
IamSam.SamIam.ThatSam-I-am!
Checking For Existence #
Up until now, you’ve probably been checking to see if a string is empty like this:
empty_string = ""
if empty_string == "":
print("String is empty.")
While this is a completed valid way to write this check, Python officially
recommends simply using if
(or if not
) and the name of the variable, like
this:
empty_string = ""
if not empty_string:
print("String is empty.")
In practice, you will most likely use this fact to write something where you execute code certain code if it is not empty:
if example_string:
# Do stuff with example_string, and you know it isn't empty
A Type for Nothing #
Sometimes, you may want to have a value that represents nothing at all. For
example, suppose you have a service where users can send password-protected
messages to each other, and blank passwords (simply hitting Enter, essentially)
are allowed. To differentiate the case of a message having a blank password
(""
) from one having no password at all, you can use a special value called
None
.
Another common example is if you are looking at a sequence of integer values and
you need to keep the highest one. The integer values you see may all be
negative, so setting a default value of 0 may not always work. In this case, you
should set the initial maximum value to None
.
In this case, though, you will need to check that a variable is equal to None
.
The syntax for checking equality to None
is slightly different - rather than
using ==
or !=
, you use is
or is not
:
if max_value is None:
# Set the maximum value
Data Structures #
You can think of a data structure as a way of organizing data for a particular purpose. In the real world, you can organize the exact same data in different ways or for different purposes. For example, how you organize the names Alice, Bob, Charlie, and David depends on whether you are creating a directory (alphabetically), ranking them by the number of points scored in a game (where you also need to keep track of those points), or creating a guest list (where you only want to check whether they are on the list or not).
In Python, data structures are also types, which means that they define a set of things you can do with the data they contain. As you might guess, the type of data structure you use will depend on how you intend to use the data.
Below, we will describe a few common data structures, what you can do with them, and how they are commonly used in programs.
Lists #
A list represents a sequence of items. You can define a list with square
brackets ([]
) and items separated by commas (,
):
sample_list = [3, 1, 4, 1, 5, 9] # Digits of pi
empty_list = [] # This has no items in it
It is easier to reason about lists in which each item has the same type, but you can mix and match items of different types within a list.
Like with strings, you can get the length of a list with len
and get items
within the list using indexing or slicing:
>>> len(sample_list)
6
>>> sample_list[2]
4
>>> sample_list[:3]
[3, 1, 4]
Also as with strings, you can concatenate and multiply lists:
>>> group_1 = ["Alice", "Bob"]
>>> group_2 = ["Charlie", "David"]
>>> group_1 + group_2
["Alice", "Bob", "Charlie", "David"]
>>> group_1 * 2
["Alice", "Bob", "Alice", "Bob"]
You can also iterate through each item of a list with a for
loop:
for digit in sample_list:
print(digit)
A unique features of lists is that you can make modifications to them. For example, you can assign to an individual element in the list:
>>> passengers = ["Alice", "Bob", "Charlie", "David"]
>>> passengers[2] = "Eleanor"
>>> passengers
['Alice', 'Bob', 'Eleanor', 'David']
You can also append to an existing list, which adds a single item to the end of the list.
>>> sample_list = [3, 1, 4, 1, 5, 9]
>>> sample_list.append(2)
>>> sample_list
[3, 1, 4, 1, 5, 9, 2]
This is different from adding [2]
to sample_list
, which does not change
sample_list
:
>>> sample_list + [2]
[3, 1, 4, 1, 5, 9, 2]
>>> sample_list
[3, 1, 4, 1, 5, 9]
Here is an example program that takes a list of numbers and sorts them into two new lists of positive numbers and negative numbers:
>>> number_list = [1, 3.2, -4, -0.5, 1, -10, 42, -1, -7]
>>> positives = []
>>> negatives = []
>>> for num in number_list:
... if num > 0:
... positives.append(num)
... else:
... negatives.append(num)
>>> print(positives)
[1, 3.2, 1, 42]
>>> print(negatives)
[-4, -0.5, -10, -1, -7]
Ranges #
It is quite common in Python programs to do something for each number from 0 to n. Rather than defining a long list of integers, you can do this using ranges. The typical way of using ranges is like this:
# This prints all numbers from 0 to 9, each on its own line.
for i in range(10):
print(i)
You can use range
with one integer as shown above, with two integers (such as
range(1, 11)
), or with three integers (such as range(1, 12, 2)
). The results
are very similar to how string slicing works:
range(5) # Essentially [0, 1, 2, 3, 4]
range(1, 6) # Essentially [1, 2, 3, 4, 5]
range(1, 6, 2) # Essentially [1, 3, 5]
range(5, 0, -1) # Essentially [5, 4, 3, 2, 1]
We say that these ranges are essentially equivalent to lists because if you
iterate through them with a for
loop, the effect will be the same. But for
reasons that we will not go into here, ranges are not lists, and you cannot use
most of the list operators (such as +
or append
) with ranges. (You can use
len
to get the length of a range, though.)
You can also use ranges to do something a certain number of times. For example,
the following code prints Hello!
ten times in a row:
for _ in range(10):
print("Hello!")
You may find it strange that in this for
loop, we use _
as the variable.
Remember that _
is a valid variable name, and in most Python programs, it is
used for a variable whose value is ignored. Specifically, in this for
loop, we
are simply printing Hello!
and do not need the value of any of the integers
from the range. Using _
makes this intent clear.
Dictionaries #
In its most basic form, an English dictionary allows you to look for a specific word and find its definition. Similarly, a dictionary in Python allows you to associate pairs of data. You can then “look up” one member of the pair to get the other member. As an example, here is a dictionary that maps integers to their English word:
number_words = {1: "one", 2: "two", 3: "three"}
As you can see, you use curly braces ({
and }
) to surround the pairs and use
the colon (:
) to connect a pair of items (like the integer 1
and the string
"one"
).
We would say that this dictionary maps 1
to "one"
. In this dictionary,
1
, 2
, and 3
are called keys, while "one"
, "two"
, and "three"
are
called values.
You can find the value corresponding to the key 1
like this:
>>> number_words[1]
'one'
This lookup only goes in one direction: you cannot get a key like 1
by running
number_words["one"]
. As with lists, you can mix and match the types of both
keys and values, but it is easier to reason about a dictionary where all the
keys and values are of the same type.
If you look up a key that is not in a dictionary, you will get an error (called
KeyError
):
>>> number_words[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 0
You can avoid this error by first checking if the key is mapped to a value using
in
, like this:
>>> 0 in number_words
False
>>> 1 in number_words
True
It is simple to add entries to a dictionary:
>>> number_words[0] = "zero"
>>> number_words
{1: 'one', 2: 'two', 3: 'three', 0: 'zero'}
You can only have one of any key in a dictionary. If you assign a value to a key already in a dictionary, you will overwrite its previous value:
>>> number_words[1] = "ONE"
>>> number_words
{1: 'ONE', 2: 'two', 3: 'three', 0: 'zero'}
As a quick example for how we can use dictionaries, here is a short program to count the occurrences of each letter that appears in a string:
sample_string = "hello world"
letter_counts = {} # Empty dictionary
for character in sample_string:
if character in letter_counts:
letter_counts[character] += 1
else:
letter_counts[character] = 1
# letter_counts is as follows:
# {'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}
You can get the number of entries in a dictionary with len
. You can also use a
for
loop to go through all of the keys of a dictionary:
# This loop prints out "one", "two", "three" on separate lines.
number_words = {1: "one", 2: "two", 3: "three"}
for number in number_words:
print(number_words[number])
You will loop through the keys in the order in which they were added to the dictionary.
If for some reason you need to delete a key from a dictionary, you can do that
with del
:
>>> number_words = {1: "one", 2: "two", 3: "three"}
>>> del number_words[2]
>>> number_words
{1: 'one', 3: 'three'}
Finally, note that there are some restrictions on what types you can use as keys in dictionaries. Of the types you have seen so far, lists and dictionaries are the only two types that you cannot use as dictionary keys:
>>> {[1, 2]: "one, two"}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
That being said, there is an alternative if you want to use something like a list as a dictionary key, which we will describe next.
Tuples #
Tuples are essentially lists that you cannot change (i.e., you cannot assign to
individual items or append
to them). They are written slightly differently
from lists, using parentheses (()
) around comma-separated items:
sample_tuple = (1, 2, 3, 4, 5)
Their behavior is nearly identical to that of lists - you can add tuples, multiply them by integers, loop through them, find their length, etc.
One hiccup with this syntax is that declaring a tuple with just one item in it requires a comma:
>>> (1) # This is an int
1
>>> (1,) # This is a tuple
(1,)
A big advantage of tuples is that they can be used as dictionary keys:
>>> tuple_dict = {(1, 2): "one, two"}
>>> tuple_dict[(1, 2)]
'one, two'
If you know that you are dealing with a sequence of a specific size - for example, if you know that your data represents a pair of x and y coordinates
- it may be advantageous to use a tuple. Doing so prevents you from accidentally appending something to the coordinates.
Sets #
A set is a data structure designed to test for membership (whether something is in the set or not). Creating a set is similar to creating a dictionary, except that you use single items instead of colon-separated pairs:
hubs = {"Chicago", "Denver", "Houston", "Los Angeles", "Newark",
"San Francisco", "Washington"}
Note that an empty set cannot be defined with {}
, since that represents an
empty dictionary. To define an empty set, use set()
instead.
In some ways, sets behave similarly to lists. You can find the size of a set
with len
and you can check whether something is in a set with in
.
However, sets do not keep track of the order of the items they contain. You cannot index or slice a list, and duplicate items are not allowed. You can loop through the items of a set, but you might be surprised by the order:
>>> shuffled_nums = {3, 2, 4, 5, 1}
>>> for num in shuffled_nums:
... print(num)
...
1
2
3
4
5
On the other hand, you can do some operations on sets that you cannot on lists. You can combine sets to get all of the items that appear in both sets (“and”), in either set (“or”), or in just one of the sets (“xor”, pronounced “ex or”):
>>> first_nums = {1, 2, 3, 4, 5}
>>> first_primes = {2, 3, 5, 7, 11}
>>> first_nums & first_primes # Numbers in both sets (and)
{2, 3, 5}
>>> first_nums | first_primes # Numbers in at least one set (or)
{1, 2, 3, 4, 5, 7, 11}
>>> first_nums ^ first_primes # Numbers in only one set (xor)
{1, 4, 7, 11}
You can add and remove individual items from sets using add
, remove
, or
discard
:
>>> sample_set = {1, 2, 3}
>>> sample_set.add(4) # sample_set is now {1, 2, 3, 4}
>>> sample_set.remove(2) # sample_set is now {1, 3, 4}
>>> sample_set.remove(5) # This will result in a KeyError
>>> sample_set.discard(3) # sample_set is now {1, 4}
>>> sample_set.discard(5) # No error, but doesn't remove anything
You can also “subtract” two sets, which will delete any items in the first set that are in the second (but ignore anything that is only in the second set):
>>> first_nums = {1, 2, 3, 4, 5}
>>> first_primes = {2, 3, 5, 7, 11}
>>> first_nums - first_primes
{1, 4}
As you can see, 2
, 3
, and 5
were removed, but 7
and 11
were not in
first_nums
to begin with and are thus ignored.
Finally, here is an example of a short program that counts the unique characters in a string and prints the result:
hello = "Hello world!"
unique_chars = set()
for character in hello:
unique_letters.add(character)
print(f"Characters used in string: {unique_chars}")
This prints the following:
Characters used in string: {'!', 'd', 'w', 'o', 'e', 'r', 'H', 'l', ' '}
Concise Data Structures with Comprehensions #
Comprehensions are not data structures of their own, but can make it easier to define certain kinds of data structures.
Suppose you wanted to create a list consisting of the numbers 0 to 99. You could
type out the list [0, 1, ..., 99]
, but this would be quite tedious. You could
instead use a for
loop and repeatedly add to a list:
counting_up = []
for i in range(100):
counting_up.append(i)
But there is a much more efficient and clean way to write this:
counting_up = [i for i in range(100)]
This is called a comprehension (and in this case specifically, a list comprehension). You can also use them for dictionaries. Here is a dictionary comprehension that maps each integer from 0 to 99 to its square:
squares = {i: i ** 2 for i in range(100)}
As you can see, the syntax is similar - the left part of the comprehension
defines how to use i
, while the right part of the comprehension describes the
specific values of i
to use.
The right part of the comprehension does not need to be a range
- any data
type that supports for
looping, such as a list or dictionary, will also work.
You can combine a comprehension with an if
statement to filter out certain
values. For example, suppose you wanted to define a list of all integers under
1000 that are not divisible by 5. You can write the following to get that list:
[i for i in range(100) if i % 5 != 0]
Comprehensions can be quite powerful, but it is also important to be careful when using them. They can make your code difficult for others to read and understand, so we recommend only using them for fairly simple conditions, like what you see above.
For example, suppose you had a long list of words called word_list
and you
wanted to print the first three characters of all of the words in this list
starting with J, Q, or X. You could write this as a comprehension like this:
[word[:3] for word in word_list if word[0] == "J" or word[0] == "Q"
or word[0] == "X"]
But this is a bit convoluted to read, and you would probably be off writing this
as a for
loop, which is longer but easier to understand.
Functions #
You can think of functions as organizing code for a particular purpose. Using functions makes it easier to reuse code in different contexts and with different inputs.Learning about functions is the first example you will see of the principle we call DRY: Don’t Repeat Yourself. If you can package and reuse code rather than writing again, your code will be easier to understand, debug, and maintain.
Defining and Calling a Function #
Suppose that you wanted to check whether two lists of positive integers
(list_1
and list_2
) have the same maximum value. You could do this:
list_1_max = 0
for i in list_1:
if i > list_1_max:
list_1_max = i
list_2_max = 0
for i in list_2:
if i > list_2_max:
list_2_max = i
if list_1_max == list_2_max:
print("The two lists have the same maximum value.")
However, you will notice that the structure of the code to find the maximum value of both lists is almost exactly the same (except for the variable names).
We can instead write this code much more simply using a function. You can define a function like this:
def list_max(int_list):
max_value = 0
for i in int_list:
if i > max_value:
max_value = i
return max_value
The def
keyword (short for “define”) says that the block of code that follows
is what the function does. The name of the function is list_max
, and
int_list
represents data that is given to the function when it runs. This line
is usually called the declaration, whereas the indented lines of the function
are called the body.
You can then use the function by calling it. You call a function by providing
its name, followed by parentheses (()
) surrounding the value to assign to its
parameter(s). Here is an example of calling list_max
:
>>> list_max([3, 1, 4, 1, 5, 9])
9
When you run list_max([3, 1, 4, 1, 5, 9])
, what is actually happening?
Effectively, the function first assigns the list [3, 1, 4, 1, 5, 9]
to the
name int_list
, and then runs the code in the body of the function. The
return
keyword in the body of list_max
means that max_value
is the value
that results from running list_max
.
The expression that comes after return
(in this case, max_value
) is called
the return value, because if you tell Python to run the function, it will
compute that expression and return it to you. The list [3, 1, 4, 1, 5, 9]
is
called an argument to list_max
- while a parameter refers to the name of the
input, an argument refers to the input’s actual value when the function runs.
It is worth noting that the return
keyword will immediately stop running the
function after that line, no matter what else it might have left to do. Consider
this function:
def find_item(item_list, target):
for index in range(len(item_list)):
if item_list[index] == target:
return index
else:
print(f"It's not at index {index}...")
Let’s run this function with a list and an integer:
>>> find_item(["foo", "bar", "baz"], "bar")
It's not at index 0...
1
Notice that the function did not print anything for index 2 (where "baz"
is),
since it returned when it hit index 1.
Return Types #
Every function has a return type, which is simply the type of the value that
the function returns. For example, the return type of the list_max
and
be_positive
functions above is an integer because both functions return
variables that are integers. Just as it is important to be able to work out what
type a variable is, it is important to be able to work out what type a function
returns.
As an example, if x
and y
are integers, what is the return type of the
function below?
def midpoint(x, y):
return (x + y) / 2
The return type of midpoint
is a float, because x + y
is an integer, and
dividing two integers with /
always returns a float.
Some functions don’t return anything at all. For example, consider this
function, which just appends a few items onto the end of a list and has no
return
statement:
def last_laugh(word_list):
for _ in range(3):
word_list.append("ha")
If you recall, appending something to a list produces no output:
>>> word_list = ["Hi", "there"]
>>> word_list.append("ha")
There is a special type that represents “nothing”, and it is called None
(or
the none type). The append
function for lists, as well as most functions that
modify an existing variable (adding to or deleting from a set or dictionary, for
example), returns None
. Perhaps most surprisingly, the commonly-used print
function returns None
.
One more time, for emphasis: the print
function’s return type is None
.
This is because the output you see from the print
function is the string that
it is printing to the screen, not the return value of the function. Since they
look nearly identical, it can be confusing. A good way to remember this is to
look at the output of a string versus the output of print
:
>>> "Hello world!"
'Hello world!'
>>> print("Hello world!")
Hello world!
The quotes around the first 'Hello world!'
indicates that this is a string
being returned, and the message shown when running print
does not have this.
You can check the return type of a variable, expression, or piece of data by
using the type
function. This simply returns the type of whatever you give it:
>>> type("Hello world!")
<class 'str'>
>>> type(2 + 2 == 5)
<class 'bool'>
>>> type(print("Hello world!"))
Hello world!
<class 'NoneType'>
In general, we recommend that you avoid writing functions that both return a
value and modify an existing value. This is because it is more difficult to
reason about what such a function does in a larger program. For example, the
following program takes a list L
and an item x
, adding x
to the end of L
and returning the new length of L
.
def append_and_length(L, x):
L.append(x)
return len(L)
If you call this function repeatedly (for example, in a loop), then you have to keep track of its return value as well as the fact that it is adding something to a list each time it runs.
Scope #
Here is a rather pointless function - it takes a single parameter, and sets its value to 42, doing nothing else:
def set_to_42(x):
x = 42
The question is, what is the value of x
after this code runs?
x = 0
set_to_42(x)
You might be tempted to say 42, but it turns out that this is not the case:
print(x) # This will print 0
Why is x
not set to 42 despite calling this function? The answer has to do
with a concept called scope.
When a function runs, any variables defined in its body are only valid within
that function. So the statement x = 42
within the set_to_42
sets the value
of x
to 42, but only until the function finishes executing. Here is another
example:
def define_y():
y = 42
define_y()
print(y) # This will result in an error
Because y = 42
is only valid within the scope of define_y
, trying to access
the value of y
outside of the function results in an error (unless we have
defined y
outside of the function as well).
Note that this only affects the definition of entire variables, so you can redefine part of a variable and see that change outside of the function’s scope:
def change_middle(int_list):
int_list[1] = 5
sample_list = [1, 2, 3]
change_middle(sample_list)
print(sample_list) # This prints [1, 5, 3]
Also operations like appending will make changes that last outside of the function scope as well:
def append_42(int_list):
int_list.append(42)
numbers = [1, 2, 3]
append_42(numbers)
print(numbers) # This prints [1, 2, 3, 42]
Docstrings #
Though we have left them out in previous functions for simplicity, all but the simplest functions you write should have a docstring. As its name might suggest, a docstring is a special string used to document a function’s use and purpose.
Before we get into the details of what information a docstring contains, here are a few principles to keep in mind:
- A docstring should explain what the function does. Readers of your docstring need to know what your function does so that they can use the function in their code. A docstring saves the reader the effort of figuring out what the function does by reading the code.
- A docstring should explain what the function’s inputs and outputs are. If a docstring describes what the function does and what its inputs and outputs are, the reader should have enough information to properly call the function in a program.
- A docstring should mention any assumptions that the function makes. For example, some functions that take an integer as input require the integer to be positive. If the function will crash or return an incorrect result if these assumptions are violated, the docstring should make that clear.
- A docstring should not explain how the function works. For example, if a
function loops through every character of a string, it does not matter whether
it does so using a
for
loop or awhile
loop.
With that being said, we can dive into what a docstring looks like. A docstring
starts and ends with a set of three quotation marks ("
), like this:
def compare_lists(left_list, right_list, which_items):
"""
Compare two sorted lists and return items belonging to one or both of them.
Given two sorted lists of comparable items (i.e., every pair of items can
be compared), called left_list and right_list, create three sets: one
consisting of the items that only appear in left_list, one consisting of
the items that only appear in right_list, and one consisting of items that
appear in both lists. Duplicate items in lists are allowed. The value of
which_items, which can be -1, 0, or 1, determines whether to return the
list of items from only left_list, only right_list, or both lists,
respectively.
Args:
left_list: A list of comparable items.
right_list: A list of comparable items.
which_items: An integer equal to -1, 0, or 1, indicating which items
to return.
Returns:
A list of items from only left_list, only right_list, or both lists.
"""
pass
The starting quote marks must be aligned with the function body - you will get a syntax error if they are not. It is good style to also align the ending quote marks with the function body, but you will not get an error if they are not aligned.
The first sentence of a docstring should summarize what the function does. It should be written in imperative style (i.e., “Compare two” rather than “Compares two” above).
Below, you can write a longer description if necessary. Here, you can explain the function in more detail, including any assumptions made or any special cases. This part is optional if the one-sentence description sufficiently explains how to use the function. If your function makes changes to variables (such as appending to a list) that is not explained in the one-sentence summary above, you should explain the changes here.
If your function takes any parameters, you should explain each of them in a
section marked Args
. For each parameter, explain the type of the parameter and
what it represents. Similarly, if your function returns something, you should
include this in a section marked Returns
with an explanation of the function’s
return type and what the return value represents.
As with other lines of code, no part of a docstring should exceed 80 characters in length, including the indentation.
Finally, note that the body of the function is simply pass
. The pass
keyword
does nothing, but it can be written as a placeholder for code in an indented
code block, such as the body of an if
statement or for
loop. The pass
keyword can be useful if you are writing the overall structure of your code
blocks but plan on writing the content of the block later.
Reasoning about Code #
A useful skill in developing and debugging code is learning how to reason about what a block or line of code is doing. This can include reasoning at a high level (what the code as a whole is accomplishing) or at a low level (what a specific line of code is doing with a variable). Below, we will describe techniques you can use to help you think about code at both levels.
Expected Behavior #
As you write code, it can be helpful to clarify what you expect the code to do, even before you write it. Usually, this involves thinking of a few simple cases and trying to predict what the result will be.
For example, suppose that you are writing a function called max_int(numbers)
to return the largest integer in a list. If you call this function with a list
containing a single integer (like [1]
), you would expect the function to
return that single integer.
This seems obvious - in general, you would expect to get the largest integer from a function designed to do so. But to thoroughly make your own expectations of the function clear, you should think adversarially, that is, like someone who is trying to make the function behave incorrectly. This can help you identify places in the code where potential problems might occur.
For example, what happens if you give max_int
a list of all negative integers?
What happens if all of the integers are the same? Perhaps most interestingly,
what should max_int
return if you give it an empty list?
In the next reading, we will talk about testing code to ensure that it behaves in the way you expect it to. But we recommend that you practice thinking through what you expect code to do, building that skill before diving into testing.
Stepping through Code #
You can also think about what individual lines of code do. Consider the following function:
def list_max(int_list):
max_value = 0
for i in int_list:
if i > max_value:
max_value = i
return max_value
Suppose you call list_max([1, 3, 2])
. Let’s think through what this function
does at a line-by-line level.
- First
max_value
is set to 0. - We are looping through the list, so the first value of
i
we consider is 1. Right now,i
is 1 andmax_value
is 0. - We compare this to
max_value
, and since 1 is greater than 0, we setmax_value
to 1. Now, bothi
andmax_value
are 1. - In our next iteration through the loop,
i
is 3.max_value
is still 1. - Since 3 is greater than 1, we set
max_value
to 3. Now, bothi
andmax_value
are 3. - In our next iteration through the loop,
i
is 2.max_value
is 3. - Since 2 is not greater than 3, we do not change the value of
max_value
. Soi
is still 2 andmax_value
is still 3. - Finally, we have gone through the entire list. We can return the value of
max_value
, which is 3.
Keeping track of the values of variables in this way can be extremely helpful in debugging code that is not behaving correctly. Unfortunately, it can also be a bit tedious to keep this information entirely in your head.
One way of keeping track of this is through careful use of the print
function.
We could write the function like this:
def list_max(int_list):
max_value = 0
for i in int_list:
print(f"Now considering {i} (current max: {max_value})")
if i > max_value:
print(f"Changing max_value to {i}")
max_value = i
return max_value
Notice the calls to the print
function that make allow us to see information
about the function as it runs. If we call list_max([1, 3, 2])
, we get the
following output:
Now considering 1 (current max: 0)
Changing max_value to 1
Now considering 3 (current max: 1)
Changing max_value to 3
Now considering 2 (current max: 3)
3
This is easy to follow, and we do not have to keep track of the values manually. Debugging code in this way is called print-based debugging, and is often used by software developers as a first step in diagnosing buggy code.
You can also step through code online, using a tool called Python Tutor. This allows you to enter code and go through its execution line by line, tracking every variable name (including the names of functions) and value. For the program above, the tool shows something like this:
We highly recommend using print-based debugging or Python Tutor as you think through what your code is doing.