1. Python Basics

Reading 1: Python Basics #

Note: parts of this chapter have been adapted from Chapters 1 and 2 of Think Python.

In this chapter, you will learn about the basics of writing code in Python. Even with the small amount of Python you learn here, you should be able to write short programs that will run on your computer.

Programs, Programming Languages, and Python #

A program is essentially a series of instructions that specifies how to perform a computational task. Programs are almost always executed by computers, but they do not have to be - you could execute simple programs with the help of some pen and paper. Every task that a computer performs, such as fetching and rendering a webpage, running a game, or calculating the result of a physics simulation, are all programs.

As the term suggests, a programming language is a language for expressing programs. As with natural languages like English, programming languages have a set of rules that describe how to specify and interpret ideas (in this case, the instructions of a program). Python is a programming language that was first released in the early 1990s. It is a popular choice for introductory computing courses because it is easy to get started with, somewhat forgiving of common mistakes, and relatively easy to read. It is also quite powerful, and is used for a wide range of applications including data science, Web development, and game design. In this course, we use Python 3, which is the latest and de facto standard version of the language. Python 2 is still widely used, however, and if you look up documentation on the Web, you should ensure that it is for Python 3.

Running Python Code #

There are two main ways to run Python code: the Python interpreter and through Jupyter notebooks. Several other ways are commonly used to run Python code, but they are in some way a variant of the above two methods. In this section, we will show you how to run Python code using these two methods so that you can try the code we show for yourself.

The Python Interpreter #

One way to run Python code is to use the Python interpreter, which is a REPL for Python code. If you are using the computational environment we describe in the setup guide, then you can start the Python interpreter by opening a terminal window in WSL/Ubuntu and running the python3 command. You should see output that looks like this:

$ python3
Python 3.8.5 (default, Jul 27 2020, 08:42:51)
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

On some machines, if you run python instead of python3, you will see similar output, except that the first line of output will say something like Python 2.7.18 instead. This means that you are running Python 2.

In the Python interpreter, you can write lines of Python code and press Enter to execute them. You can use the Up/Down arrows (or if you prefer, Ctrl-P/Ctrl-N) to access previous lines of code that you have run in the interpreter. To exit, you can run the command exit() from the interpreter or press Ctrl-D.

Because of the >>> used as the prompt in the Python interpreter, we will use these characters to indicate Python code you can run (similar to how we use $ to show a command in Bash that you can run). This does not mean you have to run the code in the Python interpreter; you can just as easily run the command in a Jupyter notebook, as we describe next.

Jupyter Notebooks #

Another way to run Python is through Jupyter notebooks, which you have already seen in Reading 0. Each Jupyter notebook has a kernel, which you can think of as a Python interpreter running as a Web service in the background. The kernel runs code from a notebook’s code cells.

Using Jupyter notebooks requires a Web server to be running on your computer, so to run or edit these notebooks, you will need to first start this server. To do so, run the command jupyter notebook in a new terminal window in WSL/Ubuntu. You are recommended to use a terminal window just for this command, since the server prints output as it runs. To shut down the server, press Ctrl-C (you may also have to confirm the shutdown).

On some machines, starting the Jupyter notebook server will automatically launch your Web browser with a window that allows you to select an existing file to open or create a new file, like this:

File selection window of the Jupyter notebook Web
interface

If not, you can look in the notebook server output for a link that looks like the following (note that the token value is an example, and will not work on your machine):

http://localhost:8888/?token=a995013330f6bb9d0546af78778c19a99a6400397e23e1fc

Copy and paste this link into your Web browser to open the file selection window. Alternatively, you can simply visit the site http://localhost:8888 and paste in the token value (the characters following token= in the output above).

Jupyter notebook files have names ending in .ipynb (for interactive Python notebook). If you open a notebook file, you will see an interface that looks like this:

Jupyter notebook editor with an example
notebook

The number next to the code cell indicates the order in which you have run the cells, while In: [ ] indicates that the cell has not been run yet. A cell can be run multiple times, and the numbers will count up accordingly. Sometimes the order in which you execute cells matters, so it can be helpful to double-check a notebook when you are done with it by selecting “Run all” in the Cell menu. Note that there is no way to “undo” the effect of running a cell, except to restart the kernel from the Kernel menu.

Variables #

The concept of variables is fundamental to computing, since it allows us to store and use data over the course of a program. This data can be simple values such as integers or can be code that executes as part of a program. In this section, we will provide an introduction to how variables work in Python.

Objects #

Objects are how Python represents data. An object consists of an identity, a type, and a value. An object’s identity essentially describes where on a computer it is stored, an object’s type describes what a program can do with the object, and an object’s value describes the object’s data that is used by programs.

As an example, consider the mathematical expression \(x = 42\) . Since this not part of a computer program, the identity of the variable \(x\) does not make much sense, but we can say that it is an integer (its type) with the value 42.

Knowing an object’s identity is typically not useful in the early stages of learning Python, so we will not cover it in this chapter. While there are a few cases in which reasoning about an object’s identity is important, for now, we will instead refer to objects by a name or identifier, which we describe below.

If you want to read more about objects in Python, the relevant page in the Python Language Reference covers the topic in great detail.

The Assignment Statement #

One of the simplest operations you can do in Python is something like this:

>>> x = 42

This line is called an assignment statement because it assigns the name x to the value 42. From this point, you can use the name x as shorthand for 42:

>>> x + 1
43

We refer to x + 1 as an expression, which is a statement or part of a statement that can be reduced to an object. In this case, the expression x + 1 reduces to an object with the value 43.

You can change the value that a name refers to with a new assignment statement:

>>> x
42
>>> x = 100
>>> x + 1
101

As a side note, sometimes Python refers to assignment as binding a name to a value. There is a subtle difference between assignment and binding, but we will not cover it here. If you are interested, you can read more about assignment statements or binding on the respective Python 3 documentation pages.

Naming Rules and Guidelines #

Usually called identifiers in Python, names are used to refer to objects. Here, we discuss important and rules and guidelines for what names you can and should use in your code. The rules about what names you can give to variables is part of the Python language and apply to all Python users, while guidelines about what names you should give to your variables are stylistic decisions that can vary based on things like what company you work for or when you learned Python.

Valid Names #

You are probably familiar with variables in math having single-letter names like \(x\) or \(\sigma\) (possibly with subscripts like \(x_i\) ). In Python (and most other programming languages, for that matter), variables can have multi-character names such as url or compressed_size. However, you cannot just use any sequence of characters as an identifier:

>>> leading-digit = 1
  File "<stdin>", line 1
SyntaxError: cannot assign to operator

Even if you do not understand what this error message means, it is clear that something has gone wrong. The documentation on identifiers contains a technical description of what constitutes a valid identifier, but the following rules cover most of the common cases:

  • Identifiers can consist of letters, numbers, and a small set of punctuation characters.
  • With one notable exception, identifiers must begin with a letter.
  • Identifiers cannot be any of Python’s reserved keywords.

In the Python interpreter, the special name _ refers to the output of the previous command:

>>> 2 + 5
7
>>> _
7
>>> 5 - 1
4
>>> _
4

Note that in Jupyter notebooks, _ returns the output of the most recent cell (i.e., the one executed just before the one in which _ appears). If the most recent cell did not produce any output, or if you try to get the value of _ in the first cell run, you will instead get the output '' (the empty string - see below for details). By contrast, in the Python interpreter, you will get an error instead.

Guidelines for Variable Names #

While any name that fits the above guidelines is one that you can use to identify a variable, we use a significantly stricter set of guidelines for what you should use to identify a variable. These guidelines are largely adapted from Google’s Python style guide.

The first and perhaps most important guideline for a variable name in Python is that it should precisely and readably communicate what its variable represents. Most, if not all, of the other guidelines stem from this one. It is also a hard guideline to follow, however. Many, many tutorials and forum posts use vague names and improper style, and it is easy to fall into the habit of not using proper style when quickly writing code. We strongly encourage you to build good habits for variable names while you are still learning so that you can write code that others (and your future self) can read and easily understand.

A straightforward application of this guideline is that you should avoid using single-letter variable names wherever possible. Some letters have been used so perasively for so long that they are considered exceptions to this guidelines. One exception you will see in this reading is i and j for counters, variables that count up to (or down from) a variable, performing some computation a certain number of times. If you are referring to \((x, y)\) coordinates, you may use x and y. That being said, you should aim to use even these exceptions sparingly if possible.

You should also avoid using abbreviations or vague names in your code. When you first write your code, it may save you a few keystrokes to write xse for \(x\) standard error rather than x_standard_error. However, others who may read your code or you at a later time may have difficulty deciphering what xse originally stood for. Names like count may indicate that something is being counted, but it may take further effort to figure out what. It would be better to use a name like total_line_count or total_lines.

With a few exceptions that we will see in a future reading, variable names in Python should be written in snake case. This means that for variable names that consist of multiple words, the words should be separated by the underscore character (_). The previous paragraph shows examples of this; you should write variable names like total_lines instead of totalLines as you may in other programming languages.

Generally, variables should also be written in all lowercase. The only exception to this is if you are defining a constant, which is a variable whose value you intend to set once and never change. For example, acceleration in a vacuum due to gravity on Earth may be set with a line like this:

GRAV_ACCEL = 9.81

Using constants like this makes your code easier to read and maintain over time. For example, if you decide that you want to change this value to be in US customary units rather than SI units, you can simply change the value of GRAV_ACCEL rather than finding and replacing every instance of the value 9.81 in your code. It is worth noting that while naming constants in this way communicates your intent not to change the value of a variable, nothing stops you from actually changing the value.

Basic Types and Operators #

As we mentioned previouly, every object in Python has a type that defines what a program can do with that object. If a part of your program expects data of a certain type and instead gets data of a different type, a variety of things may happen. In rare cases, the program continues as expected, but you should not count on this happening. Sometimes, the program continues but may produce unexpected results or crash later. Much of the time, the program will simply crash with an error.

Because type mismatches can produce a wide range of errors in Python programs, it is important for you to develop a solid understanding of types in Python. Eventually, you should be able to take any variable in a Python program and work out what type it is.

Below, we describe a few common types that you will encounter in the vast majority of Python programs you read or write. We also describe commonly-used operators for these types that represent ways to use one or more variables as part of a computation.

Integers and Arithmetic Operators #

As its name suggests, the integer type represents integer values (numbers with nothing past the decimal point). You can get the additive inverse (negative) of an integer by adding a - before the integer:

positive_int = 42
negative_int = -1
double_negative_int = -(-22)

You can do standard addition and subtraction with integers:

>>> 2 + 5
7
>>> 21 - 34
-13

You can also do multiplication and exponentiation, though the symbols look different from what you find in standard mathematical notation:

>>> 2 * 8
16
>>> 2 ** 8
256

As you might expect, the standard order of operations applies, so you may have to use parentheses to force operations to happen in a certain order:

>>> 3 * 7 + 2 ** 2
25
>>> 3 * (7 + 2) ** 2
243

In the first line above, the expression 2 ** 2 is evaluated to 4 first, and then 3 * 7 is evaluated to 21. These are then added together to produce 25. In the second line, the expression 7 + 2 is evaluated first due to the parentheses, which results in 9. Then, this value is used in 9 ** 2 to produce 81, and finally, the multiplication 3 * 81 produces 243.

Division is a bit odd since dividing two integers does not always produce an integer:

>>> 5 / 2
2.5

The result of this division is a floating-point number, which we will discuss later in this section. However, it is worth noting that dividing two integers always produces a floating-point number, even if the result would ordinarily be an integer:

>>> 5 / 1
5.0

This behavior might seem unintuitive, but can be helpful when reasoning about a program’s behavior. Because dividing two integers always produces a floating-point number, if you have integers a and b and write a line like c = a / b, you know that c is always a floating-point number, regardless of whether b divides a or not.

If you want the division of two integers to instead produce an integer, you can use the floor division operator, written //:

>>> 24 // 4
6

This operation is called floor division because it always rounds the result down to the next integer:

>>> 5 // 4
1
>>> 6 // 4
1
>>> 7 // 4
1
>>> 8 // 4
2

The remainder operator, written as %, is related to the floor division operator, giving the remainder of dividing two integers:

>>> 5 % 4
1
>>> 6 % 4
2
>>> 7 % 4
3
>>> 8 % 4
0

The remainder operator is sometimes called the modulo operator or simply mod for short, but this term can be confusing because some people (and programming languages) assume that the modulo operation must produce a positive number, whereas in Python, it does not:

>>> 13 % 4
1
>>> -13 % 4
3
>>> 13 % -4
-3
>>> -13 % -4
-1

In Python, you can assume that a % b will always produce an integer between 0 and b, excluding b itself.

Booleans and Comparison Operators #

The boolean type represents one of two values: True or False. These values are used to represent logical statements about variables or expressions. A logical statement can often be expressed using comparison operators. Below, we will examine a few common comparison operators.

The equals operator == is written between two expressions and evaluates to True if the two expressions produce the same value and False otherwise. By contrast, the not-equals operator != simply evaluates to True if the two expressions produce different values. Here are some simple examples using integers:

>>> 2 + 2 == 4
True
>>> 2 + 2 != 5
True
>>> 2 + 2 == 5
False

The less-than operator < and greater-than operator > check that the first expression is less than or greater than the second, respectively:

>>> 2 ** 10 > 1000
True
>>> -2 ** 8 < 0
False

The less-than-or-equal-to operator <= and greater-than-or-equal-to operator >= behave similar to < and >, but also evaluate to True if the expressions on both sides are equal:

>>> celsius = -40
>>> fahrenheit = (celsius * 1.8) + 32
>>> celsius <= fahrenheit
True
>>> celsius >= fahrenheit
True

You can write a single line with multiple comparison operators, like this:

1970 < current_year < 2020

Python will break this into two separate comparisons, 1970 < current_year and current_year < 2020. The overall expression will evaluate to True if both comparisons evaluate to True and False otherwise. While a line like the above is appropriate, we generally discourage using multiple comparisons in a line, as it can result in code that is difficult to read:

new_warnings == total_warnings > total_errors <= total_messages

Floating-Point Numbers #

Floating-point numbers, sometimes simply called floats, are a type that represent numbers with a decimal point. The term “floating point” refers to the fact that the decimal point can “float” among the digits, that is, the number is not always \(x\) digits with \(y\) digits past the decimal point.

Floats support all of the arithmetic operations that integers do, but note that floor division with two floats will produce another float:

>>> 5.0 // 2.0
2.0

Arithmetic with floats can sometimes have surprising results:

>>> 0.1 + 0.2
0.30000000000000004
>>> _ == 0.3
False

This is because of the way that Python internally represents floats. Virtually every computer represents data as binary bits - sequences of 1s and 0s. But just as the fraction \(3/11\) is represented in decimal as the repeating \(0.272727\ldots\) , the numbers 0.1, 0.2, and 0.3 all have repeating representations in binary. Thus their true values can only be approximated on a computer, producing this rather strange result.

Because of this, you should be careful when doing comparisons with floating-point numbers. If you are interested in reading more about common pitfalls in floating-point arithmetic, the Python official tutorial has a page that you may find helpful.

Finally, note that you can compare integers and floats even though they are different types:

>>> 42 == 42.0
True
>>> 2.0 + 2.0 < 5
True

However, the pitfalls of floating-point arithmetic still apply:

>>> 3 < (0.1 + 0.2) * 10
True

Strings and the print Function #

The string type represents a sequence (or string) of characters. Generally, a string can be written simply as the sequence of characters it represents, surrounded by single quotes (') or double quotes ("), like 'quack' or "cellar door". There is no difference between strings written with single quotes or double quotes - the quote marks simply need to match.

Occasionally, you may find the need to use quotation marks inside a string. An example of this is if your string contains an apostrophe, such as "There's no place like home." In this case, you can use the other type of quote marks to surround the string as shown above, or you can use an escape character, which tells Python to temporarily stop looking for a quote mark to end the string. Python uses the backslash (\) as its escape character. To include a literal quote mark in a string, then, you can write something like this:

>>> "Dorothy said, \"There's no place like home.\""
'Dorothy said, "There\'s no place like home."'

The \" is called an escape sequence. As you can see from the output above, \' is also an escape sequence indicating a literal apostrophe. You may notice that in the example above, the Python interpreter shows string output as surrounded by single quotes (hence the apostrophe being preceded by the escape character). In this course, we use double quotes for all strings (mainly for reasons discussed in the documentation for the Black code formatter).

Generally, typing an escape sequence in a string will not cause it to appear as what it stands for. As an example, \n is an escape sequence representing a line break, but it will still show as \n in a string:

>>> "a\nb"
'a\nb'

To see how the string would be shown with these escape sequences taken into an account, you can use the print function, which displays a string. We will talk more about what a function is in the next reading, but you can use the print function like this:

>>> print("a\nb")
a
b

As you can see, the \n is displayed as a line break, and no quotes are surrounding the string.

You can also print integers:

>>> print(42)
42

You can even print multiple items of different types. If you do this, print will helpfully insert a space between items:

>>> print("The answer to life is", 42)
The answer to life is 42

You can join two strings together with an operation called concatenation. The symbol for concatenation is +:

>>> "Hello " + "world!"
'Hello world!'

Note that when concatenating two strings, a space is not automatically added between them.

You can repeat a string by multiplying it by an integer:

>>> "ha" * 4
'hahahaha'

You can check whether a character or sequence of characters is in a string by using the in operator:

>>> "ah" in "hahahaha"
True
>>> "hello" in "Hello world!"
False
>>> "" in "everything"
True

As you can see from the above examples, in is case-sensitive. The string "" is called the empty string, and has no characters in it. However, the empty string is in every string, including itself.

You can get the length of a string using the len function, like this:

>>> len("hello")
5
>>> len("")
0

You can get an individual character of a string by using square brackets ([]) with an integer, like this:

>>> message = "hapnhoempaogltepnrprdypaiaitrm"
>>> message[1]
'a'
>>> message[0]
'h'
>>> message[-1]
'm'
>>> message[-2]
'r'

Notice that message[1] actually gives you the second character of the message. In Python, the characters of strings are counted starting from 0, so message[0] gives you the first character of a string, and so on. Using a negative integer in square brackets will give you a character counting from the end of the string, with message[-1] being the last character of message. Making a mistake with which character of the string you are getting (often called an off-by-one error) is quite common among those first learning Python, and we encourage you to be careful when getting characters in a string. It is worth noting that the characters of a string message go from 0 to len(message) - 1, so trying to get the character at position len(message) will result in an error:

>>> message[len(message)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range

You can slice strings to get a subset of the string’s characters:

>>> advice = "Use the Force"
>>> advice[:3]
'Use'
>>> advice[4:7]
'the'
>>> advice[-5:]
'Force'

As you can see, you can provide a starting and ending character position, but you do not have to. If you do not provide a starting character position, it will be assumed to be 0, and if you do not provide an ending character position, you will get the rest of the string from your start point.

You should also be aware that while the starting position you give is the first character in the slice, the ending position is the one after the last character in the slice. This is why advice[:3] gives the characters in positions 0 through 2, but not 3.

If you give an invalid pair of starting and/or ending positions, you will get an empty string:

>>> advice[3:1]
''
>>> advice[100:]
''

You can also provide a second colon (:) along with another integer (called the step size) to take every \(n\) th character:

>>> advice[::2]
'UeteFre'

As you can see, this takes the entire string, but only every other character, starting from the one at position 0.

As with a regular string slice, you will get the subset of characters up to, but not including, the end position. The number of characters between the start and end position does not have to be a multiple of the step size - you will simply get the characters that you can:

>>> advice[1:7:3]
'st'
>>> advice[1:8:3]
'st '
>>> advice[1:10:3]
'st '

Finally, you can also use a negative step size:

>>> advice[::-1]
'ecroF eht esU'

This example is a handy way to reverse a string. You may notice that the start and end positions have been “switched” in the sense that they now run from right to left:

>>> advice[:-6:-1]
'ecroF'

With knowledge of string slices, you can find a hidden message in the message string from above:

>>> message = "hapnhoempaogltepnrprdypaiaitrm"
>>> message[::6]
'helpi'
>>> message[1::6]
'amtra'
>>> message[2::6]
'ppedi'
>>> message[3::6]
'napyt'
>>> message[4::6]
'honpr'
>>> message[5::6]
'ogram'

Or, to put that all together: “Help! I am trapped in a Python program.”

Blocks of Code #

A block of code is a set of multiple lines of code that are run as a sequence and accomplish a single task. A block of code might be used when writing a function (which we will see in Reading 2) or in conditionals and loops (which we will see below). In language, an analogous concept would be a paragraph.

In Python, blocks of code are usually denoted by whitespace, blank characters such as spaces, line breaks, and tabs. Newlines and indentation both visually and syntactically separate code ideas. The Python syntax is:

keyword argument:
    code to be executed
    code to be executed

keyword is a defined word in Python that has a specific meaning, such as if or for. argument is the expected arguments or inputs for that keyword. This is then followed by a colon (:) and newline.

Everything that then belongs under that block must be indented one level inward. This can be done using tab or spaces. It is safer to use spaces (4 for proper style) instead of tab because different machines may interpret the tab character as differing numbers of spaces; most text editors will automatically swap tabs for spaces. It is important to keep the level of indentation the same for each block of code, since otherwise you will get an error.

When running a block of code in the interpreter, you will see ... to denote a continuation of a block, since the interpreter is reading the code you type in line by line and is thus expecting the rest of the block.

>>> if 42 % 6 == 0:
...     print("42 is divisible by 6.")
...

Note that to end the block in the interpreter that you need to leave a blank line and then press Enter.

Conditionals #

A conditional allows you to execute different blocks on Python code depending on certain conditions, such as a variable or expression having a certain value.

The if statement #

Similar to the English language, Python uses the word if to denote a conditional block of text. We then have a boolean expression to evaluate, that if true we will then execute the code block below it.

if 42 % 5 == 0:
    print("42 is divisible by 5.")

if 42 % 6 == 0:
    print("42 is divisible by 6.")

In the example above, only 42 is divisible by 6. will be printed.

We can also nest if statements inside one another.

if 42 % 3 == 0:
    print("42 is divisible by 3.")
    if 42 % 2 == 0:
        print("42 is divisible by 6.")

This will print both 42 is divisible by 3. and 42 is divisible by 6. on separate lines.

Joining conditions with and, or, and not #

Python also has the keywords and, or, and not to make more complex conditional statement. Additionally, we can use the keywords True and False.

if 42 % 3 == 0 and 42 % 2 == 0:
    print("42 is divisible by 6.")

if False or True:
    print("True!")

if not "Dorothy" in "Kansas":
    print("We must be over the rainbow!")

Enhancing Conditionals with elif and else #

Consider the following code

x = 10
if  x < 0:
    print("negative")
if (x ** 0.5) % 1 == 0 and not x < 0 :
    print("square")
if (x ** 0.5) % 1 != 0 and not x < 0:
    print("not a square")

We can rewrite this useing the keywords, elif (else if) and else.

x = 10
if  x < 0:
    print("negative")
elif (x ** 0.5) % 1 == 0:
    print("square")
else:
    print("not a square")

Like in the English language, else (and elif) denote to only execute the code if the previous conditions have not been met. elif and else must follow an if or elif block. elif and else blocks will be skipped if a previous conditional block has been executed. Note that elif requires a boolean expression, while else does not.

Loops #

Loops are another type of code block, which allows us to repeat similar tasks without retyping code. There are two types of loops: for and while.

For Loops #

For loops allow you to loop over an iterable, collection of indexable items. Currently the only iterable data type you know are strings. You will learn about other types in future weeks. The structure of a for loop is:

for variable in iterable:
    code to execute
  • for is the Python keyword denoting the loop.
  • variable is a variable name that will take on the value of each item in the iterable, in order, for the block of code.
  • iterable the collection of things to loop over.

Here’s code that uses a for loop to count the number of vowels in a string.

text = "the quick brown fox jumped over the lazy dogs."
vowels = "aeiou"
vowel_count = 0

for letter in text:
    if letter in vowels:
        vowel_count += 1

print(vowel_count)

While Loops #

While loops allow you to loop over code until a certain condition is met. The structure of a while loop is:

while condition:
    code to execute

One of the more important things to keep in mind is that the code will continue running until the condition is met. If the block in the while loop never causes the condition to change, the program will run until force stopped. To do this, press Ctrl-C (in the interpreter) or Ctrl-M followed by I (in Jupyter). In general, if your while loop references a variable, you should ensure that the variable is modified somewhere in the block within the loop as well.

Here is an example while loop which prints a countdown to 0.

n = 10

while n >= 0:
    print(n, "!")
print("Happy New Year!")

Coding Style and Readability #

It is important to write code that is not only correct, but also easy to read. Well-written, readable code makes it easier for others to understand what your code is doing and why, which in turn makes it easier for them to use or adapt your code. Over time, the Python community has developed a set of guidelines for coding style that programmers widely consider readable. Throughout these readings, we will include some of these guidelines to help you write clearer, more readable code.

We have already given you some style guidelines on choosing variable names. Below, we will cover two more aspects of style and readability.

Line Length #

Long lines of code and text are generally discouraged in programming because they can be difficult to read. On the Web, you may have encountered a page where you had to scroll far to the right to read a single line of text and had to repeat that process as you go through each line. Unfortunately, many applications, especially those that deal with plain text (such as .txt files and code), still do not display long lines very well.

Most Python style guidelines thus limit the length of lines that can be used in code. In this course, we enforce a maximum line length of 80 characters. This is largely consistent with two widely-used sets of style guidelines: Google’s Python style guide, and Python’s PEP8 document (which recommends a maximum length of 79 characters).

There are two ways to cut down overly long lines. You can use a backslash (\) at the end of a line to indicate that the following line should be considered part of the current line:

x = 4 + \
    5

This is the same as the line x = 4 + 5. As an alternatively, you could use parentheses:

x = (4 +
     5)

Python will see an open parenthesis and assume that everything until the matching parenthesis is part of one statement.

Whichever of these approaches you use, it is good practice to align the start of each of the lines, as shown above.

Comments #

As you write single lines of code or short blocks of code, it may be relatively easy to keep track of what the code is doing and why you wrote it. However, as you write larger programs, it becomes more and more difficult to keep all of your reasoning in your mind.

Comments are lines of text that you can include within code that Python will simply ignore. Most comments in Python start with the number sign (#). They can thus be used to write information not captured by the code, like this:

# Swap the values of x and y. Because assigning to a variable deletes its old
# value, create a temporary variable to hold one of the values.
temp = x
x = y
y = temp

If you were to just see the three lines of code above, it might be difficult to work out why the variable temp is being defined and used in this way. The comments above explain what these three lines accomplish.

Commenting blocks of code in this way is helpful for outlining the broad steps of what you are trying to accomplish in a program. For example, you may have a program that reads a filename, reads that contents of the file, processes the contents in some way, and writes the processed data to another file. While you may find that simply splitting these steps into blocks of code is helpful enough, commenting each block of code can provide more insight into your design, particularly if a block is complex.

Generally, you should try to focus comments on why you wrote the code rather than what the code does. A helpful rule of thumb is to assume that the reader knows more Python than you. It is acceptable to describe what the code is doing at a high level, as shown in the example above. If you find yourself explaining the details of how the code is accomplishing its goals, your comment is likely too deep in the weeds.

Finally, you should note that comments are also subject to the 80-character line length limit as described above.

Bugs #

Initially you will likely encounter two type of bugs in Python: syntax errors and runtime errors (also called exceptions).

Syntax Errors #

In programming languages, syntax refers to the set of rules about what constitutes a valid statement. This term is also used in linguistics - a valid (syntactically correct) English sentence needs to start with a capitalized word and end in a period, question mark, or exclamation point. Note that syntax has nothing to do with the correctness of the code itself, or whether the code will even run successfully. “Colorless green ideas sleep furiously." is a valid sentence using the rules of English syntax but doesn’t have a widely agreed-upon meaning, and the statement x += 42 on its own is a valid Python expression but won’t execute successfully because x does not have a value.

When you try to run code in Python, the interpreter checks the text of the code to try to split it into a series of statements that it can then execute. If this process fails, you get a syntax error. Try running the code below in a Jupyter notebook:

print("Hello world!")
x = (4, 5

As expected, we get a syntax error. Notice that when you get a syntax error, none of the code runs, even if the code before the error is fine (like the print function call above).

If we ran this code in the Python interpreter, we’d see this:

>>> print("Hello world!")
Hello world!
>>> x = (4, 5
...

The interpreter would wait for you to close the parentheses and use the ... to denote an unfinished block of code. You are be able to add the closing parenthesis and have the code run correctly.

When encountering an error, it’s also important to pay attention to two pieces of information when you get an error: the location of the error, and the error message.

The location of the error consists of a file, line number, and position in that line. In a Jupyter notebook, the file is "<ipython-input-...>" (which is the format for cells in a Jupyter noteook) and in the Python interpreter, the file is "<stdin>". In a regular .py file, this would be the name of the file itself. The line number tells you where in the file the error occurred, and the caret (^) shows you where in the line the error occurred. It’s worth noting that the caret will point to the spot where Python first detects that there must be a syntax error, not necessarily to the location that needs to be fixed. Take a look at this example:

x = (1,
y = 5

If you ran this in a Jupyter notebook, you would get the following output:

  File "<ipython-input-1-a60693365e48>", line 2
    y = 5
      ^
SyntaxError: invalid syntax

Notice that it isn’t until = that the Python interpreter knows this must be a syntax error - for all it knows, y could have been defined and part of a larger expression.

The error message can sometimes give you a helpful clue as to what has gone wrong. In the first example above, you should have gotten the error message “unexpected EOF while parsing” (EOF stands for end of file). This message tells you that as the Python interpreter was scanning to find the end of the tuple, it hit the end of the file.

However, sometimes the error message is less helpful: in the second example above, you should have only gotten the message “invalid syntax”, which provides almost no additional information. As a helpful tip, the message “invalid syntax” almost always means that you forgot to close a paired delimiter (i.e., you left out a ), ], or }) or that you left out a colon (:) at the end of a statement starting with if, for, while, etc.

Finally, note that in addition to SyntaxError, there is another type of syntax error you may encounter: IndentationError. You have probably noticed that when you start an if or similar statement, end it with a colon, and press Enter, the next line will be indented by 4 spaces. (The actual number of spaces does not matter, as long as you are consistent.) An IndentationError indicates that a line of code is indented in an unexpected amount (too few or too many spaces).

Exceptions #

Runtime errors or exceptions are errors that occur while code is running. Most non-syntax errors that cause the code to stop running will fall into this category.

Exceptions in Python can happen for a large possible number of reasons. When you encounter an exception, the most helpful step is usually to read the error message to understand exactly what happened. That being said, the most common errors you will see at this point are likely NameError (the variable or function that you are trying to use has not been defined yet) or TypeError (you mistakenly tried to do something like add an int to a string).

You can find a complete list of built-in exceptions here. You do not have to be familiar with all of these, but it is helpful to look up exceptions as you run into them and familiarize yourself with what specifically these exceptions mean.

If you encounter an exception, there are a number of things you can check that may help you find the bug:

  • Make sure you didn’t make a typo. This is usually quite easy to find - a NameError for a name you thought you defined is probably a typo.
  • Check your types. Getting a TypeError means you are trying to call a function or operation on the wrong data type. You are either calling a function on the wrong variable, or mistakenly assigned the wrong data to a variable. Having good and clear variable names helps avoid potential confusion. Checking the places you are assigning to the variable can help track down the errant line of code.
  • Make sure you have updated your code correctly. For example, if at various points you strip out the first 4 characters of a string like foo[4:] and use this in multiple places, make sure that whenever you change this in one place, you change it in others as well. Better yet, define a function or variable for things like this so that you only have to change it in one place.

Git #

If you completed Assignment 0, then you have already forked the repository for this course. The remaining worksheet and assignment files will be distributed via this repository. Here, we will describe the steps you need to take to receive the changes we make to the course repository, such as adding new files or modifying submitted files to reflect our feedback to you.

Get Remote Changes #

The changes that are made to the original repository are not automatically communicated to your fork on GitHub or to your local copy on your computer. The proper way to get these changes are to pull them into your local copy.

To do this, cd into your copy of the course repository and run the command git fetch upstream. You should see output like the following:

$ git fetch upstream
> remote: Counting objects: 75, done.
> remote: Compressing objects: 100% (53/53), done.
> remote: Total 62 (delta 27), reused 44 (delta 9)
> Unpacking objects: 100% (62/62), done.
> From https://github.com/olincollege/softdes-2020-03
>  * [new branch]      master     -> upstream/master

In a nutshell, this command has asked the olincollege/softdes-2020-03 repository for its latest changes and added them to your local copy. However, Git does not automatically overwrite your copy with these changes, instead writing them into another branch called upstream/master (shown in the last line of the output). A Git branch is essentially a different history that allows different sets of changes to a repository to coexist so that you can manage several versions of files at once. We will talk more about branches in future readings.

In the next step, you will merge these changes into your master branch.

Merge Remote Changes #

You can merge changes from another branch with the git merge command. Specifically, to merge the remote changes that are now in upstream/master in your local copy, run the following from the local copy of your repository:

$ git merge upstream

If you have followed the computational setup, this will open up a window in VS Code where you can edit a commit message. This commit message should be pre-filled with appropriate content, so you should be able to simply save and close the message to complete the merge operation.

Upon trying to merge, you may see a message in the output of git merge that says something like this:

> Automatic merge failed; fix conflicts and then commit the result

If this happens, it is likely because you have made changes in your local copy to a section of a file that has changed in upstream/master, and Git cannot automatically determine which version to keep. This is called a merge conflict. If this does happen, you can use git status to determine which file(s) have merge conflicts.

To fix a merge conflict, simply open the conflicted files in an editor such as VS Code and edit the file so that it contains the content you want to commit. After making the appropriate changes, you can run git commit and write a commit message as you normally would.