Sets

Sets #

A set is a data structure designed to test for membership (whether something is in the set or not). Creating a set is similar to creating a dictionary, except that you use single items instead of colon-separated pairs:

hubs = {"Chicago", "Denver", "Houston", "Los Angeles", "Newark",
        "San Francisco", "Washington"}

Note that an empty set cannot be defined with {}, since that represents an empty dictionary. To define an empty set, use set() instead.

In some ways, sets behave similarly to lists. You can find the size of a set with len and you can check whether something is in a set with in.

However, sets do not keep track of the order of the items they contain. You cannot index or slice a list, and duplicate items are not allowed. You can loop through the items of a set, but you might be surprised by the order. For example, if you run this code:

shuffled_nums = {3, 2, 4, 5, 1}
for num in shuffled_nums:
    print(num)

You will see the following output:

1
2
3
4
5

On the other hand, you can do some operations on sets that you cannot on lists. You can combine sets to get all of the items that appear in both sets (“and”), in either set (“or”), or in just one of the sets (“xor”, pronounced “ex or”):

first_nums = {1, 2, 3, 4, 5}
first_primes = {2, 3, 5, 7, 11}

# Numbers in both sets (and): {2, 3, 5}
first_nums & first_primes

# Numbers in at least one set (or): {1, 2, 3, 4, 5, 7, 11}
first_nums | first_primes

# Numbers in only one set (xor): {1, 4, 7, 11}
first_nums ^ first_primes

You can add and remove individual items from sets using add, remove, or discard:

sample_set = {1, 2, 3}
sample_set.add(4)  # sample_set is now {1, 2, 3, 4}
sample_set.remove(2)  # sample_set is now {1, 3, 4}
sample_set.remove(5)  # This will result in a KeyError
sample_set.discard(3)  # sample_set is now {1, 4}
sample_set.discard(5)  # No error, but doesn't remove anything

You can also “subtract” two sets, which will delete any items in the first set that are in the second (but ignore anything that is only in the second set):

first_nums = {1, 2, 3, 4, 5}
first_primes = {2, 3, 5, 7, 11}

first_nums - first_primes  # This returns {1, 4}

As you can see, 2, 3, and 5 were removed, but 7 and 11 were not in first_nums to begin with and are thus ignored.

Finally, here is an example of a short program that counts the unique characters in a string and prints the result:

hello = "Hello world!"
unique_chars = set()
for character in hello:
    unique_letters.add(character)
print(f"Characters used in string: {unique_chars}")

This prints the following:

Characters used in string: {'!', 'd', 'w', 'o', 'e', 'r', 'H', 'l', ' '}