Python Basics #11 — Sets
Do you remember learning about sets in elementary school math class? If you’ve been out of school for a while it might be hazy, but you definitely covered things like union, intersection, complement, and difference. Python provides a data structure that mirrors these mathematical sets — appropriately called the set. Let’s dig in.
First, how to create a set object. There are two ways.
The first way is to use the set constructor. The constructor takes a single sequence-type object as an argument; with no arguments it returns an empty set.
Pass a list to the constructor, then check the type:
>>> set_1 = set([1, 2, 3])
>>> print(set_1)
{1, 2, 3}
>>> print(type(set_1))
<class 'set'>A set object was created.
Now pass a tuple:
>>> set_2 = set((1, 2, 3))
>>> print(set_2)
{1, 2, 3}
>>> print(type(set_2))
<class 'set'>A set was created, just like with the list.
Now try a string and a range:
>>> set_3 = set('string')
>>> print(set_3)
{'s', 't', 'n', 'r', 'i', 'g'}
>>> print(type(set_3))
<class 'set'>
>>> set_4 = set(range(10))
>>> print(set_4)
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> print(type(set_4))
<class 'set'>Sets created from string and range objects.
Now create an empty set by calling the constructor with no arguments:
>>> set_5 = set()
>>> print(set_5)
set()
>>> print(type(set_5))
<class 'set'>Empty set created.
As mentioned, the set constructor takes a single sequence-type object — list, tuple, string, range, etc. What happens if you pass something else?
Pass three integers (not a sequence):
>>> set(1, 2, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: set expected at most 1 argument, got 3Error: set expects at most 1 argument, but got 3.
Pass a single int:
>>> set(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterableError: int is not iterable.
The second way to create a set is using curly braces. The syntax is similar to creating a list or tuple.
Create a set with curly braces:
>>> set_6 = {1, 2, 3}
>>> print(set_6)
{1, 2, 3}
>>> print(type(set_6))
<class 'set'>A set was created. But what about empty curly braces — does that create an empty set? As you learned in the dictionary lesson, empty curly braces create an empty dict, not an empty set:
>>> var = {}
>>> print(var)
{}
>>> print(type(var))
<class 'dict'>You get an empty dict instead of an empty set. Remember this gotcha.
Now let’s look at the characteristics of sets.
Characteristic 1: sets are unordered. #
Create a set:
>>> my_set = {'one', 'two', 'three'}
>>> my_set
{'three', 'one', 'two'}The items are in a different order than they were defined.
Add a new item:
>>> my_set.add('four')
>>> my_set
{'four', 'three', 'one', 'two'}The new item didn’t go to the end.
Characteristic 2: no indexing. #
Because sets are unordered, you can’t use indexes:
>>> my_set[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'set' object is not subscriptableError: set objects don’t support indexing.
Characteristic 3: only one item with a given value. #
Create a set with duplicates:
>>> my_set = {1, 2, 2, 3}
>>> my_set
{1, 2, 3}Only one 2 remains.
The set already contains 1. Add another 1 with add:
>>> my_set.add(1)
>>> my_set
{1, 2, 3}Since 1 already exists in the set, it’s not added again. Sets eliminate duplicates, which makes them very useful for extracting unique values from data.
Characteristic 4: items must be immutable (hashable). #
Sets only allow immutable types as items. Immutable types include boolean, int, float, tuple, str, frozenset, etc. Mutable types like list, set, and dictionary cannot be used as set items.
Define a set with tuples (immutable) as items:
>>> {(1, 2, 3), (2, 3, 4)}
{(1, 2, 3), (2, 3, 4)}Created without issues.
Try defining a set with lists (mutable) as items:
>>> {[1, 2, 3], [2, 3, 4]}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'Error: list is unhashable. Think of “unhashable” as equivalent to “mutable.”
Now let’s look at the methods provided by the set class.
Use help:
>>> help(set)
Help on class set in module builtins:
class set(object)
| set() -> new empty set object
| set(iterable) -> new set object
|
| Build an unordered collection of unique elements.
|
| Methods defined here:
|
| __and__(self, value, /)
| Return self&value.
|
| __contains__(...)
| x.__contains__(y) <==> y in x.
|
| __eq__(self, value, /)
| Return self==value.
|
| __ge__(self, value, /)
| Return self>=value.
|
| __getattribute__(self, name, /)
| Return getattr(self, name).
|
| __gt__(self, value, /)
| Return self>value.
|
| __iand__(self, value, /)
| Return self&=value.
|
| __init__(self, /, *args, **kwargs)
| Initialize self. See help(type(self)) for accurate signature.
|
| __ior__(self, value, /)
| Return self|=value.
|
| __isub__(self, value, /)
| Return self-=value.
|
| __iter__(self, /)
| Implement iter(self).
|
| __ixor__(self, value, /)
| Return self^=value.
|
| __le__(self, value, /)
| Return self<=value.
|
| __len__(self, /)
| Return len(self).
|
| __lt__(self, value, /)
| Return self<value.
|
| __ne__(self, value, /)
| Return self!=value.
|
| __or__(self, value, /)
| Return self|value.
|
| __rand__(self, value, /)
| Return value&self.
|
| __reduce__(...)
| Return state information for pickling.
|
| __repr__(self, /)
| Return repr(self).
|
| __ror__(self, value, /)
| Return value|self.
|
| __rsub__(self, value, /)
| Return value-self.
|
| __rxor__(self, value, /)
| Return value^self.
|
| __sizeof__(...)
| S.__sizeof__() -> size of S in memory, in bytes
|
| __sub__(self, value, /)
| Return self-value.
|
| __xor__(self, value, /)
| Return self^value.
|
| add(...)
| Add an element to a set.
|
| This has no effect if the element is already present.
|
| clear(...)
| Remove all elements from this set.
|
| copy(...)
| Return a shallow copy of a set.
|
| difference(...)
| Return the difference of two or more sets as a new set.
|
| (i.e. all elements that are in this set but not the others.)
|
| difference_update(...)
| Remove all elements of another set from this set.
|
| discard(...)
| Remove an element from a set if it is a member.
|
| If the element is not a member, do nothing.
|
| intersection(...)
| Return the intersection of two sets as a new set.
|
| (i.e. all elements that are in both sets.)
|
| intersection_update(...)
| Update a set with the intersection of itself and another.
|
| isdisjoint(...)
| Return True if two sets have a null intersection.
|
| issubset(...)
| Report whether another set contains this set.
|
| issuperset(...)
| Report whether this set contains another set.
|
| pop(...)
| Remove and return an arbitrary set element.
| Raises KeyError if the set is empty.
|
| remove(...)
| Remove an element from a set; it must be a member.
|
| If the element is not a member, raise a KeyError.
|
| symmetric_difference(...)
| Return the symmetric difference of two sets as a new set.
|
| (i.e. all elements that are in exactly one of the sets.)
|
| symmetric_difference_update(...)
| Update a set with the symmetric difference of itself and another.
|
| union(...)
| Return the union of sets as a new set.
|
| (i.e. all elements that are in either set.)
|
| update(...)
| Update a set with the union of itself and others.
|
| ----------------------------------------------------------------------
| Class methods defined here:
|
| __class_getitem__(...) from builtins.type
| See PEP 585
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| __hash__ = NoneThe set class provides many methods. Let’s cover the most common ones.
add
#
>>> help(set.add)
Help on method_descriptor:
add(...)
Add an element to a set.
This has no effect if the element is already present.add adds an item to a set. As mentioned, duplicates are not added.
Create an empty set and add items:
>>> myset = set()
>>> print(myset)
set()
>>>
>>> myset.add(1)
>>> print(myset)
{1}
>>>
>>> myset.add(2)
>>> print(myset)
{1, 2}1 and 2 were added to the empty set.
pop
#
>>> help(set.pop)
Help on method_descriptor:
pop(...)
Remove and return an arbitrary set element.
Raises KeyError if the set is empty.pop removes and returns an arbitrary item. If the set is empty, it raises KeyError.
Use pop:
>>> myset.pop()
1
>>> print(myset)
{2}Item 1 was returned and removed.
Run again:
>>> myset.pop()
2
>>> print(myset)
set()Item 2 was returned, leaving an empty set.
Try pop on an empty set:
>>> myset.pop()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'pop from an empty set'KeyError is raised on an empty set.
remove
#
>>> help(set.remove)
Help on method_descriptor:
remove(...)
Remove an element from a set; it must be a member.
If the element is not a member, raise a KeyError.remove also deletes items, but unlike pop (which takes no argument and removes an arbitrary item), remove takes the value to delete and raises KeyError if that value isn’t a member.
Define a set and use remove:
>>> myset = {1, 2, 3}
>>> myset.remove(1)
>>> print(myset)
{2, 3}Item 1 was removed.
Try removing a value that isn’t in the set:
>>> myset.remove(4)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 4KeyError is raised.
union
#
>>> help(set.union)
Help on method_descriptor:
union(...)
Return the union of sets as a new set.
(i.e. all elements that are in either set.)Sets resemble math sets, so you can compute things like union, intersection, and difference between two or more sets. union returns the union — all elements in either set.

Create two sets:
>>> set_1 = {1, 2, 3}
>>> set_2 = {3, 4, 5}Compute the union:
>>> set_1.union(set_2)
{1, 2, 3, 4, 5}intersection
#
Returns only the items present in both sets.

>>> set_1.intersection(set_2)
{3}Only the common item 3 is returned.
difference
#
Returns the items in one set after removing items present in the other set.

>>> set_1.difference(set_2)
{1, 2}Items 1 and 2 remain, with 3 (present in set_2) removed.
The set class provides more useful methods beyond the ones above — try them out. Sets aren’t used as often as lists or dictionaries, but they’re very handy in the right situations. Get comfortable with them.
In the next lesson we’ll cover if statements, one of the most common control structures used to direct program flow.
Thanks for reading.