Sets#
A set is an unordered collection of unique elements, analogous to the mathematical concept of a set. The most common uses are deduplication and set operations such as union and intersection:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # Create a set
s1 = {1, 2, 3, 4, 5}
s2 = {"apple", "banana", "cherry"}
# Duplicate elements are automatically removed
s3 = {1, 2, 2, 3, 3, 3}
print(s3) # {1, 2, 3} (order not guaranteed)
# Empty set must use set() (because {} is an empty dict)
empty_set = set()
print(type(empty_set)) # <class 'set'>
print(type({})) # <class 'dict'>
# Create a set from a list (deduplication)
nums = [1, 2, 2, 3, 3, 3, 4]
unique = set(nums)
print(unique) # {1, 2, 3, 4}
|
Adding and Removing#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| s = {1, 2, 3}
# add(): add a single element
s.add(4)
print(s) # {1, 2, 3, 4}
# update(): add multiple elements
s.update([5, 6])
print(s) # {1, 2, 3, 4, 5, 6}
# discard(): remove element, no error if not found
s.discard(6)
s.discard(99) # no error
print(s) # {1, 2, 3, 4, 5}
# remove(): remove element, raises KeyError if not found
s.remove(5)
print(s) # {1, 2, 3, 4}
# pop(): randomly remove and return an element
item = s.pop()
print(item) # unpredictable which element
# clear(): clear the set
s.clear()
print(s) # set()
|
Querying#
1
2
3
4
5
6
7
8
| s = {1, 2, 3, 4, 5}
# Check existence
print(3 in s) # True
print(9 in s) # False
# Length
print(len(s)) # 5
|
The in operator on a set is much faster than on a list — O(1) vs O(n) — making sets ideal for large-scale lookups:
1
2
3
4
5
6
| # Prefer set for large-scale lookups
valid_users = {"alice", "bob", "charlie", "david"}
username = "alice"
if username in valid_users:
print("User exists")
|
Set Operations#
1
2
| a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7}
|
Union#
1
2
3
| # All elements in a or b
print(a | b) # {1, 2, 3, 4, 5, 6, 7}
print(a.union(b)) # same as above
|
Intersection#
1
2
3
| # Elements common to both a and b
print(a & b) # {3, 4, 5}
print(a.intersection(b)) # same as above
|
Difference#
1
2
3
4
5
6
| # Elements in a but not in b
print(a - b) # {1, 2}
print(a.difference(b)) # same as above
# Elements in b but not in a
print(b - a) # {6, 7}
|
Symmetric Difference#
1
2
3
| # Elements in a only or b only (excluding common elements)
print(a ^ b) # {1, 2, 6, 7}
print(a.symmetric_difference(b)) # same as above
|
Subsets and Supersets#
1
2
3
4
5
6
7
8
9
| a = {1, 2, 3}
b = {1, 2, 3, 4, 5}
print(a.issubset(b)) # True (a is a subset of b)
print(b.issuperset(a)) # True (b is a superset of a)
# Two sets with no common elements
print(a.isdisjoint({6, 7})) # True
print(a.isdisjoint({3, 6})) # False
|
Set Comprehension#
Set comprehension uses nearly identical syntax to list comprehension — the only difference is curly braces {} instead of square brackets. The result is automatically deduplicated and unordered.
{expression for variable in iterable}
{expression for variable in iterable if condition}
1
2
3
4
5
6
7
8
9
| numbers = [1, 2, 2, 3, 3, 4, 5]
# List comprehension → keeps duplicates, ordered
list_result = [x ** 2 for x in numbers]
print(list_result) # [1, 4, 4, 9, 9, 16, 25]
# Set comprehension → deduplicates automatically, unordered
set_result = {x ** 2 for x in numbers}
print(set_result) # {1, 4, 9, 16, 25}
|
With condition#
1
2
3
4
5
6
7
8
| # Squares of even numbers only
even_squares = {x ** 2 for x in range(1, 11) if x % 2 == 0}
print(even_squares) # {4, 16, 36, 64, 100}
# Collect unique first letters from a word list
words = ["apple", "avocado", "banana", "blueberry", "cherry"]
initials = {w[0] for w in words}
print(initials) # {'a', 'b', 'c'} (order not guaranteed)
|
Comparing list vs set comprehension#
| List comprehension [...] | Set comprehension {...} |
|---|
| Result type | list | set |
| Keeps duplicates | ✅ | ❌ (auto-deduped) |
| Preserves order | ✅ | ❌ (unordered) |
| Best for | when order or duplicates matter | when you only care which values exist |
Practical Examples#
Remove duplicate elements from a list#
1
2
3
| data = [1, 3, 2, 3, 1, 4, 2, 5]
unique = sorted(set(data))
print(unique) # [1, 2, 3, 4, 5]
|
Find common elements between two lists#
1
2
3
4
5
| list1 = ["Alice", "Bob", "Charlie", "David"]
list2 = ["Bob", "David", "Eve", "Frank"]
common = set(list1) & set(list2)
print(common) # {'Bob', 'David'}
|
Find elements unique to each list#
1
2
3
4
| only_in_1 = set(list1) - set(list2)
only_in_2 = set(list2) - set(list1)
print(only_in_1) # {'Alice', 'Charlie'}
print(only_in_2) # {'Eve', 'Frank'}
|
Count unique words in text#
1
2
3
4
5
6
7
| text = "to be or not to be that is the question"
words = text.split()
unique_words = set(words)
print(f"Total words: {len(words)}") # 10
print(f"Unique words: {len(unique_words)}") # 8
print(unique_words)
|
Permission management#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| admin_permissions = {"read", "write", "delete", "manage_users"}
editor_permissions = {"read", "write"}
user_role = "editor"
user_permissions = editor_permissions if user_role == "editor" else admin_permissions
# Check for a specific permission
if "delete" in user_permissions:
print("Can delete")
else:
print("No delete permission") # No delete permission
# Find missing permissions
missing = admin_permissions - user_permissions
print(f"Missing permissions: {missing}") # {'delete', 'manage_users'}
|