Sets#

A set is an unordered collection of unique elements, analogous to the mathematical concept of a set. The most common uses are deduplication and set operations such as union and intersection:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Create a set
s1 = {1, 2, 3, 4, 5}
s2 = {"apple", "banana", "cherry"}

# Duplicate elements are automatically removed
s3 = {1, 2, 2, 3, 3, 3}
print(s3)  # {1, 2, 3} (order not guaranteed)

# Empty set must use set() (because {} is an empty dict)
empty_set = set()
print(type(empty_set))   # <class 'set'>
print(type({}))          # <class 'dict'>

# Create a set from a list (deduplication)
nums = [1, 2, 2, 3, 3, 3, 4]
unique = set(nums)
print(unique)  # {1, 2, 3, 4}

Adding and Removing#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
s = {1, 2, 3}

# add(): add a single element
s.add(4)
print(s)  # {1, 2, 3, 4}

# update(): add multiple elements
s.update([5, 6])
print(s)  # {1, 2, 3, 4, 5, 6}

# discard(): remove element, no error if not found
s.discard(6)
s.discard(99)  # no error
print(s)  # {1, 2, 3, 4, 5}

# remove(): remove element, raises KeyError if not found
s.remove(5)
print(s)  # {1, 2, 3, 4}

# pop(): randomly remove and return an element
item = s.pop()
print(item)  # unpredictable which element

# clear(): clear the set
s.clear()
print(s)  # set()

Querying#

1
2
3
4
5
6
7
8
s = {1, 2, 3, 4, 5}

# Check existence
print(3 in s)     # True
print(9 in s)     # False

# Length
print(len(s))     # 5

The in operator on a set is much faster than on a list — O(1) vs O(n) — making sets ideal for large-scale lookups:

1
2
3
4
5
6
# Prefer set for large-scale lookups
valid_users = {"alice", "bob", "charlie", "david"}

username = "alice"
if username in valid_users:
    print("User exists")

Set Operations#

1
2
a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7}

Union#

1
2
3
# All elements in a or b
print(a | b)           # {1, 2, 3, 4, 5, 6, 7}
print(a.union(b))      # same as above

Intersection#

1
2
3
# Elements common to both a and b
print(a & b)                   # {3, 4, 5}
print(a.intersection(b))       # same as above

Difference#

1
2
3
4
5
6
# Elements in a but not in b
print(a - b)                   # {1, 2}
print(a.difference(b))         # same as above

# Elements in b but not in a
print(b - a)                   # {6, 7}

Symmetric Difference#

1
2
3
# Elements in a only or b only (excluding common elements)
print(a ^ b)                          # {1, 2, 6, 7}
print(a.symmetric_difference(b))      # same as above

Subsets and Supersets#

1
2
3
4
5
6
7
8
9
a = {1, 2, 3}
b = {1, 2, 3, 4, 5}

print(a.issubset(b))     # True (a is a subset of b)
print(b.issuperset(a))   # True (b is a superset of a)

# Two sets with no common elements
print(a.isdisjoint({6, 7}))  # True
print(a.isdisjoint({3, 6}))  # False

Set Comprehension#

Set comprehension uses nearly identical syntax to list comprehension — the only difference is curly braces {} instead of square brackets. The result is automatically deduplicated and unordered.

{expression for variable in iterable}
{expression for variable in iterable if condition}

Basic form and automatic deduplication#

1
2
3
4
5
6
7
8
9
numbers = [1, 2, 2, 3, 3, 4, 5]

# List comprehension → keeps duplicates, ordered
list_result = [x ** 2 for x in numbers]
print(list_result)   # [1, 4, 4, 9, 9, 16, 25]

# Set comprehension → deduplicates automatically, unordered
set_result = {x ** 2 for x in numbers}
print(set_result)    # {1, 4, 9, 16, 25}

With condition#

1
2
3
4
5
6
7
8
# Squares of even numbers only
even_squares = {x ** 2 for x in range(1, 11) if x % 2 == 0}
print(even_squares)  # {4, 16, 36, 64, 100}

# Collect unique first letters from a word list
words = ["apple", "avocado", "banana", "blueberry", "cherry"]
initials = {w[0] for w in words}
print(initials)   # {'a', 'b', 'c'} (order not guaranteed)

Comparing list vs set comprehension#

List comprehension [...]Set comprehension {...}
Result typelistset
Keeps duplicates❌ (auto-deduped)
Preserves order❌ (unordered)
Best forwhen order or duplicates matterwhen you only care which values exist

Practical Examples#

Remove duplicate elements from a list#

1
2
3
data = [1, 3, 2, 3, 1, 4, 2, 5]
unique = sorted(set(data))
print(unique)  # [1, 2, 3, 4, 5]

Find common elements between two lists#

1
2
3
4
5
list1 = ["Alice", "Bob", "Charlie", "David"]
list2 = ["Bob", "David", "Eve", "Frank"]

common = set(list1) & set(list2)
print(common)  # {'Bob', 'David'}

Find elements unique to each list#

1
2
3
4
only_in_1 = set(list1) - set(list2)
only_in_2 = set(list2) - set(list1)
print(only_in_1)  # {'Alice', 'Charlie'}
print(only_in_2)  # {'Eve', 'Frank'}

Count unique words in text#

1
2
3
4
5
6
7
text = "to be or not to be that is the question"
words = text.split()

unique_words = set(words)
print(f"Total words: {len(words)}")           # 10
print(f"Unique words: {len(unique_words)}")   # 8
print(unique_words)

Permission management#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
admin_permissions = {"read", "write", "delete", "manage_users"}
editor_permissions = {"read", "write"}

user_role = "editor"
user_permissions = editor_permissions if user_role == "editor" else admin_permissions

# Check for a specific permission
if "delete" in user_permissions:
    print("Can delete")
else:
    print("No delete permission")  # No delete permission

# Find missing permissions
missing = admin_permissions - user_permissions
print(f"Missing permissions: {missing}")  # {'delete', 'manage_users'}