Skip to Content
Course content

3.4.1 Unique collections and set operations

Unique Collections:

In Python, a set is an example of a unique collection. This means that a set automatically removes duplicate values when you add them. The primary feature of sets is that they only store unique elements, which makes them ideal for situations where you need to ensure no duplicates are stored.

Key Characteristics of Sets:
  • Unordered: Elements in a set are not stored in any particular order, and there is no indexing like in lists or tuples.
  • Mutable: You can add or remove elements from a set after it’s created.
  • Unique Elements: A set automatically removes duplicate values, ensuring that all elements are unique.
  • No Indexing: Sets do not support indexing, slicing, or other sequence-like behavior.

Creating a Set (Unique Collection):

A set can be created using curly braces {} or the set() constructor.

# Example of a set with unique elements
unique_set = {1, 2, 3, 4, 5}
print(unique_set)  # Output: {1, 2, 3, 4, 5}

# A set does not allow duplicates
duplicate_set = {1, 2, 2, 3}
print(duplicate_set)  # Output: {1, 2, 3} (Duplicates removed)

In the second example, even though 2 was added twice, the set only contains it once, demonstrating how sets automatically handle uniqueness.

Set Operations:

Python sets come with a variety of set operations that allow for efficient manipulation of data. These operations are particularly useful when you're working with mathematical sets or need to perform operations on collections of unique elements. Some of the most common set operations include union, intersection, difference, and symmetric difference.

1. Union (|)

The union of two sets combines all elements from both sets, but it removes duplicates. It results in a set containing all unique elements from both sets.

set1 = {1, 2, 3}
set2 = {3, 4, 5}

# Union of set1 and set2
union_set = set1 | set2
print(union_set)  # Output: {1, 2, 3, 4, 5}

2. Intersection (&)

The intersection of two sets results in a set containing only the elements that are present in both sets.

set1 = {1, 2, 3}
set2 = {3, 4, 5}

# Intersection of set1 and set2
intersection_set = set1 & set2
print(intersection_set)  # Output: {3}

3. Difference (-)

The difference between two sets returns a set containing the elements that are in the first set but not in the second set.

set1 = {1, 2, 3}
set2 = {3, 4, 5}

# Difference of set1 and set2
difference_set = set1 - set2
print(difference_set)  # Output: {1, 2}

4. Symmetric Difference (^)

The symmetric difference of two sets returns a set containing elements that are in either of the sets, but not in both.

set1 = {1, 2, 3}
set2 = {3, 4, 5}

# Symmetric Difference of set1 and set2
symmetric_difference_set = set1 ^ set2
print(symmetric_difference_set)  # Output: {1, 2, 4, 5}

5. Subset (<=) and Superset (>=)

A subset is a set where all its elements are contained within another set. A superset is a set that contains all elements of another set.

set1 = {1, 2, 3}
set2 = {1, 2, 3, 4, 5}

# Checking if set1 is a subset of set2
print(set1 <= set2)  # Output: True (set1 is a subset of set2)

# Checking if set2 is a superset of set1
print(set2 >= set1)  # Output: True (set2 is a superset of set1)

6. Disjoint Sets (isdisjoint())

Disjoint sets are two sets that have no elements in common. The isdisjoint() method checks if two sets are disjoint.

set1 = {1, 2, 3}
set2 = {4, 5, 6}

# Checking if set1 and set2 are disjoint
print(set1.isdisjoint(set2))  # Output: True (sets have no common elements)

Practical Use Cases of Set Operations:

  1. Removing Duplicates: Sets are commonly used to remove duplicates from a collection. For example, if you have a list with duplicates, you can convert it to a set to eliminate those duplicates.
    duplicate_list = [1, 2, 2, 3, 3, 4, 5]
    unique_elements = set(duplicate_list)
    print(unique_elements)  # Output: {1, 2, 3, 4, 5}
    
  2. Efficient Membership Testing: Sets provide a fast way to check if an element exists in a collection. This is much faster than lists because sets use a hashing mechanism.
    my_set = {1, 2, 3, 4, 5}
    
    # Checking if an element is in the set
    print(3 in my_set)  # Output: True
    print(10 in my_set)  # Output: False
    
  3. Set Operations for Data Analysis: Set operations are often used in data analysis to find overlaps, exclusions, or unique elements between different datasets. For example:
    • Union can be used to combine datasets.
    • Intersection can be used to find common elements.
    • Difference can be used to find elements exclusive to one dataset.
  4. Mathematical Set Theory: Sets in Python are closely related to mathematical set theory, allowing you to perform set operations like union, intersection, and difference.

Summary of Unique Collections and Set Operations:

  • Unique Collections: Sets in Python store unique elements and automatically remove duplicates, making them ideal for tasks requiring uniqueness.
  • Set Operations:
    • Union combines all unique elements from two sets.
    • Intersection returns elements common to both sets.
    • Difference gives elements that are in one set but not the other.
    • Symmetric Difference provides elements that are in either set but not both.
    • Subset/Superset checks if one set is contained in another.
    • Disjoint determines if two sets have no common elements.

By utilizing sets and their operations, you can efficiently handle collections of unique elements and perform set-based calculations.

Commenting is not enabled on this course.