Python Sets

Unleash the power of Python sets! These unordered collections store unique elements, making them ideal for removing duplicates from data and optimizing tasks that require fast lookups to check if an element belongs to the set. Sets are a fundamental data structure in Python, and their efficient membership testing capabilities prove valuable in various programming scenarios. Whether working with data analysis, building applications, or exploring more advanced topics like machine learning, understanding sets can significantly enhance your Python programming skills.

Python-Set
Table of Contents

What is Set in Python?

Sets in Python offer a unique way to store collections of items. Unlike lists or dictionaries, sets are unordered, meaning the order you add elements doesn’t matter. More importantly, sets cannot contain duplicate values. Think of a set as a collection of unique objects, like a collection of colorful pebbles where no two are identical.

Python Set Syntax

my_set = {“item1”, “item2”, …}

  • my_set: Name you choose for your set.
  • { and }: Curly braces enclose the elements you want to include in the set.
  • item1, item2, etc.: The individual values you want to store in the set are separated by commas. These can be any data type in Python, but they must be immutable (unchangeable), like numbers, strings, or tuples.

Python Set Example

fruits = {"apple", "banana", "cherry", "apple"}
print(fruits)  # Output order may vary

Explanation

  • Line 1: Creates a set named fruits containing various fruits, including a duplicate “apple”.
  • Line 2: Prints the set. Notice that duplicates have been removed, and the order might not be identical to the one you entered.

Output

{‘banana’, ‘cherry’, ‘apple’}


Key Properties of Python Set

Here’s a summary of the essential characteristics of Python sets:

  • Unordered: Elements in a set have no specific order.
  • Unique elements: A set cannot contain duplicate elements.
  • Mutable: You can add or remove elements from a set after its creation.
  • Hashable elements: Elements within a set must be hashable (e.g., numbers, strings, tuples).
  • Efficient membership testing: Checking if an element exists in a set is very fast.
  • Supports mathematical set operations: Sets support operations like union, intersection, difference, etc.

Methods for Creating Python Set

Python offers two main ways to create sets:

  1. set() Function
  2. Curly Braces {}

Initializing Sets with the set() Constructor

The set() function is a convenient way to create sets in Python. It can take various iterables (like lists or strings) as input and produce a set containing unique elements.

Syntax

my_set = set(iterable)

  • .set(): The set() function is called to construct the set.
  • iterable: Any iterable object (like a list, tuple, or string) that provides the elements for the set.

Example

fruits = ["apple", "banana", "orange", "apple"]
unique_fruits = set(fruits)
print(unique_fruits)  # Output: {'apple', 'banana', 'orange'}

Explanation

  • Line 1: Creates a list fruits with some fruits, including a duplicate “apple”.
  • Line 2: Uses set(fruits) to convert the list fruits into a set named unique_fruits. The set() function removes duplicates while creating the set.
  • Line 3: Prints the unique_fruits set, demonstrating that the duplicate “apple” is gone.

Constructing Sets Using Curly Braces {}

While the set() function is a versatile tool, Python also allows you to create sets directly using curly braces {}. This approach is useful for explicitly defining the elements you want in your set.

Syntax

my_set = {element1, element2, element3, …}

  • {}: Curly braces enclose the elements you want to include in the set.
  • element1, element2, element3: These represent the elements you want in the set, separated by commas.

Example

colors = {"red", "green", "blue"}
print(colors)  # Output: {'red', 'green', 'blue'} (Order may vary)

Explanation

  • Line 1: Creates a set named colors directly using curly braces {}. Elements are enclosed within these braces and separated by commas.
  • Line 2: Prints the colors set. Notice that the order of elements in the output might differ since sets are unordered collections.

Creating Empty Python Set

Sometimes, you might need a set that starts empty and is ready to be populated later. Python provides a couple of ways to achieve this.

Syntax: Using set()

empty_set = set()

  • .set(): The set() function is called to create the set but without arguments inside the parentheses.

Syntax: Using Curly Braces

empty_set = {}

  • {}: Empty curly braces indicate an empty set.

Note: While both methods create an empty set, using { } with curly braces alone might lead to an empty dictionary. It’s generally recommended to use set() for clarity when creating empty sets.

Example

numbers = set()
print(numbers)  # Output: set() (Empty set representation)

Explanation

  • Line 1: Creates an empty set named numbers using set(). The resulting set is empty since no arguments are provided within the parentheses.
  • Line 2: Prints the numbers set. The output, set(), indicates an empty set.

Adding Elements to Python Set

Sets in Python are mutable, allowing you to add new elements even after they’ve been created. Here’s a breakdown of the primary ways to insert elements into a set:

  • add() Method
  • update() Method

Expanding Sets Using the add() Method

The add() method is the primary way to include new elements in an existing set in Python. It ensures uniqueness within the set, preventing the addition of duplicates.

Syntax

my_set.add(new_element)

  • .add(): The add() method is called on the set object.
  • new_element: The element you want to add to the set.

Example

colors = {"red", "green"}
colors.add("blue")
print(colors)  # Output: {'green', 'red', 'blue'} (Order might vary)

Explanation

  • Line 1: Creates a set colors with the colors “red” and “green”.
  • Line 2: Uses colors.add("blue") to add the new color “blue” to the colors set.
  • Line 3: Prints the updated colors set, which now includes the added color “blue”.

Adding Multiple Elements with update()

While add() is great for adding single elements, Python provides ways to incorporate multiple elements simultaneously. Here are two common approaches:

Syntax: Using update()

my_set.update(iterable)

  • .update(): The update() method is called on the set object.
  • iterable: Any iterable object (like a list, tuple, or another set) containing the elements you want to add.

Example: Adding Multiple Colors

colors = {"red", "green"}
more_colors = ["blue", "yellow"]
colors.update(more_colors)
print(colors)  # Output: {'red', 'green', 'yellow', 'blue'} (Order might vary)

Explanation

  • Line 1: Creates a set colors with “red” and “green”.
  • Line 2: Creates a list more_colors containing the colors “blue” and “yellow” you want to add.
  • Line 3: Uses colors.update(more_colors) to add all elements from the more_colors list to the colors set.
  • Line 4: Prints the updated colors set, showcasing the addition of both new colors.

Modifying Elements in Python Set (Understanding Immutability)

Since sets are unordered collections, they don’t inherently have a concept of updating specific elements by their position. However, combining the remove() and add() methods can achieve a similar effect.

Syntax

my_set.remove(element_to_remove)  # Remove the element
my_set.add(new_element)           # Add the new element

  • .remove(element_to_remove): Removes the specified element from the set. If the element doesn’t exist, it raises a KeyError.
  • .add(new_element): Adds a new element to the set.

Example: Replacing “banana” with “mango”

fruits = {"apple", "banana", "orange"}
fruits.remove("banana")
fruits.add("mango")
print(fruits)  # Output: {'apple', 'orange', 'mango'}

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Uses fruits.remove("banana") to target and remove “banana” from the set if it’s present.
  • Line 3: Uses fruits.add("mango") to include the new element “mango” in the set.
  • Line 4: Prints the updated fruits set, demonstrating the replacement effect.

Techniques for Finding Elements in Python Set

One of the strengths of sets is their lightning-fast membership testing. You can easily check if an element exists within a set using the in and not in operators.

Syntax

# Checks if ‘element’ exists in ‘my_set’ (returns True or False)
element in my_set 

# Checks if ‘element’ is not in ‘my_set’ (returns True or False)
element not in my_set 

  • element: The element you want to check for in the set.
  • my_set: The set you want to search within.
  • in: The in operator checks for membership.
  • not in: The not in operator checks for the absence of an element.

Example: Checking for a Fruit

fruits = {"apple", "banana", "orange"}
has_mango = "mango" in fruits
print(has_mango)  # Output: False

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Uses "mango" in fruits to check if the element “mango” exists within the fruits set. It assigns the result (True or False) to the variable has_mango.
  • Line 3: Prints the value of has_mango, which is False since “mango” is not present in the set.

Removing Elements from Python Set

Sets in Python offer several ways to delete elements, giving you flexibility:

  • remove() Method
  • discard() Method
  • clear() Method
  • pop() Method

Targeted Removal with remove()

The remove() method is convenient for deleting elements from a set. It targets a specific element for removal and ensures the set maintains its unique elements.

Syntax

my_set.remove(element_to_remove)

  • .remove(): The remove() method is called on the set object.
  • element_to_remove: The element you want to delete from the set.

Note: If the element you try to remove doesn’t exist in the set, remove() will raise a KeyError. To avoid errors, it’s generally recommended to check for element existence before using remove().

Example: Removing “banana” from a Fruit Set

fruits = {"apple", "banana", "orange"}
fruits.remove("banana") 
print(fruits)  # Output: {'orange', 'apple'}

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Uses fruits.remove("banana") to attempt to remove “banana” from the fruits set. If “banana” is present, it gets deleted.
  • Line 3: Prints the updated fruits set, showing “banana” is gone (if it existed originally).

Safe Removal with discard()

While remove() is great for targeted deletion, sometimes you might not care if the element existed before removal. The discard() method offers a more relaxed approach to element deletion in sets.

Syntax

my_set.discard(element_to_remove)

  • .discard(): The discard() method is called on the set object.
  • element_to_remove: The element you want to try removing from the set.

Key Difference: Unlike remove(), discard() doesn’t raise an error if the element you try to remove isn’t present in the set. It simply does nothing in that case.

Example: Discarding “mango” from a Fruit Set

fruits = {"apple", "banana", "orange"}
fruits.discard("mango")  # Try removing "mango" (no error if missing)
print(fruits)  # Output: {'banana', 'orange', 'apple'} (Unchanged if "mango" wasn't there)

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Uses fruits.discard("mango") to attempt to remove “mango” from the fruits set. If “mango” exists, it gets deleted. If not, there’s no error.
  • Line 3: Prints the fruits set, which remains the same regardless of whether “mango” was there.

Clearing Sets with clear()

If you need to remove all elements from a set at once, Python provides the clear() method. This empties the set, preparing it to be populated with new elements.

Syntax

my_set.clear()

  • .clear(): The clear() method is called on the set object.

Example (Clearing a Set of Numbers)

numbers = {1, 2, 3, 4}
numbers.clear()
print(numbers)  # Output: set() (Empty set representation)

Explanation

  • Line 1: Creates a set numbers with some numbers.
  • Line 2: Uses numbers.clear() to remove all existing elements from the numbers set.
  • Line 3: Prints the numbers set, which is now empty as indicated by the set() output.

Arbitrary Element Removal with pop()

While remove() and discard() target specific elements for deletion, the pop() method offers a slightly different approach. It removes and returns an arbitrary element from the set.

Syntax

removed_element = my_set.pop()

  • .pop(): The pop() method is called on the set object.
  • removed_element: The variable that will store the removed element (optional).

Note: Unlike remove() and discard(), pop() will raise a KeyError if the set is empty. It’s recommended to use pop() only when you’re confident the set has at least one element.

Example: Removing and Printing a Random Fruit

fruits = {"apple", "banana", "orange"}
removed_fruit = fruits.pop()
print(f"Removed fruit: {removed_fruit}")   
print(fruits)  # Output: {'apple', 'orange'} (Set with one less element)

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Uses fruits.pop() to remove and store a random element from the fruits set in the variable removed_fruit.
  • Line 3: Prints a message showing the removed fruit.
  • Line 4: Prints the updated fruits set, demonstrating the removal of one element.

Output

Removed fruit: banana
{‘orange’, ‘apple’}


Eliminating Duplicates with Python Set

Sets are inherently designed to store unique elements and avoid duplicates. However, you can leverage the set creation process if you have a sequence with duplicates and want to convert it to a set to remove them.

Syntax

unique_elements = set(original_sequence)

  • set(): The set() function is called to construct the set.
  • original_sequence: Any iterable object (like a list, tuple, or string) that might contain duplicate elements.

Example: Removing Duplicates from a List

fruits = ["apple", "banana", "orange", "apple", "pear", "banana"]
unique_fruits = set(fruits)
print(unique_fruits)  # Output: {'banana', 'pear', 'orange', 'apple'}

Explanation

  • Line 1: Creates a list fruits with some fruits, including duplicates.
  • Line 2: Uses unique_fruits = set(fruits) to convert the list fruits into a set named unique_fruits. Since sets cannot have duplicates, this process automatically removes them while creating the new set.
  • Line 3: Prints the unique_fruits set, demonstrating that the duplicate elements have been removed.

Iterating Through Python Set

Since sets are unordered collections, iterating over them doesn’t guarantee the order in which elements appear. However, you can still process each element using a for loop.

Syntax

for element in my_set:
   # Your code to process the element

for element in my_set: The for loop iterates through each element (element) in the set my_set.

Example: Printing Each Fruit

fruits = {"apple", "banana", "orange", "pear"}
for fruit in fruits:
    print(fruit)    # Print each fruit

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Starts a for loop iterating through each fruit in the fruits set.
  • Line 3: The indented code within the loop (print(fruit)) prints the current fruit encountered during the iteration. The order of the printed fruits may vary since sets are unordered.

Output

banana
pear
orange
apple


Determining Python Set Size with len()

Knowing how many elements are in a set is often useful. Python provides the built-in len() function to retrieve a set’s length (number of elements).

Syntax

number_of_elements = len(my_set)

  • len(): The len() function is called to get the length.
  • my_set: The set for which you want to find the length.

Example: Counting the Number of Fruits

fruits = {"apple", "banana", "orange", "pear"}
total_fruits = len(fruits)
print(f"You have {total_fruits} fruits in your set.")

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Uses total_fruits = len(fruits) to calculate the length of the fruits set and stores the result (number of fruits) in the variable total_fruits.
  • Line 3: Prints a message displaying the total number of fruits retrieved from the len() function.

Introduction to Immutable Sets: frozenset

While regular sets are great for dynamic collections, Python offers frozenset for situations where you need a set that cannot be modified after creation. This immutability ensures data integrity and makes frozen sets suitable for use as dictionary keys or for representing fixed data sets.

Syntax

my_frozenset = frozenset(iterable)

  • frozenset(): The frozenset() function creates the frozenset.
  • iterable: Any iterable object (like a list, tuple, or another set) containing the elements you want to include in the frozenset.

Example: Creating a Frozen Set of Numbers

numbers = frozenset([1, 2, 3])  # Create a frozenset using a list
# numbers.add(4)  # This would cause an error (frozensets are immutable)
print(numbers)  # Output: frozenset({1, 2, 3})

Explanation

  • Line 1: Creates a frozenset named numbers using the frozenset() function and a list [1, 2, 3] as input.
  • Line 2: Commented out. Since frozenset is immutable, adding elements using methods like add() would result in an error.
  • Line 3: Prints the numbers frozenset, showcasing the elements.

Unifying Sets with the Union Operator (|)

In Python, sets can be combined using the union operator (|) to create a new set containing elements in either or both original sets. Here’s a breakdown:

Syntax

combined_set = set1 | set2

  • set1: The first set you want to combine.
  • |: The union operator that merges the sets.
  • set2: The second set you want to combine.

Note: The union operator only keeps unique elements, so duplicates are automatically removed from the combined set.

Example: Combining Fruits and Vegetables

fruits = {"apple", "banana", "orange"}
vegetables = {"carrot", "pea", "lettuce"}
all_items = fruits | vegetables
print(all_items)  # Output: {'apple', 'banana', 'carrot', 'lettuce', 'orange', 'pea'} (Order might vary)

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Creates a set vegetables with some vegetables.
  • Line 3: Uses all_items = fruits | vegetables to combine the fruits and vegetables sets using the union operator (|). The result is stored in all_items.
  • Line 4: Prints the all_items set, showcasing the combined elements from both original sets. The order may vary as sets are unordered collections.

The union() Method for Set Union

Another way to join sets in Python is the union() method. It behaves similarly to the union operator (|), but offers more flexibility as it allows you to combine multiple sets simultaneously.

Syntax

combined_set = set1.union(set2, …, setN)

.union(): The union() method is called on the first set.

Example: Combining Fruits, Vegetables, and Legumes

fruits = {"apple", "banana", "orange"}
vegetables = {"carrot", "pea", "lettuce"}
legumes = {"lentil", "bean"}
all_items = fruits.union(vegetables, legumes)  # Combine three sets
print(all_items)  # Output: {'apple', 'pea', 'banana', 'carrot', 'lettuce', 'orange', 'lentil', 'bean'} (Order might vary)

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Creates a set vegetables with some vegetables.
  • Line 3: Creates a set legumes with some legumes.
  • Line 4: Uses all_items = fruits.union(vegetables, legumes) to combine the fruits, vegetables, and legumes sets using the union() method of the fruits set. The result is stored in all_items.
  • Line 5: Prints the all_items set, showcasing the elements from all three original sets. The order may vary since sets are unordered collections.

The Intersection Operator (&)

The intersection operator (&) helps you identify elements in both sets. It creates a new set containing only the common elements across the sets you compare.

Syntax

common_elements = set1 & set2

  • &: The intersection operator that finds common elements.

Example: Finding Common Fruits

fruits = {"apple", "banana", "orange"}
veggies = {"carrot", "pea", "lettuce", "banana"}
common_items = fruits & veggies  # Find fruits common to both sets
print(common_items)  # Output: {'banana'}

Explanation

  • Line 1: Creates a set fruits with some fruits.
  • Line 2: Creates a set veggies with vegetables, including “banana” which is also in fruits.
  • Line 3: Uses common_items = fruits & veggies to find the intersection between fruits and veggies. The common element (“banana”) is stored in common_items.
  • Line 4: Prints the common_items set, showing the element in both sets.

Using the intersection() Method

Like the intersection operator (&), the intersection() method efficiently finds elements in both sets. It returns a new set containing the shared elements.

Syntax

common_elements = set1.intersection(set2)

  • .intersection(): The intersection() method is called on the first set.

Example: Finding Shared Letters Between Sets

vowels = {"a", "e", "i", "o", "u"}
letters = {"a", "b", "c", "d", "e"}
shared_letters = vowels.intersection(letters)  # Find shared letters
print(shared_letters)  # Output: {'a', 'e'}

Explanation

  • Line 1: Creates a set vowels with vowel letters.
  • Line 2: Creates a set letters with various letters, including some vowels.
  • Line 3: Uses shared_letters = vowels.intersection(letters) to find the intersection between vowels and letters using the intersection() method of the vowels set. The result is stored in shared_letters.
  • Line 4: Prints the shared_letters set, showing the letters in both sets.

The Difference Operator (-)

The subtraction operator () helps you identify elements in the first set but not in the second. It creates a new set containing the elements that are unique to the first set compared to the second.

Syntax

difference_elements = set1 – set2

  • difference_elements: The variable that will store the new set containing the difference.
  • : The subtraction operator that finds elements in the first set but not the second.

Note: The order of the sets matters when using the subtraction operator. set1 – set2 will find elements in set1 that are not in set2, but not the other way around.

Example: Finding Fruits Not in Vegetables

fruits = {"apple", "banana", "orange", "kiwi"}
vegetables = {"carrot", "pea", "lettuce"}
diff_items = fruits - vegetables 
print(diff_items)  # Output: {'banana', 'kiwi', 'orange', 'apple'} (Order might vary)

Explanation

  • Line 1: Creates a set fruits with various fruits.
  • Line 2: Creates a set vegetables with some vegetables.
  • Line 3: Uses diff_items = fruits - vegetables to find the difference between fruits and vegetables using the subtraction operator. The elements unique to fruits are stored in diff_items.
  • Line 4: Prints the diff_items set, showing the fruits not present in the vegetables set.

Using the difference() Method

Like the subtraction operator (), the difference() method efficiently identifies only elements in the first set. It returns a new set containing the elements unique to the first set compared to the second.

Syntax

difference_elements = set1.difference(set2)

.difference(): The difference() method is called on the first set.

Example: Finding Unique Numbers

numbers1 = {1, 2, 3, 4, 5}
numbers2 = {2, 4, 6, 8}
unique_numbers = numbers1.difference(numbers2)  # Find elements in numbers1 not in numbers2
print(unique_numbers)  # Output: {1, 3, 5} (Order might vary)

Explanation

  • Line 1: Creates a set numbers1 with some numbers.
  • Line 2: Creates a set numbers2 with some different numbers, including some shared with numbers1.
  • Line 3: Uses unique_numbers = numbers1.difference(numbers2) to find the difference between numbers1 and numbers2 using the difference() method of the numbers1 set. The elements unique to numbers1 are stored in unique_numbers.
  • Line 4: Prints the unique_numbers set, showing the numbers that exist only in numbers1 and not in numbers2.