Less popular but still fairly common is the dictionary comprehension, which works in a similar way, but there is another tool in the realm of Python comprehensions that is almost never talked about, and that is set comprehension. As you might have guessed, this is similar to other types of Python comprehensions but for modifying and generating sets. Here are the basics of what you need to know about them.

How it Works

The key aspect of set comprehension that makes it unique is that it returns a set, which means the elements inside will be unordered and cannot contain any duplicates. The rest is pretty much the same as list comprehension. The input can be anything that contains a group of elements.

Let’s look at some examples. What if I have some text and I want to pull out all the unique words in the form of a set?

sentence = "The cat in the hat had two sidekicks, thing one and thing two."

In order for this to work properly, I will make sentence entirely lowercase and then remove the comma and the period. This can be done with the lower() and replace() functions. Then I can use the split() function to separate it into a list of words, and from there generate a set of all the unique words:

words = sentence.lower().replace('.', '').replace(',', '').split()
unique_words = {word for word in words}

This is what our unique_words set will now look like:

{'and', 'cat', 'had', 'hat', 'in', 'one', 'sidekicks', 'the', 'thing', 'two'}

As you can see, the original order of the words as they appeared in the sentence was not maintained in the result. In this case, it spit them out in alphabetical order, however this is not necessarily going to happen every time. When thinking about a set, the order doesn’t actually matter. What’s important is that the words that appeared multiple times in the sentence (“the” and “thing”) now only appear once in the set.

Conditionals Within Set Comprehensions

To expand on the above example, what if I wanted to filter out any of the words that have more than three letters? Here’s how I would do that:

unique_words = {word for word in words if len(word) <= 3}

Output:

{'and', 'cat', 'had', 'hat', 'in', 'one', 'the', 'two'}

Just to show that we can, let’s return all the unique words again, but this time capitalize the ones that start with the letter “h” and leave the rest the same:

unique_words = {word.capitalize() if word[0] == 'h' else word for word in words}

Output:

{'Had', 'Hat', 'and', 'cat', 'in', 'one', 'sidekicks', 'the', 'thing', 'two'}

We can even do both of the above examples together in one set comprehension:

unique_words = {word.capitalize() if word[0] == 'h' else word for word in words if len(word) <= 3}

Output:

{'Had', 'Hat', 'and', 'cat', 'in', 'one', 'the', 'two'}

Nested Set Comprehensions

As with list and dictionary comprehensions, you can nest one set comprehension within another, although there is one very important caveat. In general with Python, when it comes to sets inside sets, the inner sets have to be frozen sets, or you will get an error. A frozen set is just like a set, although sets are mutable and frozen sets are not. The reason we need to do this is because mutable objects cannot be accessed with hash-based memory lookups, which is required for a set to function properly. In other words, sets are mutable, which makes them unhashable, and therefore they cannot exist as elements within a larger set, unless they are frozen. Fortunately, we can easily make sets into frozen sets by wrapping them in the frozenset()function.

Let’s use the same sentence from before, but this time we will also iterate through each of the letters in each word and return only the consonants as a frozen set of unique letters.

sentence = "The cat in the hat had two sidekicks, thing one and thing two."
words = sentence.lower().replace('.', '').replace(',', '').split()
vowels = ['a', 'e', 'i', 'o', 'u']
consonants = {frozenset({letter for letter in word if letter not in vowels}) for word in words}

Now consonants will be a set of frozen sets looking like this:

{frozenset({'t', 'w'}),
 frozenset({'c', 'd', 'k', 's'}),
 frozenset({'c', 't'}),
 frozenset({'n'}),
 frozenset({'g', 'h', 'n', 't'}),
 frozenset({'d', 'h'}),
 frozenset({'h', 't'}),
 frozenset({'d', 'n'})}

This is probably the most glaring distinction to make about set comprehension. Keep in mind that this only applies to a set within another set. Sets within lists, or within dictionaries, do not require frozen sets.

This pretty much covers the unique aspects of set comprehension. The rest pretty much follows the same basic rules of Python comprehension, so with that you should have what you need to be able to explore all kinds of applications for this tool.

#python

Set Comprehension in Python 3 for Beginners
22.25 GEEK