Jump to content

Python Concepts/Sets

From Wikiversity


Objective

[edit | edit source]
  • Learn about Python sets.
  • Learn how to dynamically manipulate sets.
  • Learn about set math and comparison.
  • Learn about built-in set functions.
  • Learn when to use sets and when not to.

Lesson

[edit | edit source]

Python Sets

[edit | edit source]

Sets are mutable sequences, like lists. However, sets and lists differ. Unlike lists, you cannot use append() nor can you index or slice. Although the set has limitations, it has two advantages. The set can only contain unique items, so if there are two or more items the same, all but one will be removed. This can help get rid of duplicates. A set is an unordered collection with no duplicate elements. (The technical definition: A set object is an unordered collection of distinct hashable objects. ) Secondly, sets can perform set mathematics. This makes Python sets much like mathematical sets. To create a set, use curly braces ({}). To create an empty set you have to use set().

>>> spam = {1, 2, 3}
>>> spam
{1, 2, 3}
>>> eggs = {1, 2, 1, 3, 5, 2, 7, 3, 4}
>>> eggs
{1, 2, 3, 4, 5, 7}    # each object unique
>>> {True, False, True, False, True}
{False, True}
>>> {"hi", "hello", "hey", "hi", "hiya", "sup"}
{'hey', 'sup', 'hi', 'hello', 'hiya'}
>>>
>>> a = {} ; a
{}
>>> isinstance(a,set)
False
>>>
>>> a = set() ; a # to create empty set.
set()
>>> isinstance(a,set)
True
>>>


Operations on a single set

[edit | edit source]

Initialize the set:

[edit | edit source]
>>> b = set('alacazam') ; b
{'z', 'l', 'm', 'c', 'a'}
>>> 
>>> b = {'alacazam'} ; b
{'alacazam'}
>>> 
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan',  'plum', 'peach', 'pecan',  'plum', 'peach'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan', 'plum', 'peach', 'pecan', 'plum', 'peach']
>>> b = set(d) ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>>

Familiar operations:

[edit | edit source]
>>> isinstance(b,set)
True
>>> len(b)
5
>>> 'apple' in b
True
>>> 'grape' in b
False
>>> 
>>> 'grape' not in b
True
>>> 
>>> for x in b : print ( x[0:3] ) # for x in set :
... 
pea
pec
pea
plu
app
>>> 
>>> f = b # A shallow copy.
>>> f == b
True
>>> f is b
True
>>> f = set(b) # A deep copy.
>>> f == b
True
>>> f is b
False
>>>

Operations available for set:

[edit | edit source]
>>> b = set() ; b.add('alacazam') ; b # add element 'alacazam' to set b.
{'alacazam'}
>>> 
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan']
>>> for c in d : b.add(c) ; b
... 
{'pear', 'apple', 'plum'} # 'apple' was added
{'pear', 'apple', 'plum'} # 'pear' was not added.
{'pear', 'apple', 'plum'} # 'plum' was not added.
{'peach', 'pear', 'apple', 'plum'} # 'peach' was added
{'peach', 'pecan', 'pear', 'plum', 'apple'} # 'pecan' was added. ordering not same as list d.
>>> 
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> b.clear() ; b # remove all elements from set b
set()
>>> 
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> a = b.pop() ; a ; b # Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.
'peach'
{'pecan', 'pear', 'plum', 'apple'}
>>> 
>>> b.discard('grape') ; b # Remove element 'grape' from set b if element is present.
{'pecan', 'pear', 'plum', 'apple'}
>>> 
>>> b.discard('pear') ; b
{'pecan', 'plum', 'apple'}
>>> 
>>> b.remove('apple') ; b # Remove element 'apple' from set b. Raises KeyError if element is not contained in the set.
{'pecan', 'plum'}
>>>

Set comprehensions

[edit | edit source]

Similarly to list comprehensions, set comprehensions are also supported:

>>> {x*x%7   for x in range(-234,79)}
{0, 1, 2, 4}
>>>
>>> a = {x for x in 'abracadabra' if x in 'abcrmgz'} ; a
{'b', 'a', 'c', 'r'}
>>>

Operations on two sets

[edit | edit source]

set.isdisjoint(other)

[edit | edit source]

Return True if set set has no elements in common with other set. Sets are disjoint if and only if their intersection is the empty set.

>>> set1 = {'pecan', 'pear', 'plum', 'apple'}
>>> set2 = {'pecan', 'pear', 'orange', 'mandarin'}
>>> set3 = {'grape', 'watermelon', 'orange', 'mandarin'}
>>> set1.isdisjoint(set2)
False
>>> set1.isdisjoint(set3)
True
>>> 
>>> {'a', 'b', 'c'}.isdisjoint( {'a', 'd', 'e'} )
False
>>> {'a', 'b', 'c'}.isdisjoint( {'z', 'd', 'e'} )
True
>>>

set.issubset(other)

[edit | edit source]

Test whether every element in set set is in other. Equivalent to set <= other.

>>> {'a', 'b', 'c'}.issubset( {'a', 'b', 'c', 'd'} )
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'c'] ) # this form accepts iterable for 'other'.
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'd'] )
False
>>> {'a', 'b', 'c'}.issubset( 'abd' )
False
>>> {'a', 'b', 'c'}.issubset( 'abcdef' )
True
>>> 
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c'} # In this form both arguments are sets.
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c', 'd'}
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'd'}
False
>>> 
>>> {'a', 'b', 'c'} < {'a', 'b', 'c'}
False
>>> {'a', 'b', 'c'} < {'a', 'b', 'c', 'd'} # set is a proper subset of other
True
>>> {'a', 'b', 'c'} < {'a', 'b', 'd'}
False
>>>

Symmetric difference

[edit | edit source]

newSet = set.symmetric_difference(other).

Return a new set with elements in either set or other but not both.

>>> newSet = {'a', 'b', 'c', 'g'}.symmetric_difference( {'a', 'b', 'h', 'i'} ) ; newSet
{'i', 'c', 'h', 'g'}
>>> {'a', 'b', 'c', 'g'}.symmetric_difference( 'abcdef' ) # this form accepts iterable for 'other'.
{'e', 'f', 'd', 'g'}
>>> {'a', 'b', 'c', 'g'} ^ {'a', 'b', 'h', 'i'} # In this form both arguments are sets.
{'i', 'c', 'h', 'g'}
>>>

Operations on two or more sets

[edit | edit source]

Union

[edit | edit source]

newSet = set.union(*others).

Return a new set with elements from set and all others.

>>> newSet = {'a', 'b', 'c', 'g'}.union( {'b', 'h'},'abcz', [1,2,3] ) ; newSet # iterable as argument
{'b', 1, 'z', 'c', 2, 'h', 3, 'g', 'a'}
>>> 
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | set([1,2,3,4]) ; newSet # operands must be sets.
{'b', 1, 2, 'c', 3, 'h', 4, 'g', 'a'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | [1,2,3,4] ; newSet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'set' and 'list'
>>>

Intersection

[edit | edit source]

newSet = set.intersection(*others).

Return a new set with elements common to set and all others.

>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg' ) ; newSet # iterable as argument
{'b', 'g'}
>>
>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg','p', 'q', 's', 'b', 'g' ) ; newSet
set()
>>> 
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'}  ; newSet # operands must be sets
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & set(('g', 'b', 'z', 'm')) ; newSet
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & ('g', 'b', 'z', 'm') ; newSet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'set' and 'tuple'
>>>

Difference

[edit | edit source]

newSet = set.difference(*others).

Return a new set with elements in set that are not in others.

>>> newSet = {'a', 'b', 'c', 'g','q', 'x'}.difference( {'g', 'b', 'h'},'bczg' ) ; newSet # iterable as argument
{'q', 'x', 'a'}
>>> 
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} -  {'g', 'b', 'h'} - set('bczg')  ; newSet # operands are sets
{'a', 'q', 'x'}
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} - set( {'g', 'b', 'h'} | set('bczg') )  ; newSet
{'q', 'x', 'a'} # same as above
>>> 
>>> {'a', 'q', 'x'} == {'q', 'x', 'a'}
True
>>>

Assignments

[edit | edit source]

String str1 contains the names of all 50 states of the United States of America with some duplicates and extraneous white space. Use sets, including set comprehensions, to determine the one letter that does not appear in the name of any state.

str1 = '''                                                                                                                                       
  Indiana , Kentucky , Nebraska , California , Oregon , Washington , Hawaii , Alaska ,                                                           
  Arizona , Utah , Nevada , Idaho , New Mexico , Colorado , Wyoming , Montana , Texas                                                            
 , Oklahoma , Kansas , Nebraska , South Dakota , North D , Louisiana , Ark , Missouri ,                                                          
 Iowa , Illinois , Minnesota , Michigan , Mississippi , Tennessee , Alabama ,                                                                    
Ohio,West V , Virginia , Michigan , Florida , Georgia , S Carolina ,  ,    N C ,                                                                 
      , Maryland , Delaware , New Jersey , New York , Pennsylvania , Vermont , New  Hampshire                                                    
   , Maine , Connecticut ,         , Rhode Island , Massachussetts , Maine , Wisconsin ,                                                         
'''

Why are the letters "N C" sufficient for "North Carolina"?

One Possible Solution

Further Reading or Review

[edit | edit source]
Completion status: this resource is just getting off the ground. Please feel welcome to help!

References

[edit | edit source]

1. Python's documentation:

"Sets", "Set Types — set, frozenset", "Displays for lists, sets and dictionaries", "Set displays"


2. Python's methods:


3. Python's built-in functions:

"class set(....)"