Jump to content

Python Concepts/Dictionaries

From Wikiversity

Objective

[edit | edit source]
  • Learn about Python dictionaries.
  • Learn about the dictionary's syntax.
  • Learn about built-in dictionary functions.
  • Work with loops using dictionary keys and values.
  • Learn when to use dictionaries and when not to.

Page Index

[edit | edit source]

Lesson

[edit | edit source]

Python dictionaries are (apparently) random sequences that contain a set of keys, where each key points to a value and is associated with that value, hence the expression "associative arrays" sometimes found in other languages. Dictionaries act just like real life dictionaries, where the key is the word and the value is the word's definition.


An English dictionary, for example, may contain hundreds of thousands of words. Because each word is put into the dictionary in a well defined way, the word and its definition can be found very quickly. In any real life dictionary words can be added or deleted at any time and the definition of an existing word can be changed. Although an English dictionary may appear to contain several definitions of the same word, "house (noun)" and "house (verb)" are separate entries. Python's dictionaries are conceptually similar, but the ordering of the keys may seem random. However, entries are put into Python's dictionaries in a well defined way, so that each key and its associated value may be retrieved very quickly.


Dictionaries use the curly braces ({}) like the set, however empty curly braces create a dictionary.

>>> {"blue": 0, "red": 1, "green": 2} #
{'green': 2, 'red': 1, 'blue': 0}     # Ordering is different from that supplied above.
# "blue" is a key. Its associated value is 0.
# "red" is a key. Its associated value is 1.
# "green" is a key. Its associated value is 2.
>>>
>>> {"blue": 0, "red": 1, "green": 2} == {"green": 2, "blue": 0, "red": 1} == {"green": 2, "red": 1, "blue": 0}
True # Ordering is not important.
>>> 
>>> isinstance( {"green": 2, "blue": 0, "red": 1}, dict )
True
>>> 
>>> {30: "0", 31: "1", "32": 2}
{'32': 2, 30: '0', 31: '1'}
>>>
>>> isinstance({}, dict)
True
>>> isinstance({}, set)
False


In technical terms the dictionary is an object that maps hashable values (keys) to arbitrary objects (values). A hashable value is a value that can be included as a member of a set. If you think of keys as numbers or strings, you can do much with dictionaries.

Basic operations on dictionaries

[edit | edit source]


>>> # create dictionary phoneNumbers
>>> phoneNumbers = {}
>>>
>>> # add entries to phoneNumbers
>>> phoneNumbers['Jack'] = 1234
>>> phoneNumbers['Bill'] = 2234
>>> phoneNumbers['andy'] = 2235
>>>
>>> # display contents of phoneNumbers
>>> phoneNumbers
{'Jack': 1234, 'Bill': 2234, 'andy': 2235}
>>>
>>> # there is a typo in 'andy'
>>> del phoneNumbers['andy']
>>> phoneNumbers
{'Jack': 1234, 'Bill': 2234}
>>>
>>> # correct entry for 'Andy'
>>> phoneNumbers['Andy'] = 2235
>>> phoneNumbers
{'Jack': 1234, 'Bill': 2234, 'Andy': 2235}
>>>
>>> # query the dictionary
>>> 'George' in phoneNumbers
False
>>> 'Bill' in phoneNumbers
True
>>> 2235 in phoneNumbers
False # True is returned only for keys.
>>> 
>>> phoneNumbers['Bill']
2234
>>> phoneNumbers['George']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'George'
>>> 
>>> # if you wish to handle a possible error:
>>> status = 'Bill' in phoneNumbers and print (phoneNumbers['Bill']) ; status
2234
>>> status = 'George' in phoneNumbers and print (phoneNumbers['George']) ; status
False
>>> 
>>> # better:
>>> phoneNumbers = {'Jack': 1234, 'George': 2235, 'Susie': 1231}
>>> for name in ('Bill', 'George') :
...     if name in phoneNumbers : print ("{}'s".format(name), 'phone number is', phoneNumbers[name])
...     else : print ("No entry for", name)
... 
No entry for Bill
George's phone number is 2235
>>>

Dictionary views

[edit | edit source]

The objects returned by dict.keys(), dict.values(), dict.items() are view objects. They provide a dynamic view of the dictionary’s entries, which means that when the dictionary changes, each view reflects these changes.

Values not unique:

>>> phoneNumbers
{'Jack': 1234, 'Bill': 2234, 'Andy': 2235}
>>> phoneNumbers['George'] = 2235 # same as Andy
>>> phoneNumbers
{'Jack': 1234, 'Bill': 2234, 'Andy': 2235, 'George': 2235}
>>>

Retrieve the keys:

>>> [*phoneNumbers]
['Jack', 'Bill', 'Andy', 'George']
>>> 
>>> names = list(phoneNumbers) ; names
['Jack', 'Bill', 'Andy', 'George']
>>> 
>>> names =  phoneNumbers.keys() ; names
dict_keys(['Jack', 'Bill', 'Andy', 'George'])
>>> 'Jack' in phoneNumbers
True
>>> 'Jack' in phoneNumbers.keys()
True
>>> name = 'Jack' ; (name in phoneNumbers) == (name in phoneNumbers.keys()) 
True
>>> name = 'Linda' ; (name in phoneNumbers) == (name in phoneNumbers.keys()) # False == False
True
>>> 
>>> # the keys are all unique, similar to a set.
>>> set(names)
{'Bill', 'George', 'Andy', 'Jack'}
>>> len(names) == len(set(names))
True # always
>>> names & {'Andy', 'George', 'Linda', 'Susie'}
{'Andy', 'George'} # keys behave like a set
>>>

Retrieve the values

>>> numbers = phoneNumbers.values() ; numbers
dict_values([1234, 2234, 2235, 2235]) # not all unique
>>> set(numbers)
{2234, 1234, 2235}
>>> len(numbers) == len(set(numbers))
False # usually
>>> 
>>> 2235 in phoneNumbers.values()
True
>>> 2335 in phoneNumbers.values()
False
>>>

Retrieve keys and values as a sequence of (key,value) pairs where each pair is a tuple

>>> pairs = phoneNumbers.items() ; pairs
dict_items([('Jack', 1234), ('Bill', 2234), ('Andy', 2235), ('George', 2235)])
>>> len(pairs)
4
>>>
>>> pairs = list(phoneNumbers.items()) ; pairs
[('Jack', 1234), ('Bill', 2234), ('Andy', 2235), ('George', 2235)]
>>> isinstance(pairs[0], tuple)
True
>>>
>>> pairs = tuple(phoneNumbers.items()) ; pairs
(('Jack', 1234), ('Bill', 2234), ('Andy', 2235), ('George', 2235))
>>> isinstance(pairs[2], tuple)
True
>>> 
>>> ('Andy', 2235) in phoneNumbers.items()
True
>>> ('Ted', 2235) in phoneNumbers.items()
False
>>> ('Andy', 2236) in phoneNumbers.items()
False
>>> ['Andy', 2235] in phoneNumbers.items()
False
>>> tuple(['Andy', 2235]) in phoneNumbers.items()
True
>>>

Views are dynamic:

>>> phoneNumbers
{'Jack': 1234, 'Bill': 2234, 'George': 2235, 'Ted': 1245, 'Susie': 1231}
>>> names = phoneNumbers.keys() ; names
dict_keys(['Jack', 'Bill', 'George', 'Ted', 'Susie'])
>>> numbers = phoneNumbers.values() ; numbers
dict_values([1234, 2234, 2235, 1245, 1231])
>>> pairs = phoneNumbers.items() ; pairs
dict_items([('Jack', 1234), ('Bill', 2234), ('George', 2235), ('Ted', 1245), ('Susie', 1231)])
>>>
>>> del phoneNumbers['Ted']
>>> names; numbers ; pairs # all reflect the change (like shallow copies)
dict_keys(['Jack', 'Bill', 'George', 'Susie'])
dict_values([1234, 2234, 2235, 1231])
dict_items([('Jack', 1234), ('Bill', 2234), ('George', 2235), ('Susie', 1231)])
>>>
>>> phoneNumbers['Frank'] = 2233
>>> names; numbers ; pairs # all reflect the change (like shallow copies)
dict_keys(['Jack', 'Bill', 'George', 'Susie', 'Frank'])
dict_values([1234, 2234, 2235, 1231, 2233])
dict_items([('Jack', 1234), ('Bill', 2234), ('George', 2235), ('Susie', 1231), ('Frank', 2233)])
>>> 
>>> # for a deep copy:
>>> pairs = tuple(phoneNumbers.items()) ; pairs
(('Jack', 1234), ('Bill', 2234), ('George', 2235), ('Susie', 1231), ('Frank', 2233))
>>>
>>> del phoneNumbers['Bill']
>>> phoneNumbers
{'Jack': 1234, 'George': 2235, 'Susie': 1231, 'Frank': 2233}
>>> 
>>> pairs
(('Jack', 1234), ('Bill', 2234), ('George', 2235), ('Susie', 1231), ('Frank', 2233)) # like a deep copy
>>>

Iterations over keys, values, items:

>>> phoneNumbers
{'Jack': 1234, 'George': 2235, 'Susie': 1231, 'Frank': 2233}
>>>
>>> names = []
>>> for name in phoneNumbers.keys() : names += [name]
... 
>>> names
['Jack', 'George', 'Susie', 'Frank']
>>>
>>> numbers = []
>>> for number in phoneNumbers.values() : numbers += [number]
... 
>>> numbers 
[1234, 2235, 1231, 2233] 
>>>
>>> data = []
>>> for (key, value) in phoneNumbers.items() : data += [(key, value)]
... 
>>> data
[('Jack', 1234), ('George', 2235), ('Susie', 1231), ('Frank', 2233)]
>>>

Summary:

>>> len(phoneNumbers.items())==len(phoneNumbers.keys())==len(phoneNumbers.values())
True # always
>>> 
>>> phoneNumbers 
{'Jack': 1234, 'Bill': 2234, 'Andy': 2235, 'George': 2235}
>>> data
(('Andy', 1244), ('Ted', 1245), ('Linda', 1230), ('Susie', 1231))
>>> for (name, number) in data : phoneNumbers[name] = number # update phoneNumbers
... 
>>> phoneNumbers # Andy's number has changed
{'Jack': 1234, 'Bill': 2234, 'Andy': 1244, 'George': 2235, 'Ted': 1245, 'Linda': 1230, 'Susie': 1231}
>>> 
>>> # iteration over keys, iteration over values, iteration over items
>>> # are always in same order.
>>> list(phoneNumbers.keys())
['Jack', 'Bill', 'Andy', 'George', 'Ted', 'Linda', 'Susie']
>>> list(phoneNumbers.items())
[('Jack', 1234), ('Bill', 2234), ('Andy', 1244), ('George', 2235), ('Ted', 1245), ('Linda', 1230), ('Susie', 1231)]
>>> list(phoneNumbers.values())
[1234, 2234, 1244, 2235, 1245, 1230, 1231]
>>>

Other operations on dictionaries

[edit | edit source]


Introduction to zip():

>>> names
['Jack', 'George', 'Susie', 'Frank']
>>> numbers
[  1234,     2235,    1231,   2233] # each number corresponds to a name above:
>>> list(zip(names, numbers))
[('Jack', 1234), ('George', 2235), ('Susie', 1231), ('Frank', 2233)]
>>>

dictionary = dict()

[edit | edit source]

Return a new dictionary initialized from an optional positional argument and a possibly empty set of keyword arguments.

Arguments may be supplied to the dict() constructor as follows:

dictionary = dict( )

dictionary = dict( **kwarg )

dictionary = dict( mapping )

dictionary = dict( mapping, **kwarg )

dictionary = dict( iterable )

dictionary = dict( iterable, **kwarg )

where:

**kwarg means a possibly empty set of keyword arguments such as: one=1, two=2, three=3

"mapping" means dictionary, and

"iterable" means an iterable object (such as list or tuple) each item of which must be an iterable with exactly two objects. The first object of each item becomes a key in the new dictionary, and the second object the corresponding value. If a key occurs more than once, the last value for that key silently becomes the corresponding value in the new dictionary.


To illustrate, consider the following python code:

a = dict( four = 4 , one=1 , three= 3, two =2 )  # arguments are **kwarg 
b = { 'two': 2, 'one': 1, 'three': 3 , 'four' : 4  }
c = dict( zip(['one', 'four', 'two', 'three'], [1, 4, 2, 3]) )  # argument is output of zip().                     
d = dict( [('two', -2), ('three', 3), ('one', 1)] ,  four = 4 , two =2  )  # arguments are list and **kwarg        
e = dict( (('two', 2), ('three', -3), ('one', 1)) , four = 4 , three= 3 )  # arguments are tuple and **kwarg       
f = dict( {('two', 2), ('three', -3), ('one', 1)} , four = 4 , three= 3 )  # arguments are set and **kwarg         
g = dict( {'three': -3, 'one': -1, 'two': 2} , four = 4 , one=1 , three= 3 )  # arguments are dict and **kwarg     
h = dict( {'three': -3, 'one': -1, 'two': 2, 'four':4 , 'one':1 , 'three':3} )  # argument is dict                  
print(
'a =', a, '''
a == b == c == d == e == f == g = h:''', a == b == c == d == e == f == g == h
)

The above code produces:

a = {'four': 4, 'one': 1, 'three': 3, 'two': 2}
a == b == c == d == e == f == g = h: True

For a deep copy:

a = {'four': 4, 'one': 1, 'three': 3, 'two': 2}
b = dict(a) # b is deep copy of a.

value = dictionary.get(key[, default])

[edit | edit source]

Same as value = dictionary[key] without raising KeyError.

>>> phoneNumbers
{'Jack': 1234, 'George': 2235, 'Susie': 1231, 'Frank': 2233}
>>
>>> phoneNumbers.get('Alfred')
>>> phoneNumbers.get('Frank', 'no entry for Frank')
2233
>>> phoneNumbers.get('Alfred','no entry for Alfred')
'no entry for Alfred'
>>>

value = dictionary.pop(key[, default])

[edit | edit source]

If key is in the dictionary, remove it and return its value, else return default. If default is not given and key is not in the dictionary, a KeyError is raised.

>>> phoneNumbers
{'George': 2235, 'Susie': 1231, 'Frank': 2233}
>>>
>>> value = phoneNumbers.pop("Jimbo", 'no entry for Jimbo') ; value
'no entry for Jimbo'
>>> value = phoneNumbers.pop("Susie", 'no entry for Susie') ; value
1231
>>>
>>> phoneNumbers
{'George': 2235, 'Frank': 2233} # Susie has been removed.
>>> value = phoneNumbers.pop("Frank") ; value
2233
>>> phoneNumbers
{'George': 2235} # Frank has been removed.
>>> value = phoneNumbers.pop("Jimbo") ; value
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'Jimbo'
>>> phoneNumbers
{'George': 2235}
>>>

Good programming requires that we anticipate and process errors. (Unless you really want to produce KeyError.)

for name in ('Jimbo', 'George') :
    value = phoneNumbers.pop(name, None)
    if value:
        print ("{}'s phone number was {}".format(name, value))
    else:
        print ("No phone number for", name)
    if name in phoneNumbers : # It should have been removed.                                                       
        print ('Unknown internal error.') # You include this kind of code when you're learning python.
print ('phoneNumbers =', phoneNumbers)
No phone number for Jimbo

George's phone number was 2235

phoneNumbers = {}

What if 'Jimbo' was in phoneNumbers with a value of None?

phoneNumbers = {'George' : 2235, 'Jimbo' : None}

The above code produces the same results.

(key, value) = dictionary.popitem()

[edit | edit source]

Remove and return an arbitrary (key, value) pair from the dictionary. popitem() is useful to iterate destructively over a dictionary. If the dictionary is empty, calling popitem() raises a KeyError.

>>> L1 = []
>>> phoneNumbers = {'Jack': 1234, 'George': 2235, 'Susie': 1231}
>>>
>>> while phoneNumbers : L1 += [phoneNumbers.popitem()] ; phoneNumbers ; L1
... 
{'Jack': 1234, 'George': 2235}
[('Susie', 1231)]
{'Jack': 1234}
[('Susie', 1231), ('George', 2235)]
{}
[('Susie', 1231), ('George', 2235), ('Jack', 1234)]
>>>

value = dictionary.setdefault(key[, default])

[edit | edit source]

If key is in dictionary, return value. If not, insert key with a value of default and return default (which defaults to None.)

>>> phoneNumbers = {'Jack': 1234, 'George': 2235, 'Susie': 1231}
>>> phoneNumbers.setdefault('George', 3456)
2235
>>> phoneNumbers
{'Jack': 1234, 'George': 2235, 'Susie': 1231}
>>> phoneNumbers.setdefault('Allen', 3456)
3456
>>> phoneNumbers
{'Jack': 1234, 'George': 2235, 'Susie': 1231, 'Allen': 3456}
>>> phoneNumbers.setdefault('Tim') # added to phoneNumbers with default None
>>> phoneNumbers
{'Jack': 1234, 'George': 2235, 'Susie': 1231, 'Allen': 3456, 'Tim': None}
>>>

dictionary.update([other])

[edit | edit source]

Update the dictionary with the key:value pairs from other, overwriting existing keys. Return None. update() accepts arguments like the arguments to dict() above.

>>> d = dict(one=1, two=2) ; d
{'one': 1, 'two': 2}
>>> d.update() ; d # empty argument list is OK.
{'one': 1, 'two': 2}
>>>
a = dict(zero=0,one=-1) # note the mistake -1
b = dict(zero=0,one=-1)
c = dict(zero=0,one=-1)
d = dict(zero=0,one=-1)
e = dict(zero=0,one=-1)
f = dict(zero=0,one=-1)
g = dict(zero=0,one=-1)
h = dict(zero=0,one=-1)
k = dict(zero=0,one=-1)
m = dict(zero=0,one=-1)
n = dict(zero=0,one=-1) # no shallow copies

a.update( one=1, three=3, two=2 )     	      	      	      	    # **kwarg
b.update( { 'two': 2, 'one': 1, 'three': 3 } )	      	      	    # dict
c.update( { 'two': -2, 'one': 1, 'three': -3 }, two=2, three=3 )    # dict and **kwarg
d.update( zip(['one', 'two', 'three'], [1, 2, 3]) )                 # zip
e.update( zip(['one', 'two', 'three'], [1, -2, 3]), two=2, one=1 )  # zip and **kwarg
f.update( [('two', 2), ('three', 3), ('one', 1)] )                  # list
g.update( [('two', 2), ('three', 3)], one=1 ) 	      	      	    # list and **kwarg
h.update( (('two', 2), ('three', 3), ('one', 1)) )                  # tuple
k.update( (('two', -2), ('three', 3)), one=1, two=2, three=3 )	    # tuple and **kwarg
m.update( {('two', 2), ('three', 3), ('one', 1)} )                  # set
n.update( {('two', -2), ('three', 3)}, one=1, two=2, three=3 )	    # set and **kwarg

# mistake has been corrected:                                                                                      
(a == b == c == d == e == f == g == h == k == m == n == {'zero': 0, 'one': 1, 'three': 3, 'two': 2}) or exit(99)

exit(0)

The above code takes exit (0).

An elementary database

[edit | edit source]


Let's use the python dictionary to build an elementary database, "friends:"

friends = {}

The information which we'd like to include is: name, age, address, phone number, hobbies. Add entries to "friends:"

friends['Bill']  = [22, 'Black Street', 1234, [None]]
friends['Alan']  = [20, 'Brown Street', 2345, ['cycling','stamp collecting']]
friends['Tim']   = [19, 'Green Street', 3456, ['parachuting', 'video games','athletics']]
friends['Linda'] = [19, 'Brown Street', 4567, ['old movies']]
friends['Jenny'] = [21, 'Grey Street',  4567, ['old movies', 'cycling', 'video games', 'genealogy']]

Access the data in the database:

print ("Bill's age is", friends['Bill'][0])
print ("Alan's address is", friends['Alan'][1])
print ("Tim's phone number is", friends['Tim'][2])
print ("Linda's hobbies are", friends['Linda'][3])
print ("Jenny's hobbies are", friends['Jenny'][3])
Bill's age is 22

Alan's address is Brown Street

Tim's phone number is 3456

Linda's hobbies are ['old movies']

Jenny's hobbies are ['old movies', 'cycling', 'video games', 'genealogy']


Make the code more readable:

age = 0; address = 1; phoneNumber = 2; hobbies = 3

print ("Bill's age is", friends['Bill'][age])
print ("Alan's address is", friends['Alan'][address])
print ("Tim's phone number is", friends['Tim'][phoneNumber])
print ("Linda's hobbies are", friends['Linda'][hobbies])
print ("Jenny's hobbies are", friends['Jenny'][hobbies])

Output is same as above. Make the output of 'hobbies' more readable:

for name in list(friends) :
    d = friends[name][hobbies]
    if d :
        if len(d) == 1 :
      	    if d[0] == None:
                print ('Nothing for', name, "under 'hobbies'.")
      	    else :
                print("{}'s hobby is: {}.".format( name, d[0] ))
        else:
            s = "{}'s hobbies are: {}".format( name, d[0] )
            for k in range(1,len(d)):
                s += ", {}".format(d[k])
            s += '.'
            print (s)
    else:
        print ('No info for', name, "under 'hobbies'.")
Nothing for Bill under 'hobbies'.

Alan's hobbies are: cycling, stamp collecting.

Tim's hobbies are: parachuting, video games, athletics.

Linda's hobby is: old movies.

Jenny's hobbies are: old movies, cycling, video games, genealogy.


Add more information to the database:

friends['Bill']  += [[ ['Ford','classic', 1948, ['v8','ohv','water-cooled']], 'Ford' ]]
friends['Alan']  += [[]]
friends['Tim']   += [['Toyoya','Chevy']]
friends['Linda'] += [['Volvo']]
friends['Jenny'] += [['Subaru']]

Check the additional data:

for name in friends : print (name, friends[name][-1])
Bill [['Ford', 'classic', 1948, ['v8', 'ohv', 'water-cooled']], 'Ford']

Alan []

Tim ['Toyoya', 'Chevy']

Linda ['Volvo']

Jenny ['Subaru']

This is a simple database and already it's becoming complicated:

cars = 4
print (friends['Bill'][cars][0][1])
print (friends['Bill'][cars][0][3][2])
classic

water-cooled


What if you decide to expand the information in friends[name][phoneNumber]? For example:

friends['Tim'][phoneNumber] = [ ['cell', number], ['home', number], ['business', number, extension] ]

Fortunately, python's list processing capabilities are powerful and almost unlimited.


Some advanced operations on the database



Show the names of all your friends who live on Brown street.

for name in list(friends) :
    if friends[name][address] == 'Brown Street' :
        print (name, 'lives on Brown Street.')
Alan lives on Brown Street.

Linda lives on Brown Street.


Show the names of all your friends who enjoy video games.

for name in list(friends) :
    d = friends[name][hobbies]
    for hobby in d :
        if hobby == 'video games' :
            print (name, 'enjoys video games.')
Tim enjoys video games.

Jenny enjoys video games.


Show the names of all your friends who own a classic car with an 'ohv' engine.

for name in friends :
    autoInfo = friends[name][4]
    for item in autoInfo:
        if (isinstance(item, list)) and (isinstance(item[3], list)) :
            if ('classic' in item) and ('ohv' in item[3]) :
                print (name, 'has one.')
Bill has one.

Integrity of the database



From time to time review the data in the database and verify that it makes sense and has not been corrupted.

If you see a friend's age as 112, would this make sense?

In a multi-item cell like "hobbies" remove duplicates: ['parachuting', 'video games', 'athletics', 'parachuting', 'video games'] should be ['parachuting', 'video games', 'athletics'].

Does a list like ['parachuting', None, 'video games', 'athletics'] make sense?

Would you keep a hobby like 'robbing banks' in your database?

Assignments

[edit | edit source]

Completion status: this resource is a stub, which means that pretty much nothing has been done yet.

References

[edit | edit source]

1. Python's documentation:

"Dictionaries", "Displays for lists, sets and dictionaries", "Dictionary displays", "Dictionary Objects", "4.10. Mapping Types — dict", "4.10.1. Dictionary view objects", "4.7.2. Keyword Arguments", "How are dictionaries implemented?", "Why must dictionary keys be immutable?"


2. Python's methods:


3. Python's built-in functions: