Python Concepts/List Comprehension
Objective
[edit | edit source]
|
Lesson
[edit | edit source]
To create a list of squares of numbers in >>> a = []
>>> for x in range(1,6) : a += [x*x]
...
>>> a
[1, 4, 9, 16, 25]
This approach has the perhaps undesirable side-effect of creating or re-assigning both
[for x in range(1,6) : [x*x]] # This produces SyntaxError: invalid syntax.
If we change the syntax slightly, List Comprehensions come to the rescue. |
List Comprehensions
[edit | edit source]
List comprehensions provide a concise way to create lists. Common applications are to make new lists where each element is the result of some operations applied to each member of another sequence or iterable, or to create a subsequence of those elements that satisfy a certain condition.
>>> [x*x for x in range(1,6)]
[1, 4, 9, 16, 25]
>>>
In the listcomp above,
>>> a = list(range(-5,7)) ; a
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6]
>>>
Create a deep copy of list [edit | edit source] |
Nested List Comprehensions
[edit | edit source]
Create a useful list: matrix = [p for p in range (3,23)] # A listcomp does this nicely.
print ('len(matrix) =', len(matrix))
print ('matrix =', matrix)
len(matrix) = 20
matrix = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]
New list of 3 rows[edit | edit source]Create a list containing three rows in which each row contains every third value of b = [ [matrix[p] for p in range(q,len(matrix),3)] for q in range(3) ]
print ('''
b =
{}, length={}
{}, length={}
{}, length={}
'''.format( b[0],len(b[0]), b[1],len(b[1]), b[2],len(b[2]) ))
b =
[3, 6, 9, 12, 15, 18, 21], length=7
[4, 7, 10, 13, 16, 19, 22], length=7
[5, 8, 11, 14, 17, 20], length=6
The nested listcomp above is equivalent to: b = [ [matrix[p] for p in range(0,len(matrix),3)] ,
[matrix[p] for p in range(1,len(matrix),3)] ,
[matrix[p] for p in range(2,len(matrix),3)] ]
Fill each row as necessary so that all rows have same length: c = [ b[p]+[None]*q for p in range(len(b)) for q in [ len(b[0]) - len(b[p]) ] ]
The syntax of the listcomp requires a print ('''
c =
{}, length={}
{}, length={}
{}, length={}
'''.format( c[0],len(c[0]), c[1],len(c[1]), c[2],len(c[2]) ))
c =
[3, 6, 9, 12, 15, 18, 21], length=7
[4, 7, 10, 13, 16, 19, 22], length=7
[5, 8, 11, 14, 17, 20, None], length=7
Create a dictionary[edit | edit source]Create a dictionary from a list in which each value at an even position is a key and each value at an odd position is the associated value: data = [p%q for p in range (3,10) for q in range (2,7)]
print ('data =', data)
data = [1, 0, 3, 3, 3, 0, 1, 0, 4, 4, 1, 2, 1, 0, 5, 0, 0, 2, 1, 0, 1, 1, 3, 2, 1, 0, 2, 0, 3, 2, 1, 0, 1, 4, 3]
Create a list containing two rows in which the first row contains keys and the second contains values. b = [ [data[p] for p in range(q,len(data),2)] for q in range(2) ]
print ('''
b =
{}, length={}
{}, length={}
'''.format( b[0],len(b[0]), b[1],len(b[1]) ))
b =
[1, 3, 3, 1, 4, 1, 1, 5, 0, 1, 1, 3, 1, 2, 3, 1, 1, 3], length=18
[0, 3, 0, 0, 4, 2, 0, 0, 2, 0, 1, 2, 0, 0, 2, 0, 4], length=17
Fill as necessary: c = [ b[p]+[None]*q for p in range(len(b)) for q in range(len(b[0]) - len(b[p]), 1+len(b[0]) - len(b[p])) ]
print ('''
c =
{}, length={}
{}, length={}
'''.format( c[0],len(c[0]), c[1],len(c[1]) ))
c =
[1, 3, 3, 1, 4, 1, 1, 5, 0, 1, 1, 3, 1, 2, 3, 1, 1, 3], length=18
[0, 3, 0, 0, 4, 2, 0, 0, 2, 0, 1, 2, 0, 0, 2, 0, 4, None], length=18
Transpose rows and columns and create input to dictionary: d = [[row[i] for row in c] for i in range(len(c[0]))]
print (' d =', d);
d = [[1, 0], [3, 3], [3, 0], [1, 0], [4, 4], [1, 2], [1, 0], [5, 0], [0, 2], [1, 0], [1, 1], [3, 2], [1, 0], [2, 0], [3, 2], [1, 0], [1, 4], [3, None]]
List d is equivalent to: d1 = [ [ c[0][i], c[1][i] ] for i in range(18) ]
print ('d1 =', d1)
d1 = [[1, 0], [3, 3], [3, 0], [1, 0], [4, 4], [1, 2], [1, 0], [5, 0], [0, 2], [1, 0], [1, 1], [3, 2], [1, 0], [2, 0], [3, 2], [1, 0], [1, 4], [3, None]]
List d1 is equivalent to: d2 = [
[ c[0][ 0], c[1][ 0] ] ,
[ c[0][ 1], c[1][ 1] ] ,
[ c[0][ 2], c[1][ 2] ] ,
........................
[ c[0][15], c[1][15] ] ,
[ c[0][16], c[1][16] ] ,
[ c[0][17], c[1][17] ]
]
print ('d2 =', d2)
d2 = [[1, 0], [3, 3], [3, 0], [1, 0], [4, 4], [1, 2], [1, 0], [5, 0], [0, 2], [1, 0], [1, 1], [3, 2], [1, 0], [2, 0], [3, 2], [1, 0], [1, 4], [3, None]]
e = dict(d)
print ('e =', e);
e = {1: 4, 3: None, 4: 4, 5: 0, 0: 2, 2: 0}
Listcomps simplified[edit | edit source]In practice this code will do the job: data = [p%q for p in range (3,10) for q in range (2,7)]
print ('data =', data)
data = [1, 0, 3, 3, 3, 0, 1, 0, 4, 4, 1, 2, 1, 0, 5, 0, 0, 2, 1, 0, 1, 1, 3, 2, 1, 0, 2, 0, 3, 2, 1, 0, 1, 4, 3]
Check the input:[edit | edit source]if isinstance(data,list) and len(data) >= 1: pass # Input must contain at least one key.
else : exit(99)
status = {
((isinstance(data[p], int)) or (isinstance(data[p], float)) or (isinstance(data[p], str)))
for p in range(0, len(data), 2)
} # A set comprehension. In this dictionary each key must be int or float or str.
print ('status =', status)
status = {True}
if False in status :
print ("'data' contains unrecognized key.")
exit (98)
Create the dictionary[edit | edit source]b = dict(
[
(data+[None])[p:p+2] for p in range(0, len(data), 2)
]
)
print (
'\nlen(data) =', len(data),
'\nInput to dict()\n =', [ (data+[None])[p:p+2] for p in range(0, len(data), 2) ],
'\nDictionary =', b)
len(data) = 35
Input to dict()
= [[1, 0], [3, 3], [3, 0], [1, 0], [4, 4], [1, 2], [1, 0], [5, 0], [0, 2], [1, 0], [1, 1], [3, 2], [1, 0], [2, 0], [3, 2], [1, 0], [1, 4], [3, None]]
Dictionary = {1: 4, 3: None, 4: 4, 5: 0, 0: 2, 2: 0}
|
List Comprehensions for free-format Python
[edit | edit source]
Although a listcomp recognizes statements beginning with only A "Unix date" has format $ date Wed Feb 14 08:24:24 CST 2018 The code in this section uses list comprehensions to recognize valid dates. All of the following are considered valid dates: Wed Feb 14 08:24:24 CST 2018 Wednes Feb 14 08:24:24 CST 2018 # More than 3 letters in name of day. Wed Febru 14 08:24:24 CST 2018 # More than 3 letters in name of month. Wed Feb 14 8:24 : 24 CST 2018 # White space in hh:mm:ss. wed FeB 14 8:24 : 24 cSt 2018 # Bad punctuation. Build dictionary mo = '''January February March April
May June July August September
October November December
'''
L1 = [
[ month[:3], {month[:p] for p in range (3,len(month)+1)} ]
for month in mo.title().split()
]
months = dict(L1)
Display dictionary L1 = [
'''months['{}'] = {}'''.format(key, months[key])
for key in months
]
print ( '\n'.join(L1) )
months['Jan'] = {'Januar', 'Janua', 'Janu', 'Jan', 'January'} months['Feb'] = {'Februa', 'Febru', 'Februar', 'Feb', 'Febr', 'February'} months['Mar'] = {'Mar', 'March', 'Marc'} months['Apr'] = {'April', 'Apri', 'Apr'} months['May'] = {'May'} months['Jun'] = {'June', 'Jun'} months['Jul'] = {'Jul', 'July'} months['Aug'] = {'Augus', 'Augu', 'Aug', 'August'} months['Sep'] = {'Sep', 'Septemb', 'September', 'Septem', 'Septe', 'Septembe', 'Sept'} months['Oct'] = {'Octo', 'Octobe', 'Oct', 'Octob', 'October'} months['Nov'] = {'Nove', 'November', 'Novemb', 'Novem', 'Nov', 'Novembe'} months['Dec'] = {'Decembe', 'Decemb', 'Dece', 'Decem', 'December', 'Dec'} Build dictionary da='''
Sunday Monday Tuesday
Wednesday Thursday
Friday Saturday
'''
L1 = [
[ day[:3], {day[:p] for p in range (3,len(day)+1)} ]
for day in da.title().split()
]
days = dict(L1)
Display dictionary L1 = [
'''days['{}'] = {}'''.format(key, days[key])
for key in days
]
print ( '\n'.join(L1) )
days['Sun'] = {'Sund', 'Sunda', 'Sun', 'Sunday'} days['Mon'] = {'Monday', 'Mon', 'Mond', 'Monda'} days['Tue'] = {'Tuesda', 'Tues', 'Tuesday', 'Tue', 'Tuesd'} days['Wed'] = {'Wednesda', 'Wednesd', 'Wedn', 'Wednes', 'Wed', 'Wedne', 'Wednesday'} days['Thu'] = {'Thursday', 'Thur', 'Thu', 'Thursd', 'Thurs', 'Thursda'} days['Fri'] = {'Friday', 'Fri', 'Frida', 'Frid'} days['Sat'] = {'Saturday', 'Satu', 'Saturda', 'Sat', 'Saturd', 'Satur'} The regular expression: reg2 = (
r'''\b # Word boundary.
(?P<day>\w{3,}) # At least 3 word characters.
\s+
(?P<month>\w{3,}) # At least 3 word characters.
\s+
(?P<date>\d{1,2}) # 1 or 2 numbers.
\s+
(?P<hours>\d{1,2}) # 1 or 2 numbers.
\s*:\s*
(?P<minutes>\d{1,2}) # 1 or 2 numbers.
\s*:\s*
(?P<seconds>\d{1,2}) # 1 or 2 numbers.
\s+
(?P<time_zone>\w{3}) # 3 word characters
\s+
(?P<year>\d{4}) # 4 numbers
\b # Word boundary.'''
)
Dictionary that contains number of days per month: d1 = dict ((
('Jan', 31), ('May', 31), ('Sep', 30),
('Feb', 28), ('Jun', 30), ('Oct', 31),
('Mar', 31), ('Jul', 31), ('Nov', 30),
('Apr', 30), ('Aug', 31), ('Dec', 31),
))
List all valid dates in string dates = '''
MON Februar 12 0:30 : 19 CST 2018
Tue Feb 33 00:30:19 CST 2018 # Invalid.
Wed Feb 29 00:30:19 CST 1900 # Invalid.
Thursda feb 29 00:30:19 CST 1944
'''
The list comprehension that does it all: L1 = [
'\n'.join(( str(m), m[0], str(m.groupdict()) ))
for m in re.finditer(reg2, dates, re.IGNORECASE|re.VERBOSE|re.ASCII)
for day in (m['day'].title(),) # Equivalent to assignment: day = m['day'].title()
if day[:3] in days
if day in days[day[:3]]
for month in ( m['month'].title() ,)
if month[:3] in months
if month in months[month[:3]]
for date in ( int(m['date']) ,) if date >= 1
for hours in ( int(m['hours']) ,) if hours <= 23
for minutes in ( int(m['minutes']) ,) if minutes <= 59
for seconds in ( int(m['seconds']) ,) if seconds <= 59
for zone in (m['time_zone'] ,) if zone.upper() in ('EST', 'EDT', 'CST', 'CDT', 'MST', 'MDT', 'PST', 'PDT' )
for year in ( int(m['year']) ,) if year >= 1900 and year <= 2020
for leap_year in ( # 'else' in a listcomp
( # equivalent to:
year % 4 == 0, # if year % 100 == 0:
year % 400 == 0 # leap_year = year % 400 == 0
)[year % 100 == 0] # else :
,) # leap_year = year % 4 == 0
for max_date in ( # if (month[:3] == 'Feb') and leap_year :
( # max_date = 29
d1[month[:3]], # else :
29 # max_date = d1[month[:3]]
)[(month[:3] == 'Feb') and leap_year] #
,)
if date <= max_date
]
print ( '\n\n'.join(L1) )
<_sre.SRE_Match object; span=(2, 35), match='MON Februar 12 0:30 : 19 CST 2018'> MON Februar 12 0:30 : 19 CST 2018 {'day': 'MON', 'month': 'Februar', 'date': '12', 'hours': '0', 'minutes': '30', 'seconds': '19', 'time_zone': 'CST', 'year': '2018'} <_sre.SRE_Match object; span=(159, 255), match='Thursda feb 29 > Thursda feb 29 00:30:19 CST 1944 {'day': 'Thursda', 'month': 'feb', 'date': '29', 'hours': '00', 'minutes': '30', 'seconds': '19', 'time_zone': 'CST', 'year': '1944'}
|
Assignments
[edit | edit source]
|
Further Reading or Review
[edit | edit source]
|
References
[edit | edit source]
1. Python's documentation: "5.1.3. List Comprehensions", "5.1.4. Nested List Comprehensions"
|