Python Programming/Strings

From Wikiversity
Jump to: navigation, search

This lesson introduces Python string processing.

Objectives and Skills[edit]

Objectives and skills for this lesson include:

  • Standard Library
    • String operations

Readings[edit]

  1. Wikipedia: String (computer science)
  2. Python for Everyone: Strings

Multimedia[edit]

  1. YouTube: Python for Informatics - Chapter 6 - Strings
  2. YouTube: Python - Strings
  3. YouTube: Python - String Slicing
  4. YouTube: Python - String Formatting

Examples[edit]

len() function[edit]

The len() function returns the length (the number of items) of an object.[1]

string = "Test"

print("len(string):", len(string))

Output:

len(string): 4

Strings[edit]

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.[2]

string = "Test"

print("Characters:")
for letter in string:
    print(letter)

print("\nCharacters by position:")
for i in range(len(string)):
    print("string[%d]: %c" % (i, string[i]))

Output:

Characters:
T
e
s
t

Characters by position:
string[0]: T
string[1]: e
string[2]: s
string[3]: t

Membership Comparisons[edit]

The operators in and not in test for membership. x in s evaluates to true if x is a member of s, and false otherwise.[3]

alphabet = "abcdefghijklmnopqrstuvwxyz"
string = "Python Programming/Strings"

print("The string contains:")
for letter in alphabet:
    if letter in string.lower():
        print(letter)

Output:

The string contains:
a
g
h
i
m
n
o
p
r
s
t
y

String Methods[edit]

Strings implement all of the common sequence operations, along with additional methods such as case validation and conversion.[4]

string = "Test"

print("string:", string)
print("string.isalpha():", string.isalpha())
print("string.islower():", string.islower())
print("string.isnumeric():", string.isnumeric())
print("string.isspace():", string.isspace())
print("string.istitle():", string.istitle())
print("string.isupper():", string.isupper())
print("string.lower():", string.lower())
print("string.strip():", string.strip())
print("string.swapcase():", string.swapcase())
print("string.title():", string.title())
print("string.upper():", string.upper())

Output:

string: Test
string.isalpha(): True
string.islower(): False
string.isnumeric(): False
string.isspace(): False
string.istitle(): True
string.isupper(): False
string.lower(): test
string.strip(): Test
string.swapcase(): tEST
string.title(): Test
string.upper(): TEST

String Parsing[edit]

Python substrings are referred to as slices. Slices are accessed using the syntax string[start:end], with the first character index starting at zero. The slice will include the characters from start up to but not including end. If end is omitted, the slice will include the characters from start through the end of the string. String slices may use negative indexing, in which case the index counts backwards from the end of the string.[5] The find() method returns the lowest index in the string where a substring is found within the given slice. Returns -1 if the substring is not found.[6]

string = "Python Programming/Strings"
index = string.find("/")

if index >= 0:
    project = string[0:index]
    page = string[index + 1:]

    print("Project:", project)
    print("Page:", page)

Output:

Project: Python Programming
Page: Strings

String Formatting[edit]

String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications in format are replaced with zero or more elements of values.[7]

print("Value:", value)
print("Integer: %i" % value)
print("Octal: %o" % value)
print("Hexadecimal: %x" % value)
print("Float: %.2f" % value)
print("Exponent: %.2e" % value)
print("Character: %c" % value)
print("String: %s" % value)
print("Multiple: %i, %o, %x, %.2f, %.2e, %c, %s" % 
    (value, value, value, value, value, value, value))

Output:

Value: 65.5
Integer: 65
Octal: 101
Hexadecimal: 41
Float: 65.50
Exponent: 6.55e+01
Character: A
String: 65.5
Multiple: 65, 101, 41, 65.50, 6.55e+01, A, 65.5

str.format() Method[edit]

The str.format() method uses format strings that contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output.[8]

integer = 65
float = 65.5
string = "65.5"

print("Decimal: {:d}".format(integer))
print("Binary: {:b}".format(integer))
print("Octal: {:o}".format(integer))
print("Hexadecimal: {:x}".format(integer))
print("Character: {:c}".format(integer))
print("Float: {:.2f}".format(float))
print("Exponent: {:.2e}".format(float))
print("String: {:s}".format(string))
print("Multiple: {:d}, {:b}, {:o}, {:x}, {:c}, {:.2f}, {:.2e}, {:s}".format(
    integer, integer, integer, integer, integer, float, float, string))

Output:

Decimal: 65
Binary: 1000001
Octal: 101
Hexadecimal: 41
Character: A
Float: 65.50
Exponent: 6.55e+01
String: 65.5
Multiple: 65, 1000001, 101, 41, A, 65.50, 6.55e+01, 65.5

Activities[edit]

Tutorials[edit]

  1. Complete one or more of the following tutorials:

Practice[edit]

  1. Review Python.org: String methods. Create a Python program that asks the user for a line of text containing a first name and last name. Use string methods to parse the line and print out the name in the form last name, first initial, such as Lastname, F. Include a trailing period after the first initial. Ensure that the first letter of each name part is capitalized, and the rest of the last name is lower case. Include error handling in case the user does not enter exactly two name parts. Use a user-defined function for the actual string processing, separate from input and output.
  2. Review Python.org: String methods. Create a Python program that asks the user for a line of comma-separated-values. It could be a sequence of test scores, names, or any other values. Use string methods to parse the line and print out each item on a separate line. Remove commas and any leading or trailing spaces from each item when printed. If the item is numeric, display it formatted as a floating point value with two decimal places. If the value is not numeric, display it as is.
  3. Review Python.org: String methods. Create a Python program that asks the user for a line of text that contains HTML tags, such as:
        <p><strong>This is a bold paragraph.</strong></p>
    Use string methods to search for and remove all HTML tags, and then print the remaining untagged text. Include error handling in case an HTML tag isn't entered correctly (an unmatched < or >). Use a user-defined function for the actual string processing, separate from input and output.
  4. Review Python.org: String methods. Create a Python program that asks the user for a line of text. Then ask the user for the number of characters to print in each line, the number of lines to be printed, and a scroll direction, right or left. Using the given line of text, duplicate the text as needed to fill the given number of characters per line. Then print the requested number of lines, shifting the entire line's content by one character, left or right, each time the line is printed. The first or last character will be shifted / appended to the other end of the string. For example:
        Repeat this. Repeat this. 
        epeat this. Repeat this. R
        peat this. Repeat this. Re

Lesson Summary[edit]

  • A string is traditionally a sequence of characters, either as a literal constant or as some kind of variable.[9]
  • A string variable may allow its elements to be mutated and the length changed, or it may be fixed (after creation).[10]
  • Python strings are immutable — they cannot be changed.[11]
  • The len() function returns the length (the number of items) of an object.[12]
  • Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.[13]
  • The operators in and not in test for membership. x in s evaluates to true if x is a member of s, and false otherwise.[14]
  • String methods include: isalpha(), islower(), isnumeric(), isspace(), istitle(), isupper(), lower(), strip(), swapcase(), title(), and upper().[15]
  • Python substrings are referred to as slices. Slices are accessed using the syntax string[start:end], with the first character index starting at zero. The slice will include the characters from start up to but not including end. If end is omitted, the slice will include the characters from start through the end of the string.[16]
  • String slices may use negative indexing, in which case the index counts backwards from the end of the string.[17]
  • The find() method returns the lowest index in the string where a substring is found within the given slice. Returns -1 if the substring is not found.[18]
  • String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications in format are replaced with zero or more elements of values.[19]
  • The str.format() method uses format strings that contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output.[20]

Key Terms[edit]

counter
A variable used to count something, usually initialized to zero and then incremented.[21]
empty string
A string with no characters and length 0, represented by two quotation marks.[22]
format operator
An operator, %, that takes a format string and a tuple and generates a string that includes the elements of the tuple formatted as specified by the format string.[23]
format sequence
A sequence of characters in a format string, like %d, that specifies how a value should be formatted.[24]
format string
A string, used with the format operator, that contains format sequences.[25]
flag
A boolean variable used to indicate whether a condition is true.[26]
invocation
A statement that calls a method.[27]
immutable
The property of a sequence whose items cannot be assigned.[28]
index
An integer value used to select an item in a sequence, such as a character in a string.[29]
item
One of the values in a sequence.[30]
method
A function that is associated with an object and called using dot notation.[31]
object
Something a variable can refer to. For now, you can use “object” and “value” interchangeably.[32]
search
A pattern of traversal that stops when it finds what it is looking for.[33]
sequence
An ordered set; that is, a set of values where each value is identified by an integer index.[34]
slice
A part of a string specified by a range of indices.[35]
traverse
To iterate through the items in a sequence, performing a similar operation on each.[36]

Review Questions[edit]

Enable JavaScript to hide answers.
Click on a question to see the answer.
  1. A string is _____.
    A string is traditionally a sequence of characters, either as a literal constant or as some kind of variable.
  2. A string variable may _____.
    A string variable may allow its elements to be mutated and the length changed, or it may be fixed (after creation).
  3. Python strings are _____ — they cannot be changed.
    Python strings are immutable — they cannot be changed.
  4. The len() function returns _____.
    The len() function returns the length (the number of items) of an object.
  5. Textual data in Python is handled with _____.
    Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.
  6. The operators _____ and _____ test for membership. x _____ s evaluates to true if x is a member of s, and false otherwise.
    The operators in and not in test for membership. x in s evaluates to true if x is a member of s, and false otherwise.
  7. String methods include:
    String methods include: isalpha(), islower(), isnumeric(), isspace(), istitle(), isupper(), lower(), strip(), swapcase(), title(), and upper().
  8. Python substrings are referred to as _____.
    Python substrings are referred to as slices.
  9. Slices are accessed using the syntax _____.
    Slices are accessed using the syntax string[start:end].
  10. The first character index in a slice starts at _____. The slice will include _____. If end is omitted, the slice will include _____.
    The first character index in a slice starts at zero. The slice will include the characters from start up to but not including end. If end is omitted, the slice will include the characters from start through the end of the string.
  11. String slices may use negative indexing, in which case _____.
    String slices may use negative indexing, in which case the index counts backwards from the end of the string.
  12. The find() method returns _____.
    The find() method returns the lowest index in the string where a substring is found within the given slice, and returns -1 if the substring is not found.
  13. String objects have one unique built-in operation: the % operator (modulo). This is also known as _____.
    String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications in format are replaced with zero or more elements of values.
  14. The str.format() method uses format strings that contain “replacement fields” surrounded by _____.
    The str.format() method uses format strings that contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output.

Assessments[edit]

See Also[edit]

References[edit]