Open In Colab

Lesson 4 Strings

Pragmatic AI Labs

alt text

This notebook was produced by Pragmatic AI Labs. You can continue learning about these topics by:

4.1 Use string methods

String Quoting

Single quotes

'Here is a string'
'Here is a string'

Double quotes

"Here is a string" == 'Here is a string'
True

Triple Strings

a_very_large_phrase = """
Wikipedia is hosted by the Wikimedia Foundation, 
a non-profit organization that also hosts a range of other projects.
"""

print(a_very_large_phrase)

Wikipedia is hosted by the Wikimedia Foundation, 
a non-profit organization that also hosts a range of other projects.


Raw Strings

jon_jones = '...wrote on twitter he is the greatest "heavyw8e! \nfighter of all time'
print(jon_jones)
...wrote on twitter he is the greatest "heavyw8e! 
fighter of all time

jon_jones = r'...wrote on twitter he is the greatest "heavyw8e! \nfighter of all time'
print(jon_jones)
...wrote on twitter he is the greatest "heavyw8e! \nfighter of all time

Case Manipulation

captain = "Patrick Tayluer"

captain
'Patrick Tayluer'
captain.capitalize()
'Patrick tayluer'
captain.lower()
'patrick tayluer'
captain.upper()
'PATRICK TAYLUER'
captain.swapcase()
'pATRICK tAYLUER'
captain = 'patrick tayluer'
captain.title()
'Patrick Tayluer'

Interrogation

river = 'Mississippi'

len(river)
11
river.count('s')
4
river.index('pp')
8
river.index('r')

    ---------------------------------------------------------------------------

    ValueError                                Traceback (most recent call last)

    <ipython-input-67-fcd85454de2b> in <module>()
    ----> 1 river.index('r')
    

    ValueError: substring not found


river.find('r')
-1
river.startswith('M')
True
river.endswith('i')
True
'sip' in river
True

Content Type

'abc123'.isalpha()
False
'abc123'.isalnum()
True
'lowercase'.islower()
True
'lowercase'.isupper()
False
'The Good Ship'.istitle()
True
'The bad seed'.istitle()
False

More information: String Methods

4.2 Format strings

F-strings where introduced in Python 3.6. They prefixed by either a ‘F’ or ‘f’ before the beginning quotation mark. Values can be inserted into F-strings at runtime using replacement fields which are deliminated by curly braces.

Insert variable into replacement field

strings_count = 5
frets_count = 24
f"Noam Pikelny's banjo has {strings_count} strings and {frets_count} frets"
"Noam Pikelny's banjo has 5 strings and 24 frets"

Insert expression into replacement field

a = 12
b = 32
f"{a} times {b} equals {a*b}"
'12 times 32 equals 384'

Index list in string replacement fields

players = ["Tony Trischka", "Bill Evans", "Alan Munde"]
f"Performances will be held by {players[1]}, {players[0]}, and {players[2]}"
'Performances will be held by Bill Evans, Tony Trischka, and Alan Munde'

Conversion flags

A conversion flag can be specified to convert the type of the value before formatting. The three available flags are ‘s’, ‘r’ and ‘a’.

Using str conversion

nuts = [1,2,3,4,5]
f"Calling str() on a the list {nuts} produces {nuts!s}"
'Calling str() on a the list [1, 2, 3, 4, 5] produces [1, 2, 3, 4, 5]'

Using repr conversiont

nut = 'pistacio'
f"Calling repr on the string {nut} results in {nut!r}"
"Calling repr on the string pistacio results in 'pistacio'"

Using ascii conversion

check = "√"
f"The ascii version of {check} is {check!a}"
"The ascii version of √ is '\\u221a'"

Padding a number

lucky_num = 13
f"To pad the number {lucky_num} to 5 places:{lucky_num:5d}"
'To pad the number 13 to 5 places:   13'

Setting padding value at runtime

luckey_num = 13
padding = 5
f"To pad the number {lucky_num} to {padding} places:{lucky_num:{padding}d}"
'To pad the number 13 to 5 places:   13'

More information: Format String Syntax

Other String Formatting: String Format Method

Old Style String Formatting

4.3 Manipulate strings

Concatenation

"Bob" + "beroo"
'Bobberoo'
"AB" * 8
'ABABABABABABABAB'

Remove Whitespace

ship = " The Yankee Clipper "
ship
' The Yankee Clipper '
ship.strip()
'The Yankee Clipper'
ship.lstrip()
'The Yankee Clipper '
ship.rstrip()
' The Yankee Clipper'
ship.rstrip("per ")
' The Yankee Cli'

Add padding

port = "Boston"
port.center(12, '*')
'***Boston***'
port.ljust(12, '*')
'Boston******'
port.rjust(12, '*')
'******Boston'
for port_city in ['Liverpool', 'Boston', 'New York', 'Philadelphia']:
  print(port_city.rjust(12))
   Liverpool
      Boston
    New York
Philadelphia

'-5'.zfill(4)
'-005'

Replace

"FILADELFIA".replace("F", "PH")
'PHILADELPHIA'

Spitting and Joining

words_string = "Here,Are,Some,Words"
words_string
'Here,Are,Some,Words'

Split on comma

words = words_string.split(',')
words
['Here', 'Are', 'Some', 'Words']

Joining

':'.join(words)
'Here:Are:Some:Words'

Split on newline

multiline = "Sometimes we are given\na multiline document\nas a single string"
multiline
'Sometimes we are given\na multiline document\nas a single string'
for line in multiline.splitlines():
  print(line)
Sometimes we are given
a multiline document
as a single string

Slicing

collector = "William Main Doerflinger"
collector[0]
'W'
collector[-1]
'r'
collector[13:18]
'Doerf'
collector[-7:]
'flinger'

More information: common sequence operations

4.4 Learn to use unicode

There are multiple encoding possible for mapping characters to bytes. Python strings default to UTF-8. Earlier versions of Python used a more limited encoding.

Encode

twice_pie = 'ππ'
twice_pie
'ππ'
twice_π = twice_pie
twice_π
'ππ'
pie = "\N{GREEK CAPITAL LETTER PI}"
pie
'Π'
ord(pie)
928
chr(928)
'Π'
u = chr(40960) + 'abcd' + chr(1972)
u.encode('utf-8')
u
'ꀀabcd\u07b4'

Saving File in Unicode

with open("new_file.txt", "w", encoding='utf-8') as opened_file:
  opened_file.write("Søme Unˆcode text")
  
!cat new_file.txt
  
Søme Unˆcode text

Unicode

Notes