Open In Colab

Lesson 4 Strings

Pragmatic AI Labs

alt text

This notebook was produced by Pragmatic AI Labs. You can continue learning about these topics by:

4.1 Use string methods

String Quoting

Single quotes

'Here is a string'
Copy to clipboard
'Here is a string'

Double quotes

"Here is a string" == 'Here is a string'
Copy to clipboard
True

Triple Strings

a_very_large_phrase = """
Wikipedia is hosted by the Wikimedia Foundation, 
a non-profit organization that also hosts a range of other projects.
"""

print(a_very_large_phrase)
Copy to clipboard

Wikipedia is hosted by the Wikimedia Foundation, 
a non-profit organization that also hosts a range of other projects.


Raw Strings

jon_jones = '...wrote on twitter he is the greatest "heavyw8e! \nfighter of all time'
print(jon_jones)
Copy to clipboard
...wrote on twitter he is the greatest "heavyw8e! 
fighter of all time

jon_jones = r'...wrote on twitter he is the greatest "heavyw8e! \nfighter of all time'
print(jon_jones)
Copy to clipboard
...wrote on twitter he is the greatest "heavyw8e! \nfighter of all time

Case Manipulation

captain = "Patrick Tayluer"

captain
Copy to clipboard
'Patrick Tayluer'
captain.capitalize()
Copy to clipboard
'Patrick tayluer'
captain.lower()
Copy to clipboard
'patrick tayluer'
captain.upper()
Copy to clipboard
'PATRICK TAYLUER'
captain.swapcase()
Copy to clipboard
'pATRICK tAYLUER'
captain = 'patrick tayluer'
captain.title()
Copy to clipboard
'Patrick Tayluer'

Interrogation

river = 'Mississippi'

Copy to clipboard
len(river)
Copy to clipboard
11
river.count('s')
Copy to clipboard
4
river.index('pp')
Copy to clipboard
8
river.index('r')
Copy to clipboard

    ---------------------------------------------------------------------------

    ValueError                                Traceback (most recent call last)

    <ipython-input-67-fcd85454de2b> in <module>()
    ----> 1 river.index('r')
    

    ValueError: substring not found


river.find('r')
Copy to clipboard
-1
river.startswith('M')
Copy to clipboard
True
river.endswith('i')
Copy to clipboard
True
'sip' in river
Copy to clipboard
True

Content Type

'abc123'.isalpha()
Copy to clipboard
False
'abc123'.isalnum()
Copy to clipboard
True
'lowercase'.islower()
Copy to clipboard
True
'lowercase'.isupper()
Copy to clipboard
False
'The Good Ship'.istitle()
Copy to clipboard
True
'The bad seed'.istitle()
Copy to clipboard
False

More information: String Methods

4.2 Format strings

F-strings where introduced in Python 3.6. They prefixed by either a ‘F’ or ‘f’ before the beginning quotation mark. Values can be inserted into F-strings at runtime using replacement fields which are deliminated by curly braces.

Insert variable into replacement field

strings_count = 5
frets_count = 24
f"Noam Pikelny's banjo has {strings_count} strings and {frets_count} frets"
Copy to clipboard
"Noam Pikelny's banjo has 5 strings and 24 frets"

Insert expression into replacement field

a = 12
b = 32
f"{a} times {b} equals {a*b}"
Copy to clipboard
'12 times 32 equals 384'

Index list in string replacement fields

players = ["Tony Trischka", "Bill Evans", "Alan Munde"]
f"Performances will be held by {players[1]}, {players[0]}, and {players[2]}"
Copy to clipboard
'Performances will be held by Bill Evans, Tony Trischka, and Alan Munde'

Conversion flags

A conversion flag can be specified to convert the type of the value before formatting. The three available flags are ‘s’, ‘r’ and ‘a’.

Using str conversion

nuts = [1,2,3,4,5]
f"Calling str() on a the list {nuts} produces {nuts!s}"
Copy to clipboard
'Calling str() on a the list [1, 2, 3, 4, 5] produces [1, 2, 3, 4, 5]'

Using repr conversiont

nut = 'pistacio'
f"Calling repr on the string {nut} results in {nut!r}"
Copy to clipboard
"Calling repr on the string pistacio results in 'pistacio'"

Using ascii conversion

check = "√"
f"The ascii version of {check} is {check!a}"
Copy to clipboard
"The ascii version of √ is '\\u221a'"

Padding a number

lucky_num = 13
f"To pad the number {lucky_num} to 5 places:{lucky_num:5d}"
Copy to clipboard
'To pad the number 13 to 5 places:   13'

Setting padding value at runtime

luckey_num = 13
padding = 5
f"To pad the number {lucky_num} to {padding} places:{lucky_num:{padding}d}"
Copy to clipboard
'To pad the number 13 to 5 places:   13'

More information: Format String Syntax

Other String Formatting: String Format Method

Old Style String Formatting

4.3 Manipulate strings

Concatenation

"Bob" + "beroo"
Copy to clipboard
'Bobberoo'
"AB" * 8
Copy to clipboard
'ABABABABABABABAB'

Remove Whitespace

ship = " The Yankee Clipper "
ship
Copy to clipboard
' The Yankee Clipper '
ship.strip()
Copy to clipboard
'The Yankee Clipper'
ship.lstrip()
Copy to clipboard
'The Yankee Clipper '
ship.rstrip()
Copy to clipboard
' The Yankee Clipper'
ship.rstrip("per ")
Copy to clipboard
' The Yankee Cli'

Add padding

port = "Boston"
Copy to clipboard
port.center(12, '*')
Copy to clipboard
'***Boston***'
port.ljust(12, '*')
Copy to clipboard
'Boston******'
port.rjust(12, '*')
Copy to clipboard
'******Boston'
for port_city in ['Liverpool', 'Boston', 'New York', 'Philadelphia']:
  print(port_city.rjust(12))
Copy to clipboard
   Liverpool
      Boston
    New York
Philadelphia

'-5'.zfill(4)
Copy to clipboard
'-005'

Replace

"FILADELFIA".replace("F", "PH")
Copy to clipboard
'PHILADELPHIA'

Spitting and Joining

words_string = "Here,Are,Some,Words"
words_string
Copy to clipboard
'Here,Are,Some,Words'

Split on comma

words = words_string.split(',')
words
Copy to clipboard
['Here', 'Are', 'Some', 'Words']

Joining

':'.join(words)
Copy to clipboard
'Here:Are:Some:Words'

Split on newline

multiline = "Sometimes we are given\na multiline document\nas a single string"
multiline
Copy to clipboard
'Sometimes we are given\na multiline document\nas a single string'
for line in multiline.splitlines():
  print(line)
Copy to clipboard
Sometimes we are given
a multiline document
as a single string

Slicing

collector = "William Main Doerflinger"
collector[0]
Copy to clipboard
'W'
collector[-1]
Copy to clipboard
'r'
collector[13:18]
Copy to clipboard
'Doerf'
collector[-7:]
Copy to clipboard
'flinger'

More information: common sequence operations

4.4 Learn to use unicode

There are multiple encoding possible for mapping characters to bytes. Python strings default to UTF-8. Earlier versions of Python used a more limited encoding.

Encode

twice_pie = 'ππ'
twice_pie
Copy to clipboard
'ππ'
twice_π = twice_pie
twice_π
Copy to clipboard
'ππ'
pie = "\N{GREEK CAPITAL LETTER PI}"
pie
Copy to clipboard
'Π'
ord(pie)
Copy to clipboard
928
chr(928)
Copy to clipboard
'Π'
u = chr(40960) + 'abcd' + chr(1972)
u.encode('utf-8')
u
Copy to clipboard
'ꀀabcd\u07b4'

Saving File in Unicode

with open("new_file.txt", "w", encoding='utf-8') as opened_file:
  opened_file.write("Søme Unˆcode text")
  
!cat new_file.txt
  
Copy to clipboard
Søme Unˆcode text

Unicode

Notes