Lesson 8: Functions
Pragmatic AI Labs
This notebook was produced by Pragmatic AI Labs. You can continue learning about these topics by:
- Buying a copy of Pragmatic AI: An Introduction to Cloud-Based Machine Learning
- Reading an online copy of Pragmatic AI:Pragmatic AI: An Introduction to Cloud-Based Machine Learning
- Watching video Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline on Safari Books Online.
- Watching video AWS Certified Machine Learning-Speciality
- Purchasing video Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video
- Viewing more content at noahgift.com
8.1 Write and use functions
Building blocks of distributed computing
def work(input):
"""Processes input and returns output"""
output = input + 1
return output
work(1)
2
Key Components of Functions
Docstrings
def docstring():
"""Triple Quoted documentation!"""
docstring?
Arguments: Keyword and Positional
- Positional: Order based processing
- Keyword: Key/Value processing
Positional
def positional(first,second,third):
"""Processes arguments to function in order"""
print(f"Processed first {first}")
print(f"Processed second {second}")
print(f"Processed third {third}")
positional(1, 2, 3)
Processed first 1
Processed second 2
Processed third 3
positional(2, 3, 1)
Processed first 2
Processed second 3
Processed third 1
Keyword
def keyword(first=1, second=2, third=3):
"""Processed in any order"""
print(f"Processed first {first}")
print(f"Processed second {second}")
print(f"Processed third {third}")
keyword(1,2,3)
Processed first 1
Processed second 2
Processed third 3
keyword(second=2, third=3, first=1)
Processed first 1
Processed second 2
Processed third 3
keyword(second=2)
Processed first 1
Processed second 2
Processed third 3
Return
Default is None
def bridge_to_nowhere():pass
bridge_to_nowhere() == None
True
type(bridge_to_nowhere())
NoneType
Most useful functions return something
def more_than_zero():
return 1
more_than_zero() == 1
True
Functions can return functions
def inner_peace():
"""A deep function"""
def peace():
return "piece"
return peace
inner = inner_peace()
print(f"Hey, I need that {inner()}")
Hey, I need that piece
inner2 = inner_peace()
type(inner2)
function
8.2 Write and use decorators
Using Decorators
Very common to use for dispatching a function via:
- Command-line tools
- Web Routes
- Speeding up Python code
Command-line Tools
%%python
import click
def less_than_zero():
return {"iron_man": -1}
@click.command()
def run():
rdj = less_than_zero()
click.echo(f"Robert Downey Junior is versatile {rdj}")
if __name__== "__main__":
run()
Robert Downey Junior is versatile {'iron_man': -1}
Web App
%%writefile run.py
from flask import Flask
app = Flask(__name__)
def less_than_zero():
return {"iron_man": -1}
@app.route('/')
def runit():
return less_than_zero()
Overwriting run.py
curl localhost:5000/ {‘iron_man’: -1}
Using Numba
Using numba Just in Time Compiler (JIT) can dramatically speed up code
def crunchy_normal():
count = 0
num = 10000000
for i in range(num):
count += num
return count
%%time
crunchy_normal()
CPU times: user 906 ms, sys: 581 µs, total: 907 ms
Wall time: 908 ms
100000000000000
from numba import jit
@jit(nopython=True)
def crunchy():
count = 0
num = 10000000
for i in range(num):
count += num
return count
%%time
crunchy()
CPU times: user 113 ms, sys: 15.9 ms, total: 129 ms
Wall time: 194 ms
100000000000000
Writing Decorators
Instrumentation Decorator
Using a decorator to time, debug or instrument code is very common
from functools import wraps
import time
def instrument(f):
@wraps(f)
def wrap(*args, **kw):
ts = time.time()
result = f(*args, **kw)
te = time.time()
print(f"function: {f.__name__}, args: [{args}, {kw}] took: {te-ts} sec")
return result
return wrap
Using decorator to time execution of a function
from time import sleep
@instrument
def simulated_work(count, task):
"""simulates work"""
print("Starting work")
sleep(count)
processed = f"one {task} leap"
return processed
simulated_work(3, task="small")
Starting work
function: simulated_work, args: [(3,), {'task': 'small'}] took: 3.0027008056640625 sec
'one small leap'
8.3 Compose closure functions
Functions with state
def calorie_counter():
"""Counts calories"""
protein = 0
fat = 0
carbohydrate = 0
total = 0
def calorie_counter_inner(food):
nonlocal protein
nonlocal fat
nonlocal carbohydrate
if food == "protein":
protein += 4
elif food == "carbohydrate":
carbohydrate += 4
elif food == "fat":
fat += 9
total = protein + carbohydrate + fat
print(f"Consumed {total} calories of protein: {protein}, carbohydrate: {carbohydrate}, fat: {fat}")
return calorie_counter_inner
meal = calorie_counter()
type(meal)
function
meal("carbohydrate")
Consumed 4 calories of protein: 0, carbohydrate: 4, fat: 0
meal("fat")
Consumed 13 calories of protein: 0, carbohydrate: 4, fat: 9
meal("protein")
Consumed 17 calories of protein: 4, carbohydrate: 4, fat: 9
8.4 Use lambda
YAGNI
You Ain’t Gonna Need** I**t
import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
func = lambda x: x**x
func(4)
256
def expo(x):
return x**x
expo(4)
256
Close Encounters with Lambdas
Used in series or DataFrame
import pandas as pd
series = pd.Series([1, 5, 10])
series.apply(lambda x: x**x)
0 1
1 3125
2 10000000000
dtype: int64
def expo(x):
return x**x
expo(4)
import pandas as pd
series = pd.Series([1, 5, 10])
series.apply(expo)
0 1
1 3125
2 10000000000
dtype: int64
8.5 Advanced Use of Functions
Applying a Function to a Pandas DataFrame
import pandas as pd
df = pd.read_csv(
"https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
df.head()
fat_100g | carbohydrates_100g | sugars_100g | proteins_100g | salt_100g | reconstructed_energy | product | |
---|---|---|---|---|---|---|---|
0 | 28.57 | 64.29 | 14.29 | 3.57 | 0.00000 | 2267.85 | Banana Chips Sweetened (Whole) |
1 | 17.86 | 60.71 | 17.86 | 17.86 | 0.63500 | 2032.23 | Peanuts |
2 | 57.14 | 17.86 | 3.57 | 17.86 | 1.22428 | 2835.70 | Organic Salted Nut Mix |
3 | 18.75 | 57.81 | 15.62 | 14.06 | 0.13970 | 1953.04 | Organic Muesli |
4 | 36.67 | 36.67 | 3.33 | 16.67 | 1.60782 | 2336.91 | Zen Party Mix |
def high_protein(row):
"""Creates a high or low protein category"""
if row > 80:
return "high_protein"
return "low_protein"
df["high_protein"] = df["proteins_100g"].apply(high_protein)
df.head()
fat_100g | carbohydrates_100g | sugars_100g | proteins_100g | salt_100g | reconstructed_energy | product | high_protein | |
---|---|---|---|---|---|---|---|---|
0 | 28.57 | 64.29 | 14.29 | 3.57 | 0.00000 | 2267.85 | Banana Chips Sweetened (Whole) | low_protein |
1 | 17.86 | 60.71 | 17.86 | 17.86 | 0.63500 | 2032.23 | Peanuts | low_protein |
2 | 57.14 | 17.86 | 3.57 | 17.86 | 1.22428 | 2835.70 | Organic Salted Nut Mix | low_protein |
3 | 18.75 | 57.81 | 15.62 | 14.06 | 0.13970 | 1953.04 | Organic Muesli | low_protein |
4 | 36.67 | 36.67 | 3.33 | 16.67 | 1.60782 | 2336.91 | Zen Party Mix | low_protein |
df.describe()
fat_100g | carbohydrates_100g | sugars_100g | proteins_100g | salt_100g | reconstructed_energy | |
---|---|---|---|---|---|---|
count | 45028.000000 | 45028.000000 | 45028.000000 | 45028.000000 | 45028.000000 | 45028.000000 |
mean | 10.765910 | 34.054788 | 16.005614 | 6.619437 | 1.469631 | 1111.332304 |
std | 14.930087 | 29.557017 | 21.495512 | 7.936770 | 12.794943 | 791.621634 |
min | 0.000000 | 0.000000 | -1.200000 | -3.570000 | 0.000000 | 0.000000 |
25% | 0.000000 | 7.440000 | 1.570000 | 0.000000 | 0.063500 | 334.520000 |
50% | 3.170000 | 22.390000 | 5.880000 | 4.000000 | 0.635000 | 1121.540000 |
75% | 17.860000 | 61.540000 | 23.080000 | 9.520000 | 1.440180 | 1678.460000 |
max | 100.000000 | 100.000000 | 100.000000 | 100.000000 | 2032.000000 | 4475.000000 |
Partial Functions
from functools import partial
def multiple_sort(column_one, column_two):
"""Performs multiple sort on a pandas DataFrame"""
sorted_df = df.sort_values(by=[column_one, column_two],
ascending=[False, False])
return sorted_df
multisort = partial(multiple_sort, "sugars_100g")
type(multisort)
functools.partial
Find sugary and fatty food
df = multisort("fat_100g")
df.head()
fat_100g | carbohydrates_100g | sugars_100g | proteins_100g | salt_100g | reconstructed_energy | product | high_protein | |
---|---|---|---|---|---|---|---|---|
8254 | 25.00 | 100.00 | 100.0 | 0.00 | 0.00000 | 2675.00 | Princess Mix Decorations | low_protein |
8255 | 25.00 | 100.00 | 100.0 | 0.00 | 0.00000 | 2675.00 | Frosted Mix | low_protein |
8253 | 12.50 | 100.00 | 100.0 | 0.00 | 0.00000 | 2187.50 | Holiday Happiness Mix | low_protein |
9371 | 1.79 | 85.71 | 100.0 | 7.14 | 0.04572 | 1648.26 | Organic Just Cherries | low_protein |
222 | 0.00 | 100.00 | 100.0 | 0.00 | 0.00000 | 1700.00 | Tnt Exploding Candy | low_protein |
Find sugary and salty food
df = multisort("salt_100g")
df.head()
fat_100g | carbohydrates_100g | sugars_100g | proteins_100g | salt_100g | reconstructed_energy | product | high_protein | |
---|---|---|---|---|---|---|---|---|
33151 | 0.0 | 0.0 | 100.0 | 0.0 | 71.120 | 0.0 | Turkey Brine Kit, Garlic & Herb | low_protein |
24783 | 0.0 | 100.0 | 100.0 | 0.0 | 24.130 | 1700.0 | Seasoning | low_protein |
4073 | 0.0 | 100.0 | 100.0 | 0.0 | 7.620 | 1700.0 | Seasoning Rub, Sweet & Spicy Seafood | low_protein |
10282 | 0.0 | 100.0 | 100.0 | 0.0 | 2.540 | 1700.0 | Instant Pectin | low_protein |
17880 | 0.0 | 100.0 | 100.0 | 0.0 | 0.635 | 1700.0 | Cranberry Cosmos Cocktail Rimming Sugar | low_protein |