Open In Colab

Lesson 8: Functions

Pragmatic AI Labs

alt text

This notebook was produced by Pragmatic AI Labs. You can continue learning about these topics by:

8.1 Write and use functions

Building blocks of distributed computing

def work(input):
  """Processes input and returns output"""
  
  output = input + 1
  return output

work(1)
2

Key Components of Functions

Docstrings

def docstring():
  """Triple Quoted documentation!"""
  
docstring?

Arguments: Keyword and Positional

  • Positional: Order based processing
  • Keyword: Key/Value processing
Positional
def positional(first,second,third):
  """Processes arguments to function in order"""
  
  print(f"Processed first {first}")
  print(f"Processed second {second}")
  print(f"Processed third {third}")
  
  
positional(1, 2, 3)
Processed first 1
Processed second 2
Processed third 3

positional(2, 3, 1)
Processed first 2
Processed second 3
Processed third 1

Keyword
def keyword(first=1, second=2, third=3):
  """Processed in any order"""
  
  print(f"Processed first {first}")
  print(f"Processed second {second}")
  print(f"Processed third {third}")
keyword(1,2,3)
Processed first 1
Processed second 2
Processed third 3

keyword(second=2, third=3, first=1)
Processed first 1
Processed second 2
Processed third 3

keyword(second=2)
Processed first 1
Processed second 2
Processed third 3

Return

Default is None

def bridge_to_nowhere():pass
  
bridge_to_nowhere() == None
True
type(bridge_to_nowhere())
NoneType

Most useful functions return something

def more_than_zero():
  
  return 1
more_than_zero() == 1
True

Functions can return functions

def inner_peace():
  """A deep function"""
  
  def peace():
    return "piece"
  
  return peace
inner = inner_peace()
print(f"Hey, I need that {inner()}")
Hey, I need that piece

inner2 = inner_peace()
type(inner2)
function

8.2 Write and use decorators

Using Decorators

Very common to use for dispatching a function via:

  • Command-line tools
  • Web Routes
  • Speeding up Python code

Command-line Tools

%%python
import click

def less_than_zero():
  
  return {"iron_man": -1}

@click.command()
def run():
  
  rdj = less_than_zero()
  click.echo(f"Robert Downey Junior is versatile {rdj}")
  
if __name__== "__main__":
  run()
Robert Downey Junior is versatile {'iron_man': -1}

Web App

%%writefile run.py
from flask import Flask
app = Flask(__name__)

def less_than_zero():
  
  return {"iron_man": -1}

@app.route('/')
def runit():
  return less_than_zero()
  

Overwriting run.py

curl localhost:5000/ {‘iron_man’: -1}

Using Numba

Using numba Just in Time Compiler (JIT) can dramatically speed up code

def crunchy_normal():
  count = 0
  num = 10000000
  for i in range(num):
    count += num  
  return count
%%time
crunchy_normal()
CPU times: user 906 ms, sys: 581 µs, total: 907 ms
Wall time: 908 ms

100000000000000
from numba import jit

@jit(nopython=True)
def crunchy():
  count = 0
  num = 10000000
  for i in range(num):
    count += num  
  return count
%%time
crunchy()
CPU times: user 113 ms, sys: 15.9 ms, total: 129 ms
Wall time: 194 ms

100000000000000

Writing Decorators

Instrumentation Decorator

Using a decorator to time, debug or instrument code is very common

from functools import wraps
import time

def instrument(f):
    @wraps(f)
    def wrap(*args, **kw):
        ts = time.time()
        result = f(*args, **kw)
        te = time.time()
        print(f"function: {f.__name__}, args: [{args}, {kw}] took: {te-ts} sec")
        return result
    return wrap

Using decorator to time execution of a function

from time import sleep

@instrument
def simulated_work(count, task):
  """simulates work"""
  
  print("Starting work")
  sleep(count)
  processed = f"one {task} leap"
  return processed
  
simulated_work(3, task="small")  
Starting work
function: simulated_work, args: [(3,), {'task': 'small'}] took: 3.0027008056640625 sec

'one small leap'

8.3 Compose closure functions

Functions with state

def calorie_counter():
    """Counts calories"""
    
    protein = 0
    fat = 0
    carbohydrate = 0
    total = 0
    def calorie_counter_inner(food):
        nonlocal protein
        nonlocal fat
        nonlocal carbohydrate
        if food == "protein":
          protein += 4
        elif food == "carbohydrate":
          carbohydrate += 4
        elif food == "fat":
          fat += 9
        total = protein + carbohydrate + fat
        print(f"Consumed {total} calories of protein: {protein}, carbohydrate: {carbohydrate}, fat: {fat}")
    return calorie_counter_inner
meal = calorie_counter()
type(meal)

function
meal("carbohydrate")
Consumed 4 calories of protein: 0, carbohydrate: 4, fat: 0

meal("fat")
Consumed 13 calories of protein: 0, carbohydrate: 4, fat: 9

meal("protein")
Consumed 17 calories of protein: 4, carbohydrate: 4, fat: 9

8.4 Use lambda

YAGNI

You Ain’t Gonna Need** I**t

import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

func = lambda x: x**x
func(4)
256
def expo(x):
  return x**x

expo(4)
256

Close Encounters with Lambdas

Used in series or DataFrame

import pandas as pd

series = pd.Series([1, 5, 10])
series.apply(lambda x: x**x)
0              1
1           3125
2    10000000000
dtype: int64
def expo(x):
  return x**x

expo(4)
import pandas as pd

series = pd.Series([1, 5, 10])
series.apply(expo)
0              1
1           3125
2    10000000000
dtype: int64

8.5 Advanced Use of Functions

Applying a Function to a Pandas DataFrame

import pandas as pd
df = pd.read_csv(
    "https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
df.head()
fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g reconstructed_energy product
0 28.57 64.29 14.29 3.57 0.00000 2267.85 Banana Chips Sweetened (Whole)
1 17.86 60.71 17.86 17.86 0.63500 2032.23 Peanuts
2 57.14 17.86 3.57 17.86 1.22428 2835.70 Organic Salted Nut Mix
3 18.75 57.81 15.62 14.06 0.13970 1953.04 Organic Muesli
4 36.67 36.67 3.33 16.67 1.60782 2336.91 Zen Party Mix
def high_protein(row):
  """Creates a high or low protein category"""
  
  if row > 80:
    return "high_protein"
  return "low_protein"
df["high_protein"] = df["proteins_100g"].apply(high_protein)
df.head()
fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g reconstructed_energy product high_protein
0 28.57 64.29 14.29 3.57 0.00000 2267.85 Banana Chips Sweetened (Whole) low_protein
1 17.86 60.71 17.86 17.86 0.63500 2032.23 Peanuts low_protein
2 57.14 17.86 3.57 17.86 1.22428 2835.70 Organic Salted Nut Mix low_protein
3 18.75 57.81 15.62 14.06 0.13970 1953.04 Organic Muesli low_protein
4 36.67 36.67 3.33 16.67 1.60782 2336.91 Zen Party Mix low_protein
df.describe()
fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g reconstructed_energy
count 45028.000000 45028.000000 45028.000000 45028.000000 45028.000000 45028.000000
mean 10.765910 34.054788 16.005614 6.619437 1.469631 1111.332304
std 14.930087 29.557017 21.495512 7.936770 12.794943 791.621634
min 0.000000 0.000000 -1.200000 -3.570000 0.000000 0.000000
25% 0.000000 7.440000 1.570000 0.000000 0.063500 334.520000
50% 3.170000 22.390000 5.880000 4.000000 0.635000 1121.540000
75% 17.860000 61.540000 23.080000 9.520000 1.440180 1678.460000
max 100.000000 100.000000 100.000000 100.000000 2032.000000 4475.000000

Partial Functions

from functools import partial

def multiple_sort(column_one, column_two):
  """Performs multiple sort on a pandas DataFrame"""
  
  sorted_df = df.sort_values(by=[column_one, column_two], 
                 ascending=[False, False])
  return sorted_df
  
multisort = partial(multiple_sort, "sugars_100g")
type(multisort)
functools.partial

Find sugary and fatty food

df = multisort("fat_100g")
df.head()
fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g reconstructed_energy product high_protein
8254 25.00 100.00 100.0 0.00 0.00000 2675.00 Princess Mix Decorations low_protein
8255 25.00 100.00 100.0 0.00 0.00000 2675.00 Frosted Mix low_protein
8253 12.50 100.00 100.0 0.00 0.00000 2187.50 Holiday Happiness Mix low_protein
9371 1.79 85.71 100.0 7.14 0.04572 1648.26 Organic Just Cherries low_protein
222 0.00 100.00 100.0 0.00 0.00000 1700.00 Tnt Exploding Candy low_protein

Find sugary and salty food

df = multisort("salt_100g")
df.head()
fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g reconstructed_energy product high_protein
33151 0.0 0.0 100.0 0.0 71.120 0.0 Turkey Brine Kit, Garlic & Herb low_protein
24783 0.0 100.0 100.0 0.0 24.130 1700.0 Seasoning low_protein
4073 0.0 100.0 100.0 0.0 7.620 1700.0 Seasoning Rub, Sweet & Spicy Seafood low_protein
10282 0.0 100.0 100.0 0.0 2.540 1700.0 Instant Pectin low_protein
17880 0.0 100.0 100.0 0.0 0.635 1700.0 Cranberry Cosmos Cocktail Rimming Sugar low_protein