{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Lesson8-Python For Data Science-Functions.ipynb", "version": "0.3.2", "provenance": [], "collapsed_sections": [ "chwCKrPTeu0V", "9MMO5aLZe8GC", "HO4Gzz0Oje08", "heQ5djh0fppR", "dL-yPSgtfwHc", "jZZPzN9ofzqE", "k67ErjU7qkVU", "evHbw7C8f2y0", "ZrldedS-sFUS", "loWnoGY8OPLA", "RHsiAy8N3-Cj", "AMPU657F4JWF", "3Srgz3C65Vdk", "9KQ2Cmcelcy-", "qO-uoYUClvqr" ], "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "metadata": { "id": "spdivf2TMnGC", "colab_type": "text" }, "cell_type": "markdown", "source": [ "# Lesson 8: Functions" ] }, { "metadata": { "id": "c_Id55m6Jsbu", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## Pragmatic AI Labs\n", "\n" ] }, { "metadata": { "id": "e5p96AqpSDZa", "colab_type": "text" }, "cell_type": "markdown", "source": [ "![alt text](https://paiml.com/images/logo_with_slogan_white_background.png)\n", "\n", "This notebook was produced by [Pragmatic AI Labs](https://paiml.com/). You can continue learning about these topics by:\n", "\n", "* Buying a copy of [Pragmatic AI: An Introduction to Cloud-Based Machine Learning](http://www.informit.com/store/pragmatic-ai-an-introduction-to-cloud-based-machine-9780134863917)\n", "* Reading an online copy of [Pragmatic AI:Pragmatic AI: An Introduction to Cloud-Based Machine Learning](https://www.safaribooksonline.com/library/view/pragmatic-ai-an/9780134863924/)\n", "* Watching video [Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline](https://www.safaribooksonline.com/videos/essential-machine-learning/9780135261118) on Safari Books Online.\n", "* Watching video [AWS Certified Machine Learning-Speciality](https://learning.oreilly.com/videos/aws-certified-machine/9780135556597)\n", "* Purchasing video [Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video](http://www.informit.com/store/essential-machine-learning-and-ai-with-python-and-jupyter-9780135261095)\n", "* Viewing more content at [noahgift.com](https://noahgift.com/)\n" ] }, { "metadata": { "id": "pBTeTbnRKG_k", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "chwCKrPTeu0V", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 8.1 Write and use functions" ] }, { "metadata": { "id": "9MMO5aLZe8GC", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Building blocks of distributed computing" ] }, { "metadata": { "id": "xcU--BqPgQKN", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def work(input):\n", " \"\"\"Processes input and returns output\"\"\"\n", " \n", " output = input + 1\n", " return output\n" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "_yUpu-I9hhdf", "colab_type": "code", "outputId": "4b185ca1-8462-4092-cfb8-714006aafc5c", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "work(1)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "2" ] }, "metadata": { "tags": [] }, "execution_count": 4 } ] }, { "metadata": { "id": "Kocgo6ZijVpi", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Key Components of Functions" ] }, { "metadata": { "id": "HO4Gzz0Oje08", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Docstrings" ] }, { "metadata": { "id": "InpDV6kejca8", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def docstring():\n", " \"\"\"Triple Quoted documentation!\"\"\"\n", " " ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "hM_gK54njn7r", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "docstring?" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "BdZ6O3lkkUkp", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Arguments: Keyword and Positional" ] }, { "metadata": { "id": "n0ha-pEKmE8Q", "colab_type": "text" }, "cell_type": "markdown", "source": [ "* *Positional*: Order based processing\n", "* *Keyword*: Key/Value processing\n" ] }, { "metadata": { "id": "RpK99_2BpmgX", "colab_type": "text" }, "cell_type": "markdown", "source": [ "##### Positional" ] }, { "metadata": { "id": "YJvR7-o7kaP3", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def positional(first,second,third):\n", " \"\"\"Processes arguments to function in order\"\"\"\n", " \n", " print(f\"Processed first {first}\")\n", " print(f\"Processed second {second}\")\n", " print(f\"Processed third {third}\")\n", " \n", " " ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "9HJAF-v8lxAk", "colab_type": "code", "outputId": "6d56644d-31c7-4b80-d0a3-4c8ce4a33621", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "positional(1, 2, 3)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Processed first 1\n", "Processed second 2\n", "Processed third 3\n" ], "name": "stdout" } ] }, { "metadata": { "id": "XUghQwcSl69Q", "colab_type": "code", "outputId": "9b4d2886-af03-4944-e3a7-f899a0736445", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "positional(2, 3, 1)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Processed first 2\n", "Processed second 3\n", "Processed third 1\n" ], "name": "stdout" } ] }, { "metadata": { "id": "5gxfaOIIppTy", "colab_type": "text" }, "cell_type": "markdown", "source": [ "##### Keyword" ] }, { "metadata": { "id": "z8Ud2enHpQiW", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def keyword(first=1, second=2, third=3):\n", " \"\"\"Processed in any order\"\"\"\n", " \n", " print(f\"Processed first {first}\")\n", " print(f\"Processed second {second}\")\n", " print(f\"Processed third {third}\")" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "Kf0afT68piRI", "colab_type": "code", "outputId": "a2d9c475-2b70-400d-807c-c6b827c66df8", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "keyword(1,2,3)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Processed first 1\n", "Processed second 2\n", "Processed third 3\n" ], "name": "stdout" } ] }, { "metadata": { "id": "BZEQ6NagqAMK", "colab_type": "code", "outputId": "cf83e515-0f3a-4a58-b780-9480c62f1edb", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "keyword(second=2, third=3, first=1)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Processed first 1\n", "Processed second 2\n", "Processed third 3\n" ], "name": "stdout" } ] }, { "metadata": { "id": "Cw4QflPLqgOs", "colab_type": "code", "outputId": "6e1de89f-3aad-4919-e231-4ce0d5a28d1b", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "keyword(second=2)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Processed first 1\n", "Processed second 2\n", "Processed third 3\n" ], "name": "stdout" } ] }, { "metadata": { "id": "t292FxCKri6_", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Return" ] }, { "metadata": { "id": "JZpCuThdsehI", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Default is None" ] }, { "metadata": { "id": "-lmlSwRHroMM", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def bridge_to_nowhere():pass\n", " " ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "FeW19iLCsXNa", "colab_type": "code", "outputId": "189e59fb-d84a-40e2-e32a-d1c1a8993c45", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "bridge_to_nowhere() == None" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 18 } ] }, { "metadata": { "id": "6E-n3TktN9vP", "colab_type": "code", "outputId": "1cbe907e-e5d3-4c8d-f2e5-15de311c749b", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "type(bridge_to_nowhere())" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "NoneType" ] }, "metadata": { "tags": [] }, "execution_count": 19 } ] }, { "metadata": { "id": "kYqhX30VsxY6", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Most useful functions return something" ] }, { "metadata": { "id": "1mOKT9nOsg0y", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def more_than_zero():\n", " \n", " return 1" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "CwMszHHpsok6", "colab_type": "code", "outputId": "0dfc577d-68e6-4c2c-a98d-ad5e08602753", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "more_than_zero() == 1" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 21 } ] }, { "metadata": { "id": "-RsSF-E3s1g2", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Functions can return functions" ] }, { "metadata": { "id": "yG9l-n73s5VA", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def inner_peace():\n", " \"\"\"A deep function\"\"\"\n", " \n", " def peace():\n", " return \"piece\"\n", " \n", " return peace" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "RGbMCsu6ta57", "colab_type": "code", "outputId": "759e51d0-382f-47f7-9a80-5cfa52cb4b3a", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "inner = inner_peace()\n", "print(f\"Hey, I need that {inner()}\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Hey, I need that piece\n" ], "name": "stdout" } ] }, { "metadata": { "id": "dkITkJLtOVg-", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "inner2 = inner_peace()" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "Mx01RIPWObVb", "colab_type": "code", "outputId": "dc4947a7-b9ce-4e6a-9930-12337f37f54e", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "type(inner2)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "function" ] }, "metadata": { "tags": [] }, "execution_count": 26 } ] }, { "metadata": { "id": "heQ5djh0fppR", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 8.2 Write and use decorators" ] }, { "metadata": { "id": "4rtpT5VKvS-E", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Using Decorators" ] }, { "metadata": { "id": "Unf0PyHlzvl3", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Very common to use for dispatching a function via:\n", "\n", "\n", "* Command-line tools\n", "* Web Routes\n", "* Speeding up Python code\n", "\n" ] }, { "metadata": { "id": "IhJ4ugKaznwL", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Command-line Tools" ] }, { "metadata": { "id": "8fH6Sw5gvZPU", "colab_type": "code", "outputId": "1ed57d27-ec3b-441f-d176-54e97353c3c3", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "%%python\n", "import click\n", "\n", "def less_than_zero():\n", " \n", " return {\"iron_man\": -1}\n", "\n", "@click.command()\n", "def run():\n", " \n", " rdj = less_than_zero()\n", " click.echo(f\"Robert Downey Junior is versatile {rdj}\")\n", " \n", "if __name__== \"__main__\":\n", " run()" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Robert Downey Junior is versatile {'iron_man': -1}\n" ], "name": "stdout" } ] }, { "metadata": { "id": "YZBqLiy_1-9r", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Web App" ] }, { "metadata": { "id": "cryR4v0lzZnn", "colab_type": "code", "outputId": "83214037-31be-4c94-c70b-4f8b1736f9ea", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "%%writefile run.py\n", "from flask import Flask\n", "app = Flask(__name__)\n", "\n", "def less_than_zero():\n", " \n", " return {\"iron_man\": -1}\n", "\n", "@app.route('/')\n", "def runit():\n", " return less_than_zero()\n", " \n" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Overwriting run.py\n" ], "name": "stdout" } ] }, { "metadata": { "id": "_Qela4-QPUy6", "colab_type": "text" }, "cell_type": "markdown", "source": [ "curl localhost:5000/ {'iron_man': -1}\n", "\n" ] }, { "metadata": { "id": "e4j-RCLj7qVa", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Using Numba" ] }, { "metadata": { "id": "zyVxYYr8dPZ9", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Using numba Just in Time Compiler (JIT) can dramatically speed up code" ] }, { "metadata": { "id": "XpHg_ZXm_VFs", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def crunchy_normal():\n", " count = 0\n", " num = 10000000\n", " for i in range(num):\n", " count += num \n", " return count" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "KGdobU-Z_al4", "colab_type": "code", "outputId": "39da257e-6e9e-42b6-bd5c-2950d53d4308", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "%%time\n", "crunchy_normal()" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "CPU times: user 906 ms, sys: 581 µs, total: 907 ms\n", "Wall time: 908 ms\n" ], "name": "stdout" }, { "output_type": "execute_result", "data": { "text/plain": [ "100000000000000" ] }, "metadata": { "tags": [] }, "execution_count": 31 } ] }, { "metadata": { "id": "Ml9C38hhzixJ", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "from numba import jit\n", "\n", "@jit(nopython=True)\n", "def crunchy():\n", " count = 0\n", " num = 10000000\n", " for i in range(num):\n", " count += num \n", " return count" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "Vi8MvQH-8b2F", "colab_type": "code", "outputId": "1bc1d290-934f-4bc5-9224-992323a35d55", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "%%time\n", "crunchy()" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "CPU times: user 113 ms, sys: 15.9 ms, total: 129 ms\n", "Wall time: 194 ms\n" ], "name": "stdout" }, { "output_type": "execute_result", "data": { "text/plain": [ "100000000000000" ] }, "metadata": { "tags": [] }, "execution_count": 33 } ] }, { "metadata": { "id": "cvnTUxZYvfKR", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Writing Decorators" ] }, { "metadata": { "colab_type": "text", "id": "AxG-DkX2gOA_" }, "cell_type": "markdown", "source": [ "#### Instrumentation Decorator" ] }, { "metadata": { "colab_type": "text", "id": "Lu_1OjESgOBA" }, "cell_type": "markdown", "source": [ "Using a decorator to time, debug or instrument code is very common" ] }, { "metadata": { "colab_type": "code", "id": "416ZT8wsgOBB", "colab": {} }, "cell_type": "code", "source": [ "from functools import wraps\n", "import time\n", "\n", "def instrument(f):\n", " @wraps(f)\n", " def wrap(*args, **kw):\n", " ts = time.time()\n", " result = f(*args, **kw)\n", " te = time.time()\n", " print(f\"function: {f.__name__}, args: [{args}, {kw}] took: {te-ts} sec\")\n", " return result\n", " return wrap" ], "execution_count": 0, "outputs": [] }, { "metadata": { "colab_type": "text", "id": "spX1jAORgOBD" }, "cell_type": "markdown", "source": [ "Using decorator to time execution of a function" ] }, { "metadata": { "id": "nhV7RoBLxGdY", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "from time import sleep\n", "\n", "@instrument\n", "def simulated_work(count, task):\n", " \"\"\"simulates work\"\"\"\n", " \n", " print(\"Starting work\")\n", " sleep(count)\n", " processed = f\"one {task} leap\"\n", " return processed\n", " " ], "execution_count": 0, "outputs": [] }, { "metadata": { "colab_type": "code", "outputId": "83e6646e-6307-425e-83cb-cfe901895aed", "id": "nEeQPk_sgOBD", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "simulated_work(3, task=\"small\") " ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Starting work\n", "function: simulated_work, args: [(3,), {'task': 'small'}] took: 3.0027008056640625 sec\n" ], "name": "stdout" }, { "output_type": "execute_result", "data": { "text/plain": [ "'one small leap'" ] }, "metadata": { "tags": [] }, "execution_count": 36 } ] }, { "metadata": { "id": "LjmVGMm0fthG", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 8.3 Compose closure functions" ] }, { "metadata": { "id": "dL-yPSgtfwHc", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Functions with state" ] }, { "metadata": { "id": "yIKSkPO8vI9Y", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def calorie_counter():\n", " \"\"\"Counts calories\"\"\"\n", " \n", " protein = 0\n", " fat = 0\n", " carbohydrate = 0\n", " total = 0\n", " def calorie_counter_inner(food):\n", " nonlocal protein\n", " nonlocal fat\n", " nonlocal carbohydrate\n", " if food == \"protein\":\n", " protein += 4\n", " elif food == \"carbohydrate\":\n", " carbohydrate += 4\n", " elif food == \"fat\":\n", " fat += 9\n", " total = protein + carbohydrate + fat\n", " print(f\"Consumed {total} calories of protein: {protein}, carbohydrate: {carbohydrate}, fat: {fat}\")\n", " return calorie_counter_inner" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "rhDStaUJxfG0", "colab_type": "code", "outputId": "bd6db3c5-7a7e-480b-906b-3c288923df63", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "meal = calorie_counter()\n", "type(meal)\n" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "function" ] }, "metadata": { "tags": [] }, "execution_count": 42 } ] }, { "metadata": { "id": "A_qbn0kfxsuv", "colab_type": "code", "outputId": "6d3e1779-6f81-48b3-8128-61a9a4169ac3", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "meal(\"carbohydrate\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Consumed 4 calories of protein: 0, carbohydrate: 4, fat: 0\n" ], "name": "stdout" } ] }, { "metadata": { "id": "_rswW-D-xtth", "colab_type": "code", "outputId": "a896f942-ca6d-4034-f7d4-acc3d0aadaa8", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "meal(\"fat\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Consumed 13 calories of protein: 0, carbohydrate: 4, fat: 9\n" ], "name": "stdout" } ] }, { "metadata": { "id": "OVZZFhewx41w", "colab_type": "code", "outputId": "25eeca4a-f07b-49f5-ec49-f44932725ee5", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "meal(\"protein\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Consumed 17 calories of protein: 4, carbohydrate: 4, fat: 9\n" ], "name": "stdout" } ] }, { "metadata": { "id": "Xox6IlosfwnH", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 8.4 Use lambda" ] }, { "metadata": { "id": "jZZPzN9ofzqE", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### YAGNI\n" ] }, { "metadata": { "id": "ZQl421JrpKJK", "colab_type": "text" }, "cell_type": "markdown", "source": [ "**Y**ou **A**in't **G**onna **N**eed** I**t\n" ] }, { "metadata": { "id": "y0lfEYGTR7Eo", "colab_type": "code", "outputId": "604ec6e7-0613-4019-fe8c-01c3035970a0", "colab": { "base_uri": "https://localhost:8080/", "height": 374 } }, "cell_type": "code", "source": [ "import this" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "The Zen of Python, by Tim Peters\n", "\n", "Beautiful is better than ugly.\n", "Explicit is better than implicit.\n", "Simple is better than complex.\n", "Complex is better than complicated.\n", "Flat is better than nested.\n", "Sparse is better than dense.\n", "Readability counts.\n", "Special cases aren't special enough to break the rules.\n", "Although practicality beats purity.\n", "Errors should never pass silently.\n", "Unless explicitly silenced.\n", "In the face of ambiguity, refuse the temptation to guess.\n", "There should be one-- and preferably only one --obvious way to do it.\n", "Although that way may not be obvious at first unless you're Dutch.\n", "Now is better than never.\n", "Although never is often better than *right* now.\n", "If the implementation is hard to explain, it's a bad idea.\n", "If the implementation is easy to explain, it may be a good idea.\n", "Namespaces are one honking great idea -- let's do more of those!\n" ], "name": "stdout" } ] }, { "metadata": { "colab_type": "code", "outputId": "5885a57c-da47-42b4-e4a7-96d1923dfab0", "id": "dc1YjODApmo3", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "func = lambda x: x**x\n", "func(4)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "256" ] }, "metadata": { "tags": [] }, "execution_count": 47 } ] }, { "metadata": { "colab_type": "code", "outputId": "f4cc5b17-6bfd-4172-da43-f800bd265086", "id": "NcsUPsvYpmo6", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "def expo(x):\n", " return x**x\n", "\n", "expo(4)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "256" ] }, "metadata": { "tags": [] }, "execution_count": 48 } ] }, { "metadata": { "id": "k67ErjU7qkVU", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Close Encounters with Lambdas" ] }, { "metadata": { "id": "aShV74ZSrZsm", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Used in series or DataFrame " ] }, { "metadata": { "id": "7hzx6YXMqsAr", "colab_type": "code", "outputId": "d88ec4ed-7f82-477b-df3c-0e19a23dd762", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "import pandas as pd\n", "\n", "series = pd.Series([1, 5, 10])\n", "series.apply(lambda x: x**x)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 1\n", "1 3125\n", "2 10000000000\n", "dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 50 } ] }, { "metadata": { "id": "CYeSRDx2SSdE", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def expo(x):\n", " return x**x\n", "\n", "expo(4)" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "GUlDqiwGSOtK", "colab_type": "code", "outputId": "0b0c533d-3c6c-4bf2-ecf6-67944a5832f0", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "import pandas as pd\n", "\n", "series = pd.Series([1, 5, 10])\n", "series.apply(expo)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 1\n", "1 3125\n", "2 10000000000\n", "dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 51 } ] }, { "metadata": { "id": "JiNag1pMfz2E", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 8.5 Advanced Use of Functions" ] }, { "metadata": { "id": "evHbw7C8f2y0", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Applying a Function to a Pandas DataFrame" ] }, { "metadata": { "id": "RXxygaSEeFqn", "colab_type": "code", "outputId": "c563ff1b-1729-4e72-ed02-073eebcf784a", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "cell_type": "code", "source": [ "import pandas as pd\n", "df = pd.read_csv(\n", " \"https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv\")\n", "df.drop([\"Unnamed: 0\", \"exceeded\", \"g_sum\", \"energy_100g\"], axis=1, inplace=True) #drop two rows we don't need\n", "df.head()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fat_100gcarbohydrates_100gsugars_100gproteins_100gsalt_100greconstructed_energyproduct
028.5764.2914.293.570.000002267.85Banana Chips Sweetened (Whole)
117.8660.7117.8617.860.635002032.23Peanuts
257.1417.863.5717.861.224282835.70Organic Salted Nut Mix
318.7557.8115.6214.060.139701953.04Organic Muesli
436.6736.673.3316.671.607822336.91Zen Party Mix
\n", "
" ], "text/plain": [ " fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g \\\n", "0 28.57 64.29 14.29 3.57 0.00000 \n", "1 17.86 60.71 17.86 17.86 0.63500 \n", "2 57.14 17.86 3.57 17.86 1.22428 \n", "3 18.75 57.81 15.62 14.06 0.13970 \n", "4 36.67 36.67 3.33 16.67 1.60782 \n", "\n", " reconstructed_energy product \n", "0 2267.85 Banana Chips Sweetened (Whole) \n", "1 2032.23 Peanuts \n", "2 2835.70 Organic Salted Nut Mix \n", "3 1953.04 Organic Muesli \n", "4 2336.91 Zen Party Mix " ] }, "metadata": { "tags": [] }, "execution_count": 52 } ] }, { "metadata": { "id": "HYvmsSyBeU-6", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def high_protein(row):\n", " \"\"\"Creates a high or low protein category\"\"\"\n", " \n", " if row > 80:\n", " return \"high_protein\"\n", " return \"low_protein\"" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "H9Qhe8YtfM-t", "colab_type": "code", "outputId": "492f4580-5046-451c-c6c7-027754801915", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "cell_type": "code", "source": [ "df[\"high_protein\"] = df[\"proteins_100g\"].apply(high_protein)\n", "df.head()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fat_100gcarbohydrates_100gsugars_100gproteins_100gsalt_100greconstructed_energyproducthigh_protein
028.5764.2914.293.570.000002267.85Banana Chips Sweetened (Whole)low_protein
117.8660.7117.8617.860.635002032.23Peanutslow_protein
257.1417.863.5717.861.224282835.70Organic Salted Nut Mixlow_protein
318.7557.8115.6214.060.139701953.04Organic Mueslilow_protein
436.6736.673.3316.671.607822336.91Zen Party Mixlow_protein
\n", "
" ], "text/plain": [ " fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g \\\n", "0 28.57 64.29 14.29 3.57 0.00000 \n", "1 17.86 60.71 17.86 17.86 0.63500 \n", "2 57.14 17.86 3.57 17.86 1.22428 \n", "3 18.75 57.81 15.62 14.06 0.13970 \n", "4 36.67 36.67 3.33 16.67 1.60782 \n", "\n", " reconstructed_energy product high_protein \n", "0 2267.85 Banana Chips Sweetened (Whole) low_protein \n", "1 2032.23 Peanuts low_protein \n", "2 2835.70 Organic Salted Nut Mix low_protein \n", "3 1953.04 Organic Muesli low_protein \n", "4 2336.91 Zen Party Mix low_protein " ] }, "metadata": { "tags": [] }, "execution_count": 54 } ] }, { "metadata": { "id": "1QSzM3o3Thyt", "colab_type": "code", "outputId": "c6bbcd19-322b-4991-aa78-03a37c4004e5", "colab": { "base_uri": "https://localhost:8080/", "height": 297 } }, "cell_type": "code", "source": [ "df.describe()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fat_100gcarbohydrates_100gsugars_100gproteins_100gsalt_100greconstructed_energy
count45028.00000045028.00000045028.00000045028.00000045028.00000045028.000000
mean10.76591034.05478816.0056146.6194371.4696311111.332304
std14.93008729.55701721.4955127.93677012.794943791.621634
min0.0000000.000000-1.200000-3.5700000.0000000.000000
25%0.0000007.4400001.5700000.0000000.063500334.520000
50%3.17000022.3900005.8800004.0000000.6350001121.540000
75%17.86000061.54000023.0800009.5200001.4401801678.460000
max100.000000100.000000100.000000100.0000002032.0000004475.000000
\n", "
" ], "text/plain": [ " fat_100g carbohydrates_100g sugars_100g proteins_100g \\\n", "count 45028.000000 45028.000000 45028.000000 45028.000000 \n", "mean 10.765910 34.054788 16.005614 6.619437 \n", "std 14.930087 29.557017 21.495512 7.936770 \n", "min 0.000000 0.000000 -1.200000 -3.570000 \n", "25% 0.000000 7.440000 1.570000 0.000000 \n", "50% 3.170000 22.390000 5.880000 4.000000 \n", "75% 17.860000 61.540000 23.080000 9.520000 \n", "max 100.000000 100.000000 100.000000 100.000000 \n", "\n", " salt_100g reconstructed_energy \n", "count 45028.000000 45028.000000 \n", "mean 1.469631 1111.332304 \n", "std 12.794943 791.621634 \n", "min 0.000000 0.000000 \n", "25% 0.063500 334.520000 \n", "50% 0.635000 1121.540000 \n", "75% 1.440180 1678.460000 \n", "max 2032.000000 4475.000000 " ] }, "metadata": { "tags": [] }, "execution_count": 55 } ] }, { "metadata": { "id": "ZrldedS-sFUS", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Partial Functions" ] }, { "metadata": { "id": "drjwAPlRsWLe", "colab_type": "code", "outputId": "a2abd261-507f-45ab-dbb7-75964acb05b1", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "from functools import partial\n", "\n", "def multiple_sort(column_one, column_two):\n", " \"\"\"Performs multiple sort on a pandas DataFrame\"\"\"\n", " \n", " sorted_df = df.sort_values(by=[column_one, column_two], \n", " ascending=[False, False])\n", " return sorted_df\n", " \n", "multisort = partial(multiple_sort, \"sugars_100g\")\n", "type(multisort)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "functools.partial" ] }, "metadata": { "tags": [] }, "execution_count": 56 } ] }, { "metadata": { "id": "ZFdOEsXMt-Yj", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Find sugary and fatty food" ] }, { "metadata": { "id": "NdmaOWEitzOj", "colab_type": "code", "outputId": "4aadd03a-6c25-47e2-8dfb-3e47b56ff492", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "cell_type": "code", "source": [ "df = multisort(\"fat_100g\")\n", "df.head()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fat_100gcarbohydrates_100gsugars_100gproteins_100gsalt_100greconstructed_energyproducthigh_protein
825425.00100.00100.00.000.000002675.00Princess Mix Decorationslow_protein
825525.00100.00100.00.000.000002675.00Frosted Mixlow_protein
825312.50100.00100.00.000.000002187.50Holiday Happiness Mixlow_protein
93711.7985.71100.07.140.045721648.26Organic Just Cherrieslow_protein
2220.00100.00100.00.000.000001700.00Tnt Exploding Candylow_protein
\n", "
" ], "text/plain": [ " fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g \\\n", "8254 25.00 100.00 100.0 0.00 0.00000 \n", "8255 25.00 100.00 100.0 0.00 0.00000 \n", "8253 12.50 100.00 100.0 0.00 0.00000 \n", "9371 1.79 85.71 100.0 7.14 0.04572 \n", "222 0.00 100.00 100.0 0.00 0.00000 \n", "\n", " reconstructed_energy product high_protein \n", "8254 2675.00 Princess Mix Decorations low_protein \n", "8255 2675.00 Frosted Mix low_protein \n", "8253 2187.50 Holiday Happiness Mix low_protein \n", "9371 1648.26 Organic Just Cherries low_protein \n", "222 1700.00 Tnt Exploding Candy low_protein " ] }, "metadata": { "tags": [] }, "execution_count": 57 } ] }, { "metadata": { "id": "yV-V_lB_uCIe", "colab_type": "text" }, "cell_type": "markdown", "source": [ "Find sugary and salty food" ] }, { "metadata": { "id": "cb7_LvMAuXnL", "colab_type": "code", "outputId": "175e1e94-4a85-4e65-ea4a-8524e238b068", "colab": { "base_uri": "https://localhost:8080/", "height": 238 } }, "cell_type": "code", "source": [ "df = multisort(\"salt_100g\")\n", "df.head()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fat_100gcarbohydrates_100gsugars_100gproteins_100gsalt_100greconstructed_energyproducthigh_protein
331510.00.0100.00.071.1200.0Turkey Brine Kit, Garlic & Herblow_protein
247830.0100.0100.00.024.1301700.0Seasoninglow_protein
40730.0100.0100.00.07.6201700.0Seasoning Rub, Sweet & Spicy Seafoodlow_protein
102820.0100.0100.00.02.5401700.0Instant Pectinlow_protein
178800.0100.0100.00.00.6351700.0Cranberry Cosmos Cocktail Rimming Sugarlow_protein
\n", "
" ], "text/plain": [ " fat_100g carbohydrates_100g sugars_100g proteins_100g salt_100g \\\n", "33151 0.0 0.0 100.0 0.0 71.120 \n", "24783 0.0 100.0 100.0 0.0 24.130 \n", "4073 0.0 100.0 100.0 0.0 7.620 \n", "10282 0.0 100.0 100.0 0.0 2.540 \n", "17880 0.0 100.0 100.0 0.0 0.635 \n", "\n", " reconstructed_energy product \\\n", "33151 0.0 Turkey Brine Kit, Garlic & Herb \n", "24783 1700.0 Seasoning \n", "4073 1700.0 Seasoning Rub, Sweet & Spicy Seafood \n", "10282 1700.0 Instant Pectin \n", "17880 1700.0 Cranberry Cosmos Cocktail Rimming Sugar \n", "\n", " high_protein \n", "33151 low_protein \n", "24783 low_protein \n", "4073 low_protein \n", "10282 low_protein \n", "17880 low_protein " ] }, "metadata": { "tags": [] }, "execution_count": 58 } ] } ] }