{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Lesson11-Python For Data Science-Lazy-Evaluation.ipynb", "version": "0.3.2", "provenance": [], "collapsed_sections": [], "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "metadata": { "id": "Crf8Kd8NfbPU", "colab_type": "text" }, "cell_type": "markdown", "source": [ "# Lesson 11: Lazy Evaluation " ] }, { "metadata": { "id": "c_Id55m6Jsbu", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## Pragmatic AI Labs\n", "\n" ] }, { "metadata": { "id": "e5p96AqpSDZa", "colab_type": "text" }, "cell_type": "markdown", "source": [ "![alt text](https://paiml.com/images/logo_with_slogan_white_background.png)\n", "\n", "This notebook was produced by [Pragmatic AI Labs](https://paiml.com/). You can continue learning about these topics by:\n", "\n", "* Buying a copy of [Pragmatic AI: An Introduction to Cloud-Based Machine Learning](http://www.informit.com/store/pragmatic-ai-an-introduction-to-cloud-based-machine-9780134863917)\n", "* Reading an online copy of [Pragmatic AI:Pragmatic AI: An Introduction to Cloud-Based Machine Learning](https://www.safaribooksonline.com/library/view/pragmatic-ai-an/9780134863924/)\n", "* Watching video [Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline](https://www.safaribooksonline.com/videos/essential-machine-learning/9780135261118) on Safari Books Online.\n", "* Watching video [AWS Certified Machine Learning-Speciality](https://learning.oreilly.com/videos/aws-certified-machine/9780135556597)\n", "* Purchasing video [Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video](http://www.informit.com/store/essential-machine-learning-and-ai-with-python-and-jupyter-9780135261095)\n", "* Viewing more content at [noahgift.com](https://noahgift.com/)\n" ] }, { "metadata": { "id": "pBTeTbnRKG_k", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "YImvqAwAgAvQ", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 11.1 Use generators" ] }, { "metadata": { "id": "5DdCcggCx_mr", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Lists and Generators" ] }, { "metadata": { "id": "CI49Xf1Zpv34", "colab_type": "code", "outputId": "d03f3a87-8e38-4553-9991-7aaeaecc7287", "colab": { "base_uri": "https://localhost:8080/", "height": 51 } }, "cell_type": "code", "source": [ "l_ten = [x for x in range(10)]\n", "g_ten = (x for x in range(10))\n", "\n", "print(f\"l_ten is a {type(l_ten)} and prints as: {l_ten}\")\n", "print(f\"g_ten is a {type(g_ten)} and prints as: {g_ten}\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "l_ten is a and prints as: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n", "g_ten is a and prints as: at 0x7fec69bbfc50>\n" ], "name": "stdout" } ] }, { "metadata": { "id": "rkYiVtcLGqtp", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Next" ] }, { "metadata": { "id": "-wNqIpQ3Givl", "colab_type": "code", "outputId": "d1ddcfed-ad94-45cf-dcf7-4c1bc567100a", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "next(g_ten)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "1" ] }, "metadata": { "tags": [] }, "execution_count": 3 } ] }, { "metadata": { "id": "YR7wQDaCyE2Q", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Iteration" ] }, { "metadata": { "id": "0k0L73YjvAxz", "colab_type": "code", "outputId": "089fa83e-c3d8-4c4a-a428-7941cd35d338", "colab": { "base_uri": "https://localhost:8080/", "height": 187 } }, "cell_type": "code", "source": [ "for x in g_ten:\n", " print(x)\n" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n" ], "name": "stdout" } ] }, { "metadata": { "id": "rowGF4XOf8Ka", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Indexing" ] }, { "metadata": { "id": "i6jk8wqS_Uuw", "colab_type": "code", "outputId": "9373abdd-37ec-487f-c7fb-8573ef2c36bd", "colab": { "base_uri": "https://localhost:8080/", "height": 181 } }, "cell_type": "code", "source": [ "\n", "g_ten[3]" ], "execution_count": 0, "outputs": [ { "output_type": "error", "ename": "TypeError", "evalue": "ignored", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mg_ten\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'generator' object is not subscriptable" ] } ] }, { "metadata": { "id": "cPELitYxgKCV", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Size" ] }, { "metadata": { "id": "PDL0ne2AvGwr", "colab_type": "code", "outputId": "c4d8625d-20c5-4e33-e3ee-1d0fbbaff264", "colab": { "base_uri": "https://localhost:8080/", "height": 51 } }, "cell_type": "code", "source": [ "import sys\n", "x = 100000000\n", "l_big = [x for x in range(x)]\n", "g_big = (x for x in range(x))\n", "\n", "print( f\"l_big is {sys.getsizeof(l_big)} bytes\")\n", "print( f\"g_big is {sys.getsizeof(g_big)} bytes\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "l_big is 859724472 bytes\n", "g_big is 88 bytes\n" ], "name": "stdout" } ] }, { "metadata": { "id": "f0hPWewPhEqs", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 11.2 Design generator pipelines" ] }, { "metadata": { "id": "bcf4O0eeG_SH", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Stringing generators together" ] }, { "metadata": { "id": "Upc6d_95p3nW", "colab_type": "code", "outputId": "9f50257a-2ce7-4c28-b4bd-5233ed8ab1cf", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "evens = (x*2 for x in range(5000000))\n", "three_factors = (x//3 for x in evens if x%3 == 0)\n", "titles = (f\"this number is {x}\" for x in three_factors)\n", "capped = (x.title() for x in titles)\n", "\n", "print(f\"The first call to capped: {next(capped)}\")\n", "print(f\"The second call to capped: {next(capped)}\")\n", "print(f\"The third call to capped: {next(capped)}\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "The first call to capped: This Number Is 0\n", "The second call to capped: This Number Is 2\n", "The third call to capped: This Number Is 4\n" ], "name": "stdout" } ] }, { "metadata": { "id": "Uw-d7YLvHFJ9", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Why use lazy evaluation\n", "Processing large datasets in smaller pieces.\n", "Example: Salt and protein of organic foods" ] }, { "metadata": { "id": "DQM-57LbN8xN", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Define generator to read file line by line" ] }, { "metadata": { "id": "9e-PhHF5I0Y5", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def row_reader(file_path):\n", " for line in open(file_path, 'r'):\n", " yield line" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "41Sunl1DOhxb", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### " ] }, { "metadata": { "id": "InaYYM8uHJgJ", "colab_type": "code", "outputId": "1888f219-d8d1-4240-a617-85fdd6fdc8f3", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "file_path = './features.en.openfoodfacts.org.products.csv'\n", "\n", "rows = row_reader(file_path)\n", "rows" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": { "tags": [] }, "execution_count": 100 } ] }, { "metadata": { "id": "uhIjFYv-P7HL", "colab_type": "code", "outputId": "71c72516-8cc2-4968-dae4-6df87f694e95", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "next(rows)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'3,57.14,17.86,3.57,17.86,1.22428,2540,2835.7,92.86,0,Organic Salted Nut Mix\\n'" ] }, "metadata": { "tags": [] }, "execution_count": 104 } ] }, { "metadata": { "id": "u7XEQ_IxPnom", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Generator pipeline to process one line at a time" ] }, { "metadata": { "id": "MQKlrhuxJCAH", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def row_reader(file_path):\n", " line_reader = (x for x in open(file_path, 'r'))\n", " \n", " organics_only = (x.split(',') for x in line_reader if x.split(',')[-1].startswith('Organic'))\n", "\n", " name_salt_protein = ((x[-1], x[-6], x[-7]) for x in organics_only)\n", " \n", " return name_salt_protein\n", "\n", "\n", "\n", "rows = row_reader(file_path)" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "de771jMOQFoX", "colab_type": "code", "outputId": "bea9f0ad-fac3-46b8-8eff-8ab8281d1c87", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "next(rows)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "('Organic Oat Groats\\n', '0.0254', '16.67')" ] }, "metadata": { "tags": [] }, "execution_count": 109 } ] }, { "metadata": { "id": "a3MlETh6Lxp8", "colab_type": "code", "outputId": "8245617e-7d28-4a5b-8400-232362b98adb", "colab": { "base_uri": "https://localhost:8080/", "height": 1969 } }, "cell_type": "code", "source": [ "import pandas\n", "organics = pandas.DataFrame(columns=['Name', 'Salt', 'Protein'])\n", "\n", "rows = row_reader(file_path)\n", "\n", "for new_row in rows:\n", " organics.loc[len(organics)] = new_row\n", " \n", "organics" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NameSaltProtein
0Organic Salted Nut Mix\\n1.2242817.86
1Organic Muesli\\n0.139714.06
2Organic Hazelnuts\\n0.0101614.29
3Organic Oat Groats\\n0.025416.67
4Organic Quinoa Coconut Granola With Mango\\n0.0228610.91
5Organic Unswt Berry Coconut Granola\\n0.2819412.96
6Organic Red Quinoa\\n0.0101613.33
7Organic Blueberry Almond Granola\\n0.0457210.91
8Organic Coconut Chips\\n0.093986
9Organic Garbanzo Beans\\n0.0533417.02
10Organic Yellow Split Peas\\n0.0558828.89
11Organic Trail Mix\\n0.12713.33
12Organic Raw Pumpkin Seeds\\n0.0431830
13Organic Tamari Pumpkin Seed\\n0.9702826.47
14Organic Harvest Pilaf\\n0.0279415.56
15Organic Salted Pistachios\\n1.4503421.43
16Organic Medjool Dates\\n0.01272.2
17Organic Whole Cashews\\n0.038114.71
18Organic Flourless Sprouted 7-Grain Bread\\n0.673111.76
19Organic Sunny Days Snack Bars\\n0.601985.26
20Organic Nine Grain All Natural Bread\\n1.003311.63
21Organic 100% Whole Wheat\\n0.828049.3
22Organic Great Seed\\n0.8864611.63
23Organic Tortellini Pasta\\n0.38110
24Organic Ravioli\\n0.55889
25Organic Broccoli Florets\\n0.073663.53
26Organic Creamy Tomato Bisque\\n0.756921.22
27Organic Green Peas\\n0.57155.62
28Organic Mixed Vegetable\\n0.193042.35
29Organic Beef Burger\\n0.1422417.88
............
855Organic Gummy Bears & Worms\\n02.38
856Organic Fruit Flavored Snacks\\n0.109224.35
857Organic Gummy Bears\\n04.35
858Organic Lollipops\\n00
859Organic Sour Head\\n00
860Organic Buttermilk Pancake Mix\\n2.278387.69
861Organic Yellow Cake Mix\\n1.9054.55
862Organic Double Chocolate Brownie Mix\\n1.089663.57
863Organic Whole Grain Muffin Mix\\n2.047245.56
864Organic 1% Lowfat Milk\\n0.144783.39
865Organic Whole Milk\\n0.13973.39
866Organic 2% Reduced Fat Milk\\n0.13973.39
867Organic Fat Free Skim Milk\\n0.144783.81
868Organic Fruit\\n01.43
869Organic Vegetable Chili\\n0.632462.45
870Organic Salted Butter Made With Organic Sweet ...1.633220
871Organic Whole Milk\\n0.1273.33
872Organic 2% Reduced Fat Milk\\n0.132083.33
873Organic Lowfat Milk\\n0.132083.33
874Organic Milk\\n0.132083.33
875Organic Milk\\n0.134623.38
876Organic Brown Flax Seeds\\n0.0787415.38
877Organic Raw Shelled Pumpkin Seed\\n0.0457225
878Organic Raw Sunflower Meat\\n0.0279421.43
879Organic Maple Syrup\\n0.020320
880Organic Super Sweet Whole Kernel Corn\\n0.40641.6
881Organic Sweet Peas\\n0.60963.2
882Organic Cut Green Beans\\n0.614680.83
883Organic Black Beans\\n0.2546.15
884Organic Dark Red Kidney Beans\\n0.274326.92
\n", "

885 rows × 3 columns

\n", "
" ], "text/plain": [ " Name Salt Protein\n", "0 Organic Salted Nut Mix\\n 1.22428 17.86\n", "1 Organic Muesli\\n 0.1397 14.06\n", "2 Organic Hazelnuts\\n 0.01016 14.29\n", "3 Organic Oat Groats\\n 0.0254 16.67\n", "4 Organic Quinoa Coconut Granola With Mango\\n 0.02286 10.91\n", "5 Organic Unswt Berry Coconut Granola\\n 0.28194 12.96\n", "6 Organic Red Quinoa\\n 0.01016 13.33\n", "7 Organic Blueberry Almond Granola\\n 0.04572 10.91\n", "8 Organic Coconut Chips\\n 0.09398 6\n", "9 Organic Garbanzo Beans\\n 0.05334 17.02\n", "10 Organic Yellow Split Peas\\n 0.05588 28.89\n", "11 Organic Trail Mix\\n 0.127 13.33\n", "12 Organic Raw Pumpkin Seeds\\n 0.04318 30\n", "13 Organic Tamari Pumpkin Seed\\n 0.97028 26.47\n", "14 Organic Harvest Pilaf\\n 0.02794 15.56\n", "15 Organic Salted Pistachios\\n 1.45034 21.43\n", "16 Organic Medjool Dates\\n 0.0127 2.2\n", "17 Organic Whole Cashews\\n 0.0381 14.71\n", "18 Organic Flourless Sprouted 7-Grain Bread\\n 0.6731 11.76\n", "19 Organic Sunny Days Snack Bars\\n 0.60198 5.26\n", "20 Organic Nine Grain All Natural Bread\\n 1.0033 11.63\n", "21 Organic 100% Whole Wheat\\n 0.82804 9.3\n", "22 Organic Great Seed\\n 0.88646 11.63\n", "23 Organic Tortellini Pasta\\n 0.381 10\n", "24 Organic Ravioli\\n 0.5588 9\n", "25 Organic Broccoli Florets\\n 0.07366 3.53\n", "26 Organic Creamy Tomato Bisque\\n 0.75692 1.22\n", "27 Organic Green Peas\\n 0.5715 5.62\n", "28 Organic Mixed Vegetable\\n 0.19304 2.35\n", "29 Organic Beef Burger\\n 0.14224 17.88\n", ".. ... ... ...\n", "855 Organic Gummy Bears & Worms\\n 0 2.38\n", "856 Organic Fruit Flavored Snacks\\n 0.10922 4.35\n", "857 Organic Gummy Bears\\n 0 4.35\n", "858 Organic Lollipops\\n 0 0\n", "859 Organic Sour Head\\n 0 0\n", "860 Organic Buttermilk Pancake Mix\\n 2.27838 7.69\n", "861 Organic Yellow Cake Mix\\n 1.905 4.55\n", "862 Organic Double Chocolate Brownie Mix\\n 1.08966 3.57\n", "863 Organic Whole Grain Muffin Mix\\n 2.04724 5.56\n", "864 Organic 1% Lowfat Milk\\n 0.14478 3.39\n", "865 Organic Whole Milk\\n 0.1397 3.39\n", "866 Organic 2% Reduced Fat Milk\\n 0.1397 3.39\n", "867 Organic Fat Free Skim Milk\\n 0.14478 3.81\n", "868 Organic Fruit\\n 0 1.43\n", "869 Organic Vegetable Chili\\n 0.63246 2.45\n", "870 Organic Salted Butter Made With Organic Sweet ... 1.63322 0\n", "871 Organic Whole Milk\\n 0.127 3.33\n", "872 Organic 2% Reduced Fat Milk\\n 0.13208 3.33\n", "873 Organic Lowfat Milk\\n 0.13208 3.33\n", "874 Organic Milk\\n 0.13208 3.33\n", "875 Organic Milk\\n 0.13462 3.38\n", "876 Organic Brown Flax Seeds\\n 0.07874 15.38\n", "877 Organic Raw Shelled Pumpkin Seed\\n 0.04572 25\n", "878 Organic Raw Sunflower Meat\\n 0.02794 21.43\n", "879 Organic Maple Syrup\\n 0.02032 0\n", "880 Organic Super Sweet Whole Kernel Corn\\n 0.4064 1.6\n", "881 Organic Sweet Peas\\n 0.6096 3.2\n", "882 Organic Cut Green Beans\\n 0.61468 0.83\n", "883 Organic Black Beans\\n 0.254 6.15\n", "884 Organic Dark Red Kidney Beans\\n 0.27432 6.92\n", "\n", "[885 rows x 3 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 110 } ] }, { "metadata": { "id": "BR6eLrbzhGPb", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## 11.3 Implement lazy evaluation functions" ] }, { "metadata": { "id": "n_kqQG7usG8e", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Generator functions" ] }, { "metadata": { "id": "kimMnJ_KusGM", "colab_type": "code", "outputId": "a978f830-746d-40ea-95d3-954b1cf47d86", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "def square_them(numbers):\n", " for number in numbers:\n", " yield number * number\n", " \n", " \n", "s = square_them(range(10000))\n", "\n", "print(next(s))\n", "print(next(s))\n", "print(next(s))\n", "print(next(s))" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "0\n", "1\n", "4\n", "9\n" ], "name": "stdout" } ] }, { "metadata": { "id": "HW-lco4AuoQj", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Infinite generators" ] }, { "metadata": { "id": "vv-uU-F_fapM", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def counter(d):\n", " \n", " while True:\n", " d += 1\n", " yield d" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "dgJGyw-UsEng", "colab_type": "code", "outputId": "14a5d37a-2af1-4bb2-853c-6509286a90f8", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "c = counter(10)\n", "\n", "print(next(c))\n", "print(next(c))\n", "print(next(c))" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "11\n", "12\n", "13\n" ], "name": "stdout" } ] }, { "metadata": { "id": "v0VTQqKNtQ4k", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Other forms of lazy evaluation" ] }, { "metadata": { "id": "PyYJRYHNtW9_", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "def some_expensive_connection():\n", " import time\n", " time.sleep(10)\n", " return {}\n", "\n", "_DB = None\n", "\n", "def DB():\n", " global _DB\n", " if _DB is None:\n", " _DB = some_expensive_connection()\n", " \n", " \n", " \n", " \n", "\n", " " ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "5NGAfQcrVIBx", "colab_type": "text" }, "cell_type": "markdown", "source": [ "# File setup" ] }, { "metadata": { "id": "vy1L87IJJ962", "colab_type": "code", "outputId": "7b118d1d-8884-443a-fd9e-b19ce4a1b55e", "colab": { "resources": { "http://localhost:8080/nbextensions/google.colab/files.js": { "data": "Ly8gQ29weXJpZ2h0IDIwMTcgR29vZ2xlIExMQwovLwovLyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKLy8geW91IG1heSBub3QgdXNlIHRoaXMgZmlsZSBleGNlcHQgaW4gY29tcGxpYW5jZSB3aXRoIHRoZSBMaWNlbnNlLgovLyBZb3UgbWF5IG9idGFpbiBhIGNvcHkgb2YgdGhlIExpY2Vuc2UgYXQKLy8KLy8gICAgICBodHRwOi8vd3d3LmFwYWNoZS5vcmcvbGljZW5zZXMvTElDRU5TRS0yLjAKLy8KLy8gVW5sZXNzIHJlcXVpcmVkIGJ5IGFwcGxpY2FibGUgbGF3IG9yIGFncmVlZCB0byBpbiB3cml0aW5nLCBzb2Z0d2FyZQovLyBkaXN0cmlidXRlZCB1bmRlciB0aGUgTGljZW5zZSBpcyBkaXN0cmlidXRlZCBvbiBhbiAiQVMgSVMiIEJBU0lTLAovLyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KLy8gU2VlIHRoZSBMaWNlbnNlIGZvciB0aGUgc3BlY2lmaWMgbGFuZ3VhZ2UgZ292ZXJuaW5nIHBlcm1pc3Npb25zIGFuZAovLyBsaW1pdGF0aW9ucyB1bmRlciB0aGUgTGljZW5zZS4KCi8qKgogKiBAZmlsZW92ZXJ2aWV3IEhlbHBlcnMgZm9yIGdvb2dsZS5jb2xhYiBQeXRob24gbW9kdWxlLgogKi8KKGZ1bmN0aW9uKHNjb3BlKSB7CmZ1bmN0aW9uIHNwYW4odGV4dCwgc3R5bGVBdHRyaWJ1dGVzID0ge30pIHsKICBjb25zdCBlbGVtZW50ID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnc3BhbicpOwogIGVsZW1lbnQudGV4dENvbnRlbnQgPSB0ZXh0OwogIGZvciAoY29uc3Qga2V5IG9mIE9iamVjdC5rZXlzKHN0eWxlQXR0cmlidXRlcykpIHsKICAgIGVsZW1lbnQuc3R5bGVba2V5XSA9IHN0eWxlQXR0cmlidXRlc1trZXldOwogIH0KICByZXR1cm4gZWxlbWVudDsKfQoKLy8gTWF4IG51bWJlciBvZiBieXRlcyB3aGljaCB3aWxsIGJlIHVwbG9hZGVkIGF0IGEgdGltZS4KY29uc3QgTUFYX1BBWUxPQURfU0laRSA9IDEwMCAqIDEwMjQ7Ci8vIE1heCBhbW91bnQgb2YgdGltZSB0byBibG9jayB3YWl0aW5nIGZvciB0aGUgdXNlci4KY29uc3QgRklMRV9DSEFOR0VfVElNRU9VVF9NUyA9IDMwICogMTAwMDsKCmZ1bmN0aW9uIF91cGxvYWRGaWxlcyhpbnB1dElkLCBvdXRwdXRJZCkgewogIGNvbnN0IHN0ZXBzID0gdXBsb2FkRmlsZXNTdGVwKGlucHV0SWQsIG91dHB1dElkKTsKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIC8vIENhY2hlIHN0ZXBzIG9uIHRoZSBvdXRwdXRFbGVtZW50IHRvIG1ha2UgaXQgYXZhaWxhYmxlIGZvciB0aGUgbmV4dCBjYWxsCiAgLy8gdG8gdXBsb2FkRmlsZXNDb250aW51ZSBmcm9tIFB5dGhvbi4KICBvdXRwdXRFbGVtZW50LnN0ZXBzID0gc3RlcHM7CgogIHJldHVybiBfdXBsb2FkRmlsZXNDb250aW51ZShvdXRwdXRJZCk7Cn0KCi8vIFRoaXMgaXMgcm91Z2hseSBhbiBhc3luYyBnZW5lcmF0b3IgKG5vdCBzdXBwb3J0ZWQgaW4gdGhlIGJyb3dzZXIgeWV0KSwKLy8gd2hlcmUgdGhlcmUgYXJlIG11bHRpcGxlIGFzeW5jaHJvbm91cyBzdGVwcyBhbmQgdGhlIFB5dGhvbiBzaWRlIGlzIGdvaW5nCi8vIHRvIHBvbGwgZm9yIGNvbXBsZXRpb24gb2YgZWFjaCBzdGVwLgovLyBUaGlzIHVzZXMgYSBQcm9taXNlIHRvIGJsb2NrIHRoZSBweXRob24gc2lkZSBvbiBjb21wbGV0aW9uIG9mIGVhY2ggc3RlcCwKLy8gdGhlbiBwYXNzZXMgdGhlIHJlc3VsdCBvZiB0aGUgcHJldmlvdXMgc3RlcCBhcyB0aGUgaW5wdXQgdG8gdGhlIG5leHQgc3RlcC4KZnVuY3Rpb24gX3VwbG9hZEZpbGVzQ29udGludWUob3V0cHV0SWQpIHsKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIGNvbnN0IHN0ZXBzID0gb3V0cHV0RWxlbWVudC5zdGVwczsKCiAgY29uc3QgbmV4dCA9IHN0ZXBzLm5leHQob3V0cHV0RWxlbWVudC5sYXN0UHJvbWlzZVZhbHVlKTsKICByZXR1cm4gUHJvbWlzZS5yZXNvbHZlKG5leHQudmFsdWUucHJvbWlzZSkudGhlbigodmFsdWUpID0+IHsKICAgIC8vIENhY2hlIHRoZSBsYXN0IHByb21pc2UgdmFsdWUgdG8gbWFrZSBpdCBhdmFpbGFibGUgdG8gdGhlIG5leHQKICAgIC8vIHN0ZXAgb2YgdGhlIGdlbmVyYXRvci4KICAgIG91dHB1dEVsZW1lbnQubGFzdFByb21pc2VWYWx1ZSA9IHZhbHVlOwogICAgcmV0dXJuIG5leHQudmFsdWUucmVzcG9uc2U7CiAgfSk7Cn0KCi8qKgogKiBHZW5lcmF0b3IgZnVuY3Rpb24gd2hpY2ggaXMgY2FsbGVkIGJldHdlZW4gZWFjaCBhc3luYyBzdGVwIG9mIHRoZSB1cGxvYWQKICogcHJvY2Vzcy4KICogQHBhcmFtIHtzdHJpbmd9IGlucHV0SWQgRWxlbWVudCBJRCBvZiB0aGUgaW5wdXQgZmlsZSBwaWNrZXIgZWxlbWVudC4KICogQHBhcmFtIHtzdHJpbmd9IG91dHB1dElkIEVsZW1lbnQgSUQgb2YgdGhlIG91dHB1dCBkaXNwbGF5LgogKiBAcmV0dXJuIHshSXRlcmFibGU8IU9iamVjdD59IEl0ZXJhYmxlIG9mIG5leHQgc3RlcHMuCiAqLwpmdW5jdGlvbiogdXBsb2FkRmlsZXNTdGVwKGlucHV0SWQsIG91dHB1dElkKSB7CiAgY29uc3QgaW5wdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQoaW5wdXRJZCk7CiAgaW5wdXRFbGVtZW50LmRpc2FibGVkID0gZmFsc2U7CgogIGNvbnN0IG91dHB1dEVsZW1lbnQgPSBkb2N1bWVudC5nZXRFbGVtZW50QnlJZChvdXRwdXRJZCk7CiAgb3V0cHV0RWxlbWVudC5pbm5lckhUTUwgPSAnJzsKCiAgY29uc3QgcGlja2VkUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICBpbnB1dEVsZW1lbnQuYWRkRXZlbnRMaXN0ZW5lcignY2hhbmdlJywgKGUpID0+IHsKICAgICAgcmVzb2x2ZShlLnRhcmdldC5maWxlcyk7CiAgICB9KTsKICB9KTsKCiAgY29uc3QgY2FuY2VsID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnYnV0dG9uJyk7CiAgaW5wdXRFbGVtZW50LnBhcmVudEVsZW1lbnQuYXBwZW5kQ2hpbGQoY2FuY2VsKTsKICBjYW5jZWwudGV4dENvbnRlbnQgPSAnQ2FuY2VsIHVwbG9hZCc7CiAgY29uc3QgY2FuY2VsUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICBjYW5jZWwub25jbGljayA9ICgpID0+IHsKICAgICAgcmVzb2x2ZShudWxsKTsKICAgIH07CiAgfSk7CgogIC8vIENhbmNlbCB1cGxvYWQgaWYgdXNlciBoYXNuJ3QgcGlja2VkIGFueXRoaW5nIGluIHRpbWVvdXQuCiAgY29uc3QgdGltZW91dFByb21pc2UgPSBuZXcgUHJvbWlzZSgocmVzb2x2ZSkgPT4gewogICAgc2V0VGltZW91dCgoKSA9PiB7CiAgICAgIHJlc29sdmUobnVsbCk7CiAgICB9LCBGSUxFX0NIQU5HRV9USU1FT1VUX01TKTsKICB9KTsKCiAgLy8gV2FpdCBmb3IgdGhlIHVzZXIgdG8gcGljayB0aGUgZmlsZXMuCiAgY29uc3QgZmlsZXMgPSB5aWVsZCB7CiAgICBwcm9taXNlOiBQcm9taXNlLnJhY2UoW3BpY2tlZFByb21pc2UsIHRpbWVvdXRQcm9taXNlLCBjYW5jZWxQcm9taXNlXSksCiAgICByZXNwb25zZTogewogICAgICBhY3Rpb246ICdzdGFydGluZycsCiAgICB9CiAgfTsKCiAgaWYgKCFmaWxlcykgewogICAgcmV0dXJuIHsKICAgICAgcmVzcG9uc2U6IHsKICAgICAgICBhY3Rpb246ICdjb21wbGV0ZScsCiAgICAgIH0KICAgIH07CiAgfQoKICBjYW5jZWwucmVtb3ZlKCk7CgogIC8vIERpc2FibGUgdGhlIGlucHV0IGVsZW1lbnQgc2luY2UgZnVydGhlciBwaWNrcyBhcmUgbm90IGFsbG93ZWQuCiAgaW5wdXRFbGVtZW50LmRpc2FibGVkID0gdHJ1ZTsKCiAgZm9yIChjb25zdCBmaWxlIG9mIGZpbGVzKSB7CiAgICBjb25zdCBsaSA9IGRvY3VtZW50LmNyZWF0ZUVsZW1lbnQoJ2xpJyk7CiAgICBsaS5hcHBlbmQoc3BhbihmaWxlLm5hbWUsIHtmb250V2VpZ2h0OiAnYm9sZCd9KSk7CiAgICBsaS5hcHBlbmQoc3BhbigKICAgICAgICBgKCR7ZmlsZS50eXBlIHx8ICduL2EnfSkgLSAke2ZpbGUuc2l6ZX0gYnl0ZXMsIGAgKwogICAgICAgIGBsYXN0IG1vZGlmaWVkOiAkewogICAgICAgICAgICBmaWxlLmxhc3RNb2RpZmllZERhdGUgPyBmaWxlLmxhc3RNb2RpZmllZERhdGUudG9Mb2NhbGVEYXRlU3RyaW5nKCkgOgogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAnbi9hJ30gLSBgKSk7CiAgICBjb25zdCBwZXJjZW50ID0gc3BhbignMCUgZG9uZScpOwogICAgbGkuYXBwZW5kQ2hpbGQocGVyY2VudCk7CgogICAgb3V0cHV0RWxlbWVudC5hcHBlbmRDaGlsZChsaSk7CgogICAgY29uc3QgZmlsZURhdGFQcm9taXNlID0gbmV3IFByb21pc2UoKHJlc29sdmUpID0+IHsKICAgICAgY29uc3QgcmVhZGVyID0gbmV3IEZpbGVSZWFkZXIoKTsKICAgICAgcmVhZGVyLm9ubG9hZCA9IChlKSA9PiB7CiAgICAgICAgcmVzb2x2ZShlLnRhcmdldC5yZXN1bHQpOwogICAgICB9OwogICAgICByZWFkZXIucmVhZEFzQXJyYXlCdWZmZXIoZmlsZSk7CiAgICB9KTsKICAgIC8vIFdhaXQgZm9yIHRoZSBkYXRhIHRvIGJlIHJlYWR5LgogICAgbGV0IGZpbGVEYXRhID0geWllbGQgewogICAgICBwcm9taXNlOiBmaWxlRGF0YVByb21pc2UsCiAgICAgIHJlc3BvbnNlOiB7CiAgICAgICAgYWN0aW9uOiAnY29udGludWUnLAogICAgICB9CiAgICB9OwoKICAgIC8vIFVzZSBhIGNodW5rZWQgc2VuZGluZyB0byBhdm9pZCBtZXNzYWdlIHNpemUgbGltaXRzLiBTZWUgYi82MjExNTY2MC4KICAgIGxldCBwb3NpdGlvbiA9IDA7CiAgICB3aGlsZSAocG9zaXRpb24gPCBmaWxlRGF0YS5ieXRlTGVuZ3RoKSB7CiAgICAgIGNvbnN0IGxlbmd0aCA9IE1hdGgubWluKGZpbGVEYXRhLmJ5dGVMZW5ndGggLSBwb3NpdGlvbiwgTUFYX1BBWUxPQURfU0laRSk7CiAgICAgIGNvbnN0IGNodW5rID0gbmV3IFVpbnQ4QXJyYXkoZmlsZURhdGEsIHBvc2l0aW9uLCBsZW5ndGgpOwogICAgICBwb3NpdGlvbiArPSBsZW5ndGg7CgogICAgICBjb25zdCBiYXNlNjQgPSBidG9hKFN0cmluZy5mcm9tQ2hhckNvZGUuYXBwbHkobnVsbCwgY2h1bmspKTsKICAgICAgeWllbGQgewogICAgICAgIHJlc3BvbnNlOiB7CiAgICAgICAgICBhY3Rpb246ICdhcHBlbmQnLAogICAgICAgICAgZmlsZTogZmlsZS5uYW1lLAogICAgICAgICAgZGF0YTogYmFzZTY0LAogICAgICAgIH0sCiAgICAgIH07CiAgICAgIHBlcmNlbnQudGV4dENvbnRlbnQgPQogICAgICAgICAgYCR7TWF0aC5yb3VuZCgocG9zaXRpb24gLyBmaWxlRGF0YS5ieXRlTGVuZ3RoKSAqIDEwMCl9JSBkb25lYDsKICAgIH0KICB9CgogIC8vIEFsbCBkb25lLgogIHlpZWxkIHsKICAgIHJlc3BvbnNlOiB7CiAgICAgIGFjdGlvbjogJ2NvbXBsZXRlJywKICAgIH0KICB9Owp9CgpzY29wZS5nb29nbGUgPSBzY29wZS5nb29nbGUgfHwge307CnNjb3BlLmdvb2dsZS5jb2xhYiA9IHNjb3BlLmdvb2dsZS5jb2xhYiB8fCB7fTsKc2NvcGUuZ29vZ2xlLmNvbGFiLl9maWxlcyA9IHsKICBfdXBsb2FkRmlsZXMsCiAgX3VwbG9hZEZpbGVzQ29udGludWUsCn07Cn0pKHNlbGYpOwo=", "ok": true, "headers": [ [ "content-type", "application/javascript" ] ], "status": 200, "status_text": "" } }, "base_uri": "https://localhost:8080/", "height": 105 } }, "cell_type": "code", "source": [ "from google.colab import files\n", "# /Users/kbehrman/Google-Drive/projects/pragailabs/python-for-data-science/food/data\n", "files.upload()\n", "!ls" ], "execution_count": 0, "outputs": [ { "output_type": "display_data", "data": { "text/html": [ "\n", " \n", " \n", " Upload widget is only available when the cell has been executed in the\n", " current browser session. Please rerun this cell to enable.\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": { "tags": [] } }, { "output_type": "stream", "text": [ "'features.en.openfoodfacts.org.products (1).csv'\n", "'features.en.openfoodfacts.org.products (2).csv'\n", " features.en.openfoodfacts.org.products.csv\n", " sample_data\n" ], "name": "stdout" } ] } ] }