{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Lesson5-Python For Data Science-Python-Data-structure.ipynb", "version": "0.3.2", "provenance": [], "collapsed_sections": [ "_rehYX145nEk", "g4-XjayC8KO7" ], "include_colab_link": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "metadata": { "colab_type": "text", "id": "s39SnnxrZxsx" }, "cell_type": "markdown", "source": [ "# Lesson 5: Python Data Structures\n" ] }, { "metadata": { "id": "c_Id55m6Jsbu", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## Pragmatic AI Labs\n", "\n" ] }, { "metadata": { "id": "e5p96AqpSDZa", "colab_type": "text" }, "cell_type": "markdown", "source": [ "![alt text](https://paiml.com/images/logo_with_slogan_white_background.png)\n", "\n", "This notebook was produced by [Pragmatic AI Labs](https://paiml.com/). You can continue learning about these topics by:\n", "\n", "* Buying a copy of [Pragmatic AI: An Introduction to Cloud-Based Machine Learning](http://www.informit.com/store/pragmatic-ai-an-introduction-to-cloud-based-machine-9780134863917)\n", "* Reading an online copy of [Pragmatic AI:Pragmatic AI: An Introduction to Cloud-Based Machine Learning](https://www.safaribooksonline.com/library/view/pragmatic-ai-an/9780134863924/)\n", "* Watching video [Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline](https://www.safaribooksonline.com/videos/essential-machine-learning/9780135261118) on Safari Books Online.\n", "* Watching video [AWS Certified Machine Learning-Speciality](https://learning.oreilly.com/videos/aws-certified-machine/9780135556597)\n", "* Purchasing video [Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video](http://www.informit.com/store/essential-machine-learning-and-ai-with-python-and-jupyter-9780135261095)\n", "* Viewing more content at [noahgift.com](https://noahgift.com/)\n" ] }, { "metadata": { "colab_type": "text", "id": "0mnOHaZpZ1CU" }, "cell_type": "markdown", "source": [ "## 5.1 Use lists and tuples" ] }, { "metadata": { "colab_type": "text", "id": "aCBZvOZn3V6D" }, "cell_type": "markdown", "source": [ "### Sequences\n", "Lists, tuples, and strings are all Python sequences, and share many of the same methods." ] }, { "metadata": { "colab_type": "text", "id": "MgQzM2FbzhpW" }, "cell_type": "markdown", "source": [ "### Creating an empty list" ] }, { "metadata": { "colab_type": "code", "id": "1gUdm3jfzlCB", "outputId": "a84844d5-a2f6-43f5-f587-8e00535ea56f", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "empty = []\n", "empty" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "[]" ] }, "metadata": { "tags": [] }, "execution_count": 4 } ] }, { "metadata": { "colab_type": "text", "id": "fLx7Rtstz3Pn" }, "cell_type": "markdown", "source": [ "### Using square brackets with initial values" ] }, { "metadata": { "colab_type": "code", "id": "X6_VTC9moTAM", "outputId": "c218cea9-dbf7-4b5d-d8b2-4915070562a0", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "numbers = [1, 2, 3]\n", "numbers\n" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "[1, 2, 3]" ] }, "metadata": { "tags": [] }, "execution_count": 5 } ] }, { "metadata": { "colab_type": "text", "id": "RN5dAwcv4EwQ" }, "cell_type": "markdown", "source": [ "### Casting an iterable\n", "Any iterable can be cast to a list" ] }, { "metadata": { "colab_type": "code", "id": "IWCOdiiJ4Iv5", "outputId": "63d56640-a0c9-4f64-8882-b61950a8a059", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "numbers = list(range(10))\n", "numbers" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" ] }, "metadata": { "tags": [] }, "execution_count": 6 } ] }, { "metadata": { "colab_type": "text", "id": "m7FQdEVcAMJl" }, "cell_type": "markdown", "source": [ "### Creating using multiplication" ] }, { "metadata": { "colab_type": "code", "id": "lanCHsNZATqK", "outputId": "afdc49d3-928d-4854-9ff4-4ef2709d10d7", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "num_players = 10\n", "scores = [0] * num_players\n", "scores" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]" ] }, "metadata": { "tags": [] }, "execution_count": 7 } ] }, { "metadata": { "colab_type": "text", "id": "h9l7kUOC43iL" }, "cell_type": "markdown", "source": [ "### Mixing data types\n", "Lists can contain multple data types" ] }, { "metadata": { "colab_type": "code", "id": "jBqfcq6Q4-Yl", "outputId": "38f14f53-4a84-4c4a-d56b-3254d72bd0a7", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "mixed = ['a', 1, 2.0, [13], {}]\n", "mixed" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['a', 1, 2.0, [13], {}]" ] }, "metadata": { "tags": [] }, "execution_count": 8 } ] }, { "metadata": { "colab_type": "text", "id": "_rehYX145nEk" }, "cell_type": "markdown", "source": [ "### Indexing\n", "Items in lists can be accessed using indices in a similar fashion to strings." ] }, { "metadata": { "colab_type": "text", "id": "PuGNKkIV5_64" }, "cell_type": "markdown", "source": [ "#### Access first item" ] }, { "metadata": { "colab_type": "code", "id": "98QVzpN_ogFQ", "outputId": "93224fef-7b66-492e-851c-62c71a4d3eb5", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "numbers[0]\n" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0" ] }, "metadata": { "tags": [] }, "execution_count": 9 } ] }, { "metadata": { "colab_type": "text", "id": "wYljuMmX6FDo" }, "cell_type": "markdown", "source": [ "#### Access last item" ] }, { "metadata": { "colab_type": "code", "id": "j5XB0hVZ6S5E", "outputId": "702efb97-d245-4b06-c028-fe8ae61288e6", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "numbers[-2]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "8" ] }, "metadata": { "tags": [] }, "execution_count": 11 } ] }, { "metadata": { "colab_type": "text", "id": "tZJYdQW87vwk" }, "cell_type": "markdown", "source": [ "#### Access any item" ] }, { "metadata": { "colab_type": "code", "id": "7EJZyQUl7y_5", "outputId": "433c5b3d-51f9-44ee-bb1b-d8d162c744bf", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "numbers[4]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "4" ] }, "metadata": { "tags": [] }, "execution_count": 12 } ] }, { "metadata": { "colab_type": "text", "id": "g4-XjayC8KO7" }, "cell_type": "markdown", "source": [ "### Adding to a list" ] }, { "metadata": { "colab_type": "text", "id": "VVoxc0Co81iD" }, "cell_type": "markdown", "source": [ "#### Append to the end of a list" ] }, { "metadata": { "colab_type": "code", "id": "7l9O1BOz89Sg", "outputId": "f5ed7734-afd3-4dcd-8311-ac809622b29b", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters = ['a']\n", "letters.append('c')\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['a', 'c']" ] }, "metadata": { "tags": [] }, "execution_count": 14 } ] }, { "metadata": { "colab_type": "text", "id": "GOWYij2p9bwL" }, "cell_type": "markdown", "source": [ "#### Insert at beginning of list" ] }, { "metadata": { "colab_type": "code", "id": "KgMcKp5W9fI7", "outputId": "5f5acb75-5715-456a-f63b-76d675676766", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters.insert(0, 'b')\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['b', 'a', 'c']" ] }, "metadata": { "tags": [] }, "execution_count": 15 } ] }, { "metadata": { "colab_type": "text", "id": "z2pfGnq7-PHc" }, "cell_type": "markdown", "source": [ "#### Insert at arbitrary position" ] }, { "metadata": { "colab_type": "code", "id": "SgovUUMS-TxT", "outputId": "2d639b6d-eaa7-4f91-b837-88136a2b403a", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters.insert(2, 'c')\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['b', 'a', 'c', 'c']" ] }, "metadata": { "tags": [] }, "execution_count": 16 } ] }, { "metadata": { "colab_type": "text", "id": "W2WMgepZAjkO" }, "cell_type": "markdown", "source": [ "#### Extending with another list" ] }, { "metadata": { "colab_type": "code", "id": "UYn06yndAoNH", "outputId": "a1c34ae5-d500-44b0-98a3-c4db3b47c910", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "more_letters = ['e', 'f', 'g']\n", "letters.extend(more_letters)\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['b', 'a', 'c', 'c', 'e', 'f', 'g']" ] }, "metadata": { "tags": [] }, "execution_count": 17 } ] }, { "metadata": { "colab_type": "text", "id": "wzPgAr_s_CrC" }, "cell_type": "markdown", "source": [ "### Change item at some position" ] }, { "metadata": { "colab_type": "code", "id": "BZGy8c8bov2q", "outputId": "4fc4fd41-0e1e-4324-8302-b0ce19558908", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters[3] = 'd'\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['b', 'a', 'c', 'd', 'e', 'f', 'g']" ] }, "metadata": { "tags": [] }, "execution_count": 18 } ] }, { "metadata": { "colab_type": "text", "id": "PGcA5_5__RZm" }, "cell_type": "markdown", "source": [ "### Swap two items" ] }, { "metadata": { "colab_type": "code", "id": "egbXdmQ__UB4", "outputId": "439400af-99d2-4d6c-aaac-f16c822bf73e", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters[0], letters[1] = letters[1], letters[0]\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['a', 'b', 'c', 'd', 'e', 'f', 'g']" ] }, "metadata": { "tags": [] }, "execution_count": 19 } ] }, { "metadata": { "colab_type": "text", "id": "RNirVSMaHOp4" }, "cell_type": "markdown", "source": [ "### Removing items from a list" ] }, { "metadata": { "colab_type": "text", "id": "K0ecop0OHXo_" }, "cell_type": "markdown", "source": [ "#### Pop from the end" ] }, { "metadata": { "colab_type": "code", "id": "ZfFG3MZ7HdXa", "outputId": "fe842c28-5813-4286-df92-908864af1f53", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters = ['a', 'b', 'c', 'd', 'e', 'f']\n", "letters.pop()\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['a', 'b', 'c', 'd', 'e']" ] }, "metadata": { "tags": [] }, "execution_count": 20 } ] }, { "metadata": { "colab_type": "text", "id": "WSm1EhxBH8a2" }, "cell_type": "markdown", "source": [ "#### Pop by index" ] }, { "metadata": { "colab_type": "code", "id": "R62Fg9l4IAYV", "outputId": "a6bc87d5-46e8-4383-ced0-e53c67026e43", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters.pop(2)\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['a', 'b', 'd', 'e']" ] }, "metadata": { "tags": [] }, "execution_count": 21 } ] }, { "metadata": { "colab_type": "text", "id": "wDhR89qxIUEh" }, "cell_type": "markdown", "source": [ "#### Remove specific item" ] }, { "metadata": { "colab_type": "code", "id": "cAyAJIeOpYrU", "outputId": "60762469-653a-4e89-8ffc-4082273b728e", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters.remove('d')\n", "letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['a', 'b', 'e']" ] }, "metadata": { "tags": [] }, "execution_count": 22 } ] }, { "metadata": { "colab_type": "text", "id": "D9_M_6cwUAaX" }, "cell_type": "markdown", "source": [ "### Create tuple using brackets" ] }, { "metadata": { "colab_type": "code", "id": "7Zqb_MU2UEJa", "outputId": "0b125537-00c5-4bd2-85d4-3dc130fcf6b7", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "tup = (1, 2, 3)\n", "tup" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(1, 2, 3)" ] }, "metadata": { "tags": [] }, "execution_count": 23 } ] }, { "metadata": { "colab_type": "text", "id": "ep3XqjhqUIS1" }, "cell_type": "markdown", "source": [ "### Create tuple with commas" ] }, { "metadata": { "colab_type": "code", "id": "Oz8dkzlzUNEe", "outputId": "c8b25da2-8291-406f-b3e3-614cb7c06bba", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "tup = 1, 2, 3\n", "tup" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(1, 2, 3)" ] }, "metadata": { "tags": [] }, "execution_count": 24 } ] }, { "metadata": { "colab_type": "text", "id": "p5Ie15xNUvK-" }, "cell_type": "markdown", "source": [ "### Create empty tuple" ] }, { "metadata": { "colab_type": "code", "id": "6M3eLXfXUxLX", "outputId": "83081a8c-9e5b-4a5a-bd3e-248f805e6da7", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "tup = ()\n", "tup" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "()" ] }, "metadata": { "tags": [] }, "execution_count": 25 } ] }, { "metadata": { "colab_type": "text", "id": "0YKr2HU6UzgT" }, "cell_type": "markdown", "source": [ "### Create tuple with single item" ] }, { "metadata": { "colab_type": "code", "id": "9ib336cLU3iu", "outputId": "1f47d8eb-ed90-4c0e-d759-e7aa76f4b11d", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "tup = 1,\n", "tup" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(1,)" ] }, "metadata": { "tags": [] }, "execution_count": 28 } ] }, { "metadata": { "colab_type": "text", "id": "a_eqCY6XTy1x" }, "cell_type": "markdown", "source": [ "### Behaviours shared by lists and tuples\n", "The following sequence behaviors are shared by lists and tuples" ] }, { "metadata": { "colab_type": "text", "id": "HpNqX6QFLmdK" }, "cell_type": "markdown", "source": [ "### Check item in sequence" ] }, { "metadata": { "colab_type": "code", "id": "NpIdl5Cfp7-f", "outputId": "1f78c614-29a9-42d7-d2da-300df5698dfc", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "3 in (1, 2, 3, 4, 5)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 29 } ] }, { "metadata": { "colab_type": "text", "id": "Ms_bF8BjL79W" }, "cell_type": "markdown", "source": [ "### Check item not in sequence" ] }, { "metadata": { "colab_type": "code", "id": "gLXaJE6EMGyb", "outputId": "786985a3-a630-4617-a4d3-babd7fca9e77", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'a' not in [1, 2, 3, 4, 5]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 30 } ] }, { "metadata": { "colab_type": "text", "id": "UNKJxMb6Mipn" }, "cell_type": "markdown", "source": [ "### Slicing" ] }, { "metadata": { "colab_type": "text", "id": "r_fLVxRhNWLA" }, "cell_type": "markdown", "source": [ "#### Setting start, slice to the end" ] }, { "metadata": { "colab_type": "code", "id": "098SVCIvsibb", "outputId": "8a737bd9-05af-4f95-fdc4-aa882fef47c9", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters = 'a', 'b', 'c', 'd', 'e', 'f'\n", "letters[3:4]\n" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "('d',)" ] }, "metadata": { "tags": [] }, "execution_count": 32 } ] }, { "metadata": { "colab_type": "text", "id": "dsNvabgmNeqw" }, "cell_type": "markdown", "source": [ "#### Set end, slice from beginning" ] }, { "metadata": { "colab_type": "code", "id": "QVNJeYVtNh56", "outputId": "249d9c60-e20c-4d78-dc02-67b74b17fe17", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters[:4]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "('a', 'b', 'c', 'd')" ] }, "metadata": { "tags": [] }, "execution_count": 33 } ] }, { "metadata": { "colab_type": "text", "id": "Lrd7HuHnNvIi" }, "cell_type": "markdown", "source": [ "#### Index from end of sequence" ] }, { "metadata": { "colab_type": "code", "id": "9XiynpN9M9V_", "outputId": "3b8497e3-1219-496f-ec24-6c99dbd3ff8c", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters[-4:]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "('c', 'd', 'e', 'f')" ] }, "metadata": { "tags": [] }, "execution_count": 34 } ] }, { "metadata": { "colab_type": "text", "id": "Lzkr-snEOEE9" }, "cell_type": "markdown", "source": [ "#### Setting step" ] }, { "metadata": { "colab_type": "code", "id": "fCzpKpbSOGy-", "outputId": "e0ef8366-b712-45ca-fcb7-d3db915adae6", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters[1::-2]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "('b',)" ] }, "metadata": { "tags": [] }, "execution_count": 36 } ] }, { "metadata": { "colab_type": "text", "id": "IR6wtY_oJlSv" }, "cell_type": "markdown", "source": [ "### Unpacking" ] }, { "metadata": { "colab_type": "code", "id": "iUY-WFVvP82h", "outputId": "928e9082-4ac2-4a18-a20f-d369c9e7e713", "colab": { "base_uri": "https://localhost:8080/", "height": 198 } }, "cell_type": "code", "source": [ "first, middle = [1, 2, 3]\n", "\n", "f\"first = {first}, middle = {middle}, last = {last}\"" ], "execution_count": 0, "outputs": [ { "output_type": "error", "ename": "ValueError", "evalue": "ignored", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfirst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmiddle\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34mf\"first = {first}, middle = {middle}, last = {last}\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mValueError\u001b[0m: too many values to unpack (expected 2)" ] } ] }, { "metadata": { "colab_type": "text", "id": "Ryn60MRRQLhE" }, "cell_type": "markdown", "source": [ "### Extended unpacking" ] }, { "metadata": { "colab_type": "code", "id": "5olgXZcwQOwY", "outputId": "1c3babb9-a0b5-4018-cbeb-b5ea287d2e04", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "first, *middle, last = (1, 2, 3, 4, 5)\n", "\n", "f\"first = {first}, middle = {middle}, last = {last}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'first = 1, middle = [2, 3, 4], last = 5'" ] }, "metadata": { "tags": [] }, "execution_count": 42 } ] }, { "metadata": { "colab_type": "text", "id": "uRwRq8FkSMxF" }, "cell_type": "markdown", "source": [ "### Using list as Stack\n", "A stack is a LIFO (last in, first out) data structure which can be simulated using a list" ] }, { "metadata": { "colab_type": "text", "id": "UMhxPw8tV2ot" }, "cell_type": "markdown", "source": [ "#### Push onto the stack using append" ] }, { "metadata": { "colab_type": "code", "id": "47_iDnO6V6ut", "outputId": "18c3ccf8-0d66-4681-e576-387985c81f82", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "stack = []\n", "stack.append('first on')\n", "stack.append('second on')\n", "stack.append('third on')\n", "stack" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['first on', 'second on', 'third on']" ] }, "metadata": { "tags": [] }, "execution_count": 43 } ] }, { "metadata": { "colab_type": "text", "id": "M_wrgcjcWUij" }, "cell_type": "markdown", "source": [ "#### Retrieve items, last one first using **pop**" ] }, { "metadata": { "colab_type": "code", "id": "mY_Wbh9-WZq_", "outputId": "b5859dbe-52fb-47b8-b9ed-2f799e45be69", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "f\"Retrieved first: {stack.pop()!r}, retrieved second: {stack.pop()!r}, retrieved last: {stack.pop()!r}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"Retrieved first: 'third on', retrieved second: 'second on', retrieved last: 'first on'\"" ] }, "metadata": { "tags": [] }, "execution_count": 44 } ] }, { "metadata": { "colab_type": "text", "id": "SxpzGoM_Z_RU" }, "cell_type": "markdown", "source": [ "## 5.2 Explore dictionaries \n", "Dictionaries are mappings of key value pairs." ] }, { "metadata": { "colab_type": "text", "id": "l9nWQuW1oLCE" }, "cell_type": "markdown", "source": [ "### Create an empty dict using constructor" ] }, { "metadata": { "colab_type": "code", "id": "45C_FS-eoR-3", "outputId": "bd814a64-d7e0-4577-fbe0-46c721330ab1", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary = {}\n", "dictionary" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{}" ] }, "metadata": { "tags": [] }, "execution_count": 46 } ] }, { "metadata": { "colab_type": "text", "id": "uwNpFQFGo0C_" }, "cell_type": "markdown", "source": [ "### Create a dictionary based on key/value pairs" ] }, { "metadata": { "colab_type": "code", "id": "VWYvp8peo5ok", "outputId": "ef583270-e5dc-4902-a524-696345ff0db1", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "key_values = [['key-1','value-1'], ['key-2', 'value-2']]\n", "dictionary = dict(key_values)\n", "dictionary" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'key-1': 'value-1', 'key-2': 'value-2'}" ] }, "metadata": { "tags": [] }, "execution_count": 47 } ] }, { "metadata": { "colab_type": "text", "id": "oLd8LV02ofza" }, "cell_type": "markdown", "source": [ "### Create an empty dict using curley braces" ] }, { "metadata": { "colab_type": "code", "id": "El6HcTagolLw", "outputId": "696a4405-368b-4113-a0d1-99566e249446", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary = {}\n", "dictionary" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{}" ] }, "metadata": { "tags": [] }, "execution_count": 48 } ] }, { "metadata": { "colab_type": "text", "id": "yXNaZRMdpQoK" }, "cell_type": "markdown", "source": [ "### Use curley braces to create a dictionary with initial key/values" ] }, { "metadata": { "colab_type": "code", "id": "oRTWGBtvpYcb", "outputId": "922c363f-d506-4eb0-81bb-a7301155bdf7", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary = {'key-1': 'value-1',\n", " 'key-2': 'value-2'}\n", "\n", "dictionary" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'key-1': 'value-1', 'key-2': 'value-2'}" ] }, "metadata": { "tags": [] }, "execution_count": 49 } ] }, { "metadata": { "colab_type": "text", "id": "vb9aP6o5pv_B" }, "cell_type": "markdown", "source": [ "### Access value using key" ] }, { "metadata": { "colab_type": "code", "id": "3-jz1H8Apzgm", "outputId": "3970a9a3-a15c-4a80-a2fc-81ad55cc1717", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary['key-1']" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'value-1'" ] }, "metadata": { "tags": [] }, "execution_count": 51 } ] }, { "metadata": { "colab_type": "text", "id": "UBSLRWEeqE-W" }, "cell_type": "markdown", "source": [ "### Add a key/value pair to an existing dictionary" ] }, { "metadata": { "colab_type": "code", "id": "J98co3mWqJ5I", "outputId": "4ec4724f-aeca-48db-d6c9-84fef9a8c849", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary['key-3'] = 'value-3'\n", "\n", "dictionary" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'key-1': 'value-1', 'key-2': 'value-2', 'key-3': 'value-3'}" ] }, "metadata": { "tags": [] }, "execution_count": 52 } ] }, { "metadata": { "colab_type": "text", "id": "xmHWK2hHq7c_" }, "cell_type": "markdown", "source": [ "### Update value for existing key" ] }, { "metadata": { "colab_type": "code", "id": "VrV2r-vUq-JV", "outputId": "c7960849-de81-461f-eed5-1806f9a01ceb", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary['key-2'] = 'new-value-2'\n", "dictionary['key-2']" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'new-value-2'" ] }, "metadata": { "tags": [] }, "execution_count": 53 } ] }, { "metadata": { "colab_type": "text", "id": "_ot60Snlra6K" }, "cell_type": "markdown", "source": [ "### Get keys" ] }, { "metadata": { "colab_type": "code", "id": "Lv726tMhrYnh", "outputId": "23089819-c637-4e89-a48e-ee5507b4b550", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "list(dictionary.keys())" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['key-1', 'key-2', 'key-3']" ] }, "metadata": { "tags": [] }, "execution_count": 55 } ] }, { "metadata": { "colab_type": "text", "id": "WGJyKhKgrf11" }, "cell_type": "markdown", "source": [ "### Get values" ] }, { "metadata": { "colab_type": "code", "id": "F7F-fNMMrhT5", "outputId": "f9ace37c-d4de-450d-b4e3-c6b06790fbd0", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary.values()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "dict_values(['value-1', 'new-value-2', 'value-3'])" ] }, "metadata": { "tags": [] }, "execution_count": 56 } ] }, { "metadata": { "colab_type": "text", "id": "tCpdFg8JrqKM" }, "cell_type": "markdown", "source": [ "### Get iterable keys and items" ] }, { "metadata": { "colab_type": "code", "id": "0dZRJamArlLg", "outputId": "1d43a628-2f5e-4d4e-c179-b7cbc2373c37", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary.items()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "dict_items([('key-1', 'value-1'), ('key-2', 'new-value-2'), ('key-3', 'value-3')])" ] }, "metadata": { "tags": [] }, "execution_count": 57 } ] }, { "metadata": { "colab_type": "text", "id": "Cvk5OCmQrvkd" }, "cell_type": "markdown", "source": [ "### Use items in for loop" ] }, { "metadata": { "colab_type": "code", "id": "bzYEfTEWrxno", "outputId": "b8b1427b-7eb0-4f4c-c36c-3b526d9a09dd", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "for key, value in dictionary.items():\n", " print(f\"{key}: {value}\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "key-1: value-1\n", "key-2: new-value-2\n", "key-3: value-3\n" ], "name": "stdout" } ] }, { "metadata": { "colab_type": "text", "id": "9fqITS30sJC6" }, "cell_type": "markdown", "source": [ "### Check if dictionary has key\n", "The 'in' syntax we used with sequences checks the dicts keys for membership." ] }, { "metadata": { "colab_type": "code", "id": "c1XdFLXNsWVq", "outputId": "fc5b766b-be93-4d8f-95e0-fd17f981bee0", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'key-5' in dictionary" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "False" ] }, "metadata": { "tags": [] }, "execution_count": 60 } ] }, { "metadata": { "colab_type": "text", "id": "G4Z2Pk3VsyUJ" }, "cell_type": "markdown", "source": [ "### Get method" ] }, { "metadata": { "colab_type": "code", "id": "FnHIgV8_s1eQ", "outputId": "50b29063-b0c2-44b8-bd97-4b1da21282a7", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "dictionary.get(\"bad key\", \"default value\")" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'default value'" ] }, "metadata": { "tags": [] }, "execution_count": 64 } ] }, { "metadata": { "colab_type": "text", "id": "YmWqelbJthVB" }, "cell_type": "markdown", "source": [ "### Remove item" ] }, { "metadata": { "colab_type": "code", "id": "tOLuWnHEtkjT", "outputId": "e90cf788-09a9-4e0a-9231-b29f8c070fab", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "del(dictionary['key-1'])\n", "dictionary" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'key-2': 'new-value-2', 'key-3': 'value-3'}" ] }, "metadata": { "tags": [] }, "execution_count": 65 } ] }, { "metadata": { "id": "AnFGTgxCiZnk", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Keys must be immutable" ] }, { "metadata": { "id": "rnKNsxuZinvy", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### List as key\n", "Lists are mutable and not hashable" ] }, { "metadata": { "id": "16t9F7PPipo4", "colab_type": "code", "outputId": "ad113919-80cc-44cd-cc76-ad1d252dde27", "colab": { "base_uri": "https://localhost:8080/", "height": 198 } }, "cell_type": "code", "source": [ "items = ['item-1', 'item-2', 'item-3']\n", "\n", "map = {}\n", "\n", "map[items] = \"some-value\"" ], "execution_count": 0, "outputs": [ { "output_type": "error", "ename": "TypeError", "evalue": "ignored", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mmap\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m{\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mmap\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mitems\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"some-value\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'list'" ] } ] }, { "metadata": { "id": "_9vXIamIjLER", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Tuple as a key\n", "Tuples are immutable and hence hashable" ] }, { "metadata": { "id": "K2pFCFu3jNRM", "colab_type": "code", "outputId": "5e1fc13f-b76f-45d5-9fff-359f6199972e", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "items = 'item-1', 'item-2', 'item-3'\n", "map = {}\n", "map[items] = \"some-value\"\n", "\n", "map" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{('item-1', 'item-2', 'item-3'): 'some-value'}" ] }, "metadata": { "tags": [] }, "execution_count": 67 } ] }, { "metadata": { "colab_type": "text", "id": "Y1rOvrTSaCnA" }, "cell_type": "markdown", "source": [ "## 5.3 Dive into sets" ] }, { "metadata": { "colab_type": "text", "id": "_Bu_DUont1Ks" }, "cell_type": "markdown", "source": [ "### Create set from tuple or list" ] }, { "metadata": { "colab_type": "code", "id": "epPCLrckt4Zy", "outputId": "df768aae-5c29-492b-a444-d60735fb0f36", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "letters = 'a', 'a', 'a', 'b', 'c'\n", "unique_letters = set(letters)\n", "unique_letters" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'a', 'b', 'c'}" ] }, "metadata": { "tags": [] }, "execution_count": 68 } ] }, { "metadata": { "colab_type": "text", "id": "qnQHMNCRuebg" }, "cell_type": "markdown", "source": [ "### Create set from a string" ] }, { "metadata": { "colab_type": "code", "id": "IRsuJBVC_ORB", "outputId": "ffca987d-0534-4e4f-d660-24318b4c2795", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "unique_chars = set('mississippi')\n", "unique_chars" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'i', 'm', 'p', 's'}" ] }, "metadata": { "tags": [] }, "execution_count": 69 } ] }, { "metadata": { "colab_type": "text", "id": "LUJyjEh1uzrv" }, "cell_type": "markdown", "source": [ "### Create set using curley braces" ] }, { "metadata": { "colab_type": "code", "id": "afYk3yfTu3Pt", "outputId": "8fc66ae0-c16c-45e0-d684-41757f68ee81", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "unique_num = {1, 1, 2, 3, 4, 5, 5}\n", "unique_num" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{1, 2, 3, 4, 5}" ] }, "metadata": { "tags": [] }, "execution_count": 70 } ] }, { "metadata": { "colab_type": "text", "id": "jDMYvC0avu_C" }, "cell_type": "markdown", "source": [ "### Adding to a set" ] }, { "metadata": { "colab_type": "code", "id": "hoZ7hcrBvwtc", "outputId": "1585f6f0-3122-480e-a69a-022460b0fbc7", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "unique_num.add(6)\n", "unique_num" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6}" ] }, "metadata": { "tags": [] }, "execution_count": 72 } ] }, { "metadata": { "colab_type": "text", "id": "o-UWnVB7v8yb" }, "cell_type": "markdown", "source": [ "### Popping from a set\n", "Pop method removes and returns a random element of the set" ] }, { "metadata": { "colab_type": "code", "id": "e9xWJPXnwF4s", "outputId": "3d0bc7f2-3dda-4738-975b-011d09f8ec7f", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "unique_num.pop()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "2" ] }, "metadata": { "tags": [] }, "execution_count": 74 } ] }, { "metadata": { "colab_type": "text", "id": "VJjl2Fv9vLOl" }, "cell_type": "markdown", "source": [ "### Indexing\n", "Sets have no order, and hence cannot be accessed via indexing" ] }, { "metadata": { "colab_type": "code", "id": "vxPCJWJOvShU", "outputId": "8bd70eb2-1c1d-4b64-f23b-dcf278c4a29f", "colab": { "base_uri": "https://localhost:8080/", "height": 164 } }, "cell_type": "code", "source": [ "unique_num[4]" ], "execution_count": 0, "outputs": [ { "output_type": "error", "ename": "TypeError", "evalue": "ignored", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0munique_num\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'set' object does not support indexing" ] } ] }, { "metadata": { "colab_type": "text", "id": "TQyoEjgivjA3" }, "cell_type": "markdown", "source": [ "### Checking membership" ] }, { "metadata": { "colab_type": "code", "id": "3zOvkWX6vlf2", "outputId": "020d8aef-4a5a-45fc-da0c-8e5a74240487", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "3 in unique_num" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 76 } ] }, { "metadata": { "colab_type": "text", "id": "6ph_k8RUweYA" }, "cell_type": "markdown", "source": [ "### Set operations" ] }, { "metadata": { "colab_type": "code", "id": "7EfN4R5xwmXM", "colab": {} }, "cell_type": "code", "source": [ "s1 = { 1 ,2 ,3 ,4, 5, 6, 7}\n", "s2 = { 0, 2, 4, 6, 8 }" ], "execution_count": 0, "outputs": [] }, { "metadata": { "colab_type": "text", "id": "RAG3ATbtwtwm" }, "cell_type": "markdown", "source": [ "#### Items in first set, but not in the second" ] }, { "metadata": { "colab_type": "code", "id": "kCfxlj9Lwz1x", "outputId": "3c3b92ad-959b-46bf-ecb0-2513886923b9", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "s1 - s2" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{1, 3, 5, 7}" ] }, "metadata": { "tags": [] }, "execution_count": 78 } ] }, { "metadata": { "colab_type": "text", "id": "KDA6bO1Vw8Bk" }, "cell_type": "markdown", "source": [ "#### Items in either or both sets" ] }, { "metadata": { "colab_type": "code", "id": "a-1CBUnRw-y-", "outputId": "584219f7-80ba-4e24-f3ec-b833428bfc87", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "s1 | s2" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{0, 1, 2, 3, 4, 5, 6, 7, 8}" ] }, "metadata": { "tags": [] }, "execution_count": 79 } ] }, { "metadata": { "colab_type": "text", "id": "ADEQBlFrxIN7" }, "cell_type": "markdown", "source": [ "#### Items in both sets" ] }, { "metadata": { "colab_type": "code", "id": "IIudbePtxRkR", "outputId": "ad5fcea8-8b15-467d-a895-c11e4caa8609", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "s1 & s2" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{2, 4, 6}" ] }, "metadata": { "tags": [] }, "execution_count": 80 } ] }, { "metadata": { "colab_type": "text", "id": "742awPXgxat5" }, "cell_type": "markdown", "source": [ "#### Items in either set, but not both" ] }, { "metadata": { "colab_type": "code", "id": "03UonZ9G_jDc", "outputId": "b06f2369-fd8f-42a3-befb-d9af998f7076", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "s1 ^ s2" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{0, 1, 3, 5, 7, 8}" ] }, "metadata": { "tags": [] }, "execution_count": 81 } ] }, { "metadata": { "colab_type": "text", "id": "y-xMLy31aHaZ" }, "cell_type": "markdown", "source": [ "## 5.4 Work with the numpy array\n" ] }, { "metadata": { "colab_type": "text", "id": "0yTnd14-HasX" }, "cell_type": "markdown", "source": [ "Numpy is an opened source numerical computing libary for python. The numpy array is a datastructure representing multidimension arrays which is optimized for both memory and performance." ] }, { "metadata": { "colab_type": "text", "id": "W7mqP9swyh5d" }, "cell_type": "markdown", "source": [ "### Create a numpy array from a list of lists" ] }, { "metadata": { "colab_type": "code", "id": "NOAicBI_ynGz", "outputId": "188d1747-f144-47ea-b9d6-887b3b1e88d8", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "import numpy as np\n", "list_of_lists = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]\n", "\n", "np_array = np.array(list_of_lists)\n", "\n", "np_array" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 1, 2, 3, 4],\n", " [ 5, 6, 7, 8],\n", " [ 9, 10, 11, 12],\n", " [13, 14, 15, 16]])" ] }, "metadata": { "tags": [] }, "execution_count": 82 } ] }, { "metadata": { "colab_type": "text", "id": "0eHeRfdAyzTY" }, "cell_type": "markdown", "source": [ "### Initialize an array of zeros" ] }, { "metadata": { "colab_type": "code", "id": "1BoLJ2XYy2op", "outputId": "9945a4a2-28ed-4775-860d-999f4a1c4d54", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "zeros_array = np.zeros( (4, 5) )\n", "zeros_array" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0.]])" ] }, "metadata": { "tags": [] }, "execution_count": 83 } ] }, { "metadata": { "colab_type": "text", "id": "92DsgsVSzC2T" }, "cell_type": "markdown", "source": [ "### Initialize and array of ones" ] }, { "metadata": { "colab_type": "code", "id": "F_TWC3z6zYA2", "outputId": "2d7317b1-5b89-44b5-844d-ceec8b50ff86", "colab": { "base_uri": "https://localhost:8080/", "height": 119 } }, "cell_type": "code", "source": [ "ones_array = np.ones( (6, 6) )\n", "ones_array" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[1., 1., 1., 1., 1., 1.],\n", " [1., 1., 1., 1., 1., 1.],\n", " [1., 1., 1., 1., 1., 1.],\n", " [1., 1., 1., 1., 1., 1.],\n", " [1., 1., 1., 1., 1., 1.],\n", " [1., 1., 1., 1., 1., 1.]])" ] }, "metadata": { "tags": [] }, "execution_count": 84 } ] }, { "metadata": { "colab_type": "text", "id": "aHaTMcOIzk1_" }, "cell_type": "markdown", "source": [ "### Using arrange" ] }, { "metadata": { "colab_type": "code", "id": "s2W1mGqkzt7O", "outputId": "ee90b952-d931-453c-9f60-d44c8801feed", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "nine = np.arange( 9 )\n", "nine" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([0, 1, 2, 3, 4, 5, 6, 7, 8])" ] }, "metadata": { "tags": [] }, "execution_count": 85 } ] }, { "metadata": { "colab_type": "text", "id": "Ejoa6qiszx-O" }, "cell_type": "markdown", "source": [ "### Using reshape" ] }, { "metadata": { "colab_type": "code", "id": "RgJ6ZfVfOm6R", "outputId": "67f363e4-a9cd-477f-9ed1-6165d9fd3ac5", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "nine.reshape(3,3)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[0, 1, 2],\n", " [3, 4, 5],\n", " [6, 7, 8]])" ] }, "metadata": { "tags": [] }, "execution_count": 86 } ] }, { "metadata": { "colab_type": "text", "id": "lB_e_dbx03bL" }, "cell_type": "markdown", "source": [ "### Introspection" ] }, { "metadata": { "colab_type": "text", "id": "___j8Jrc06k8" }, "cell_type": "markdown", "source": [ "#### Get the data type" ] }, { "metadata": { "colab_type": "code", "id": "IEDq1Hcu0-z_", "outputId": "45152487-40ba-4f23-d1c0-11bb969e21a4", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "np_array.dtype" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "dtype('int64')" ] }, "metadata": { "tags": [] }, "execution_count": 87 } ] }, { "metadata": { "colab_type": "text", "id": "OTREEzKu1CU1" }, "cell_type": "markdown", "source": [ "#### Get the array's shape" ] }, { "metadata": { "colab_type": "code", "id": "JZ-1Vxj41GKC", "outputId": "54b58943-55aa-4b94-ee19-81cc036512ef", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "np_array.shape" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(4, 4)" ] }, "metadata": { "tags": [] }, "execution_count": 88 } ] }, { "metadata": { "colab_type": "text", "id": "OeY1bxyJ1Idv" }, "cell_type": "markdown", "source": [ "#### Get the number of items in the array" ] }, { "metadata": { "colab_type": "code", "id": "9-MK_t881LVg", "outputId": "557d0f24-8814-4378-8926-144d8aacbe6e", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "np_array.size" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "16" ] }, "metadata": { "tags": [] }, "execution_count": 89 } ] }, { "metadata": { "colab_type": "text", "id": "SifJ6bra1TQ0" }, "cell_type": "markdown", "source": [ "#### Get the size of the array in bytes" ] }, { "metadata": { "colab_type": "code", "id": "5YzTBEid1YsX", "outputId": "9b09f32c-a717-40c3-d048-8c7559f95a17", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "np_array.nbytes" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "128" ] }, "metadata": { "tags": [] }, "execution_count": 90 } ] }, { "metadata": { "colab_type": "text", "id": "toHDZHec1f9F" }, "cell_type": "markdown", "source": [ "### Setting the data type" ] }, { "metadata": { "colab_type": "text", "id": "7bdGC31e1pxo" }, "cell_type": "markdown", "source": [ "#### dtype parameter" ] }, { "metadata": { "colab_type": "code", "id": "3quViOL01sy4", "outputId": "ea7f0ea3-7b16-4b23-fb7b-ea457ee8ea4b", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "np_array = np.array(list_of_lists, dtype=np.int8)\n", "np_array" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 1, 2, 3, 4],\n", " [ 5, 6, 7, 8],\n", " [ 9, 10, 11, 12],\n", " [13, 14, 15, 16]], dtype=int8)" ] }, "metadata": { "tags": [] }, "execution_count": 91 } ] }, { "metadata": { "colab_type": "text", "id": "xd-_Qgqo1yyu" }, "cell_type": "markdown", "source": [ "#### Size reduction" ] }, { "metadata": { "colab_type": "code", "id": "pFk4n7d312LH", "outputId": "7c84121e-c3b4-4145-8fd7-7cd065ee7240", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "np_array.nbytes" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "16" ] }, "metadata": { "tags": [] }, "execution_count": 92 } ] }, { "metadata": { "colab_type": "text", "id": "YOzWWnLF1-m2" }, "cell_type": "markdown", "source": [ "#### The data type setting is immutible \n", "Data may be truncated if the data type is restrictive." ] }, { "metadata": { "colab_type": "code", "id": "2MPe95vlLpwb", "outputId": "7c158bcc-71b5-45c3-a678-132b2c0c19bd", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "np_array[0][0] = 1.7344567\n", "np_array[0][0]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "1" ] }, "metadata": { "tags": [] }, "execution_count": 93 } ] }, { "metadata": { "colab_type": "text", "id": "vvPhjEXoWYnT" }, "cell_type": "markdown", "source": [ "### Array Slicing\n", "\n", "\n", "* Slicing can be used to get a view reprsenting a sub-array. \n", "* The slice is a view to the original array, the data is not copied to a new data structure\n", "* The slice is taken in the form: array[ rows, columns ]\n", "\n", "\n", "\n", "\n" ] }, { "metadata": { "colab_type": "code", "id": "tDjuERnX2hUp", "outputId": "74052c3f-0c04-45e3-dc68-633903d0e939", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "np_array" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 1, 2, 3, 4],\n", " [ 5, 6, 7, 8],\n", " [ 9, 10, 11, 12],\n", " [13, 14, 15, 16]], dtype=int8)" ] }, "metadata": { "tags": [] }, "execution_count": 94 } ] }, { "metadata": { "colab_type": "code", "id": "JBEp4F7uWMmY", "outputId": "eb720ba3-9a3d-41c4-8089-7f62ca88d854", "colab": { "base_uri": "https://localhost:8080/", "height": 51 } }, "cell_type": "code", "source": [ "np_array[2:, :3]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 9, 10, 11],\n", " [13, 14, 15]], dtype=int8)" ] }, "metadata": { "tags": [] }, "execution_count": 95 } ] }, { "metadata": { "colab_type": "text", "id": "DZw_vVRta7q8" }, "cell_type": "markdown", "source": [ "### Math operations\n", "\n", "\n", "* Unlike a unlike nested lists, matrix operations perform mathimatical operations on data\n", "\n" ] }, { "metadata": { "colab_type": "text", "id": "DcuhDzog3k7b" }, "cell_type": "markdown", "source": [ "#### Create two 3 x 3 arrays" ] }, { "metadata": { "colab_type": "code", "id": "Hk4TSZMp3ueg", "outputId": "f9c0a3be-9eaa-4d19-df19-53de928ffde4", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "np_array_1 = np.arange(9).reshape(3,3)\n", "np_array_1\n" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[0, 1, 2],\n", " [3, 4, 5],\n", " [6, 7, 8]])" ] }, "metadata": { "tags": [] }, "execution_count": 96 } ] }, { "metadata": { "colab_type": "code", "id": "BGUAuzGW32MZ", "outputId": "f84c69d5-cf39-40f3-8716-1b2950a8200a", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "np_array_2 = np.arange(10, 19).reshape(3,3)\n", "np_array_2" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[10, 11, 12],\n", " [13, 14, 15],\n", " [16, 17, 18]])" ] }, "metadata": { "tags": [] }, "execution_count": 97 } ] }, { "metadata": { "colab_type": "text", "id": "JtgSgziO3-zG" }, "cell_type": "markdown", "source": [ "#### Multiply the arrays" ] }, { "metadata": { "colab_type": "code", "id": "DnsoTaV64B6s", "outputId": "1ff2e5fe-8c02-4a14-e447-489a91ae7dab", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "np_array_1 * np_array_2" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 0, 11, 24],\n", " [ 39, 56, 75],\n", " [ 96, 119, 144]])" ] }, "metadata": { "tags": [] }, "execution_count": 98 } ] }, { "metadata": { "colab_type": "text", "id": "bLZEFu6F4RZ9" }, "cell_type": "markdown", "source": [ "#### Add the arrays" ] }, { "metadata": { "colab_type": "code", "id": "gtImxWZW4TIK", "outputId": "0549bcbe-a4ec-4563-d8e3-a9b5faa12f1c", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "np_array_1 + np_array_2" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[10, 12, 14],\n", " [16, 18, 20],\n", " [22, 24, 26]])" ] }, "metadata": { "tags": [] }, "execution_count": 99 } ] }, { "metadata": { "colab_type": "text", "id": "7R8mt4RU4yMz" }, "cell_type": "markdown", "source": [ "### Matrix operations" ] }, { "metadata": { "colab_type": "text", "id": "1OU4oaWt40Jp" }, "cell_type": "markdown", "source": [ "#### Transpose" ] }, { "metadata": { "colab_type": "code", "id": "EXIwtxD942UO", "outputId": "ee5697f7-97bd-4b89-d053-771afa6fa69b", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "np_array.T" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 1, 5, 9, 13],\n", " [ 2, 6, 10, 14],\n", " [ 3, 7, 11, 15],\n", " [ 4, 8, 12, 16]], dtype=int8)" ] }, "metadata": { "tags": [] }, "execution_count": 100 } ] }, { "metadata": { "colab_type": "text", "id": "oi0b6o6o46E9" }, "cell_type": "markdown", "source": [ "#### Dot product" ] }, { "metadata": { "colab_type": "code", "id": "aQgYwiPpbxIG", "outputId": "7fd1509e-d41a-40f0-f543-eebe891d7157", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "np_array_1.dot(np_array_2)\n" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[ 45, 48, 51],\n", " [162, 174, 186],\n", " [279, 300, 321]])" ] }, "metadata": { "tags": [] }, "execution_count": 101 } ] }, { "metadata": { "colab_type": "text", "id": "z-hjI2NsaK4w" }, "cell_type": "markdown", "source": [ "## 5.5 Use the Pandas DataFrame\n", "* One of the most highly leveraged data structures for data science\n", "* A table-like two dimensional data structure. \n" ] }, { "metadata": { "colab_type": "text", "id": "e8Dxdc4V6wlV" }, "cell_type": "markdown", "source": [ "### Create a DataFrame" ] }, { "metadata": { "colab_type": "code", "id": "73JaHcb261eb", "outputId": "95ec9725-3f0e-472d-c9e1-6439bfeac6ff", "colab": { "base_uri": "https://localhost:8080/", "height": 297 } }, "cell_type": "code", "source": [ "import pandas as pd\n", "first_names = ['henry', 'rolly', 'molly', 'frank', 'david', 'steven', 'gwen', 'arthur']\n", "last_names = ['smith', 'brocker', 'stein', 'bach', 'spencer', 'de wilde', 'mason', 'davis']\n", "ages = [43, 23, 78, 56, 26, 14, 46, 92]\n", "\n", "df = pd.DataFrame({ 'first': first_names, 'last': last_names, 'age': ages})\n", "df" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agefirstlast
043henrysmith
123rollybrocker
278mollystein
356frankbach
426davidspencer
514stevende wilde
646gwenmason
792arthurdavis
\n", "
" ], "text/plain": [ " age first last\n", "0 43 henry smith\n", "1 23 rolly brocker\n", "2 78 molly stein\n", "3 56 frank bach\n", "4 26 david spencer\n", "5 14 steven de wilde\n", "6 46 gwen mason\n", "7 92 arthur davis" ] }, "metadata": { "tags": [] }, "execution_count": 103 } ] }, { "metadata": { "colab_type": "text", "id": "ut_QqgQi7CvX" }, "cell_type": "markdown", "source": [ "### Head - looking at the top" ] }, { "metadata": { "colab_type": "code", "id": "FN7tXlFV7FiE", "outputId": "f48221e8-bcee-4d3c-e2e0-e3bd130b202d", "colab": { "base_uri": "https://localhost:8080/", "height": 297 } }, "cell_type": "code", "source": [ "df.head(10)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agefirstlast
043henrysmith
123rollybrocker
278mollystein
356frankbach
426davidspencer
514stevende wilde
646gwenmason
792arthurdavis
\n", "
" ], "text/plain": [ " age first last\n", "0 43 henry smith\n", "1 23 rolly brocker\n", "2 78 molly stein\n", "3 56 frank bach\n", "4 26 david spencer\n", "5 14 steven de wilde\n", "6 46 gwen mason\n", "7 92 arthur davis" ] }, "metadata": { "tags": [] }, "execution_count": 106 } ] }, { "metadata": { "colab_type": "text", "id": "2lZEPhBd7PGN" }, "cell_type": "markdown", "source": [ "### Setting number of rows returned with head" ] }, { "metadata": { "colab_type": "code", "id": "mO583bRm7J9y", "colab": {} }, "cell_type": "code", "source": [ "df.head(3)" ], "execution_count": 0, "outputs": [] }, { "metadata": { "colab_type": "text", "id": "fP0Szs_k7ZwM" }, "cell_type": "markdown", "source": [ "### Tail - looking at the bottom" ] }, { "metadata": { "colab_type": "code", "id": "lWpCA6lh7dIZ", "outputId": "6ad75973-5eba-4e1f-e70f-709f20036be7", "colab": { "base_uri": "https://localhost:8080/", "height": 111 } }, "cell_type": "code", "source": [ "df.tail(2)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agefirstlast
646gwenmason
792arthurdavis
\n", "
" ], "text/plain": [ " age first last\n", "6 46 gwen mason\n", "7 92 arthur davis" ] }, "metadata": { "tags": [] }, "execution_count": 108 } ] }, { "metadata": { "colab_type": "text", "id": "aMcpAbpW7sKB" }, "cell_type": "markdown", "source": [ "### Describe - descriptive statistics" ] }, { "metadata": { "colab_type": "code", "id": "c1SCEPeB7xIi", "outputId": "ad35757e-f072-4208-e853-1f25dc24614e", "colab": { "base_uri": "https://localhost:8080/", "height": 297 } }, "cell_type": "code", "source": [ "df.describe()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
age
count8.000000
mean47.250000
std27.227874
min14.000000
25%25.250000
50%44.500000
75%61.500000
max92.000000
\n", "
" ], "text/plain": [ " age\n", "count 8.000000\n", "mean 47.250000\n", "std 27.227874\n", "min 14.000000\n", "25% 25.250000\n", "50% 44.500000\n", "75% 61.500000\n", "max 92.000000" ] }, "metadata": { "tags": [] }, "execution_count": 109 } ] }, { "metadata": { "colab_type": "text", "id": "Of3owwjI71cr" }, "cell_type": "markdown", "source": [ "### Access one column" ] }, { "metadata": { "colab_type": "code", "id": "siMCagaq74bO", "outputId": "99907925-afaf-459c-bb9f-63a7b9bf25e2", "colab": { "base_uri": "https://localhost:8080/", "height": 170 } }, "cell_type": "code", "source": [ "df['first']" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 henry\n", "1 rolly\n", "2 molly\n", "3 frank\n", "4 david\n", "5 steven\n", "6 gwen\n", "7 arthur\n", "Name: first, dtype: object" ] }, "metadata": { "tags": [] }, "execution_count": 110 } ] }, { "metadata": { "colab_type": "text", "id": "KdOsmAk67-FA" }, "cell_type": "markdown", "source": [ "### Slice a column" ] }, { "metadata": { "colab_type": "code", "id": "dNl_CTuk8Bip", "outputId": "df4f7aa6-4457-4c77-f2cb-5c77c227d072", "colab": { "base_uri": "https://localhost:8080/", "height": 102 } }, "cell_type": "code", "source": [ "df['first'][4:]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "4 david\n", "5 steven\n", "6 gwen\n", "7 arthur\n", "Name: first, dtype: object" ] }, "metadata": { "tags": [] }, "execution_count": 111 } ] }, { "metadata": { "colab_type": "text", "id": "H3iUwAI-8TFp" }, "cell_type": "markdown", "source": [ "### Use conditions to filter" ] }, { "metadata": { "colab_type": "code", "id": "pjYmR1d0fbUh", "outputId": "0add25f2-e4c6-4a21-98d8-6c77cdacea4e", "colab": { "base_uri": "https://localhost:8080/", "height": 142 } }, "cell_type": "code", "source": [ "df[df['age'] > 50]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agefirstlast
278mollystein
356frankbach
792arthurdavis
\n", "
" ], "text/plain": [ " age first last\n", "2 78 molly stein\n", "3 56 frank bach\n", "7 92 arthur davis" ] }, "metadata": { "tags": [] }, "execution_count": 112 } ] }, { "metadata": { "colab_type": "text", "id": "mU_PWc0zaO0X" }, "cell_type": "markdown", "source": [ "## 5.6 Use the pandas Series\n", "\n", "\n", "* A one dimensional labeled array\n", "* Contains data of only one type\n", "* Similar to a column in a spreedsheet\n", "\n", "\n" ] }, { "metadata": { "colab_type": "text", "id": "8IJJSLsN8kAq" }, "cell_type": "markdown", "source": [ "### Create a series" ] }, { "metadata": { "colab_type": "code", "id": "cdGYE9LdnSwU", "outputId": "c4a7433b-e15c-4f27-d727-72cf26e9de70", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "pd_series = pd.Series( [1, 2, 3 ] )\n", "pd_series" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 1\n", "1 2\n", "2 3\n", "dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 114 } ] }, { "metadata": { "colab_type": "text", "id": "GzZyCmanpu2p" }, "cell_type": "markdown", "source": [ "### Series introspection methods" ] }, { "metadata": { "colab_type": "code", "id": "RlsP8h-KpxxE", "outputId": "5eb734b2-d44e-422c-c53b-fb0fdca4b1ba", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "f\"This series is made up of {pd_series.size} items whose data type is {pd_series.dtype}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'This series is made up of 3 items whose data type is int64'" ] }, "metadata": { "tags": [] }, "execution_count": 115 } ] }, { "metadata": { "colab_type": "text", "id": "OgKF9C9an8iB" }, "cell_type": "markdown", "source": [ "### A Pandas DataFrame is composed of Pandas Series. " ] }, { "metadata": { "colab_type": "code", "id": "wAmSud2ToHuh", "outputId": "9afc9fb8-bef8-4f3f-95ac-8d26f9375233", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "age = df.age\n", "type( age )" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "pandas.core.series.Series" ] }, "metadata": { "tags": [] }, "execution_count": 116 } ] }, { "metadata": { "colab_type": "text", "id": "DooLFZH1ovCV" }, "cell_type": "markdown", "source": [ "### Some useful helper methods of a Series" ] }, { "metadata": { "colab_type": "text", "id": "iyDtUDT3JRuY" }, "cell_type": "markdown", "source": [ "#### mean" ] }, { "metadata": { "colab_type": "code", "id": "9gi2tOSTJTrI", "outputId": "1fbee477-4a35-422d-949b-1fe0b180af04", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "pd_series = pd.Series([ 1, 2, 3, 5, 6, 6, 6, 7, 8])\n", "pd_series.mean()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "4.888888888888889" ] }, "metadata": { "tags": [] }, "execution_count": 117 } ] }, { "metadata": { "colab_type": "text", "id": "bILvFviLJbok" }, "cell_type": "markdown", "source": [ "#### Unique" ] }, { "metadata": { "colab_type": "code", "id": "azsZ0xzcJdN2", "outputId": "f8371eb1-959f-4726-acd1-e00729f9f2d6", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "pd_series.unique()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([1, 2, 3, 5, 6, 7, 8])" ] }, "metadata": { "tags": [] }, "execution_count": 118 } ] }, { "metadata": { "colab_type": "text", "id": "GT2CGOucJgjz" }, "cell_type": "markdown", "source": [ "#### Max" ] }, { "metadata": { "colab_type": "code", "id": "s-PPBB7Co385", "outputId": "27c23504-0120-489f-b52b-0c6fbd49b8d9", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "pd_series.min()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "1" ] }, "metadata": { "tags": [] }, "execution_count": 120 } ] }, { "metadata": { "colab_type": "text", "id": "g2UU5cr6rna6" }, "cell_type": "markdown", "source": [ "# Notes:\n", "[Lists](https://docs.python.org/3/tutorial/datastructures.html)\n", "\n", "[Tuples and sequences](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences)\n", "\n", "[Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)\n", "\n", "[Numpy arrays](https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html)\n", "\n", "[Pandas DataFrame](https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.DataFrame.html)\n", "\n", "[Pandas Series](https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Series.html)\n", "\n" ] } ] }