{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Lesson5-Python For Data Science-Python-Data-structure.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [
"_rehYX145nEk",
"g4-XjayC8KO7"
],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"
"
]
},
{
"metadata": {
"colab_type": "text",
"id": "s39SnnxrZxsx"
},
"cell_type": "markdown",
"source": [
"# Lesson 5: Python Data Structures\n"
]
},
{
"metadata": {
"id": "c_Id55m6Jsbu",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## Pragmatic AI Labs\n",
"\n"
]
},
{
"metadata": {
"id": "e5p96AqpSDZa",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"\n",
"This notebook was produced by [Pragmatic AI Labs](https://paiml.com/). You can continue learning about these topics by:\n",
"\n",
"* Buying a copy of [Pragmatic AI: An Introduction to Cloud-Based Machine Learning](http://www.informit.com/store/pragmatic-ai-an-introduction-to-cloud-based-machine-9780134863917)\n",
"* Reading an online copy of [Pragmatic AI:Pragmatic AI: An Introduction to Cloud-Based Machine Learning](https://www.safaribooksonline.com/library/view/pragmatic-ai-an/9780134863924/)\n",
"* Watching video [Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline](https://www.safaribooksonline.com/videos/essential-machine-learning/9780135261118) on Safari Books Online.\n",
"* Watching video [AWS Certified Machine Learning-Speciality](https://learning.oreilly.com/videos/aws-certified-machine/9780135556597)\n",
"* Purchasing video [Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video](http://www.informit.com/store/essential-machine-learning-and-ai-with-python-and-jupyter-9780135261095)\n",
"* Viewing more content at [noahgift.com](https://noahgift.com/)\n"
]
},
{
"metadata": {
"colab_type": "text",
"id": "0mnOHaZpZ1CU"
},
"cell_type": "markdown",
"source": [
"## 5.1 Use lists and tuples"
]
},
{
"metadata": {
"colab_type": "text",
"id": "aCBZvOZn3V6D"
},
"cell_type": "markdown",
"source": [
"### Sequences\n",
"Lists, tuples, and strings are all Python sequences, and share many of the same methods."
]
},
{
"metadata": {
"colab_type": "text",
"id": "MgQzM2FbzhpW"
},
"cell_type": "markdown",
"source": [
"### Creating an empty list"
]
},
{
"metadata": {
"colab_type": "code",
"id": "1gUdm3jfzlCB",
"outputId": "a84844d5-a2f6-43f5-f587-8e00535ea56f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"empty = []\n",
"empty"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[]"
]
},
"metadata": {
"tags": []
},
"execution_count": 4
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "fLx7Rtstz3Pn"
},
"cell_type": "markdown",
"source": [
"### Using square brackets with initial values"
]
},
{
"metadata": {
"colab_type": "code",
"id": "X6_VTC9moTAM",
"outputId": "c218cea9-dbf7-4b5d-d8b2-4915070562a0",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"numbers = [1, 2, 3]\n",
"numbers\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[1, 2, 3]"
]
},
"metadata": {
"tags": []
},
"execution_count": 5
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "RN5dAwcv4EwQ"
},
"cell_type": "markdown",
"source": [
"### Casting an iterable\n",
"Any iterable can be cast to a list"
]
},
{
"metadata": {
"colab_type": "code",
"id": "IWCOdiiJ4Iv5",
"outputId": "63d56640-a0c9-4f64-8882-b61950a8a059",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"numbers = list(range(10))\n",
"numbers"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "m7FQdEVcAMJl"
},
"cell_type": "markdown",
"source": [
"### Creating using multiplication"
]
},
{
"metadata": {
"colab_type": "code",
"id": "lanCHsNZATqK",
"outputId": "afdc49d3-928d-4854-9ff4-4ef2709d10d7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"num_players = 10\n",
"scores = [0] * num_players\n",
"scores"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]"
]
},
"metadata": {
"tags": []
},
"execution_count": 7
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "h9l7kUOC43iL"
},
"cell_type": "markdown",
"source": [
"### Mixing data types\n",
"Lists can contain multple data types"
]
},
{
"metadata": {
"colab_type": "code",
"id": "jBqfcq6Q4-Yl",
"outputId": "38f14f53-4a84-4c4a-d56b-3254d72bd0a7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"mixed = ['a', 1, 2.0, [13], {}]\n",
"mixed"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['a', 1, 2.0, [13], {}]"
]
},
"metadata": {
"tags": []
},
"execution_count": 8
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "_rehYX145nEk"
},
"cell_type": "markdown",
"source": [
"### Indexing\n",
"Items in lists can be accessed using indices in a similar fashion to strings."
]
},
{
"metadata": {
"colab_type": "text",
"id": "PuGNKkIV5_64"
},
"cell_type": "markdown",
"source": [
"#### Access first item"
]
},
{
"metadata": {
"colab_type": "code",
"id": "98QVzpN_ogFQ",
"outputId": "93224fef-7b66-492e-851c-62c71a4d3eb5",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"numbers[0]\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0"
]
},
"metadata": {
"tags": []
},
"execution_count": 9
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "wYljuMmX6FDo"
},
"cell_type": "markdown",
"source": [
"#### Access last item"
]
},
{
"metadata": {
"colab_type": "code",
"id": "j5XB0hVZ6S5E",
"outputId": "702efb97-d245-4b06-c028-fe8ae61288e6",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"numbers[-2]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"8"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "tZJYdQW87vwk"
},
"cell_type": "markdown",
"source": [
"#### Access any item"
]
},
{
"metadata": {
"colab_type": "code",
"id": "7EJZyQUl7y_5",
"outputId": "433c5b3d-51f9-44ee-bb1b-d8d162c744bf",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"numbers[4]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"4"
]
},
"metadata": {
"tags": []
},
"execution_count": 12
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "g4-XjayC8KO7"
},
"cell_type": "markdown",
"source": [
"### Adding to a list"
]
},
{
"metadata": {
"colab_type": "text",
"id": "VVoxc0Co81iD"
},
"cell_type": "markdown",
"source": [
"#### Append to the end of a list"
]
},
{
"metadata": {
"colab_type": "code",
"id": "7l9O1BOz89Sg",
"outputId": "f5ed7734-afd3-4dcd-8311-ac809622b29b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters = ['a']\n",
"letters.append('c')\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['a', 'c']"
]
},
"metadata": {
"tags": []
},
"execution_count": 14
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "GOWYij2p9bwL"
},
"cell_type": "markdown",
"source": [
"#### Insert at beginning of list"
]
},
{
"metadata": {
"colab_type": "code",
"id": "KgMcKp5W9fI7",
"outputId": "5f5acb75-5715-456a-f63b-76d675676766",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters.insert(0, 'b')\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['b', 'a', 'c']"
]
},
"metadata": {
"tags": []
},
"execution_count": 15
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "z2pfGnq7-PHc"
},
"cell_type": "markdown",
"source": [
"#### Insert at arbitrary position"
]
},
{
"metadata": {
"colab_type": "code",
"id": "SgovUUMS-TxT",
"outputId": "2d639b6d-eaa7-4f91-b837-88136a2b403a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters.insert(2, 'c')\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['b', 'a', 'c', 'c']"
]
},
"metadata": {
"tags": []
},
"execution_count": 16
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "W2WMgepZAjkO"
},
"cell_type": "markdown",
"source": [
"#### Extending with another list"
]
},
{
"metadata": {
"colab_type": "code",
"id": "UYn06yndAoNH",
"outputId": "a1c34ae5-d500-44b0-98a3-c4db3b47c910",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"more_letters = ['e', 'f', 'g']\n",
"letters.extend(more_letters)\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['b', 'a', 'c', 'c', 'e', 'f', 'g']"
]
},
"metadata": {
"tags": []
},
"execution_count": 17
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "wzPgAr_s_CrC"
},
"cell_type": "markdown",
"source": [
"### Change item at some position"
]
},
{
"metadata": {
"colab_type": "code",
"id": "BZGy8c8bov2q",
"outputId": "4fc4fd41-0e1e-4324-8302-b0ce19558908",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters[3] = 'd'\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['b', 'a', 'c', 'd', 'e', 'f', 'g']"
]
},
"metadata": {
"tags": []
},
"execution_count": 18
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "PGcA5_5__RZm"
},
"cell_type": "markdown",
"source": [
"### Swap two items"
]
},
{
"metadata": {
"colab_type": "code",
"id": "egbXdmQ__UB4",
"outputId": "439400af-99d2-4d6c-aaac-f16c822bf73e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters[0], letters[1] = letters[1], letters[0]\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['a', 'b', 'c', 'd', 'e', 'f', 'g']"
]
},
"metadata": {
"tags": []
},
"execution_count": 19
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "RNirVSMaHOp4"
},
"cell_type": "markdown",
"source": [
"### Removing items from a list"
]
},
{
"metadata": {
"colab_type": "text",
"id": "K0ecop0OHXo_"
},
"cell_type": "markdown",
"source": [
"#### Pop from the end"
]
},
{
"metadata": {
"colab_type": "code",
"id": "ZfFG3MZ7HdXa",
"outputId": "fe842c28-5813-4286-df92-908864af1f53",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters = ['a', 'b', 'c', 'd', 'e', 'f']\n",
"letters.pop()\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['a', 'b', 'c', 'd', 'e']"
]
},
"metadata": {
"tags": []
},
"execution_count": 20
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "WSm1EhxBH8a2"
},
"cell_type": "markdown",
"source": [
"#### Pop by index"
]
},
{
"metadata": {
"colab_type": "code",
"id": "R62Fg9l4IAYV",
"outputId": "a6bc87d5-46e8-4383-ced0-e53c67026e43",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters.pop(2)\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['a', 'b', 'd', 'e']"
]
},
"metadata": {
"tags": []
},
"execution_count": 21
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "wDhR89qxIUEh"
},
"cell_type": "markdown",
"source": [
"#### Remove specific item"
]
},
{
"metadata": {
"colab_type": "code",
"id": "cAyAJIeOpYrU",
"outputId": "60762469-653a-4e89-8ffc-4082273b728e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters.remove('d')\n",
"letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['a', 'b', 'e']"
]
},
"metadata": {
"tags": []
},
"execution_count": 22
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "D9_M_6cwUAaX"
},
"cell_type": "markdown",
"source": [
"### Create tuple using brackets"
]
},
{
"metadata": {
"colab_type": "code",
"id": "7Zqb_MU2UEJa",
"outputId": "0b125537-00c5-4bd2-85d4-3dc130fcf6b7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"tup = (1, 2, 3)\n",
"tup"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(1, 2, 3)"
]
},
"metadata": {
"tags": []
},
"execution_count": 23
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "ep3XqjhqUIS1"
},
"cell_type": "markdown",
"source": [
"### Create tuple with commas"
]
},
{
"metadata": {
"colab_type": "code",
"id": "Oz8dkzlzUNEe",
"outputId": "c8b25da2-8291-406f-b3e3-614cb7c06bba",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"tup = 1, 2, 3\n",
"tup"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(1, 2, 3)"
]
},
"metadata": {
"tags": []
},
"execution_count": 24
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "p5Ie15xNUvK-"
},
"cell_type": "markdown",
"source": [
"### Create empty tuple"
]
},
{
"metadata": {
"colab_type": "code",
"id": "6M3eLXfXUxLX",
"outputId": "83081a8c-9e5b-4a5a-bd3e-248f805e6da7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"tup = ()\n",
"tup"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"()"
]
},
"metadata": {
"tags": []
},
"execution_count": 25
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "0YKr2HU6UzgT"
},
"cell_type": "markdown",
"source": [
"### Create tuple with single item"
]
},
{
"metadata": {
"colab_type": "code",
"id": "9ib336cLU3iu",
"outputId": "1f47d8eb-ed90-4c0e-d759-e7aa76f4b11d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"tup = 1,\n",
"tup"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(1,)"
]
},
"metadata": {
"tags": []
},
"execution_count": 28
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "a_eqCY6XTy1x"
},
"cell_type": "markdown",
"source": [
"### Behaviours shared by lists and tuples\n",
"The following sequence behaviors are shared by lists and tuples"
]
},
{
"metadata": {
"colab_type": "text",
"id": "HpNqX6QFLmdK"
},
"cell_type": "markdown",
"source": [
"### Check item in sequence"
]
},
{
"metadata": {
"colab_type": "code",
"id": "NpIdl5Cfp7-f",
"outputId": "1f78c614-29a9-42d7-d2da-300df5698dfc",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"3 in (1, 2, 3, 4, 5)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {
"tags": []
},
"execution_count": 29
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Ms_bF8BjL79W"
},
"cell_type": "markdown",
"source": [
"### Check item not in sequence"
]
},
{
"metadata": {
"colab_type": "code",
"id": "gLXaJE6EMGyb",
"outputId": "786985a3-a630-4617-a4d3-babd7fca9e77",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"'a' not in [1, 2, 3, 4, 5]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {
"tags": []
},
"execution_count": 30
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "UNKJxMb6Mipn"
},
"cell_type": "markdown",
"source": [
"### Slicing"
]
},
{
"metadata": {
"colab_type": "text",
"id": "r_fLVxRhNWLA"
},
"cell_type": "markdown",
"source": [
"#### Setting start, slice to the end"
]
},
{
"metadata": {
"colab_type": "code",
"id": "098SVCIvsibb",
"outputId": "8a737bd9-05af-4f95-fdc4-aa882fef47c9",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters = 'a', 'b', 'c', 'd', 'e', 'f'\n",
"letters[3:4]\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"('d',)"
]
},
"metadata": {
"tags": []
},
"execution_count": 32
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "dsNvabgmNeqw"
},
"cell_type": "markdown",
"source": [
"#### Set end, slice from beginning"
]
},
{
"metadata": {
"colab_type": "code",
"id": "QVNJeYVtNh56",
"outputId": "249d9c60-e20c-4d78-dc02-67b74b17fe17",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters[:4]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"('a', 'b', 'c', 'd')"
]
},
"metadata": {
"tags": []
},
"execution_count": 33
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Lrd7HuHnNvIi"
},
"cell_type": "markdown",
"source": [
"#### Index from end of sequence"
]
},
{
"metadata": {
"colab_type": "code",
"id": "9XiynpN9M9V_",
"outputId": "3b8497e3-1219-496f-ec24-6c99dbd3ff8c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters[-4:]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"('c', 'd', 'e', 'f')"
]
},
"metadata": {
"tags": []
},
"execution_count": 34
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Lzkr-snEOEE9"
},
"cell_type": "markdown",
"source": [
"#### Setting step"
]
},
{
"metadata": {
"colab_type": "code",
"id": "fCzpKpbSOGy-",
"outputId": "e0ef8366-b712-45ca-fcb7-d3db915adae6",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters[1::-2]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"('b',)"
]
},
"metadata": {
"tags": []
},
"execution_count": 36
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "IR6wtY_oJlSv"
},
"cell_type": "markdown",
"source": [
"### Unpacking"
]
},
{
"metadata": {
"colab_type": "code",
"id": "iUY-WFVvP82h",
"outputId": "928e9082-4ac2-4a18-a20f-d369c9e7e713",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 198
}
},
"cell_type": "code",
"source": [
"first, middle = [1, 2, 3]\n",
"\n",
"f\"first = {first}, middle = {middle}, last = {last}\""
],
"execution_count": 0,
"outputs": [
{
"output_type": "error",
"ename": "ValueError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfirst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmiddle\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34mf\"first = {first}, middle = {middle}, last = {last}\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mValueError\u001b[0m: too many values to unpack (expected 2)"
]
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Ryn60MRRQLhE"
},
"cell_type": "markdown",
"source": [
"### Extended unpacking"
]
},
{
"metadata": {
"colab_type": "code",
"id": "5olgXZcwQOwY",
"outputId": "1c3babb9-a0b5-4018-cbeb-b5ea287d2e04",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"first, *middle, last = (1, 2, 3, 4, 5)\n",
"\n",
"f\"first = {first}, middle = {middle}, last = {last}\""
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'first = 1, middle = [2, 3, 4], last = 5'"
]
},
"metadata": {
"tags": []
},
"execution_count": 42
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "uRwRq8FkSMxF"
},
"cell_type": "markdown",
"source": [
"### Using list as Stack\n",
"A stack is a LIFO (last in, first out) data structure which can be simulated using a list"
]
},
{
"metadata": {
"colab_type": "text",
"id": "UMhxPw8tV2ot"
},
"cell_type": "markdown",
"source": [
"#### Push onto the stack using append"
]
},
{
"metadata": {
"colab_type": "code",
"id": "47_iDnO6V6ut",
"outputId": "18c3ccf8-0d66-4681-e576-387985c81f82",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"stack = []\n",
"stack.append('first on')\n",
"stack.append('second on')\n",
"stack.append('third on')\n",
"stack"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['first on', 'second on', 'third on']"
]
},
"metadata": {
"tags": []
},
"execution_count": 43
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "M_wrgcjcWUij"
},
"cell_type": "markdown",
"source": [
"#### Retrieve items, last one first using **pop**"
]
},
{
"metadata": {
"colab_type": "code",
"id": "mY_Wbh9-WZq_",
"outputId": "b5859dbe-52fb-47b8-b9ed-2f799e45be69",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"f\"Retrieved first: {stack.pop()!r}, retrieved second: {stack.pop()!r}, retrieved last: {stack.pop()!r}\""
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"\"Retrieved first: 'third on', retrieved second: 'second on', retrieved last: 'first on'\""
]
},
"metadata": {
"tags": []
},
"execution_count": 44
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "SxpzGoM_Z_RU"
},
"cell_type": "markdown",
"source": [
"## 5.2 Explore dictionaries \n",
"Dictionaries are mappings of key value pairs."
]
},
{
"metadata": {
"colab_type": "text",
"id": "l9nWQuW1oLCE"
},
"cell_type": "markdown",
"source": [
"### Create an empty dict using constructor"
]
},
{
"metadata": {
"colab_type": "code",
"id": "45C_FS-eoR-3",
"outputId": "bd814a64-d7e0-4577-fbe0-46c721330ab1",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary = {}\n",
"dictionary"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{}"
]
},
"metadata": {
"tags": []
},
"execution_count": 46
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "uwNpFQFGo0C_"
},
"cell_type": "markdown",
"source": [
"### Create a dictionary based on key/value pairs"
]
},
{
"metadata": {
"colab_type": "code",
"id": "VWYvp8peo5ok",
"outputId": "ef583270-e5dc-4902-a524-696345ff0db1",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"key_values = [['key-1','value-1'], ['key-2', 'value-2']]\n",
"dictionary = dict(key_values)\n",
"dictionary"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'key-1': 'value-1', 'key-2': 'value-2'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 47
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "oLd8LV02ofza"
},
"cell_type": "markdown",
"source": [
"### Create an empty dict using curley braces"
]
},
{
"metadata": {
"colab_type": "code",
"id": "El6HcTagolLw",
"outputId": "696a4405-368b-4113-a0d1-99566e249446",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary = {}\n",
"dictionary"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{}"
]
},
"metadata": {
"tags": []
},
"execution_count": 48
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "yXNaZRMdpQoK"
},
"cell_type": "markdown",
"source": [
"### Use curley braces to create a dictionary with initial key/values"
]
},
{
"metadata": {
"colab_type": "code",
"id": "oRTWGBtvpYcb",
"outputId": "922c363f-d506-4eb0-81bb-a7301155bdf7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary = {'key-1': 'value-1',\n",
" 'key-2': 'value-2'}\n",
"\n",
"dictionary"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'key-1': 'value-1', 'key-2': 'value-2'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 49
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "vb9aP6o5pv_B"
},
"cell_type": "markdown",
"source": [
"### Access value using key"
]
},
{
"metadata": {
"colab_type": "code",
"id": "3-jz1H8Apzgm",
"outputId": "3970a9a3-a15c-4a80-a2fc-81ad55cc1717",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary['key-1']"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'value-1'"
]
},
"metadata": {
"tags": []
},
"execution_count": 51
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "UBSLRWEeqE-W"
},
"cell_type": "markdown",
"source": [
"### Add a key/value pair to an existing dictionary"
]
},
{
"metadata": {
"colab_type": "code",
"id": "J98co3mWqJ5I",
"outputId": "4ec4724f-aeca-48db-d6c9-84fef9a8c849",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary['key-3'] = 'value-3'\n",
"\n",
"dictionary"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'key-1': 'value-1', 'key-2': 'value-2', 'key-3': 'value-3'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 52
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "xmHWK2hHq7c_"
},
"cell_type": "markdown",
"source": [
"### Update value for existing key"
]
},
{
"metadata": {
"colab_type": "code",
"id": "VrV2r-vUq-JV",
"outputId": "c7960849-de81-461f-eed5-1806f9a01ceb",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary['key-2'] = 'new-value-2'\n",
"dictionary['key-2']"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'new-value-2'"
]
},
"metadata": {
"tags": []
},
"execution_count": 53
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "_ot60Snlra6K"
},
"cell_type": "markdown",
"source": [
"### Get keys"
]
},
{
"metadata": {
"colab_type": "code",
"id": "Lv726tMhrYnh",
"outputId": "23089819-c637-4e89-a48e-ee5507b4b550",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"list(dictionary.keys())"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['key-1', 'key-2', 'key-3']"
]
},
"metadata": {
"tags": []
},
"execution_count": 55
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "WGJyKhKgrf11"
},
"cell_type": "markdown",
"source": [
"### Get values"
]
},
{
"metadata": {
"colab_type": "code",
"id": "F7F-fNMMrhT5",
"outputId": "f9ace37c-d4de-450d-b4e3-c6b06790fbd0",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary.values()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"dict_values(['value-1', 'new-value-2', 'value-3'])"
]
},
"metadata": {
"tags": []
},
"execution_count": 56
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "tCpdFg8JrqKM"
},
"cell_type": "markdown",
"source": [
"### Get iterable keys and items"
]
},
{
"metadata": {
"colab_type": "code",
"id": "0dZRJamArlLg",
"outputId": "1d43a628-2f5e-4d4e-c179-b7cbc2373c37",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary.items()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"dict_items([('key-1', 'value-1'), ('key-2', 'new-value-2'), ('key-3', 'value-3')])"
]
},
"metadata": {
"tags": []
},
"execution_count": 57
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Cvk5OCmQrvkd"
},
"cell_type": "markdown",
"source": [
"### Use items in for loop"
]
},
{
"metadata": {
"colab_type": "code",
"id": "bzYEfTEWrxno",
"outputId": "b8b1427b-7eb0-4f4c-c36c-3b526d9a09dd",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"for key, value in dictionary.items():\n",
" print(f\"{key}: {value}\")"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"key-1: value-1\n",
"key-2: new-value-2\n",
"key-3: value-3\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "9fqITS30sJC6"
},
"cell_type": "markdown",
"source": [
"### Check if dictionary has key\n",
"The 'in' syntax we used with sequences checks the dicts keys for membership."
]
},
{
"metadata": {
"colab_type": "code",
"id": "c1XdFLXNsWVq",
"outputId": "fc5b766b-be93-4d8f-95e0-fd17f981bee0",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"'key-5' in dictionary"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"False"
]
},
"metadata": {
"tags": []
},
"execution_count": 60
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "G4Z2Pk3VsyUJ"
},
"cell_type": "markdown",
"source": [
"### Get method"
]
},
{
"metadata": {
"colab_type": "code",
"id": "FnHIgV8_s1eQ",
"outputId": "50b29063-b0c2-44b8-bd97-4b1da21282a7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"dictionary.get(\"bad key\", \"default value\")"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'default value'"
]
},
"metadata": {
"tags": []
},
"execution_count": 64
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "YmWqelbJthVB"
},
"cell_type": "markdown",
"source": [
"### Remove item"
]
},
{
"metadata": {
"colab_type": "code",
"id": "tOLuWnHEtkjT",
"outputId": "e90cf788-09a9-4e0a-9231-b29f8c070fab",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"del(dictionary['key-1'])\n",
"dictionary"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'key-2': 'new-value-2', 'key-3': 'value-3'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 65
}
]
},
{
"metadata": {
"id": "AnFGTgxCiZnk",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### Keys must be immutable"
]
},
{
"metadata": {
"id": "rnKNsxuZinvy",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### List as key\n",
"Lists are mutable and not hashable"
]
},
{
"metadata": {
"id": "16t9F7PPipo4",
"colab_type": "code",
"outputId": "ad113919-80cc-44cd-cc76-ad1d252dde27",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 198
}
},
"cell_type": "code",
"source": [
"items = ['item-1', 'item-2', 'item-3']\n",
"\n",
"map = {}\n",
"\n",
"map[items] = \"some-value\""
],
"execution_count": 0,
"outputs": [
{
"output_type": "error",
"ename": "TypeError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mmap\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m{\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mmap\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mitems\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"some-value\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m: unhashable type: 'list'"
]
}
]
},
{
"metadata": {
"id": "_9vXIamIjLER",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### Tuple as a key\n",
"Tuples are immutable and hence hashable"
]
},
{
"metadata": {
"id": "K2pFCFu3jNRM",
"colab_type": "code",
"outputId": "5e1fc13f-b76f-45d5-9fff-359f6199972e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"items = 'item-1', 'item-2', 'item-3'\n",
"map = {}\n",
"map[items] = \"some-value\"\n",
"\n",
"map"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{('item-1', 'item-2', 'item-3'): 'some-value'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 67
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Y1rOvrTSaCnA"
},
"cell_type": "markdown",
"source": [
"## 5.3 Dive into sets"
]
},
{
"metadata": {
"colab_type": "text",
"id": "_Bu_DUont1Ks"
},
"cell_type": "markdown",
"source": [
"### Create set from tuple or list"
]
},
{
"metadata": {
"colab_type": "code",
"id": "epPCLrckt4Zy",
"outputId": "df768aae-5c29-492b-a444-d60735fb0f36",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"letters = 'a', 'a', 'a', 'b', 'c'\n",
"unique_letters = set(letters)\n",
"unique_letters"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'a', 'b', 'c'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 68
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "qnQHMNCRuebg"
},
"cell_type": "markdown",
"source": [
"### Create set from a string"
]
},
{
"metadata": {
"colab_type": "code",
"id": "IRsuJBVC_ORB",
"outputId": "ffca987d-0534-4e4f-d660-24318b4c2795",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"unique_chars = set('mississippi')\n",
"unique_chars"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'i', 'm', 'p', 's'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 69
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "LUJyjEh1uzrv"
},
"cell_type": "markdown",
"source": [
"### Create set using curley braces"
]
},
{
"metadata": {
"colab_type": "code",
"id": "afYk3yfTu3Pt",
"outputId": "8fc66ae0-c16c-45e0-d684-41757f68ee81",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"unique_num = {1, 1, 2, 3, 4, 5, 5}\n",
"unique_num"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{1, 2, 3, 4, 5}"
]
},
"metadata": {
"tags": []
},
"execution_count": 70
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "jDMYvC0avu_C"
},
"cell_type": "markdown",
"source": [
"### Adding to a set"
]
},
{
"metadata": {
"colab_type": "code",
"id": "hoZ7hcrBvwtc",
"outputId": "1585f6f0-3122-480e-a69a-022460b0fbc7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"unique_num.add(6)\n",
"unique_num"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{1, 2, 3, 4, 5, 6}"
]
},
"metadata": {
"tags": []
},
"execution_count": 72
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "o-UWnVB7v8yb"
},
"cell_type": "markdown",
"source": [
"### Popping from a set\n",
"Pop method removes and returns a random element of the set"
]
},
{
"metadata": {
"colab_type": "code",
"id": "e9xWJPXnwF4s",
"outputId": "3d0bc7f2-3dda-4738-975b-011d09f8ec7f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"unique_num.pop()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"2"
]
},
"metadata": {
"tags": []
},
"execution_count": 74
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "VJjl2Fv9vLOl"
},
"cell_type": "markdown",
"source": [
"### Indexing\n",
"Sets have no order, and hence cannot be accessed via indexing"
]
},
{
"metadata": {
"colab_type": "code",
"id": "vxPCJWJOvShU",
"outputId": "8bd70eb2-1c1d-4b64-f23b-dcf278c4a29f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 164
}
},
"cell_type": "code",
"source": [
"unique_num[4]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "error",
"ename": "TypeError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0munique_num\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m: 'set' object does not support indexing"
]
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "TQyoEjgivjA3"
},
"cell_type": "markdown",
"source": [
"### Checking membership"
]
},
{
"metadata": {
"colab_type": "code",
"id": "3zOvkWX6vlf2",
"outputId": "020d8aef-4a5a-45fc-da0c-8e5a74240487",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"3 in unique_num"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {
"tags": []
},
"execution_count": 76
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "6ph_k8RUweYA"
},
"cell_type": "markdown",
"source": [
"### Set operations"
]
},
{
"metadata": {
"colab_type": "code",
"id": "7EfN4R5xwmXM",
"colab": {}
},
"cell_type": "code",
"source": [
"s1 = { 1 ,2 ,3 ,4, 5, 6, 7}\n",
"s2 = { 0, 2, 4, 6, 8 }"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"colab_type": "text",
"id": "RAG3ATbtwtwm"
},
"cell_type": "markdown",
"source": [
"#### Items in first set, but not in the second"
]
},
{
"metadata": {
"colab_type": "code",
"id": "kCfxlj9Lwz1x",
"outputId": "3c3b92ad-959b-46bf-ecb0-2513886923b9",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"s1 - s2"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{1, 3, 5, 7}"
]
},
"metadata": {
"tags": []
},
"execution_count": 78
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "KDA6bO1Vw8Bk"
},
"cell_type": "markdown",
"source": [
"#### Items in either or both sets"
]
},
{
"metadata": {
"colab_type": "code",
"id": "a-1CBUnRw-y-",
"outputId": "584219f7-80ba-4e24-f3ec-b833428bfc87",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"s1 | s2"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{0, 1, 2, 3, 4, 5, 6, 7, 8}"
]
},
"metadata": {
"tags": []
},
"execution_count": 79
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "ADEQBlFrxIN7"
},
"cell_type": "markdown",
"source": [
"#### Items in both sets"
]
},
{
"metadata": {
"colab_type": "code",
"id": "IIudbePtxRkR",
"outputId": "ad5fcea8-8b15-467d-a895-c11e4caa8609",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"s1 & s2"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{2, 4, 6}"
]
},
"metadata": {
"tags": []
},
"execution_count": 80
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "742awPXgxat5"
},
"cell_type": "markdown",
"source": [
"#### Items in either set, but not both"
]
},
{
"metadata": {
"colab_type": "code",
"id": "03UonZ9G_jDc",
"outputId": "b06f2369-fd8f-42a3-befb-d9af998f7076",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"s1 ^ s2"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{0, 1, 3, 5, 7, 8}"
]
},
"metadata": {
"tags": []
},
"execution_count": 81
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "y-xMLy31aHaZ"
},
"cell_type": "markdown",
"source": [
"## 5.4 Work with the numpy array\n"
]
},
{
"metadata": {
"colab_type": "text",
"id": "0yTnd14-HasX"
},
"cell_type": "markdown",
"source": [
"Numpy is an opened source numerical computing libary for python. The numpy array is a datastructure representing multidimension arrays which is optimized for both memory and performance."
]
},
{
"metadata": {
"colab_type": "text",
"id": "W7mqP9swyh5d"
},
"cell_type": "markdown",
"source": [
"### Create a numpy array from a list of lists"
]
},
{
"metadata": {
"colab_type": "code",
"id": "NOAicBI_ynGz",
"outputId": "188d1747-f144-47ea-b9d6-887b3b1e88d8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
}
},
"cell_type": "code",
"source": [
"import numpy as np\n",
"list_of_lists = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]\n",
"\n",
"np_array = np.array(list_of_lists)\n",
"\n",
"np_array"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[ 1, 2, 3, 4],\n",
" [ 5, 6, 7, 8],\n",
" [ 9, 10, 11, 12],\n",
" [13, 14, 15, 16]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 82
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "0eHeRfdAyzTY"
},
"cell_type": "markdown",
"source": [
"### Initialize an array of zeros"
]
},
{
"metadata": {
"colab_type": "code",
"id": "1BoLJ2XYy2op",
"outputId": "9945a4a2-28ed-4775-860d-999f4a1c4d54",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
}
},
"cell_type": "code",
"source": [
"zeros_array = np.zeros( (4, 5) )\n",
"zeros_array"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[0., 0., 0., 0., 0.],\n",
" [0., 0., 0., 0., 0.],\n",
" [0., 0., 0., 0., 0.],\n",
" [0., 0., 0., 0., 0.]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 83
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "92DsgsVSzC2T"
},
"cell_type": "markdown",
"source": [
"### Initialize and array of ones"
]
},
{
"metadata": {
"colab_type": "code",
"id": "F_TWC3z6zYA2",
"outputId": "2d7317b1-5b89-44b5-844d-ceec8b50ff86",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 119
}
},
"cell_type": "code",
"source": [
"ones_array = np.ones( (6, 6) )\n",
"ones_array"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[1., 1., 1., 1., 1., 1.],\n",
" [1., 1., 1., 1., 1., 1.],\n",
" [1., 1., 1., 1., 1., 1.],\n",
" [1., 1., 1., 1., 1., 1.],\n",
" [1., 1., 1., 1., 1., 1.],\n",
" [1., 1., 1., 1., 1., 1.]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 84
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "aHaTMcOIzk1_"
},
"cell_type": "markdown",
"source": [
"### Using arrange"
]
},
{
"metadata": {
"colab_type": "code",
"id": "s2W1mGqkzt7O",
"outputId": "ee90b952-d931-453c-9f60-d44c8801feed",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"nine = np.arange( 9 )\n",
"nine"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5, 6, 7, 8])"
]
},
"metadata": {
"tags": []
},
"execution_count": 85
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Ejoa6qiszx-O"
},
"cell_type": "markdown",
"source": [
"### Using reshape"
]
},
{
"metadata": {
"colab_type": "code",
"id": "RgJ6ZfVfOm6R",
"outputId": "67f363e4-a9cd-477f-9ed1-6165d9fd3ac5",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"nine.reshape(3,3)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[0, 1, 2],\n",
" [3, 4, 5],\n",
" [6, 7, 8]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 86
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "lB_e_dbx03bL"
},
"cell_type": "markdown",
"source": [
"### Introspection"
]
},
{
"metadata": {
"colab_type": "text",
"id": "___j8Jrc06k8"
},
"cell_type": "markdown",
"source": [
"#### Get the data type"
]
},
{
"metadata": {
"colab_type": "code",
"id": "IEDq1Hcu0-z_",
"outputId": "45152487-40ba-4f23-d1c0-11bb969e21a4",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"np_array.dtype"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"dtype('int64')"
]
},
"metadata": {
"tags": []
},
"execution_count": 87
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "OTREEzKu1CU1"
},
"cell_type": "markdown",
"source": [
"#### Get the array's shape"
]
},
{
"metadata": {
"colab_type": "code",
"id": "JZ-1Vxj41GKC",
"outputId": "54b58943-55aa-4b94-ee19-81cc036512ef",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"np_array.shape"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(4, 4)"
]
},
"metadata": {
"tags": []
},
"execution_count": 88
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "OeY1bxyJ1Idv"
},
"cell_type": "markdown",
"source": [
"#### Get the number of items in the array"
]
},
{
"metadata": {
"colab_type": "code",
"id": "9-MK_t881LVg",
"outputId": "557d0f24-8814-4378-8926-144d8aacbe6e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"np_array.size"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"16"
]
},
"metadata": {
"tags": []
},
"execution_count": 89
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "SifJ6bra1TQ0"
},
"cell_type": "markdown",
"source": [
"#### Get the size of the array in bytes"
]
},
{
"metadata": {
"colab_type": "code",
"id": "5YzTBEid1YsX",
"outputId": "9b09f32c-a717-40c3-d048-8c7559f95a17",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"np_array.nbytes"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"128"
]
},
"metadata": {
"tags": []
},
"execution_count": 90
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "toHDZHec1f9F"
},
"cell_type": "markdown",
"source": [
"### Setting the data type"
]
},
{
"metadata": {
"colab_type": "text",
"id": "7bdGC31e1pxo"
},
"cell_type": "markdown",
"source": [
"#### dtype parameter"
]
},
{
"metadata": {
"colab_type": "code",
"id": "3quViOL01sy4",
"outputId": "ea7f0ea3-7b16-4b23-fb7b-ea457ee8ea4b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
}
},
"cell_type": "code",
"source": [
"np_array = np.array(list_of_lists, dtype=np.int8)\n",
"np_array"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[ 1, 2, 3, 4],\n",
" [ 5, 6, 7, 8],\n",
" [ 9, 10, 11, 12],\n",
" [13, 14, 15, 16]], dtype=int8)"
]
},
"metadata": {
"tags": []
},
"execution_count": 91
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "xd-_Qgqo1yyu"
},
"cell_type": "markdown",
"source": [
"#### Size reduction"
]
},
{
"metadata": {
"colab_type": "code",
"id": "pFk4n7d312LH",
"outputId": "7c84121e-c3b4-4145-8fd7-7cd065ee7240",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"np_array.nbytes"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"16"
]
},
"metadata": {
"tags": []
},
"execution_count": 92
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "YOzWWnLF1-m2"
},
"cell_type": "markdown",
"source": [
"#### The data type setting is immutible \n",
"Data may be truncated if the data type is restrictive."
]
},
{
"metadata": {
"colab_type": "code",
"id": "2MPe95vlLpwb",
"outputId": "7c158bcc-71b5-45c3-a678-132b2c0c19bd",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"np_array[0][0] = 1.7344567\n",
"np_array[0][0]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1"
]
},
"metadata": {
"tags": []
},
"execution_count": 93
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "vvPhjEXoWYnT"
},
"cell_type": "markdown",
"source": [
"### Array Slicing\n",
"\n",
"\n",
"* Slicing can be used to get a view reprsenting a sub-array. \n",
"* The slice is a view to the original array, the data is not copied to a new data structure\n",
"* The slice is taken in the form: array[ rows, columns ]\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"metadata": {
"colab_type": "code",
"id": "tDjuERnX2hUp",
"outputId": "74052c3f-0c04-45e3-dc68-633903d0e939",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
}
},
"cell_type": "code",
"source": [
"np_array"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[ 1, 2, 3, 4],\n",
" [ 5, 6, 7, 8],\n",
" [ 9, 10, 11, 12],\n",
" [13, 14, 15, 16]], dtype=int8)"
]
},
"metadata": {
"tags": []
},
"execution_count": 94
}
]
},
{
"metadata": {
"colab_type": "code",
"id": "JBEp4F7uWMmY",
"outputId": "eb720ba3-9a3d-41c4-8089-7f62ca88d854",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"cell_type": "code",
"source": [
"np_array[2:, :3]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[ 9, 10, 11],\n",
" [13, 14, 15]], dtype=int8)"
]
},
"metadata": {
"tags": []
},
"execution_count": 95
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "DZw_vVRta7q8"
},
"cell_type": "markdown",
"source": [
"### Math operations\n",
"\n",
"\n",
"* Unlike a unlike nested lists, matrix operations perform mathimatical operations on data\n",
"\n"
]
},
{
"metadata": {
"colab_type": "text",
"id": "DcuhDzog3k7b"
},
"cell_type": "markdown",
"source": [
"#### Create two 3 x 3 arrays"
]
},
{
"metadata": {
"colab_type": "code",
"id": "Hk4TSZMp3ueg",
"outputId": "f9c0a3be-9eaa-4d19-df19-53de928ffde4",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"np_array_1 = np.arange(9).reshape(3,3)\n",
"np_array_1\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[0, 1, 2],\n",
" [3, 4, 5],\n",
" [6, 7, 8]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 96
}
]
},
{
"metadata": {
"colab_type": "code",
"id": "BGUAuzGW32MZ",
"outputId": "f84c69d5-cf39-40f3-8716-1b2950a8200a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"np_array_2 = np.arange(10, 19).reshape(3,3)\n",
"np_array_2"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[10, 11, 12],\n",
" [13, 14, 15],\n",
" [16, 17, 18]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 97
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "JtgSgziO3-zG"
},
"cell_type": "markdown",
"source": [
"#### Multiply the arrays"
]
},
{
"metadata": {
"colab_type": "code",
"id": "DnsoTaV64B6s",
"outputId": "1ff2e5fe-8c02-4a14-e447-489a91ae7dab",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"np_array_1 * np_array_2"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[ 0, 11, 24],\n",
" [ 39, 56, 75],\n",
" [ 96, 119, 144]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 98
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "bLZEFu6F4RZ9"
},
"cell_type": "markdown",
"source": [
"#### Add the arrays"
]
},
{
"metadata": {
"colab_type": "code",
"id": "gtImxWZW4TIK",
"outputId": "0549bcbe-a4ec-4563-d8e3-a9b5faa12f1c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"np_array_1 + np_array_2"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[10, 12, 14],\n",
" [16, 18, 20],\n",
" [22, 24, 26]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 99
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "7R8mt4RU4yMz"
},
"cell_type": "markdown",
"source": [
"### Matrix operations"
]
},
{
"metadata": {
"colab_type": "text",
"id": "1OU4oaWt40Jp"
},
"cell_type": "markdown",
"source": [
"#### Transpose"
]
},
{
"metadata": {
"colab_type": "code",
"id": "EXIwtxD942UO",
"outputId": "ee5697f7-97bd-4b89-d053-771afa6fa69b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
}
},
"cell_type": "code",
"source": [
"np_array.T"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[ 1, 5, 9, 13],\n",
" [ 2, 6, 10, 14],\n",
" [ 3, 7, 11, 15],\n",
" [ 4, 8, 12, 16]], dtype=int8)"
]
},
"metadata": {
"tags": []
},
"execution_count": 100
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "oi0b6o6o46E9"
},
"cell_type": "markdown",
"source": [
"#### Dot product"
]
},
{
"metadata": {
"colab_type": "code",
"id": "aQgYwiPpbxIG",
"outputId": "7fd1509e-d41a-40f0-f543-eebe891d7157",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"np_array_1.dot(np_array_2)\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[ 45, 48, 51],\n",
" [162, 174, 186],\n",
" [279, 300, 321]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 101
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "z-hjI2NsaK4w"
},
"cell_type": "markdown",
"source": [
"## 5.5 Use the Pandas DataFrame\n",
"* One of the most highly leveraged data structures for data science\n",
"* A table-like two dimensional data structure. \n"
]
},
{
"metadata": {
"colab_type": "text",
"id": "e8Dxdc4V6wlV"
},
"cell_type": "markdown",
"source": [
"### Create a DataFrame"
]
},
{
"metadata": {
"colab_type": "code",
"id": "73JaHcb261eb",
"outputId": "95ec9725-3f0e-472d-c9e1-6439bfeac6ff",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
}
},
"cell_type": "code",
"source": [
"import pandas as pd\n",
"first_names = ['henry', 'rolly', 'molly', 'frank', 'david', 'steven', 'gwen', 'arthur']\n",
"last_names = ['smith', 'brocker', 'stein', 'bach', 'spencer', 'de wilde', 'mason', 'davis']\n",
"ages = [43, 23, 78, 56, 26, 14, 46, 92]\n",
"\n",
"df = pd.DataFrame({ 'first': first_names, 'last': last_names, 'age': ages})\n",
"df"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" age | \n",
" first | \n",
" last | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 43 | \n",
" henry | \n",
" smith | \n",
"
\n",
" \n",
" 1 | \n",
" 23 | \n",
" rolly | \n",
" brocker | \n",
"
\n",
" \n",
" 2 | \n",
" 78 | \n",
" molly | \n",
" stein | \n",
"
\n",
" \n",
" 3 | \n",
" 56 | \n",
" frank | \n",
" bach | \n",
"
\n",
" \n",
" 4 | \n",
" 26 | \n",
" david | \n",
" spencer | \n",
"
\n",
" \n",
" 5 | \n",
" 14 | \n",
" steven | \n",
" de wilde | \n",
"
\n",
" \n",
" 6 | \n",
" 46 | \n",
" gwen | \n",
" mason | \n",
"
\n",
" \n",
" 7 | \n",
" 92 | \n",
" arthur | \n",
" davis | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" age first last\n",
"0 43 henry smith\n",
"1 23 rolly brocker\n",
"2 78 molly stein\n",
"3 56 frank bach\n",
"4 26 david spencer\n",
"5 14 steven de wilde\n",
"6 46 gwen mason\n",
"7 92 arthur davis"
]
},
"metadata": {
"tags": []
},
"execution_count": 103
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "ut_QqgQi7CvX"
},
"cell_type": "markdown",
"source": [
"### Head - looking at the top"
]
},
{
"metadata": {
"colab_type": "code",
"id": "FN7tXlFV7FiE",
"outputId": "f48221e8-bcee-4d3c-e2e0-e3bd130b202d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
}
},
"cell_type": "code",
"source": [
"df.head(10)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" age | \n",
" first | \n",
" last | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 43 | \n",
" henry | \n",
" smith | \n",
"
\n",
" \n",
" 1 | \n",
" 23 | \n",
" rolly | \n",
" brocker | \n",
"
\n",
" \n",
" 2 | \n",
" 78 | \n",
" molly | \n",
" stein | \n",
"
\n",
" \n",
" 3 | \n",
" 56 | \n",
" frank | \n",
" bach | \n",
"
\n",
" \n",
" 4 | \n",
" 26 | \n",
" david | \n",
" spencer | \n",
"
\n",
" \n",
" 5 | \n",
" 14 | \n",
" steven | \n",
" de wilde | \n",
"
\n",
" \n",
" 6 | \n",
" 46 | \n",
" gwen | \n",
" mason | \n",
"
\n",
" \n",
" 7 | \n",
" 92 | \n",
" arthur | \n",
" davis | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" age first last\n",
"0 43 henry smith\n",
"1 23 rolly brocker\n",
"2 78 molly stein\n",
"3 56 frank bach\n",
"4 26 david spencer\n",
"5 14 steven de wilde\n",
"6 46 gwen mason\n",
"7 92 arthur davis"
]
},
"metadata": {
"tags": []
},
"execution_count": 106
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "2lZEPhBd7PGN"
},
"cell_type": "markdown",
"source": [
"### Setting number of rows returned with head"
]
},
{
"metadata": {
"colab_type": "code",
"id": "mO583bRm7J9y",
"colab": {}
},
"cell_type": "code",
"source": [
"df.head(3)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"colab_type": "text",
"id": "fP0Szs_k7ZwM"
},
"cell_type": "markdown",
"source": [
"### Tail - looking at the bottom"
]
},
{
"metadata": {
"colab_type": "code",
"id": "lWpCA6lh7dIZ",
"outputId": "6ad75973-5eba-4e1f-e70f-709f20036be7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 111
}
},
"cell_type": "code",
"source": [
"df.tail(2)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" age | \n",
" first | \n",
" last | \n",
"
\n",
" \n",
" \n",
" \n",
" 6 | \n",
" 46 | \n",
" gwen | \n",
" mason | \n",
"
\n",
" \n",
" 7 | \n",
" 92 | \n",
" arthur | \n",
" davis | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" age first last\n",
"6 46 gwen mason\n",
"7 92 arthur davis"
]
},
"metadata": {
"tags": []
},
"execution_count": 108
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "aMcpAbpW7sKB"
},
"cell_type": "markdown",
"source": [
"### Describe - descriptive statistics"
]
},
{
"metadata": {
"colab_type": "code",
"id": "c1SCEPeB7xIi",
"outputId": "ad35757e-f072-4208-e853-1f25dc24614e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
}
},
"cell_type": "code",
"source": [
"df.describe()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" age | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 8.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 47.250000 | \n",
"
\n",
" \n",
" std | \n",
" 27.227874 | \n",
"
\n",
" \n",
" min | \n",
" 14.000000 | \n",
"
\n",
" \n",
" 25% | \n",
" 25.250000 | \n",
"
\n",
" \n",
" 50% | \n",
" 44.500000 | \n",
"
\n",
" \n",
" 75% | \n",
" 61.500000 | \n",
"
\n",
" \n",
" max | \n",
" 92.000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" age\n",
"count 8.000000\n",
"mean 47.250000\n",
"std 27.227874\n",
"min 14.000000\n",
"25% 25.250000\n",
"50% 44.500000\n",
"75% 61.500000\n",
"max 92.000000"
]
},
"metadata": {
"tags": []
},
"execution_count": 109
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "Of3owwjI71cr"
},
"cell_type": "markdown",
"source": [
"### Access one column"
]
},
{
"metadata": {
"colab_type": "code",
"id": "siMCagaq74bO",
"outputId": "99907925-afaf-459c-bb9f-63a7b9bf25e2",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 170
}
},
"cell_type": "code",
"source": [
"df['first']"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 henry\n",
"1 rolly\n",
"2 molly\n",
"3 frank\n",
"4 david\n",
"5 steven\n",
"6 gwen\n",
"7 arthur\n",
"Name: first, dtype: object"
]
},
"metadata": {
"tags": []
},
"execution_count": 110
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "KdOsmAk67-FA"
},
"cell_type": "markdown",
"source": [
"### Slice a column"
]
},
{
"metadata": {
"colab_type": "code",
"id": "dNl_CTuk8Bip",
"outputId": "df4f7aa6-4457-4c77-f2cb-5c77c227d072",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 102
}
},
"cell_type": "code",
"source": [
"df['first'][4:]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"4 david\n",
"5 steven\n",
"6 gwen\n",
"7 arthur\n",
"Name: first, dtype: object"
]
},
"metadata": {
"tags": []
},
"execution_count": 111
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "H3iUwAI-8TFp"
},
"cell_type": "markdown",
"source": [
"### Use conditions to filter"
]
},
{
"metadata": {
"colab_type": "code",
"id": "pjYmR1d0fbUh",
"outputId": "0add25f2-e4c6-4a21-98d8-6c77cdacea4e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 142
}
},
"cell_type": "code",
"source": [
"df[df['age'] > 50]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" age | \n",
" first | \n",
" last | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 78 | \n",
" molly | \n",
" stein | \n",
"
\n",
" \n",
" 3 | \n",
" 56 | \n",
" frank | \n",
" bach | \n",
"
\n",
" \n",
" 7 | \n",
" 92 | \n",
" arthur | \n",
" davis | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" age first last\n",
"2 78 molly stein\n",
"3 56 frank bach\n",
"7 92 arthur davis"
]
},
"metadata": {
"tags": []
},
"execution_count": 112
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "mU_PWc0zaO0X"
},
"cell_type": "markdown",
"source": [
"## 5.6 Use the pandas Series\n",
"\n",
"\n",
"* A one dimensional labeled array\n",
"* Contains data of only one type\n",
"* Similar to a column in a spreedsheet\n",
"\n",
"\n"
]
},
{
"metadata": {
"colab_type": "text",
"id": "8IJJSLsN8kAq"
},
"cell_type": "markdown",
"source": [
"### Create a series"
]
},
{
"metadata": {
"colab_type": "code",
"id": "cdGYE9LdnSwU",
"outputId": "c4a7433b-e15c-4f27-d727-72cf26e9de70",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
}
},
"cell_type": "code",
"source": [
"pd_series = pd.Series( [1, 2, 3 ] )\n",
"pd_series"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 1\n",
"1 2\n",
"2 3\n",
"dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 114
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "GzZyCmanpu2p"
},
"cell_type": "markdown",
"source": [
"### Series introspection methods"
]
},
{
"metadata": {
"colab_type": "code",
"id": "RlsP8h-KpxxE",
"outputId": "5eb734b2-d44e-422c-c53b-fb0fdca4b1ba",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"f\"This series is made up of {pd_series.size} items whose data type is {pd_series.dtype}\""
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'This series is made up of 3 items whose data type is int64'"
]
},
"metadata": {
"tags": []
},
"execution_count": 115
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "OgKF9C9an8iB"
},
"cell_type": "markdown",
"source": [
"### A Pandas DataFrame is composed of Pandas Series. "
]
},
{
"metadata": {
"colab_type": "code",
"id": "wAmSud2ToHuh",
"outputId": "9afc9fb8-bef8-4f3f-95ac-8d26f9375233",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"age = df.age\n",
"type( age )"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"metadata": {
"tags": []
},
"execution_count": 116
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "DooLFZH1ovCV"
},
"cell_type": "markdown",
"source": [
"### Some useful helper methods of a Series"
]
},
{
"metadata": {
"colab_type": "text",
"id": "iyDtUDT3JRuY"
},
"cell_type": "markdown",
"source": [
"#### mean"
]
},
{
"metadata": {
"colab_type": "code",
"id": "9gi2tOSTJTrI",
"outputId": "1fbee477-4a35-422d-949b-1fe0b180af04",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"pd_series = pd.Series([ 1, 2, 3, 5, 6, 6, 6, 7, 8])\n",
"pd_series.mean()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"4.888888888888889"
]
},
"metadata": {
"tags": []
},
"execution_count": 117
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "bILvFviLJbok"
},
"cell_type": "markdown",
"source": [
"#### Unique"
]
},
{
"metadata": {
"colab_type": "code",
"id": "azsZ0xzcJdN2",
"outputId": "f8371eb1-959f-4726-acd1-e00729f9f2d6",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"pd_series.unique()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([1, 2, 3, 5, 6, 7, 8])"
]
},
"metadata": {
"tags": []
},
"execution_count": 118
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "GT2CGOucJgjz"
},
"cell_type": "markdown",
"source": [
"#### Max"
]
},
{
"metadata": {
"colab_type": "code",
"id": "s-PPBB7Co385",
"outputId": "27c23504-0120-489f-b52b-0c6fbd49b8d9",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"pd_series.min()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1"
]
},
"metadata": {
"tags": []
},
"execution_count": 120
}
]
},
{
"metadata": {
"colab_type": "text",
"id": "g2UU5cr6rna6"
},
"cell_type": "markdown",
"source": [
"# Notes:\n",
"[Lists](https://docs.python.org/3/tutorial/datastructures.html)\n",
"\n",
"[Tuples and sequences](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences)\n",
"\n",
"[Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)\n",
"\n",
"[Numpy arrays](https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html)\n",
"\n",
"[Pandas DataFrame](https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.DataFrame.html)\n",
"\n",
"[Pandas Series](https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Series.html)\n",
"\n"
]
}
]
}