{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Lesson4-Python For Data Science-Strings.ipynb", "version": "0.3.2", "provenance": [], "collapsed_sections": [], "include_colab_link": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "metadata": { "colab_type": "text", "id": "cTNA1xgJSTZl" }, "cell_type": "markdown", "source": [ "# Lesson 4 Strings" ] }, { "metadata": { "id": "c_Id55m6Jsbu", "colab_type": "text" }, "cell_type": "markdown", "source": [ "## Pragmatic AI Labs\n", "\n" ] }, { "metadata": { "id": "e5p96AqpSDZa", "colab_type": "text" }, "cell_type": "markdown", "source": [ "![alt text](https://paiml.com/images/logo_with_slogan_white_background.png)\n", "\n", "This notebook was produced by [Pragmatic AI Labs](https://paiml.com/). You can continue learning about these topics by:\n", "\n", "* Buying a copy of [Pragmatic AI: An Introduction to Cloud-Based Machine Learning](http://www.informit.com/store/pragmatic-ai-an-introduction-to-cloud-based-machine-9780134863917)\n", "* Reading an online copy of [Pragmatic AI:Pragmatic AI: An Introduction to Cloud-Based Machine Learning](https://www.safaribooksonline.com/library/view/pragmatic-ai-an/9780134863924/)\n", "* Watching video [Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline](https://www.safaribooksonline.com/videos/essential-machine-learning/9780135261118) on Safari Books Online.\n", "* Watching video [AWS Certified Machine Learning-Speciality](https://learning.oreilly.com/videos/aws-certified-machine/9780135556597)\n", "* Purchasing video [Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video](http://www.informit.com/store/essential-machine-learning-and-ai-with-python-and-jupyter-9780135261095)\n", "* Viewing more content at [noahgift.com](https://noahgift.com/)\n" ] }, { "metadata": { "id": "pBTeTbnRKG_k", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "" ], "execution_count": 0, "outputs": [] }, { "metadata": { "colab_type": "text", "id": "GbAj3ReBSbc6" }, "cell_type": "markdown", "source": [ "## 4.1 Use string methods\n", "\n" ] }, { "metadata": { "colab_type": "text", "id": "Gby8qdr8Tqkh" }, "cell_type": "markdown", "source": [ "### String Quoting" ] }, { "metadata": { "id": "D6asCVcEwTaV", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Single quotes" ] }, { "metadata": { "id": "1lgW3xygwZG7", "colab_type": "code", "outputId": "9454a4be-5203-46e8-b457-7136571f76f0", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'Here is a string'" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Here is a string'" ] }, "metadata": { "tags": [] }, "execution_count": 52 } ] }, { "metadata": { "id": "mG-QrYoOKmyh", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Double quotes" ] }, { "metadata": { "id": "OFKfbG4_wd0k", "colab_type": "code", "outputId": "14ac0f9c-1ee8-4657-e028-f97c2ae6fc65", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "\"Here is a string\" == 'Here is a string'" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 53 } ] }, { "metadata": { "colab_type": "text", "id": "YYUhAN5NTuu3" }, "cell_type": "markdown", "source": [ "#### Triple Strings" ] }, { "metadata": { "colab_type": "code", "id": "spMcAUfWTxB7", "outputId": "2518b205-fdcc-4db3-dd84-630b00165403", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "a_very_large_phrase = \"\"\"\n", "Wikipedia is hosted by the Wikimedia Foundation, \n", "a non-profit organization that also hosts a range of other projects.\n", "\"\"\"\n", "\n", "print(a_very_large_phrase)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "\n", "Wikipedia is hosted by the Wikimedia Foundation, \n", "a non-profit organization that also hosts a range of other projects.\n", "\n" ], "name": "stdout" } ] }, { "metadata": { "colab_type": "text", "id": "XAMwuppiVn22" }, "cell_type": "markdown", "source": [ "#### Raw Strings" ] }, { "metadata": { "colab_type": "code", "id": "txx8Ps8hVqf5", "outputId": "65ca268a-5181-463e-d41f-c5f86f26a007", "colab": { "base_uri": "https://localhost:8080/", "height": 51 } }, "cell_type": "code", "source": [ "jon_jones = '...wrote on twitter he is the greatest \"heavyw8e! \\nfighter of all time'\n", "print(jon_jones)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "...wrote on twitter he is the greatest \"heavyw8e! \n", "fighter of all time\n" ], "name": "stdout" } ] }, { "metadata": { "id": "eKuBaGBex7M9", "colab_type": "code", "outputId": "6a6722a0-6693-459f-ecd8-262232c6bc76", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "jon_jones = r'...wrote on twitter he is the greatest \"heavyw8e! \\nfighter of all time'\n", "print(jon_jones)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "...wrote on twitter he is the greatest \"heavyw8e! \\nfighter of all time\n" ], "name": "stdout" } ] }, { "metadata": { "colab_type": "text", "id": "7JN8KRAruJdb" }, "cell_type": "markdown", "source": [ "### Case Manipulation" ] }, { "metadata": { "colab_type": "code", "id": "1ZnmUYK8fI9V", "outputId": "aff21c20-23e3-4e32-d66f-82ddb28250d1", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "captain = \"Patrick Tayluer\"\n", "\n", "captain" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Patrick Tayluer'" ] }, "metadata": { "tags": [] }, "execution_count": 57 } ] }, { "metadata": { "id": "zpgza2QByRP1", "colab_type": "code", "outputId": "468e3f71-87c5-41dc-d312-a704bff2713f", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "captain.capitalize()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Patrick tayluer'" ] }, "metadata": { "tags": [] }, "execution_count": 58 } ] }, { "metadata": { "colab_type": "code", "id": "sPgdoiu-tWrp", "outputId": "f1763b08-f21e-4e94-aa61-186dc7a36b32", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "captain.lower()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'patrick tayluer'" ] }, "metadata": { "tags": [] }, "execution_count": 59 } ] }, { "metadata": { "colab_type": "code", "id": "-K13_7fmtxjw", "outputId": "e3569755-809f-4979-d4a5-af40b821f7b4", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "captain.upper()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'PATRICK TAYLUER'" ] }, "metadata": { "tags": [] }, "execution_count": 60 } ] }, { "metadata": { "colab_type": "code", "id": "6jlry9S4tcEj", "outputId": "3c9d4cf5-470d-425a-d7b5-0c6583fda388", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "captain.swapcase()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'pATRICK tAYLUER'" ] }, "metadata": { "tags": [] }, "execution_count": 61 } ] }, { "metadata": { "colab_type": "code", "id": "IoCZQxuutvYA", "outputId": "c9a0ac3a-9260-4986-8a97-d7fc5a387e01", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "captain = 'patrick tayluer'\n", "captain.title()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Patrick Tayluer'" ] }, "metadata": { "tags": [] }, "execution_count": 62 } ] }, { "metadata": { "colab_type": "text", "id": "8CjgNkacldql" }, "cell_type": "markdown", "source": [ "### Interrogation" ] }, { "metadata": { "colab_type": "code", "id": "TjQozbkOlnmI", "colab": {} }, "cell_type": "code", "source": [ "river = 'Mississippi'\n" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "AgbQmY7KCWJN", "colab_type": "code", "outputId": "470f7c59-828b-4111-c00a-80be613422cb", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "len(river)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "11" ] }, "metadata": { "tags": [] }, "execution_count": 64 } ] }, { "metadata": { "id": "QFOItL32CTzZ", "colab_type": "code", "outputId": "63e4217c-08f0-4b24-b0f2-b80aa37f5c52", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "river.count('s')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "4" ] }, "metadata": { "tags": [] }, "execution_count": 65 } ] }, { "metadata": { "colab_type": "code", "id": "JiicVyEgmEZK", "outputId": "281e2797-1349-4f69-9db8-f8d6cc50f9ec", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "river.index('pp')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "8" ] }, "metadata": { "tags": [] }, "execution_count": 66 } ] }, { "metadata": { "colab_type": "code", "id": "H3RK0w0NmZwg", "outputId": "078f9c14-25f8-4c51-b45f-aaebf46e9a7f", "colab": { "base_uri": "https://localhost:8080/", "height": 164 } }, "cell_type": "code", "source": [ "river.index('r')" ], "execution_count": 0, "outputs": [ { "output_type": "error", "ename": "ValueError", "evalue": "ignored", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mriver\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'r'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mValueError\u001b[0m: substring not found" ] } ] }, { "metadata": { "colab_type": "code", "id": "YFW7wFi5mzRw", "outputId": "7aa1d8d2-13f3-40a7-9467-823630d316a8", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "river.find('r')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "-1" ] }, "metadata": { "tags": [] }, "execution_count": 68 } ] }, { "metadata": { "colab_type": "code", "id": "qk70XczYm1x4", "outputId": "1a2cedc1-096d-46c1-dafc-fc0c74618994", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "river.startswith('M')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 69 } ] }, { "metadata": { "colab_type": "code", "id": "BOh9jjnwm7GQ", "outputId": "1f31ca0a-23d6-416b-ac08-07f8ae2ea29f", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "river.endswith('i')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 70 } ] }, { "metadata": { "colab_type": "code", "id": "fJ-HzGr_NmAn", "outputId": "d4c16da5-ffc4-4f77-d4f8-5af0207d7050", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'sip' in river" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 71 } ] }, { "metadata": { "colab_type": "text", "id": "NHcQp9irnPdB" }, "cell_type": "markdown", "source": [ "### Content Type" ] }, { "metadata": { "colab_type": "code", "id": "UJcGL9BgnSzJ", "outputId": "8b6ec7c0-f236-45f6-ed87-41067483fa31", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'abc123'.isalpha()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "False" ] }, "metadata": { "tags": [] }, "execution_count": 73 } ] }, { "metadata": { "colab_type": "code", "id": "lEEfmohingDN", "outputId": "86982e7b-ab38-47e8-f350-df46689a16f9", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'abc123'.isalnum()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 74 } ] }, { "metadata": { "colab_type": "code", "id": "YUhkecIbnmGd", "outputId": "46a41050-4d51-4f33-d0fa-eda38417fc5c", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'lowercase'.islower()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 75 } ] }, { "metadata": { "colab_type": "code", "id": "C0CduPhknrUs", "outputId": "4e7d9886-0b2f-4f5b-829b-d28c8ad48f08", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'lowercase'.isupper()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "False" ] }, "metadata": { "tags": [] }, "execution_count": 76 } ] }, { "metadata": { "colab_type": "code", "id": "EFS9jh8IntrM", "outputId": "8ec24fab-a286-4152-a1b2-6882b1458498", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'The Good Ship'.istitle()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "True" ] }, "metadata": { "tags": [] }, "execution_count": 77 } ] }, { "metadata": { "colab_type": "code", "id": "GxHy9RU_UEgA", "outputId": "b49d0ed6-49a7-4f29-c092-a30990662aa2", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'The bad seed'.istitle()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "False" ] }, "metadata": { "tags": [] }, "execution_count": 78 } ] }, { "metadata": { "colab_type": "text", "id": "X1AJ18VJoD6J" }, "cell_type": "markdown", "source": [ "More information: [String Methods](https://docs.python.org/3/library/stdtypes.html#string-methods)" ] }, { "metadata": { "colab_type": "text", "id": "ITOkUfVNSoHC" }, "cell_type": "markdown", "source": [ "## 4.2 Format strings\n", "F-strings where introduced in Python 3.6. They prefixed by either a 'F' or 'f' before the beginning quotation mark. Values can be inserted into F-strings at runtime using replacement fields which are deliminated by curly braces." ] }, { "metadata": { "id": "8s8hh-Z6Cp_K", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Insert variable into replacement field" ] }, { "metadata": { "colab_type": "code", "id": "LZDftZDKT0eI", "outputId": "9b120b76-dd90-445f-a63e-25511a0db0e3", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "strings_count = 5\n", "frets_count = 24\n", "f\"Noam Pikelny's banjo has {strings_count} strings and {frets_count} frets\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"Noam Pikelny's banjo has 5 strings and 24 frets\"" ] }, "metadata": { "tags": [] }, "execution_count": 79 } ] }, { "metadata": { "id": "1TE5TEr8Cu6j", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Insert expression into replacement field" ] }, { "metadata": { "colab_type": "code", "id": "Vp4gLnKIUn8d", "outputId": "080de539-5f0d-4312-c698-516300166403", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "a = 12\n", "b = 32\n", "f\"{a} times {b} equals {a*b}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'12 times 32 equals 384'" ] }, "metadata": { "tags": [] }, "execution_count": 80 } ] }, { "metadata": { "id": "D5Nwsc3uDGEA", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Index list in string replacement fields" ] }, { "metadata": { "colab_type": "code", "id": "0SINPS8VUGES", "outputId": "caae1706-29dd-4ab6-ac45-e77d63204abf", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "players = [\"Tony Trischka\", \"Bill Evans\", \"Alan Munde\"]\n", "f\"Performances will be held by {players[1]}, {players[0]}, and {players[2]}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Performances will be held by Bill Evans, Tony Trischka, and Alan Munde'" ] }, "metadata": { "tags": [] }, "execution_count": 81 } ] }, { "metadata": { "id": "_dLRgBigEWxy", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Conversion flags\n", "A conversion flag can be specified to convert the type of the value before formatting. The three available flags are 's', 'r' and 'a'." ] }, { "metadata": { "id": "LEaUFbNPG3q3", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Using str conversion" ] }, { "metadata": { "id": "XMkyQGaOGF43", "colab_type": "code", "outputId": "0ccadcd7-e709-4a68-a6fd-fc6b2d5e0cab", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "nuts = [1,2,3,4,5]\n", "f\"Calling str() on a the list {nuts} produces {nuts!s}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Calling str() on a the list [1, 2, 3, 4, 5] produces [1, 2, 3, 4, 5]'" ] }, "metadata": { "tags": [] }, "execution_count": 82 } ] }, { "metadata": { "id": "7yKhXIB9HHAG", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Using repr conversiont" ] }, { "metadata": { "colab_type": "code", "id": "3KHOPzFpUiYt", "outputId": "4b01bf79-ddf5-455e-ff99-f270143d0422", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "nut = 'pistacio'\n", "f\"Calling repr on the string {nut} results in {nut!r}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"Calling repr on the string pistacio results in 'pistacio'\"" ] }, "metadata": { "tags": [] }, "execution_count": 83 } ] }, { "metadata": { "id": "QGL3jaPRHLL6", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Using ascii conversion" ] }, { "metadata": { "colab_type": "code", "id": "h8HvErGWVaRm", "outputId": "5e5e3bc8-c733-4ad6-8f77-2457aac0f5a2", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "check = \"√\"\n", "f\"The ascii version of {check} is {check!a}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"The ascii version of √ is '\\\\u221a'\"" ] }, "metadata": { "tags": [] }, "execution_count": 84 } ] }, { "metadata": { "id": "pppmK_qKIzMf", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Padding a number" ] }, { "metadata": { "id": "L9OiDcEVI2Ck", "colab_type": "code", "outputId": "29f065bb-59a7-41dc-da18-c55de26add65", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "lucky_num = 13\n", "f\"To pad the number {lucky_num} to 5 places:{lucky_num:5d}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'To pad the number 13 to 5 places: 13'" ] }, "metadata": { "tags": [] }, "execution_count": 85 } ] }, { "metadata": { "id": "vj9FpG5DH2hr", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Setting padding value at runtime" ] }, { "metadata": { "colab_type": "code", "id": "RPuBc-tZVhRN", "outputId": "a7430ca0-10ee-44cb-e80a-10e429a96939", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "luckey_num = 13\n", "padding = 5\n", "f\"To pad the number {lucky_num} to {padding} places:{lucky_num:{padding}d}\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'To pad the number 13 to 5 places: 13'" ] }, "metadata": { "tags": [] }, "execution_count": 86 } ] }, { "metadata": { "colab_type": "text", "id": "kwXBDbXRLHhK" }, "cell_type": "markdown", "source": [ "More information: [Format String Syntax](https://docs.python.org/2/library/string.html#format-string-syntax)\n", "\n", "Other String Formatting: \n", " [String Format Method](https://docs.python.org/2/library/string.html#custom-string-formatting)\n", " \n", " [Old Style String Formatting](https://docs.python.org/3/library/stdtypes.html#old-string-formatting)" ] }, { "metadata": { "colab_type": "text", "id": "7d0IbqU8Sslw" }, "cell_type": "markdown", "source": [ "## 4.3 Manipulate strings " ] }, { "metadata": { "colab_type": "text", "id": "dglY8AFpUSyO" }, "cell_type": "markdown", "source": [ "### Concatenation" ] }, { "metadata": { "colab_type": "code", "id": "LQCXMZpyN22h", "outputId": "5db6201a-67ac-4728-cd4a-b3d68e6f75e8", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "\"Bob\" + \"beroo\"" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Bobberoo'" ] }, "metadata": { "tags": [] }, "execution_count": 87 } ] }, { "metadata": { "colab_type": "code", "id": "mxjlBRMEN_zB", "outputId": "d0f146f8-e038-4088-affc-48a126ca961a", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "\"AB\" * 8" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'ABABABABABABABAB'" ] }, "metadata": { "tags": [] }, "execution_count": 88 } ] }, { "metadata": { "colab_type": "text", "id": "7V3tlaGeumsn" }, "cell_type": "markdown", "source": [ "### Remove Whitespace" ] }, { "metadata": { "colab_type": "code", "id": "S7s2L8NRTyYK", "outputId": "452bfb50-67a5-46a7-989b-736ba7c74313", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "ship = \" The Yankee Clipper \"\n", "ship" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "' The Yankee Clipper '" ] }, "metadata": { "tags": [] }, "execution_count": 89 } ] }, { "metadata": { "colab_type": "code", "id": "PM4r28sTT86o", "outputId": "b639f416-5256-4ce8-cfbc-153f0e898443", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "ship.strip()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'The Yankee Clipper'" ] }, "metadata": { "tags": [] }, "execution_count": 91 } ] }, { "metadata": { "colab_type": "code", "id": "wKtcldcRUCoo", "outputId": "c7f73af2-c7bf-4457-df1a-5e8a2b413efb", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "ship.lstrip()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'The Yankee Clipper '" ] }, "metadata": { "tags": [] }, "execution_count": 92 } ] }, { "metadata": { "colab_type": "code", "id": "ILdKPDCGT_Tw", "outputId": "2d87f1dc-75e2-4957-d7aa-c59ab0967598", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "ship.rstrip()" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "' The Yankee Clipper'" ] }, "metadata": { "tags": [] }, "execution_count": 93 } ] }, { "metadata": { "colab_type": "code", "id": "eEkEIy1iZDBP", "outputId": "c6954cca-3090-494c-fbc3-c5820c86c07c", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "ship.rstrip(\"per \")" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "' The Yankee Cli'" ] }, "metadata": { "tags": [] }, "execution_count": 94 } ] }, { "metadata": { "id": "7rwU5mm9Aoju", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Add padding" ] }, { "metadata": { "colab_type": "code", "id": "Z_RBTJMDdCJY", "colab": {} }, "cell_type": "code", "source": [ "port = \"Boston\"" ], "execution_count": 0, "outputs": [] }, { "metadata": { "id": "dYFeX5XTBLtx", "colab_type": "code", "outputId": "f94312c4-9c2f-4047-b5f0-b193f29ca58a", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "port.center(12, '*')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'***Boston***'" ] }, "metadata": { "tags": [] }, "execution_count": 96 } ] }, { "metadata": { "colab_type": "code", "id": "UIvvLZxNdysE", "outputId": "80b9af6a-0ff1-4cce-84c3-4cbaf351d052", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "port.ljust(12, '*')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Boston******'" ] }, "metadata": { "tags": [] }, "execution_count": 97 } ] }, { "metadata": { "colab_type": "code", "id": "mjcsTvfXd4eL", "outputId": "23f067c1-fec5-4a7e-d5a6-c9884a56873a", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "port.rjust(12, '*')" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'******Boston'" ] }, "metadata": { "tags": [] }, "execution_count": 98 } ] }, { "metadata": { "colab_type": "code", "id": "pSr3PaxHd6hd", "outputId": "56f21f11-a183-4140-c312-e56cd0b4d22d", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "cell_type": "code", "source": [ "for port_city in ['Liverpool', 'Boston', 'New York', 'Philadelphia']:\n", " print(port_city.rjust(12))" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ " Liverpool\n", " Boston\n", " New York\n", "Philadelphia\n" ], "name": "stdout" } ] }, { "metadata": { "colab_type": "code", "id": "9GoV_hGxfXQn", "outputId": "6a57402c-0bfb-49fd-c398-ee4dc9174033", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "'-5'.zfill(4)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'-005'" ] }, "metadata": { "tags": [] }, "execution_count": 100 } ] }, { "metadata": { "id": "E8ShVcW0oa7h", "colab_type": "text" }, "cell_type": "markdown", "source": [ "### Replace" ] }, { "metadata": { "colab_type": "code", "id": "b3ovfxR0eSHg", "outputId": "7ca615a7-acdc-4a54-b182-a76f50292c2f", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "\"FILADELFIA\".replace(\"F\", \"PH\")" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'PHILADELPHIA'" ] }, "metadata": { "tags": [] }, "execution_count": 101 } ] }, { "metadata": { "colab_type": "text", "id": "rlkG5_YmfxII" }, "cell_type": "markdown", "source": [ "### Spitting and Joining" ] }, { "metadata": { "colab_type": "code", "id": "BELRTH31fVWn", "outputId": "de9d54b6-dd27-47a8-9b91-13f71d0f814e", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "words_string = \"Here,Are,Some,Words\"\n", "words_string" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Here,Are,Some,Words'" ] }, "metadata": { "tags": [] }, "execution_count": 102 } ] }, { "metadata": { "id": "jSIUiLHuBnoF", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Split on comma" ] }, { "metadata": { "colab_type": "code", "id": "azqJSUvbjT3L", "outputId": "603c1993-4c10-4297-af57-926b73e0a8cf", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "words = words_string.split(',')\n", "words" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['Here', 'Are', 'Some', 'Words']" ] }, "metadata": { "tags": [] }, "execution_count": 103 } ] }, { "metadata": { "id": "D2-yh4ZCB6DC", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Joining" ] }, { "metadata": { "colab_type": "code", "id": "B26tHfU9jWID", "outputId": "4fba1170-e5b1-45d5-d588-9280d60fb167", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "':'.join(words)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Here:Are:Some:Words'" ] }, "metadata": { "tags": [] }, "execution_count": 104 } ] }, { "metadata": { "id": "9pGK5ua-Burk", "colab_type": "text" }, "cell_type": "markdown", "source": [ "#### Split on newline" ] }, { "metadata": { "colab_type": "code", "id": "6GrBwbwokddj", "outputId": "9f94e18b-fe0b-447e-bbba-2d57288d8a69", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "multiline = \"Sometimes we are given\\na multiline document\\nas a single string\"\n", "multiline" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Sometimes we are given\\na multiline document\\nas a single string'" ] }, "metadata": { "tags": [] }, "execution_count": 105 } ] }, { "metadata": { "colab_type": "code", "id": "WJ0MDyfmkyLI", "outputId": "f0a90afd-dd19-4195-a52d-c3c3fd7ede8e", "colab": { "base_uri": "https://localhost:8080/", "height": 68 } }, "cell_type": "code", "source": [ "for line in multiline.splitlines():\n", " print(line)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Sometimes we are given\n", "a multiline document\n", "as a single string\n" ], "name": "stdout" } ] }, { "metadata": { "colab_type": "text", "id": "lSGwfbgjUnGX" }, "cell_type": "markdown", "source": [ "### Slicing" ] }, { "metadata": { "colab_type": "code", "id": "jWDDLCi7OEgA", "outputId": "0b85997d-1c97-448a-9b62-6dce9e0a16c5", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "collector = \"William Main Doerflinger\"\n", "collector[0]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'W'" ] }, "metadata": { "tags": [] }, "execution_count": 107 } ] }, { "metadata": { "colab_type": "code", "id": "ENIzTZ5ZO-9p", "outputId": "14a7017b-79dd-4b24-f10e-5321dfec02c3", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "collector[-1]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'r'" ] }, "metadata": { "tags": [] }, "execution_count": 108 } ] }, { "metadata": { "colab_type": "code", "id": "e-higk4NPBqn", "outputId": "637c1ac5-d322-4b0c-e82e-be3585428e96", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "collector[13:18]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Doerf'" ] }, "metadata": { "tags": [] }, "execution_count": 109 } ] }, { "metadata": { "colab_type": "code", "id": "kruilXhaPQ81", "outputId": "37aae7a5-2e87-4460-9cdb-a2bca712b788", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "collector[-7:]" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'flinger'" ] }, "metadata": { "tags": [] }, "execution_count": 110 } ] }, { "metadata": { "colab_type": "text", "id": "MdBlKNdUMcQQ" }, "cell_type": "markdown", "source": [ "More information: [common sequence operations](https://docs.python.org/3/library/stdtypes.html#typesseq-common)\n" ] }, { "metadata": { "colab_type": "text", "id": "BQpIi5ZASzd8" }, "cell_type": "markdown", "source": [ "## 4.4 Learn to use unicode\n", "There are multiple encoding possible for mapping characters to bytes. Python strings default to UTF-8. Earlier versions of Python used a more limited encoding." ] }, { "metadata": { "colab_type": "text", "id": "fC1n_VfJVT7e" }, "cell_type": "markdown", "source": [ "### Encode" ] }, { "metadata": { "colab_type": "code", "id": "MbxFwOd1iCcF", "outputId": "929ec8fe-73e6-41ab-d7bc-e06aac1a7e5f", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "twice_pie = 'ππ'\n", "twice_pie" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'ππ'" ] }, "metadata": { "tags": [] }, "execution_count": 111 } ] }, { "metadata": { "colab_type": "code", "id": "KdKciMP1BV_1", "outputId": "659e3a1c-8ba8-444b-d66d-60bbf837b677", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "twice_π = twice_pie\n", "twice_π" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'ππ'" ] }, "metadata": { "tags": [] }, "execution_count": 112 } ] }, { "metadata": { "colab_type": "code", "id": "akJBiL6mBV_1", "outputId": "1216ea92-56db-4af7-e86c-2ee2231f4afd", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "pie = \"\\N{GREEK CAPITAL LETTER PI}\"\n", "pie" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Π'" ] }, "metadata": { "tags": [] }, "execution_count": 113 } ] }, { "metadata": { "colab_type": "code", "id": "PpTRKo5-BV_2", "outputId": "4c41ce73-1b85-4744-b01b-58214956cb15", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "ord(pie)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "928" ] }, "metadata": { "tags": [] }, "execution_count": 114 } ] }, { "metadata": { "colab_type": "code", "id": "5GycnVteBV_3", "outputId": "3d66828a-b1bd-46ff-8360-c4ddc2931d35", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "chr(928)" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'Π'" ] }, "metadata": { "tags": [] }, "execution_count": 115 } ] }, { "metadata": { "colab_type": "code", "id": "zvK_zexPBV_4", "outputId": "6cb56544-d129-44bb-ec2a-6c9481101cc4", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "u = chr(40960) + 'abcd' + chr(1972)\n", "u.encode('utf-8')\n", "u" ], "execution_count": 0, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'ꀀabcd\\u07b4'" ] }, "metadata": { "tags": [] }, "execution_count": 116 } ] }, { "metadata": { "colab_type": "text", "id": "c2a2TylyVYOS" }, "cell_type": "markdown", "source": [ "### Saving File in Unicode " ] }, { "metadata": { "colab_type": "code", "id": "MY8ElZT0BV_8", "outputId": "76f7ffc5-faae-4d9f-c001-70ef3769568f", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "cell_type": "code", "source": [ "with open(\"new_file.txt\", \"w\", encoding='utf-8') as opened_file:\n", " opened_file.write(\"Søme Unˆcode text\")\n", " \n", "!cat new_file.txt\n", " " ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Søme Unˆcode text" ], "name": "stdout" } ] }, { "metadata": { "colab_type": "text", "id": "mrlk--OrXZDc" }, "cell_type": "markdown", "source": [ "[Unicode](https://docs.python.org/3/howto/unicode.html)" ] }, { "metadata": { "colab_type": "text", "id": "leSOwe1iTSkX" }, "cell_type": "markdown", "source": [ "## Notes\n", "\n" ] }, { "metadata": { "id": "Qsd28lDPBH6t", "colab_type": "code", "colab": {} }, "cell_type": "code", "source": [ "" ], "execution_count": 0, "outputs": [] } ] }