{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Exploring and understanding Python through surprising snippets.
\n", "\n", "Translations: [Chinese \u4e2d\u6587](https://github.com/leisurelicht/wtfpython-cn) | [Add translation](https://github.com/satwikkansal/wtfpython/issues/new?title=Add%20translation%20for%20[LANGUAGE]&body=Expected%20time%20to%20finish:%20[X]%20weeks.%20I%27ll%20start%20working%20on%20it%20from%20[Y].)\n", "\n", "Other modes: [Interactive](https://colab.research.google.com/github/satwikkansal/wtfpython/blob/3.0/irrelevant/wtf.ipynb) | [CLI](https://pypi.python.org/pypi/wtfpython)\n", "\n", "Python, being a beautifully designed high-level and interpreter-based programming language, provides us with many features for the programmer's comfort. But sometimes, the outcomes of a Python snippet may not seem obvious at first sight.\n", "\n", "Here's a fun project attempting to explain what exactly is happening under the hood for some counter-intuitive snippets and lesser-known features in Python.\n", "\n", "While some of the examples you see below may not be WTFs in the truest sense, but they'll reveal some of the interesting parts of Python that you might be unaware of. I find it a nice way to learn the internals of a programming language, and I believe that you'll find it interesting too!\n", "\n", "If you're an experienced Python programmer, you can take it as a challenge to get most of them right in the first attempt. You may have already experienced some of them before, and I might be able to revive sweet old memories of yours! :sweat_smile:\n", "\n", "PS: If you're a returning reader, you can learn about the new modifications [here](https://github.com/satwikkansal/wtfpython/releases/).\n", "\n", "So, here we go...\n", "\n", "\n", "# Structure of the Examples\n", "\n", "All the examples are structured like below:\n", "\n", "> ### \u25b6 Some fancy Title\n", ">\n", "> ```py\n", "> # Set up the code.\n", "> # Preparation for the magic...\n", "> ```\n", ">\n", "> **Output (Python version(s)):**\n", ">\n", "> ```py\n", "> >>> triggering_statement\n", "> Some unexpected output\n", "> ```\n", "> (Optional): One line describing the unexpected output.\n", ">\n", ">\n", "> #### \ud83d\udca1 Explanation:\n", ">\n", "> * Brief explanation of what's happening and why is it happening.\n", "> ```py\n", "> # Set up code\n", "> # More examples for further clarification (if necessary)\n", "> ```\n", "> **Output (Python version(s)):**\n", ">\n", "> ```py\n", "> >>> trigger # some example that makes it easy to unveil the magic\n", "> # some justified output\n", "> ```\n", "\n", "**Note:** All the examples are tested on Python 3.5.2 interactive interpreter, and they should work for all the Python versions unless explicitly specified before the output.\n", "\n", "# Usage\n", "\n", "A nice way to get the most out of these examples, in my opinion, is to read them chronologically, and for every example:\n", "- Carefully read the initial code for setting up the example. If you're an experienced Python programmer, you'll successfully anticipate what's going to happen next most of the time.\n", "- Read the output snippets and,\n", " + Check if the outputs are the same as you'd expect.\n", " + Make sure if you know the exact reason behind the output being the way it is.\n", " - If the answer is no (which is perfectly okay), take a deep breath, and read the explanation (and if you still don't understand, shout out! and create an issue [here](https://github.com/satwikkansal/wtfPython)).\n", " - If yes, give a gentle pat on your back, and you may skip to the next example.\n", "\n", "PS: You can also read WTFPython at the command line using the [pypi package](https://pypi.python.org/pypi/wtfpython),\n", "```sh\n", "$ pip install wtfpython -U\n", "$ wtfpython\n", "```\n", "---\n", "\n", "# \ud83d\udc40 Examples\n", "\n", "\n\n## Hosted notebook instructions\n\nThis is just an experimental attempt of browsing wtfpython through jupyter notebooks. Some examples are read-only because, \n- they either require a version of Python that's not supported in the hosted runtime.\n- or they can't be reproduced in the notebook envrinonment.\n\nThe expected outputs are already present in collapsed cells following the code cells. The Google colab provides Python2 (2.7) and Python3 (3.6, default) runtimes. You can switch among these for Python2 specific examples. For examples specific to other minor versions, you can simply refer to collapsed outputs (it's not possible to control the minor version in hosted notebooks as of now). You can check the active version using\n\n```py\n>>> import sys\n>>> sys.version\n# Prints out Python version here.\n```\n\nThat being said, most of the examples do work as expected. If you face any trouble, feel free to consult the original content on wtfpython and create an issue in the repo. Have fun!\n\n---\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u25b6 Strings can be tricky sometimes\n", "1\\.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "140420665652016\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "a = \"some_string\"\n", "id(a)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "140420665652016\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "id(\"some\" + \"_\" + \"string\") # Notice that both the ids are same.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "2\\.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "True\n", "\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "a = \"wtf\"\n", "b = \"wtf\"\n", "a is b\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "False\n", "\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "a = \"wtf!\"\n", "b = \"wtf!\"\n", "a is b\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "3\\.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "True\n", "\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "a, b = \"wtf!\", \"wtf!\"\n", "a is b # All versions except 3.7.x\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "False\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "a = \"wtf!\"; b = \"wtf!\"\n", "a is b # This will print True or False depending on where you're invoking it (python shell / ipython / as a script)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n" ] }, { "cell_type": "code", "metadata": { "collapsed": true }, "execution_count": null, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [] } ], "source": [ "# This time in file some_file.py\n", "a = \"wtf!\"\n", "b = \"wtf!\"\n", "print(a is b)\n", "\n", "# prints True when the module is invoked!\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "4\\.\n", "\n", "**Output (< Python3.7 )**\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "'a' * 20 is 'aaaaaaaaaaaaaaaaaaaa'\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "False\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "'a' * 21 is 'aaaaaaaaaaaaaaaaaaaaa'\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Makes sense, right?\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \ud83d\udca1 Explanation:\n", "+ The behavior in first and second snippets is due to a CPython optimization (called string interning) that tries to use existing immutable objects in some cases rather than creating a new object every time.\n", "+ After being \"interned,\" many variables may reference the same string object in memory (saving memory thereby).\n", "+ In the snippets above, strings are implicitly interned. The decision of when to implicitly intern a string is implementation-dependent. There are some rules that can be used to guess if a string will be interned or not:\n", " * All length 0 and length 1 strings are interned.\n", " * Strings are interned at compile time (`'wtf'` will be interned but `''.join(['w', 't', 'f']` will not be interned)\n", " * Strings that are not composed of ASCII letters, digits or underscores, are not interned. This explains why `'wtf!'` was not interned due to `!`. CPython implementation of this rule can be found [here](https://github.com/python/cpython/blob/3.6/Objects/codeobject.c#L19)\n", " ![image](https://raw.githubusercontent.com/satwikkansal/wtfpython/master/images/string-intern/string_intern.png)\n", "+ When `a` and `b` are set to `\"wtf!\"` in the same line, the Python interpreter creates a new object, then references the second variable at the same time. If you do it on separate lines, it doesn't \"know\" that there's already `wtf!` as an object (because `\"wtf!\"` is not implicitly interned as per the facts mentioned above). It's a compile-time optimization. This optimization doesn't apply to 3.7.x versions of CPython (check this [issue](https://github.com/satwikkansal/wtfpython/issues/100) for more discussion).\n", "+ A compile unit in an interactive environment like IPython consists of a single statement, whereas it consists of the entire module in case of modules. `a, b = \"wtf!\", \"wtf!\"` is single statement, whereas `a = \"wtf!\"; b = \"wtf!\"` are two statements in a single line. This explains why the identities are different in `a = \"wtf!\"; b = \"wtf!\"`, and also explain why they are same when invoked in `some_file.py`\n", "+ The abrupt change in the output of the fourth snippet is due to a [peephole optimization](https://en.wikipedia.org/wiki/Peephole_optimization) technique known as Constant folding. This means the expression `'a'*20` is replaced by `'aaaaaaaaaaaaaaaaaaaa'` during compilation to save a few clock cycles during runtime. Constant folding only occurs for strings having a length of less than 20. (Why? Imagine the size of `.pyc` file generated as a result of the expression `'a'*10**10`). [Here's](https://github.com/python/cpython/blob/3.6/Python/peephole.c#L288) the implementation source for the same.\n", "+ Note: In Python 3.7, Constant folding was moved out from peephole optimizer to the new AST optimizer with some change in logic as well, so the third snippet doesn't work for Python 3.7. You can read more about the change [here](https://bugs.python.org/issue11549). \n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u25b6 Hash brownies\n", "1\\.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "some_dict = {}\n", "some_dict[5.5] = \"JavaScript\"\n", "some_dict[5.0] = \"Ruby\"\n", "some_dict[5] = \"Python\"\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "**Output:**\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "\"JavaScript\"\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "some_dict[5.5]\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "\"Python\"\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "some_dict[5.0] # \"Python\" destroyed the existence of \"Ruby\"?\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "\"Python\"\n", "\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "some_dict[5] \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "complex\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "complex_five = 5 + 0j\n", "type(complex_five)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "\"Python\"\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "some_dict[complex_five]\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "So, why is Python all over the place?\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \ud83d\udca1 Explanation\n", "\n", "* Python dictionaries check for equality and compare the hash value to determine if two keys are the same.\n", "* Immutable objects with the same value always have the same hash in Python.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ " True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ " 5 == 5.0 == 5 + 0j\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ " True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ " hash(5) == hash(5.0) == hash(5 + 0j)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " **Note:** Objects with different values may also have same hash (known as [hash collision](https://en.wikipedia.org/wiki/Collision_(computer_science))).\n", "* When the statement `some_dict[5] = \"Python\"` is executed, the existing value \"Ruby\" is overwritten with \"Python\" because Python recognizes `5` and `5.0` as the same keys of the dictionary `some_dict`.\n", "* This StackOverflow [answer](https://stackoverflow.com/a/32211042/4354153) explains the rationale behind it.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u25b6 Deep down, we're all the same.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "class WTF:\n", " pass\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "**Output:**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "False\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "WTF() == WTF() # two different instances can't be equal\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "False\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "WTF() is WTF() # identities are also different\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "hash(WTF()) == hash(WTF()) # hashes _should_ be different as well\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "id(WTF()) == id(WTF())\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \ud83d\udca1 Explanation:\n", "\n", "* When `id` was called, Python created a `WTF` class object and passed it to the `id` function. The `id` function takes its `id` (its memory location), and throws away the object. The object is destroyed.\n", "* When we do this twice in succession, Python allocates the same memory location to this second object as well. Since (in CPython) `id` uses the memory location as the object id, the id of the two objects is the same.\n", "* So, the object's id is unique only for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.\n", "* But why did the `is` operator evaluated to `False`? Let's see with this snippet.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ " class WTF(object):\n", " def __init__(self): print(\"I\")\n", " def __del__(self): print(\"D\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " **Output:**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ " I\n", " I\n", " D\n", " D\n", " False\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ " WTF() is WTF()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ " I\n", " D\n", " I\n", " D\n", " True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ " id(WTF()) == id(WTF())\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " As you may observe, the order in which the objects are destroyed is what made all the difference here.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u25b6 Disorder within order *\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "from collections import OrderedDict\n", "\n", "dictionary = dict()\n", "dictionary[1] = 'a'; dictionary[2] = 'b';\n", "\n", "ordered_dict = OrderedDict()\n", "ordered_dict[1] = 'a'; ordered_dict[2] = 'b';\n", "\n", "another_ordered_dict = OrderedDict()\n", "another_ordered_dict[2] = 'b'; another_ordered_dict[1] = 'a';\n", "\n", "class DictWithHash(dict):\n", " \"\"\"\n", " A dict that also implements __hash__ magic.\n", " \"\"\"\n", " __hash__ = lambda self: 0\n", "\n", "class OrderedDictWithHash(OrderedDict):\n", " \"\"\"\n", " An OrderedDict that also implements __hash__ magic.\n", " \"\"\"\n", " __hash__ = lambda self: 0\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "**Output**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "dictionary == ordered_dict # If a == b\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "True\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "dictionary == another_ordered_dict # and b == c\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "False\n", "\n", "# We all know that a set consists of only unique elements,\n", "# let's try making a set of these dictionaries and see what happens...\n", "\n" ] }, "output_type": "execute_result", "metadata": {}, "execution_count": null } ], "source": [ "ordered_dict == another_ordered_dict # the why isn't c == a ??\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "Traceback (most recent call last):\n", " File \"