Luke Plant's home page (Posts about Django)https://lukeplant.me.uk/blog/categories/django.xml2024-03-11T13:22:14ZLuke PlantNikolaRe-using CSS for the wrong HTML with Sasshttps://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/2023-06-01T20:44:15Z2023-06-01T20:44:15ZLuke Plant<p>A trick I learned for using someone else’s CSS without changing your HTML, or their CSS</p><p>Recently, while writing up <a class="reference external" href="https://github.com/spookylukey/django-htmx-patterns/blob/master/form_validation.rst">some examples and pattern for using htmx with Django for form validation</a>, I discovered a new trick for using externally defined CSS without having to change the HTML you are working with.</p>
<p>To make it concrete, an example might be that you are using some CSS from a CSS library or framework that requires your HTML to look a certain way. In the <a class="reference external" href="https://bulma.io/">Bulma</a> framework, for instance, you have to add the right <code class="docutils literal">class</code> attribute directly on an element that needs styling.</p>
<p>At the same time, you might be working with another system that is generating the HTML for you, and modifying that output might be hard or impossible or just tedious and a potential maintenance burden going forward. For instance, in <a class="reference external" href="https://docs.djangoproject.com/en/stable/ref/forms/api/">Django forms</a>, there is an <a class="reference external" href="https://docs.djangoproject.com/en/stable/ref/forms/api/#customizing-the-error-list-format">ErrorList class</a> whose output can be overridden, but by default renders like this:</p>
<div class="code"><pre class="code html"><a id="rest_code_f81b57b1de764ba9bb65fceedcc33bc7-1" name="rest_code_f81b57b1de764ba9bb65fceedcc33bc7-1" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_f81b57b1de764ba9bb65fceedcc33bc7-1"></a><span class="p"><</span><span class="nt">ul</span> <span class="na">class</span><span class="o">=</span><span class="s">"errorlist"</span><span class="p">></span>
<a id="rest_code_f81b57b1de764ba9bb65fceedcc33bc7-2" name="rest_code_f81b57b1de764ba9bb65fceedcc33bc7-2" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_f81b57b1de764ba9bb65fceedcc33bc7-2"></a> <span class="p"><</span><span class="nt">li</span><span class="p">></span>Enter a valid email address.<span class="p"></</span><span class="nt">li</span><span class="p">></span>
<a id="rest_code_f81b57b1de764ba9bb65fceedcc33bc7-3" name="rest_code_f81b57b1de764ba9bb65fceedcc33bc7-3" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_f81b57b1de764ba9bb65fceedcc33bc7-3"></a><span class="p"></</span><span class="nt">ul</span><span class="p">></span>
</pre></div>
<p>Now I have these requirements:</p>
<ul class="simple">
<li><p>I want this error list to be coloured using a Bulma <a class="reference external" href="https://bulma.io/documentation/helpers/color-helpers/#text-color">colour utility</a> as if it had <code class="docutils literal"><span class="pre">class="has-text-danger"</span></code> when it appears within a field row (which are <code class="docutils literal"><div <span class="pre">class="field"></span></code> elements).</p></li>
<li><p>When it appears at the top of the form where it has an extra <code class="docutils literal">nofield</code> class, I want it to instead be styled like a Bulma <a class="reference external" href="https://bulma.io/documentation/elements/notification/">notification</a> as if it had <code class="docutils literal"><span class="pre">class="notification</span> <span class="pre">is-danger</span> <span class="pre">is-light"</span></code>.</p></li>
</ul>
<p>But I want to do these without changing the HTML we’re given by Django, or changing existing CSS – only by adding some CSS rules.</p>
<p>The “best” way to do this is if your CSS framework provides its styles as a set of <a class="reference external" href="https://sass-lang.com/documentation/at-rules/mixin">Sass mixins</a>, or something equivalent. Bulma, as it happens, usually does this, but sometimes we’re not so lucky, and we just have CSS.</p>
<p>The trick I learnt requires you to use Sass/SCSS and the <a class="reference external" href="https://sass-lang.com/documentation/at-rules/extend">@extend directive</a>. This powerful directive takes rules relating to one selector, and pulls them into whatever rule you are writing.</p>
<p>(If you are, like me, put off using things like CSS pre-processors because of the need for a separate build step, or needing to use Node.js/npm, see my post on <a class="reference external" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step">How to use Sass/SCSS in a Django project without needing Node.js/npm or running a build process</a>)</p>
<p>The one thing you have to do is rename the base CSS file you want to re-use from <code class="docutils literal">.css</code> to <code class="docutils literal">.scss</code>. This works because SCSS is a CSS superset. Then, for the example above, you can write your own SCSS file like this:</p>
<div class="code"><pre class="code scss"><a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-1" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-1" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-1"></a><span class="k">@import</span><span class="w"> </span><span class="s2">"path/to/bulma.scss"</span><span class="p">;</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-2" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-2" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-2"></a>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-3" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-3" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-3"></a><span class="nc">.field</span><span class="w"> </span><span class="nt">ul</span><span class="nc">.errorlist</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-4" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-4" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-4"></a><span class="w"> </span><span class="k">@extend</span><span class="w"> </span><span class="nc">.has-text-danger</span><span class="o">;</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-5" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-5" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-5"></a><span class="p">}</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-6" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-6" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-6"></a>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-7" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-7" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-7"></a><span class="nt">ul</span><span class="nc">.errorlist.nonfield</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-8" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-8" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-8"></a><span class="w"> </span><span class="k">@extend</span><span class="w"> </span><span class="nc">.notification</span><span class="o">;</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-9" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-9" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-9"></a><span class="w"> </span><span class="o">@</span><span class="nt">extend</span><span class="w"> </span><span class="nc">.is-danger</span><span class="o">;</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-10" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-10" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-10"></a><span class="w"> </span><span class="o">@</span><span class="nt">extend</span><span class="w"> </span><span class="nc">.is-light</span><span class="o">;</span><span class="w"></span>
<a id="rest_code_6b63bf06f22b47c493ebf0e719ee5006-11" name="rest_code_6b63bf06f22b47c493ebf0e719ee5006-11" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_6b63bf06f22b47c493ebf0e719ee5006-11"></a><span class="p">}</span><span class="w"></span>
</pre></div>
<p>This technique can be very powerful e.g. make all <code class="docutils literal">input[type=text]</code> inside a <code class="docutils literal"><form <span class="pre">class="bulma"></span></code> have the normal Bulma <a class="reference external" href="https://bulma.io/documentation/form/input/">input</a> appearance:</p>
<div class="code"><pre class="code scss"><a id="rest_code_8db622129b294e5990f93f5ca013a621-1" name="rest_code_8db622129b294e5990f93f5ca013a621-1" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_8db622129b294e5990f93f5ca013a621-1"></a><span class="nt">form</span><span class="nc">.bulma</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<a id="rest_code_8db622129b294e5990f93f5ca013a621-2" name="rest_code_8db622129b294e5990f93f5ca013a621-2" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_8db622129b294e5990f93f5ca013a621-2"></a><span class="w"> </span><span class="nt">input</span><span class="o">[</span><span class="nt">type</span><span class="o">=</span><span class="nt">text</span><span class="o">]</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<a id="rest_code_8db622129b294e5990f93f5ca013a621-3" name="rest_code_8db622129b294e5990f93f5ca013a621-3" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_8db622129b294e5990f93f5ca013a621-3"></a><span class="w"> </span><span class="k">@extend</span><span class="w"> </span><span class="nc">.input</span><span class="o">;</span><span class="w"></span>
<a id="rest_code_8db622129b294e5990f93f5ca013a621-4" name="rest_code_8db622129b294e5990f93f5ca013a621-4" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_8db622129b294e5990f93f5ca013a621-4"></a><span class="w"> </span><span class="p">}</span><span class="w"></span>
<a id="rest_code_8db622129b294e5990f93f5ca013a621-5" name="rest_code_8db622129b294e5990f93f5ca013a621-5" href="https://lukeplant.me.uk/blog/posts/reusing-css-for-the-wrong-html-with-sass/#rest_code_8db622129b294e5990f93f5ca013a621-5"></a><span class="p">}</span><span class="w"></span>
</pre></div>
<p>This will include all related rules like <code class="docutils literal">.input:focus</code> etc.</p>
<p>As mentioned, it may not always be the best technique, but it’s a great one to have in your toolbox.</p>Django and Sass/SCSS without Node.js or a build stephttps://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/2023-06-01T19:54:15Z2023-06-01T19:54:15ZLuke Plant<p>How to use Sass/SCSS in a Django project, without needing Node.js/npm or running a build process</p><p>Although they are less necessary than in the past, I like to use a <a class="reference external" href="https://developer.mozilla.org/en-US/docs/Glossary/CSS_preprocessor">CSS pre-processor</a> when doing web development. I used to use <a class="reference external" href="https://lesscss.org/">LessCSS</a>, but recently I’ve found that I can use <a class="reference external" href="https://sass-lang.com/">Sass</a> without needing either a separate build step, or a package that requires Node.js and npm to install it. The heart of the functionality is provided by <a class="reference external" href="https://sass-lang.com/libsass">libsass</a>, an implementation of Sass as a C++ library.</p>
<p>On Linux systems, this can be installed as a package <code class="docutils literal">libsass</code> or similar, but even better is that you can pip install it as a Python package, <a class="reference external" href="https://pypi.org/project/libsass/">libsass</a>.</p>
<p>When it comes to using it from a Django project, the first step is to <a class="reference external" href="https://django-compressor.readthedocs.io/en/stable/quickstart.html">install
django-compressor</a>.</p>
<p>Then, you need to add <a class="reference external" href="https://pypi.org/project/django-libsass/">django-libsass</a> as per its instructions.</p>
<p>That’s about it. As per the django-libsass instructions, somewhere in your base HTML templates you’ll have something like this:</p>
<div class="code"><pre class="code html+django"><a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-1" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-1" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-1"></a><span class="c">{# at the top #}</span>
<a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-2" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-2" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-2"></a><span class="cp">{%</span> <span class="k">load</span> <span class="nv">compress</span> <span class="cp">%}</span>
<a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-3" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-3" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-3"></a><span class="cp">{%</span> <span class="k">load</span> <span class="nv">static</span> <span class="cp">%}</span>
<a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-4" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-4" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-4"></a>
<a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-5" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-5" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-5"></a>{# in the <span class="p"><</span><span class="nt">head</span><span class="p">></span> element #]
<a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-6" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-6" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-6"></a><span class="cp">{%</span> <span class="k">compress</span> <span class="nv">css</span> <span class="cp">%}</span>
<a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-7" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-7" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-7"></a> <span class="p"><</span><span class="nt">link</span> <span class="na">rel</span><span class="o">=</span><span class="s">"stylesheet"</span> <span class="na">type</span><span class="o">=</span><span class="s">"text/x-scss"</span> <span class="na">href</span><span class="o">=</span><span class="s">"</span><span class="cp">{%</span> <span class="k">static</span> <span class="s2">"myapp/css/main.scss"</span> <span class="cp">%}</span><span class="s">"</span> <span class="p">/></span>
<a id="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-8" name="rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-8" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_dcd45d0f50a64dcbac57eaccef03cc8b-8"></a><span class="cp">{%</span> <span class="k">endcompress</span> <span class="cp">%}</span>
</pre></div>
<p>You write your SCSS in that <code class="docutils literal">main.scss</code> file (it doesn’t have to be called that), and it can <code class="docutils literal">@import</code> other SCSS files of course.</p>
<p>Then, when you load a page, django-compressor will take care of running the SCSS files through libsass, saving the output CSS to a file and inserting the appropriate HTML that references that CSS file into your template output. It caches things very well so that you don’t incur any penalty if files haven’t changed — and libsass is a very fast implementation for when the processing does need to happen.</p>
<p>What this means is that you have eliminated both the need for Node.js/npm, and the need for a build step/process, if you only needed these things for CSS pre-processing.</p>
<p>Of course, the SCSS → CSS compilation still has to happen, but it happens on demand in the same process that runs the web app, and it’s both fast enough and reliable enough that you simply never have to think about it again. So this is “build-less” in the same way that “server-less” means you don’t have to think about servers, and the same way that Python “doesn’t have a compilation step”.</p>
<section id="future-proofing">
<h2>Future proofing</h2>
<p>On the Sass-lang page about libsass, they say it is “deprecated”, and on the <a class="reference external" href="https://github.com/sass/libsass">project page</a> page it says:</p>
<blockquote>
<p>While it will continue to receive maintenance releases indefinitely, there are no plans to add additional features or compatibility with any new CSS or Sass features.</p>
</blockquote>
<p>In other words, this is what I prefer to call “mature software” 😉. libsass already has everything I need. If it does eventually fail to be maintained or I need new features, it’s not a problem:</p>
<ul>
<li><p>Switch to Dart Sass, which can be installed as a <a class="reference external" href="https://github.com/sass/dart-sass/releases/">standalone binary</a>.</p></li>
<li><p>Set your django-compressor settings like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_803ac20f483f4fa083bfe21dcd4829c7-1" name="rest_code_803ac20f483f4fa083bfe21dcd4829c7-1" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_803ac20f483f4fa083bfe21dcd4829c7-1"></a><span class="n">COMPRESS_PRECOMPILERS</span> <span class="o">=</span> <span class="p">[</span>
<a id="rest_code_803ac20f483f4fa083bfe21dcd4829c7-2" name="rest_code_803ac20f483f4fa083bfe21dcd4829c7-2" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_803ac20f483f4fa083bfe21dcd4829c7-2"></a> <span class="p">(</span><span class="s2">"text/x-scss"</span><span class="p">,</span> <span class="s2">"sass </span><span class="si">{infile}</span><span class="s2"> </span><span class="si">{outfile}</span><span class="s2">"</span><span class="p">),</span>
<a id="rest_code_803ac20f483f4fa083bfe21dcd4829c7-3" name="rest_code_803ac20f483f4fa083bfe21dcd4829c7-3" href="https://lukeplant.me.uk/blog/posts/django-sass-scss-without-nodejs-or-build-step/#rest_code_803ac20f483f4fa083bfe21dcd4829c7-3"></a><span class="p">]</span>
</pre></div>
</li>
</ul>
<p>This covers the basic case. If you want all the features of django-libsass, which includes looking in your other static file folders for SCSS, you’ll probably need to fork <a class="reference external" href="https://github.com/torchbox/django-libsass/blob/main/django_libsass.py">the code</a> and make it work by calling Dart Sass using <a class="reference external" href="https://docs.python.org/3/library/subprocess.html">subprocess</a> — a small amount of work, and nothing that will fundamentally break this approach.</p>
</section>Python’s “Disappointing” Superpowershttps://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/2023-02-01T13:44:15Z2023-02-01T13:44:15ZLuke Plant<p>A response to Hillel Wayne’s “I am disappointed by dynamic typing”</p><p>In Hillel Wayne’s post <a class="reference external" href="https://buttondown.email/hillelwayne/archive/i-am-disappointed-by-dynamic-typing/">“I am disappointed by dynamic typing”</a>, he expresses his sense that the Python ecosystem doesn’t really make the most of the possibilities that Python provides as a dynamically typed language. This is an important subject, since every Python program pays a very substantial set of costs for Python’s highly dynamic nature, such as poor run-time performance, and maintainability issues. Are we we getting anything out of this tradeoff?</p>
<p>I think Hillel makes some fair points, and this post is intended as a response rather than a rebuttal. Recently there has been a significant influence of static type systems which I think might be harmful. The static type system we have in the form of mypy/pyright (which is partly codified in <a class="reference external" href="https://peps.python.org/pep-0484/">PEP 484</a> and following) seems to be much too heavily inspired by what is possible to map to other languages, rather than the features that Python provides.</p>
<p>(As a simple example to support that claim, consider the fact that Python has had support for keyword arguments since as long as I can remember, and for keyword-only arguments since Python 3.0. But <code class="docutils literal">typing.Callable</code> has zero support for them, meaning they can’t be typed in a higher-order context. . This is bad, since they are a key part of Python’s excellent reputation for readability, and <a class="reference external" href="https://lukeplant.me.uk/blog/posts/keyword-only-arguments-in-python/">we want more keyword-only arguments, not fewer</a>.
[<strong>EDIT:</strong> it looks like there is <a class="reference external" href="https://mypy.readthedocs.io/en/stable/protocols.html#callback-protocols">another way to do it</a>, it’s just about 10 times more work, so the point kind of stands.]
I can give more examples, but that will have to wait for another blog post).</p>
<p>I’m worried that a de-facto move away from dynamic stuff in the Python ecosystem, possibly motivated by those who use Python only because they have to, and just want to make it more like the C# or Java they are comfortable with, could leave us with the very worst of all worlds.</p>
<p>However, I also think there are plenty of counter-examples to Hillel’s claim, and that’s what this post will explore.</p>
<p>Hillel was specifically thinking about, in his own words:</p>
<ul class="simple">
<li><p>“runtime program manipulation”</p></li>
<li><p>“programs that take programs and output other programs”</p></li>
<li><p>“thinking of the whole runtime environment in the same way, where everything is a runtime construct”</p></li>
</ul>
<p>…and he gave some examples that included things like:</p>
<ul class="simple">
<li><p>run-time type modification</p></li>
<li><p>introspection/manipulation of the stack</p></li>
<li><p>passing very differently typed objects through normal code to collect information about it.</p></li>
</ul>
<p>I’m going to give lots of examples of this kind of thing in Python, and they will all be <strong>real world</strong> examples. This means that either I have used them myself to solve real problems, or I’m aware that other people are using them in significant numbers.</p>
<p>Before I get going, there are some things to point out.</p>
<p>First, I don’t have the exact examples Hillel is looking for – but that’s because the kind of problems I’ve needed to solve have not been exactly the same as his. My examples are all necessarily limited in scope: since Python allows unrestricted side-effects in any function, including IO and being able to modify other code, there are obviously limits into how well these techniques can work across large amounts of code.</p>
<p>I do think, however, that my examples are in the same general region, and some of them very close. On both sides we’ve got to avoid semantic hair-splitting – you can argue that every time you use the <code class="docutils literal">class</code> keyword in Python you are doing “run-time type creation”, rather than “compile-time type creation”, because that’s how Python’s classes work. But that’s not what Hillel meant.</p>
<p>Second, many of these more magical techniques involve what is called monkey patching. People are often confused about the difference between monkey patching and “dynamic meta-programming”, so I’ve prepared a handy flow chart for you:</p>
<img alt="Flow chart: Is this code I found a hacky monkey patch, or cool dynamic meta-programming? Question: who wrote it? If “Me” - it’s “Dynamic meta-programming”, if “someone else”, it’s “hacky monkey patch”" class="align-center" src="https://lukeplant.me.uk/blogmedia/monkey_patch_or_dynamic_meta_programming.png">
<p>There are, however, many instances of advanced, dynamic techniques that never get to the point of the chart above, and that’s because you never know about them. What you know is that the code does something useful, and it does so reliably enough that you don’t need to know what techniques contributed to it. And this is, I think, the biggest problem in what Hillel is asking for. The best examples of these techniques will be reliable enough that they don’t draw attention to themselves, and you immediately take them for granted.</p>
<p>Which is also to say that you cannot discount something I mention below just because it is so widely used that you, too, have taken it for granted – that would effectively be saying that the only examples that count are the ones that have proved to be so wild and wacky that everyone has decided they are a bad idea.</p>
<p>Third, you might also discount these examples as “just using features the language provides”, rather than “hyper programming” or something exotic. On the one hand, it would be true, but also unfair in the context of this debate. The most obvious example is <code class="docutils literal">eval</code>. This is clearly a very powerful technique not available to many statically typed languages, and exactly the kind that Hillel is looking for – you are literally creating more of your program as your program is running. On the other hand, it’s nothing more than a builtin function.</p>
<p>Finally, a number of these examples don’t involve “production” code i.e. the code is typically run only on developer machines or in CI. These still count, however – just like many of Hillel’s examples are in the area of testing. The reasons they still count are 1) developers are humans too, and solving their problems is still important and 2) the techniques used by developers on their own machines are useful in creating high quality code for running on other people’s machines, where we don’t want to incur the performance or robustness penalties of the techniques used.</p>
<p>So, here are my examples. The majority are not my own code, but I’ve also taken the opportunity to do some fairly obvious bragging about cool things I’ve done in Python.</p>
<nav class="contents" id="examples" role="doc-toc">
<p class="topic-title">Examples</p>
<ul class="simple">
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#gooey" id="toc-entry-1">Gooey</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#werkzeugs-interactive-debugger" id="toc-entry-2">Werkzeug’s interactive debugger</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#hybrid-attributes-in-sqlalchemy" id="toc-entry-3">Hybrid attributes in SQLAlchemy</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#pony-orm" id="toc-entry-4">Pony ORM</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#django" id="toc-entry-5">Django</a></p>
<ul>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#foreignkey" id="toc-entry-6">ForeignKey</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#relatedmanager" id="toc-entry-7">RelatedManager</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#manytomany-models" id="toc-entry-8">ManyToMany models</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#consequences" id="toc-entry-9">Consequences</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#baserow" id="toc-entry-10">Baserow</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#cciw-data-retention-policy" id="toc-entry-11">CCiW data retention policy</a></p></li>
</ul>
</li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#query-tracing" id="toc-entry-12">Query tracing</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#time-machine-and-pyfakefs" id="toc-entry-13">time-machine and pyfakefs</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#environment-detection" id="toc-entry-14">Environment detection</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#fluent-compiler" id="toc-entry-15">fluent-compiler</a></p>
<ul>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#compile-to-python" id="toc-entry-16">Compile-to-Python</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#dynamic-test-methods" id="toc-entry-17">Dynamic test methods</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#morph-into" id="toc-entry-18"><code class="docutils literal">morph_into</code></a></p></li>
</ul>
</li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#pytest" id="toc-entry-19">Pytest</a></p>
<ul>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#assert-rewriting" id="toc-entry-20">Assert rewriting</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#automatic-dependency-injection-of-fixtures" id="toc-entry-21">Automatic dependency injection of fixtures</a></p></li>
</ul>
</li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#others" id="toc-entry-22">Others</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#conclusion" id="toc-entry-23">Conclusion</a></p></li>
<li><p><a class="reference internal" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#links" id="toc-entry-24">Links</a></p></li>
</ul>
</nav>
<section id="gooey">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-1" role="doc-backlink">Gooey</a></h2>
<p><a class="reference external" href="https://github.com/chriskiehl/Gooey">Gooey</a> is a library that will re-interpret <a class="reference external" href="https://docs.python.org/3/library/argparse.html">argparse</a> entry points as if they were specifying a GUI. In other words, you do “import gooey”, add a decorator and it transforms your CLI program into a GUI program. Apparently it does this by <a class="reference external" href="https://github.com/chriskiehl/Gooey#how-does-it-work">re-parsing your entry point module</a>, for reasons I don’t know and don’t need to know. I do know that it works for programs I’ve tried it with, when I wanted to make something that I was using as a CLI, but also needed to be usable by other family members. A pretty cool tool that solves real problems.</p>
</section>
<section id="werkzeugs-interactive-debugger">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-2" role="doc-backlink">Werkzeug’s interactive debugger</a></h2>
<p>Werkzeug provide a <a class="reference external" href="https://werkzeug.palletsprojects.com/en/2.2.x/debug/">debugger middleware</a> which works with any WSGI-compliant Python web framework (which is most of them) with the following extremely useful behaviour:</p>
<ul class="simple">
<li><p>Crashing errors are automatically intercepted and an error page is shown with a stack trace instead of a generic 500 error.</p></li>
<li><p>For any and every frame of the stack trace, you can, right from your web browser, start a Python REPL at that frame – i.e. you can effectively continue execution of the crashed program at any point in the stack, or from multiple points simultaneously.</p></li>
</ul>
<p>This is extremely useful, to say the least.</p>
<img alt="Screenshot of Werkzeug debugger in action" class="align-center" src="https://lukeplant.me.uk/blogmedia/werkzeug_debugger_example.png">
<p>(For Django users – you can use this most easily using <a class="reference external" href="https://django-extensions.readthedocs.io/en/latest/">django-extensions</a>)</p>
</section>
<section id="hybrid-attributes-in-sqlalchemy">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-3" role="doc-backlink">Hybrid attributes in SQLAlchemy</a></h2>
<p>I’m sure there are <strong>many</strong> examples of advanced dynamic techniques in SQLAlchemy, and I’m not the best qualified to talk about them, but here is a cool one I came across that helps explain the kind of thing you can do in Python.</p>
<p>Suppose you have an ORM object with some attributes that come straight from the database, along with some calculated properties. In the example below we’ve got a model representing an account that might have payments against it:</p>
<div class="code"><pre class="code python"><a id="rest_code_417660358bd94b8f8291526aadc85be5-1" name="rest_code_417660358bd94b8f8291526aadc85be5-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-1"></a><span class="k">class</span> <span class="nc">Account</span><span class="p">(</span><span class="n">Base</span><span class="p">):</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-2" name="rest_code_417660358bd94b8f8291526aadc85be5-2" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-2"></a> <span class="c1"># DB columns:</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-3" name="rest_code_417660358bd94b8f8291526aadc85be5-3" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-3"></a> <span class="n">amount_paid</span><span class="p">:</span> <span class="n">Mapped</span><span class="p">[</span><span class="n">Decimal</span><span class="p">]</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-4" name="rest_code_417660358bd94b8f8291526aadc85be5-4" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-4"></a> <span class="n">total_purchased</span><span class="p">:</span> <span class="n">Mapped</span><span class="p">[</span><span class="n">Decimal</span><span class="p">]</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-5" name="rest_code_417660358bd94b8f8291526aadc85be5-5" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-5"></a>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-6" name="rest_code_417660358bd94b8f8291526aadc85be5-6" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-6"></a> <span class="c1"># Calculated properties:</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-7" name="rest_code_417660358bd94b8f8291526aadc85be5-7" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-7"></a> <span class="nd">@property</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-8" name="rest_code_417660358bd94b8f8291526aadc85be5-8" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-8"></a> <span class="k">def</span> <span class="nf">balance_due</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">Decimal</span><span class="p">:</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-9" name="rest_code_417660358bd94b8f8291526aadc85be5-9" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-9"></a> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">total_purchased</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">amount_paid</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-10" name="rest_code_417660358bd94b8f8291526aadc85be5-10" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-10"></a>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-11" name="rest_code_417660358bd94b8f8291526aadc85be5-11" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-11"></a> <span class="nd">@property</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-12" name="rest_code_417660358bd94b8f8291526aadc85be5-12" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-12"></a> <span class="k">def</span> <span class="nf">has_payment_outstanding</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bool</span><span class="p">:</span>
<a id="rest_code_417660358bd94b8f8291526aadc85be5-13" name="rest_code_417660358bd94b8f8291526aadc85be5-13" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_417660358bd94b8f8291526aadc85be5-13"></a> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">balance_due</span> <span class="o">></span> <span class="mi">0</span>
</pre></div>
<p>Very often you find yourself in a situation like this:</p>
<ul class="simple">
<li><p>Sometimes you have already loaded an object from the DB, and want to know a calculated value like “does this account have an outstanding payment?”. This shouldn’t execute any more database queries, since you’ve already loaded everything you need to answer that question.</p></li>
<li><p>But sometimes, you want to re-use this logic to do something like “get me all the accounts that have outstanding payments”, and it is vital for efficiency that we do the filtering in the database as a SQL <code class="docutils literal">WHERE</code> clause, rather than loading all the records into a Python process and filtering there.</p></li>
</ul>
<p>How could we do this in SQLAlchemy <strong>without duplicating the logic</strong> regarding <code class="docutils literal">balance_due</code> and <code class="docutils literal">has_outstanding_payment</code>?</p>
<p>The answer is <a class="reference external" href="https://docs.sqlalchemy.org/en/20/orm/extensions/hybrid.html">hybrid attributes</a>:</p>
<ul class="simple">
<li><p><code class="docutils literal">from sqlalchemy.ext.hybrid import hybrid_property</code></p></li>
<li><p>replace <code class="docutils literal">property</code> with <code class="docutils literal">hybrid_property</code> on the two properties.</p></li>
</ul>
<p><strong>That is all</strong>. Then you can do:</p>
<div class="code"><pre class="code python"><a id="rest_code_f7082a943d034cafbfca71e8c6489996-1" name="rest_code_f7082a943d034cafbfca71e8c6489996-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_f7082a943d034cafbfca71e8c6489996-1"></a><span class="n">select</span><span class="p">(</span><span class="n">Account</span><span class="p">)</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">Account</span><span class="o">.</span><span class="n">has_payment_outstanding</span> <span class="o">==</span> <span class="kc">True</span><span class="p">)</span>
</pre></div>
<p>This will generate a SQL query that looks like this:</p>
<div class="code"><pre class="code SQL"><a id="rest_code_6e503af4b3354dd3ad9d4394563f28a0-1" name="rest_code_6e503af4b3354dd3ad9d4394563f28a0-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_6e503af4b3354dd3ad9d4394563f28a0-1"></a><span class="k">SELECT</span><span class="w"> </span><span class="n">account</span><span class="p">.</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">account</span><span class="p">.</span><span class="n">amount_paid</span><span class="p">,</span><span class="w"> </span><span class="n">account</span><span class="p">.</span><span class="n">total_purchased</span><span class="w"></span>
<a id="rest_code_6e503af4b3354dd3ad9d4394563f28a0-2" name="rest_code_6e503af4b3354dd3ad9d4394563f28a0-2" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_6e503af4b3354dd3ad9d4394563f28a0-2"></a><span class="k">FROM</span><span class="w"> </span><span class="n">account</span><span class="w"></span>
<a id="rest_code_6e503af4b3354dd3ad9d4394563f28a0-3" name="rest_code_6e503af4b3354dd3ad9d4394563f28a0-3" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_6e503af4b3354dd3ad9d4394563f28a0-3"></a><span class="k">WHERE</span><span class="w"> </span><span class="p">(</span><span class="n">account</span><span class="p">.</span><span class="n">total_purchased</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">account</span><span class="p">.</span><span class="n">amount_paid</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1</span><span class="w"></span>
</pre></div>
<p>What’s going on here? If you have a normal model instance <code class="docutils literal">an_account</code>, retrieved from a database query, and you do <code class="docutils literal">an_account.has_payment_outstanding</code>, then in the <code class="docutils literal">has_payment_outstanding</code> function body above, everything is normal: <code class="docutils literal">self</code> is bound to <code class="docutils literal">an_account</code>, attributes like <code class="docutils literal">total_purchased</code> will be <code class="docutils literal">Decimal</code> objects that have been loaded from the database.</p>
<p>However, if you use <code class="docutils literal">Account.has_payment_outstanding</code>, the <code class="docutils literal">self</code> variable gets bound to a different type of object (the <code class="docutils literal">Account</code> class or some proxy), and so things like <code class="docutils literal">self.total_purchased</code> instead resolve to objects representing columns/fields. These classes have appropriate “dunder” methods defined, (<code class="docutils literal">__add__</code>, <code class="docutils literal">__gt__</code> etc) so that operations done on them, such as maths and comparisons, instead of returning values immediately, return new expression objects that track what operations were done. These can then be compiled to SQL later on. So we can execute the filtering as a WHERE clause in the DB.</p>
<p>The point here is: we are passing both “normal” and “instrumented” types through the same code in order to completely change our execution strategy. This allows us to effectively compile our Python code into SQL on the fly. This is essentially identical to Hillel’s example of passing instrumented objects (“Replacer” class) through normal code to extract certain information about what operations were done.</p>
<p>This is a very neat feature in SQLAlchemy that I’m rather jealous of as a Django user. If you want the same efficiency in Django, you have to define the instance properties and the database filtering separately, and usually physically not next to each other in the code. The closest we have is <a class="reference external" href="https://docs.djangoproject.com/en/4.1/ref/models/expressions/#query-expressions">Query expressions</a> but they don’t work quite the same.</p>
</section>
<section id="pony-orm">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-4" role="doc-backlink">Pony ORM</a></h2>
<p>This ORM has a way of writing SQL select queries that appears even more magical. Using an example from <a class="reference external" href="https://ponyorm.org/">their home page</a>, you write code like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_56c1ef670f0d4a29b02ebf928483a2ea-1" name="rest_code_56c1ef670f0d4a29b02ebf928483a2ea-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_56c1ef670f0d4a29b02ebf928483a2ea-1"></a><span class="n">select</span><span class="p">(</span><span class="n">c</span> <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">Customer</span> <span class="k">if</span> <span class="nb">sum</span><span class="p">(</span><span class="n">c</span><span class="o">.</span><span class="n">orders</span><span class="o">.</span><span class="n">price</span><span class="p">)</span> <span class="o">></span> <span class="mi">1000</span><span class="p">)</span>
</pre></div>
<p>The result of this is a SQL query that looks like this:</p>
<div class="code"><pre class="code SQL"><a id="rest_code_45902b2eed014a7da13893377fdb27a5-1" name="rest_code_45902b2eed014a7da13893377fdb27a5-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_45902b2eed014a7da13893377fdb27a5-1"></a><span class="k">SELECT</span><span class="w"> </span><span class="ss">"c"</span><span class="p">.</span><span class="ss">"id"</span><span class="w"></span>
<a id="rest_code_45902b2eed014a7da13893377fdb27a5-2" name="rest_code_45902b2eed014a7da13893377fdb27a5-2" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_45902b2eed014a7da13893377fdb27a5-2"></a><span class="k">FROM</span><span class="w"> </span><span class="ss">"customer"</span><span class="w"> </span><span class="ss">"c"</span><span class="w"></span>
<a id="rest_code_45902b2eed014a7da13893377fdb27a5-3" name="rest_code_45902b2eed014a7da13893377fdb27a5-3" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_45902b2eed014a7da13893377fdb27a5-3"></a><span class="w"> </span><span class="k">LEFT</span><span class="w"> </span><span class="k">JOIN</span><span class="w"> </span><span class="ss">"order"</span><span class="w"> </span><span class="ss">"order-1"</span><span class="w"></span>
<a id="rest_code_45902b2eed014a7da13893377fdb27a5-4" name="rest_code_45902b2eed014a7da13893377fdb27a5-4" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_45902b2eed014a7da13893377fdb27a5-4"></a><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="ss">"c"</span><span class="p">.</span><span class="ss">"id"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"order-1"</span><span class="p">.</span><span class="ss">"customer"</span><span class="w"></span>
<a id="rest_code_45902b2eed014a7da13893377fdb27a5-5" name="rest_code_45902b2eed014a7da13893377fdb27a5-5" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_45902b2eed014a7da13893377fdb27a5-5"></a><span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="ss">"c"</span><span class="p">.</span><span class="ss">"id"</span><span class="w"></span>
<a id="rest_code_45902b2eed014a7da13893377fdb27a5-6" name="rest_code_45902b2eed014a7da13893377fdb27a5-6" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_45902b2eed014a7da13893377fdb27a5-6"></a><span class="k">HAVING</span><span class="w"> </span><span class="k">coalesce</span><span class="p">(</span><span class="k">SUM</span><span class="p">(</span><span class="ss">"order-1"</span><span class="p">.</span><span class="ss">"total_price"</span><span class="p">),</span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="mi">1000</span><span class="w"></span>
</pre></div>
<p>A normal understanding of generator expressions suggests that the <code class="docutils literal">select</code> function is consuming a generator. But that couldn’t explain the behaviour here. Instead, it actually <a class="reference external" href="https://github.com/ponyorm/pony/blob/27593ffc74184bc334dd301a86fc5f40fdd3ad87/pony/orm/core.py#L5542">introspects the frame object of the calling code</a>, then <a class="reference external" href="https://github.com/ponyorm/pony/blob/27593ffc74184bc334dd301a86fc5f40fdd3ad87/pony/orm/decompiling.py#L22">decompiles the byte code of the generator expression object it finds</a>, and builds a <a class="reference external" href="https://github.com/ponyorm/pony/blob/27593ffc74184bc334dd301a86fc5f40fdd3ad87/pony/orm/core.py#L5669">Query</a> based on the <a class="reference external" href="https://docs.python.org/3/library/ast.html">AST</a> objects.</p>
<p>PonyORM doesn’t advertise all that, of course. It advertises a “beautiful” syntax for writing ORM code, because that’s what matters.</p>
</section>
<section id="django">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-5" role="doc-backlink">Django</a></h2>
<p>This is the web framework I know well, as I used to contribute significantly, and I’ll pick just a few important examples from the ORM, and then from the broader ecosystem.</p>
<section id="foreignkey">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-6" role="doc-backlink">ForeignKey</a></h3>
<p>Suppose, to pick one example of many, you are writing <a class="reference external" href="https://github.com/django-otp/django-otp/">django-otp</a>, a third party library that provides a <a class="reference external" href="https://en.wikipedia.org/wiki/One-time_password">One Time Password</a> implementation for <a class="reference external" href="https://en.wikipedia.org/wiki/Multi-factor_authentication">2FA requirements</a>. You want to create a table of TOTP devices that are linked to user accounts, and so you have something like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_329c9295a4804c208d653734ffd2ca36-1" name="rest_code_329c9295a4804c208d653734ffd2ca36-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_329c9295a4804c208d653734ffd2ca36-1"></a><span class="k">class</span> <span class="nc">TOTPDevice</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
<a id="rest_code_329c9295a4804c208d653734ffd2ca36-2" name="rest_code_329c9295a4804c208d653734ffd2ca36-2" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_329c9295a4804c208d653734ffd2ca36-2"></a> <span class="n">user</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">ForeignKey</span><span class="p">(</span><span class="s1">'auth.User'</span><span class="p">,</span> <span class="n">related_name</span><span class="o">=</span><span class="s1">'totp_devices'</span><span class="p">)</span>
</pre></div>
<p>Later on, you have code that starts with a <code class="docutils literal">User</code> object and retrieves their TOTP devices, and it looks something like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_2cead7e41fb54c83bfbe5bd6cf57152b-1" name="rest_code_2cead7e41fb54c83bfbe5bd6cf57152b-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_2cead7e41fb54c83bfbe5bd6cf57152b-1"></a><span class="n">devices</span> <span class="o">=</span> <span class="n">user</span><span class="o">.</span><span class="n">totp_devices</span><span class="o">.</span><span class="n">all</span><span class="p">()</span>
</pre></div>
<p>This is interesting, because my <code class="docutils literal">user</code> variable is an instance of a <code class="docutils literal">User</code> model that was provided by core Django, which has no knowledge of the third party project that provides the <code class="docutils literal">TOTPDevice</code> model.</p>
<p>In fact it goes further: I may not be using Django’s <code class="docutils literal">User</code> at all, but my own custom <code class="docutils literal">User</code> class, and the <code class="docutils literal">TOTPDevice</code> model can easily support that too just by doing this:</p>
<div class="code"><pre class="code python"><a id="rest_code_305672d98d074859b838f6b37782fd15-1" name="rest_code_305672d98d074859b838f6b37782fd15-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_305672d98d074859b838f6b37782fd15-1"></a><span class="n">user</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">ForeignKey</span><span class="p">(</span><span class="nb">getattr</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="s2">"AUTH_USER_MODEL"</span><span class="p">,</span> <span class="s2">"auth.User"</span><span class="p">))</span>
</pre></div>
<p>This means that my <code class="docutils literal">User</code> model has no knowledge of the <code class="docutils literal">TOTPDevice</code> class, nor the other way around, yet instances of these classes both get wired up to refer to each other.</p>
<p>What is actually going on to enable this?</p>
<p>When you import Django and call <code class="docutils literal">setup()</code>, it imports all the apps in your
project. When it comes to the <code class="docutils literal">TOTPDevice</code> class, it sees the <code class="docutils literal">"auth.User"</code>
reference and finds the class it refers to. It then <strong>modifies that
class</strong>, adding a <code class="docutils literal">totp_devices</code> <a class="reference external" href="https://docs.python.org/3/glossary.html#term-descriptor">descriptor</a> object to the class attributes.</p>
<p>This is <strong>run-time type modification</strong>.</p>
<p>The result is that when you do <code class="docutils literal">user.totp_devices</code>, you get a <code class="docutils literal">Manager</code> instance that does queries against the <code class="docutils literal">TOTPDevice</code> table. It is a specific kind of manager, known as a <code class="docutils literal">RelatedManager</code>, with the special property that it automatically does the correct <code class="docutils literal">filter()</code> calls to limit returned values to those related to the model instance, among other things.</p>
</section>
<section id="relatedmanager">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-7" role="doc-backlink">RelatedManager</a></h3>
<p>The <code class="docutils literal">RelatedManager</code> class is interesting in a number of ways. First, it is <a class="reference external" href="https://github.com/django/django/blob/d54717118360e8679aa2bd0c5a1625f3e84712ba/django/db/models/fields/related_descriptors.py#L632">created as a closure</a> – meaning the class itself is created inside a function that is called for each relationship. This is <strong>run-time type creation</strong>.</p>
<p>Second, there are some additional things it needs to support. In Django, projects often override <code class="docutils literal">Manager</code> classes, and the related <code class="docutils literal">QuerySet</code> classes, to provide a lot of model layer functionality. This is Django’s answer to the “Repository” pattern – popular in some languages, but laborious to create and painful to use <a class="reference external" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/">compared to what we have</a>.</p>
<p>Because of this, it’s important that the <code class="docutils literal">RelatedManager</code> preserves any custom <code class="docutils literal">Manager</code> and <code class="docutils literal">QuerySet</code> behaviour defined on the target model. So, the solution is simply that Django makes the created <code class="docutils literal">RelatedManager</code> class inherit from your custom <code class="docutils literal">Manager</code>. This is <strong>dynamic sub-classing</strong> – sub-classing of a class that is discovered at run-time. In other OOP languages, you inherit from framework classes. In Python, framework inherits you!</p>
</section>
<section id="manytomany-models">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-8" role="doc-backlink">ManyToMany models</a></h3>
<p>One common need in database applications is to have <a class="reference external" href="https://en.wikipedia.org/wiki/Many-to-many_(data_model)">many-to-many relationships</a> between two models. Typically this can be modelled with a separate table that has foreign keys to the two related tables.</p>
<p>To make this easy, Django provides a <code class="docutils literal">ManyToManyField</code>. For simple cases, it’s
tedious to have to create a model for the intermediate table yourself, so of
course Django just <a class="reference external" href="https://github.com/django/django/blob/d54717118360e8679aa2bd0c5a1625f3e84712ba/django/db/models/fields/related.py#L1279">creates it for you</a>
if you don’t provide your own, using <a class="reference external" href="https://docs.python.org/3/library/functions.html#type">type() with 3 arguments</a>. This is again <strong>run-time type creation</strong>.</p>
</section>
<section id="consequences">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-9" role="doc-backlink">Consequences</a></h3>
<p>These examples of run-time type modification or creation are perhaps not the most extreme or mind-bending. But they are something even better: useful. It’s these features, and things like them, that enable an ecosystem of third party Django libraries that can integrate with your own code without any problems.</p>
<p>Python also always gives us enough flexibility to have a good backwards compatibility story – so that, for example, the swappable User model was introduced with an absolute minimum of fuss for both projects and pluggable Django apps.</p>
<p>I’m interested in functional programming, Haskell in particular – this blog even ran on Haskell for a time – so I always take interest in developments in the Haskell web framework world. I see lots of cool things, but it always seems that the ecosystems around Haskell web frameworks are at least 10 years behind Django. One key issue is that in contrast to Django or other Python frameworks, Haskell web frameworks almost always have some kind of code generation layer. This can be made to work well for the purposes envisaged by the framework authors, but it never seems to enable the ecosystem of external packages that highly dynamic typing supports.</p>
<p>Please note that I’m not claiming here that Python is better than Haskell or anything so grand. I’m simply claiming that Python does enable very useful things to be built, and those things are made possible and easy <strong>because of Python’s design, rather than despite it</strong>.</p>
<p>I think this is important to say. Python has become massively more popular than it was when I first started to use it, and there are increasing numbers of people who use it only because of network effects, and don’t understand why it got so popular in the first place. These people can sometimes assume that it’s fundamentally a poorly designed language that we are just lumped with – today’s PHP – whose best trajectory is to make it more like Java or C# or Rust etc. I think that would be a big mistake.</p>
</section>
<section id="baserow">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-10" role="doc-backlink">Baserow</a></h3>
<p>One project that takes Python’s run-time type creation further is <a class="reference external" href="https://baserow.io/">Baserow</a>. In their case, their customers create database applications, and the metadata for those tables is stored in … tables. They like Django and want to use it as much as possible. But they also want their customers’ actual data tables to be normal RDBMS tables, and therefore benefit from all the typical RDBMS features to make their tables fast and compact etc. (I’ve seen and worked on systems that took the opposite approach, where customer schema was relegated to second class storage – essentially a key-value table – and the performance was predictably awful).</p>
<p>And they want plug-in authors to be able to use Django too! Some people are just greedy! They <a class="reference external" href="https://baserow.io/blog/how-baserow-lets-users-generate-django-models">have a nice article describing how they achieved all this</a>: in short, they use <code class="docutils literal">type()</code> for run-time type creation and then leverage everything Django gives them.</p>
<p>This has the interesting effect that the metadata tables, along with their own business tables, live <strong>at the same level</strong> as their customers’ tables which are described by those metadata tables. This “meta and non-meta living at the same level” is a neat illustration of what Python’s type system gives you:</p>
<p>When you first discover the mind-bending relationships around <code class="docutils literal">type(type) == type</code>, you might think of an infinitely-recursive relationship. But actually, an infinite relationship has been flattened to being just 3 layers deep – instance, class, metaclass. The last layer just recurses onto itself. The infinity has been tamed, and brought into the same structures that you can already deal with, and without changing language or switching to code generation techniques. This is one reason why many examples of the “hyper-programming” that Hillel talks about can just be dismissed as normal programming – but they are simply hyper-programming that you are now taking for granted.</p>
</section>
<section id="cciw-data-retention-policy">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-11" role="doc-backlink">CCiW data retention policy</a></h3>
<p><a class="reference external" href="https://www.cciw.co.uk/">CCiW</a> is a small charity I’ve been involved with for a long time. When I came to implement its <a class="reference external" href="https://gdpr-info.eu/">GDPR</a> and data retention policies, I found another example of how useful it is having access to Django’s meta-layer (generic framework code) on the same level as my normal data layer (business specific classes and tables), in ways that often aren’t the case for statically typed languages that resort to code-generation techniques for some of these things.</p>
<p>I wanted to have a data retention policy that was both human readable and machine readable, so that:</p>
<ul class="simple">
<li><p>We don’t have keep two separate documents in sync.</p></li>
<li><p>the CCiW committee and other interested parties would be able to read the policy that actually gets applied, rather than merely what the policy was supposed to be.</p></li>
<li><p>I could have machine level checking of the exhaustiveness of the policy.</p></li>
</ul>
<p>My solution was to split the data retention policy into two parts:</p>
<ul class="simple">
<li><p>a heavily commented, human-and-machine readable <a class="reference external" href="https://github.com/cciw-uk/cciw.co.uk/blob/master/config/data_retention.yaml">Literate YAML file</a> with a <a class="reference external" href="https://www.cciw.co.uk/data-retention-policy/">nicely formatted version</a>, that I can genuinely claim <strong>is</strong> our data retention policy, and that it is automatically applied,</p></li>
<li><p>and a <a class="reference external" href="https://github.com/cciw-uk/cciw.co.uk/blob/master/cciw/data_retention/applying.py">Python implementation</a> that reads this file and applies it, along with some additional logic.</p></li>
</ul>
<p>A key part of the neatness of this solution is that the generic, higher level code (which reads in a YAML file, and therefore has to treat field names and table names as strings), and the business/domain specific logic can sit right next to each other. The end result is something that’s both efficient and elegant, with great separation of concerns, and virtually self-maintaining – it complains at me automatically if I fail to update it when adding new fields or tables.</p>
<p>In terms of performance, the daily application for the data retention policy for the entire database requires, at the moment, just 5 UPDATE and 3 DELETE queries, run once a day. This is made possible by:</p>
<ul class="simple">
<li><p>using the power of an ORM,</p></li>
<li><p>using generic code to build up <code class="docutils literal">**kwargs</code> to <a class="reference external" href="https://github.com/cciw-uk/cciw.co.uk/blob/37e6d69064c9a5d1372809fa2d723a0e203d21c3/cciw/data_retention/applying.py#L86">pass</a> to <a class="reference external" href="https://docs.djangoproject.com/en/4.1/ref/models/querysets/#django.db.models.query.QuerySet.update">QuerySet.update()</a>,</p></li>
<li><p>seamlessly integrating these two with <a class="reference external" href="https://github.com/cciw-uk/cciw.co.uk/blob/37e6d69064c9a5d1372809fa2d723a0e203d21c3/cciw/data_retention/applying.py#L229">business specific logic</a>.</p></li>
</ul>
<p><a class="reference external" href="https://gist.github.com/spookylukey/eeafa220b61e479694e2acf44902b6e1">Here is one of the queries the ORM generates</a>, which is complex enough that I wouldn’t attempt to write this by hand, but it correctly applies business logic like not erasing any data of people who still owe us money, and combines all the erasure that needs to be done into a single query.</p>
</section>
</section>
<section id="query-tracing">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-12" role="doc-backlink">Query tracing</a></h2>
<p>A common need in database web applications is development tools that monitor what database queries your code is generating and where they are coming from in the code. In Python this is made very easy thanks to <a class="reference external" href="https://docs.python.org/3/library/sys.html?highlight=_getframe#sys._getframe">sys._getframe</a> which gives you frame objects of the currently running program.</p>
<p>For Django, the go-to tool that uses this is <a class="reference external" href="https://github.com/jazzband/django-debug-toolbar">django-debug-toolbar</a>, which does an excellent job of pinpointing where queries are coming from.</p>
<p>There have been times when it has failed me, however. In particular, when you are working with generic code, such as the Django admin or <a class="reference external" href="https://www.django-rest-framework.org/">Django REST framework</a>, in which the fields and properties that will be fetched may be defined as strings in declarative code, a stack trace alone is not enough to work out what is triggering the queries. For example, you might have an admin class defined like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_ce953786759e4a2cac950c3dc319cf48-1" name="rest_code_ce953786759e4a2cac950c3dc319cf48-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_ce953786759e4a2cac950c3dc319cf48-1"></a><span class="k">class</span> <span class="nc">MyModelAdmin</span><span class="p">(</span><span class="n">admin</span><span class="o">.</span><span class="n">ModelAdmin</span><span class="p">):</span>
<a id="rest_code_ce953786759e4a2cac950c3dc319cf48-2" name="rest_code_ce953786759e4a2cac950c3dc319cf48-2" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_ce953786759e4a2cac950c3dc319cf48-2"></a> <span class="n">list_display</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"field1"</span><span class="p">,</span> <span class="s2">"field2"</span><span class="p">,</span> <span class="s2">"field3"</span><span class="p">,</span> <span class="o">...</span><span class="p">]</span>
</pre></div>
<p>And the stack trace points you to <a class="reference external" href="https://github.com/django/django/blob/4470c2405c8dbb529501f9d78753e2aa4e9653a2/django/contrib/admin/templatetags/admin_list.py#L212">this code</a>:</p>
<div class="code"><pre class="code python"><a id="rest_code_e9d42b47cd754134816dd91c35dad923-1" name="rest_code_e9d42b47cd754134816dd91c35dad923-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_e9d42b47cd754134816dd91c35dad923-1"></a><span class="k">for</span> <span class="n">field_index</span><span class="p">,</span> <span class="n">field_name</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">cl</span><span class="o">.</span><span class="n">list_display</span><span class="p">):</span>
<a id="rest_code_e9d42b47cd754134816dd91c35dad923-2" name="rest_code_e9d42b47cd754134816dd91c35dad923-2" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_e9d42b47cd754134816dd91c35dad923-2"></a> <span class="n">f</span><span class="p">,</span> <span class="n">attr</span><span class="p">,</span> <span class="n">value</span> <span class="o">=</span> <span class="n">lookup_field</span><span class="p">(</span><span class="n">field_name</span><span class="p">,</span> <span class="n">result</span><span class="p">,</span> <span class="n">cl</span><span class="o">.</span><span class="n">model_admin</span><span class="p">)</span>
</pre></div>
<p>It’s correct, but not helpful. I need to know what the value of the local variable <code class="docutils literal">field_name</code> is in that loop to work out what is actually causing these queries.</p>
<p>In addition, in one case I was actually working with DRF endpoints, not the HTML endpoints the debug toolbar is designed for.</p>
<p>So, I wrote <a class="reference external" href="https://gist.github.com/spookylukey/cafeadfbe776ace223e5520bb0a93652#file-db_debug-py-L313">my own utilities</a> that, in addition to extracting the stack, would also include certain local variables for specified functions/methods. I then needed to add some aggregation functionality and pretty-printing for the SQL queries too. Also, I wrote a version of <a class="reference external" href="https://docs.djangoproject.com/en/4.1/topics/testing/tools/#django.test.TransactionTestCase.assertNumQueries">assertNumQueries</a> that used this better reporting.</p>
<p>This was highly effective, and enabled me and members of my team to tackle these DRF endpoints that had got entirely out of hand, often taking them from 10,000+ database queries (!) to 10 or 20.</p>
<p>This is relatively advanced stuff, but not actually all that hard, and it’s within reach of many developers. It doesn’t require learning a whole new language or deep black magic. You can call <code class="docutils literal">sys._getframe</code> interactively from a REPL and find out what it does. The biggest hurdle is actually making the mental leap that says “I need to build this, and with Python, I probably can”.</p>
</section>
<section id="time-machine-and-pyfakefs">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-13" role="doc-backlink">time-machine and pyfakefs</a></h2>
<p>As an example of “entire program transformation”, <a class="reference external" href="https://github.com/adamchainz/time-machine">time-machine</a> is an extremely useful library that mocks out date/time functions across your entire program, and <a class="reference external" href="https://github.com/pytest-dev/pyfakefs">pyfakefs</a> is one that does the same thing for file-system calls.</p>
<p>These contrast with libraries like <a class="reference external" href="https://docs.python.org/3/library/unittest.mock.html">unittest.mock</a> do that do monkey patching on a more limited, module-by-module basis.</p>
<p>This technique is primarily useful in automated test suites, but it has a profound impact on the rest of your code base. In other languages, if you want to mock out “all date/time access” or “all filesystem access”, you may end up with a lot of tedious and noisy code to pass these dependencies through layers of code, or complex automatic dependency injection frameworks to avoid that. In Python, those things are rarely necessary, precisely because of things like time-machine and pyfakefs – that is, because your entire program can be manipulated at run-time. Your code base then has the massive benefit of a direct and simple style.</p>
</section>
<section id="environment-detection">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-14" role="doc-backlink">Environment detection</a></h2>
<p>My current employer is <a class="reference external" href="https://datapane.com/">Datapane</a> who make tools for data apps. Many of our customers use <a class="reference external" href="https://jupyter.org/">Jupyter</a> or similar environments. To make things work really smoothly, our library codes detects the environment it is running in and responds, and in some cases interacts with this environment (courtesy of <a class="reference external" href="https://datacrayon.com/">Shahin</a>, our Jupyter guy). This is an application of Python’s great support for introspection of the running program. There are a bunch of ways you can do this kind of thing:</p>
<ul class="simple">
<li><p>checking the system environment in <code class="docutils literal">os.environ</code></p></li>
<li><p>checking the contents of <code class="docutils literal">sys.modules</code></p></li>
<li><p>using <code class="docutils literal">sys._getframe</code> to examine how you are being called.</p></li>
<li><p>attempting to use <a class="reference external" href="https://ipython.readthedocs.io/en/stable/api/generated/IPython.core.getipython.html#IPython.core.getipython.get_ipython">get_ipython</a> and seeing if it works etc.</p></li>
</ul>
<p>This is an example of the “whole runtime environment” being dynamic and introspectable, and Jupyter Notebook and its huge ecosystem make great use of this.</p>
<p>With some of the bigger features we’re working on at the moment at Datapane, we’re needing more advanced ways of adjusting to the running environment. Of course, as long it works, none of the implementation matters to our customers, so we don’t advertise any of that. Our marketing tagline for this is “Jupyter notebook to a shareable data app in 10 seconds”, not “we’re in your Python process, looking at your sys.modules”.</p>
<p>After doing a grep through my <code class="docutils literal"><span class="pre">site-packages</span></code>, I found that doing <code class="docutils literal">sys._getframe</code> for different kinds of environment detection is relatively common – often used for things like “raise a deprecation warning, but not if we are being called from these specific callees, like our own code”. Here’s just one more example:</p>
<p><a class="reference external" href="https://boltons.readthedocs.io/">boltons</a> provides a <a class="reference external" href="https://boltons.readthedocs.io/en/latest/typeutils.html?highlight=sentinel#boltons.typeutils.make_sentinel">make_sentinel</a> function. The docs state that if you want “pickleability”, the sentinel must be stored in a module-level constant. But the implementation goes further and <a class="reference external" href="https://boltons.readthedocs.io/en/latest/_modules/boltons/typeutils.html#make_sentinel">checks</a> you are doing that using a <code class="docutils literal">sys._getframe</code> trick. This is just a simple usability enhancement in which code checks that it is being used correctly, made possible by Python’s deep introspection support, but this kind of thing adds up. You will find many similar things in small amounts scattered across different libraries.</p>
</section>
<section id="fluent-compiler">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-15" role="doc-backlink">fluent-compiler</a></h2>
<p><a class="reference external" href="https://projectfluent.org/">Fluent</a> is a localisation system by Mozilla. I wrote and contributed the initial version of the official <a class="reference external" href="https://github.com/projectfluent/python-fluent">fluent.runtime</a> Python implementation, which is an interpreter for the Fluent language, and I also wrote a second implementation, <a class="reference external" href="https://github.com/django-ftl/fluent-compiler">fluent-compiler</a>.</p>
<p>Of all the libraries I’ve written, this was the one I enjoyed most, and it’s also the least popular it seems – not surprising, since GNU gettext provides a great 90% solution, which is enough for just about everyone, apart from Mozilla and, for some reason I can’t quite remember, me. However, I do know that Mozilla are actually using my second implementation in some of their web projects, via <a class="reference external" href="https://github.com/django-ftl/django-ftl">django-ftl</a>, and I’m using it, and it has a few GitHub stars, so that counts as real world!</p>
<p>Here are some of the Hillel-worthy Python techniques I used:</p>
<section id="compile-to-python">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-16" role="doc-backlink">Compile-to-Python</a></h3>
<p>In fluent-compiler, the implementation strategy I took was to compile the parsed Fluent AST to Python code, and <code class="docutils literal">exec</code> it. I actually use Python AST nodes rather than strings, for various security reasons, but this is basically the same as doing <code class="docutils literal">eval</code>, and that same technique is used by various other projects like <a class="reference external" href="https://jinja.palletsprojects.com/">Jinja</a> and <a class="reference external" href="https://www.makotemplates.org/">Mako</a>.</p>
<p>If anything qualifies as “programs that create programs on the fly”, then using <a class="reference external" href="https://docs.python.org/3/library/functions.html#exec">exec</a>, <a class="reference external" href="https://docs.python.org/3/library/functions.html#eval">eval</a> or <a class="reference external" href="https://docs.python.org/3/library/functions.html#compile">compile</a> must do so! The main advantage of this technique here is speed. It works particularly well with Fluent, because with a bit of static analysis, we can often completely eliminate the overhead that would otherwise be caused by its more advanced features, like <a class="reference external" href="https://projectfluent.org/fluent/guide/terms.html">terms and parameterized terms</a>, so that at run-time they cost us nothing.</p>
<p>This works even better when combined with PyPy. For the simple and common cases, under CPython 3.11 my benchmarks show a solution using fluent-compiler is about 15% faster than GNU gettext, while under PyPy it’s more than twice as fast. You should take these numbers with a pinch of salt, but I am confident that the result is not slow, despite having far more advanced capabilities than GNU gettext, which is not true for the first implementation – the compiler is about 5-10x faster than the interpreter for common cases on CPython.</p>
<p>Additionally, there are some neat tricks you can do when implementing a compiler using the same language that you are compiling to, like <a class="reference external" href="https://github.com/django-ftl/fluent-compiler/blob/6b262af7ce7c5608516aa24aff868ff66f95e0af/src/fluent_compiler/compiler.py#L1366">evaluating some things ahead of time that you know are constants</a>.</p>
</section>
<section id="dynamic-test-methods">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-17" role="doc-backlink">Dynamic test methods</a></h3>
<p>While developing the second implementation, I used the first implementation as a reference. I didn’t want to duplicate every test, or really do anything manually to every test, I just wanted a large sub-set of the test suite to automatically test both implementations. I also wanted failures to clearly indicate which implementation had failed, i.e. I wanted them to run as separate test cases, because the reference implementation could potentially be at fault in some corner cases.</p>
<p>As I was using unittest, my solution was this: I <a class="reference external" href="https://github.com/django-ftl/fluent-compiler/blob/d1481d61e0bc1a28a228a4b6d5258350d436e765/fluent.runtime/tests/__init__.py#L12">added a class decorator</a> that modified the test classes by removing every method that started with <code class="docutils literal">test_</code>, replacing it with two methods, one for each implementation.</p>
<p>This provided almost exactly the same functionality as one of Hillel’s wished-for examples:</p>
<blockquote>
<p>Add an output assertion to an optimized function in dev/testing, checking that on all invocations it matches the result of an unoptimized function</p>
</blockquote>
<p>I just used a slightly different technique that better suited my needs, but also made great use of run-time program manipulation.</p>
</section>
<section id="morph-into">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-18" role="doc-backlink"><code class="docutils literal">morph_into</code></a></h3>
<p>As part of the Fluent-to-Python compilation process, I have a tree of AST objects that I want to simplify. Simplifications include things like replacing a “string join” operation that has just one string, with that single string – so we need completely different types of objects. Even in a language that has mutation this can be a bit of a pain, because we’ve got to update the parent object and tell it to replace this child with a different child, and there are many different types of parent object with very different shapes. So my solution was <code class="docutils literal">morph_into</code>:</p>
<div class="code"><pre class="code python"><a id="rest_code_31ecff3114ad4381bf1274dff2c67ba8-1" name="rest_code_31ecff3114ad4381bf1274dff2c67ba8-1" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_31ecff3114ad4381bf1274dff2c67ba8-1"></a><span class="k">def</span> <span class="nf">morph_into</span><span class="p">(</span><span class="n">item</span><span class="p">,</span> <span class="n">new_item</span><span class="p">):</span>
<a id="rest_code_31ecff3114ad4381bf1274dff2c67ba8-2" name="rest_code_31ecff3114ad4381bf1274dff2c67ba8-2" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_31ecff3114ad4381bf1274dff2c67ba8-2"></a> <span class="sd">"""</span>
<a id="rest_code_31ecff3114ad4381bf1274dff2c67ba8-3" name="rest_code_31ecff3114ad4381bf1274dff2c67ba8-3" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_31ecff3114ad4381bf1274dff2c67ba8-3"></a><span class="sd"> Change `item` into `new_item` without changing its identity</span>
<a id="rest_code_31ecff3114ad4381bf1274dff2c67ba8-4" name="rest_code_31ecff3114ad4381bf1274dff2c67ba8-4" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_31ecff3114ad4381bf1274dff2c67ba8-4"></a><span class="sd"> """</span>
<a id="rest_code_31ecff3114ad4381bf1274dff2c67ba8-5" name="rest_code_31ecff3114ad4381bf1274dff2c67ba8-5" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_31ecff3114ad4381bf1274dff2c67ba8-5"></a> <span class="n">item</span><span class="o">.</span><span class="vm">__class__</span> <span class="o">=</span> <span class="n">new_item</span><span class="o">.</span><span class="vm">__class__</span>
<a id="rest_code_31ecff3114ad4381bf1274dff2c67ba8-6" name="rest_code_31ecff3114ad4381bf1274dff2c67ba8-6" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#rest_code_31ecff3114ad4381bf1274dff2c67ba8-6"></a> <span class="n">item</span><span class="o">.</span><span class="vm">__dict__</span> <span class="o">=</span> <span class="n">new_item</span><span class="o">.</span><span class="vm">__dict__</span>
</pre></div>
<p>With this solution, we leave the identity of the object the same, so none of the pointers to it need to be updated. But its type and all associated data is changed into something else, so that, other than <a class="reference external" href="https://docs.python.org/3/library/functions.html#id">id()</a>, the behaviour of <code class="docutils literal">item</code> will now be indistinguishable from <code class="docutils literal">new_item</code>. Not many languages allow you to do this!</p>
<p>I spent quite a lot of time wondering if I should be ashamed or proud of this code. But it turned out there was nothing to be ashamed of – it saved me writing a bunch of code and has had really no downsides.</p>
<p>Now, this technique won’t work for some things, like builtin primitives, so it can’t be completely generalised. But it doesn’t need that in order to be useful – all the objects I want to do this on are my own custom AST classes that share an interface, so it works and is “type safe” in its own way.</p>
<p>I’m far from the first person to discover this kind of trick when implementing compilers. In <a class="reference external" href="https://thume.ca/2019/04/29/comparing-compilers-in-rust-haskell-c-and-python/">this comparison of several groups of people working on a compiler project</a>, one of the most impressive results came from a single-person team who chose Python. She used way less code, and implemented way more features than the other groups, who all had multiple people on their teams and were using C++/Rust/Haskell etc. Fancy metaprogramming and dynamic typing were a big part of the difference, and by the sounds of it she used exactly the same kinds of things I used:</p>
<blockquote>
<p>Another example of the power of metaprogramming and dynamic typing is that we have a 400 line file called <code class="docutils literal">visit.rs</code> that is mostly repetitive boilerplate code implementing a visitor on a bunch of AST structures. In Python this could be a short ~10 line function that recursively introspects on the fields of the AST node and visits them (using the <code class="docutils literal">__dict__</code> attribute).</p>
</blockquote>
<p>Again, I’m not claiming “dynamic typing is better than static typing” – <a class="reference external" href="https://lukeplant.me.uk/blog/posts/you-cant-compare-language-features-only-languages/">I don’t think it’s even meaningful to do that comparison</a>. I’m claiming that highly dynamic meta-programming tricks are indeed a significant part of real Python code, and really do make a big difference.</p>
</section>
</section>
<section id="pytest">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-19" role="doc-backlink">Pytest</a></h2>
<p>Pytest does quite a few dynamic tricks. Hillel wishes that pytests functionality was more easily usable elsewhere, such as from a REPL. I’ve no doubt this is a legitimate complaint – as it happens, my own use cases involve <a class="reference external" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/">sticking a REPL in my test</a>, rather than sticking a pytest in my REPL. However, you can’t claim that pytest isn’t a valid example, or isn’t making use of Python’s dynamism – it does, and it provides a lot of useful functionality as a result, including:</p>
<section id="assert-rewriting">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-20" role="doc-backlink">Assert rewriting</a></h3>
<p>The most obvious is perhaps their <a class="reference external" href="https://docs.pytest.org/en/6.2.x/assert.html">assert rewriting</a>, which relies on modifying the AST of test modules to inject sub-expression information for when asserts fail. It makes test assertions often much more immediately useful.</p>
</section>
<section id="automatic-dependency-injection-of-fixtures">
<h3><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-21" role="doc-backlink">Automatic dependency injection of fixtures</a></h3>
<p>Pytest provides one of the few cases of automatic dependency injection in Python where I’ve thought it was a good idea. It also makes use of Python’s dynamism to make this dependency injection extremely low ceremony. All you need to do is add a parameter to your test function, giving the parameter the name of the <a class="reference external" href="https://docs.pytest.org/en/6.2.x/fixture.html">fixture</a> you want, and pytest will find that fixture in its registry and pass it to your function.</p>
</section>
</section>
<section id="others">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-22" role="doc-backlink">Others</a></h2>
<p>This post is way too long already, and I’ve done very little actual searching for this stuff – almost all my examples are things that I’ve heard about in the past or done myself, so there must be far more than these in the real world out there. Here are a bunch more I thought of but didn’t have time to expand on:</p>
<ul>
<li><p>PyTorch <a class="reference external" href="https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html">automatic differentiation</a> which uses instrumented objects (similar to the SQLAlchemy example I presume), plus <a class="reference external" href="https://github.com/pytorch/pytorch/blob/master/tools/autograd/derivatives.yaml">some kind of pattern matching on function calls</a> that I haven’t had time to investigate.</p>
<p><a class="reference external" href="https://vmartin.fr/understanding-automatic-differentiation-in-30-lines-of-python.html">Understanding Automatic Differentiation in 30 lines of Python</a> is a great article on how you can build this kind of thing. Crucially, Python’s dynamism makes this kind of thing very accessible to mere mortals.</p>
</li>
<li><p><a class="reference external" href="https://vcrpy.readthedocs.io/en/latest/index.html">VCR.py</a>: monkey patch all HTTP functions and record interactions, so that the second time we run a test we can use canned responses and avoid the network.</p></li>
<li><p>CCiW email tests: <a class="reference external" href="https://github.com/cciw-uk/cciw.co.uk/blob/37e6d69064c9a5d1372809fa2d723a0e203d21c3/cciw/utils/tests/base.py#L55">monkey patch Django’s Atomic decorator and mail sending functions</a> to ensure we are using “queued email” appropriately inside transactions.</p></li>
<li><p>Lots of tricks in <a class="reference external" href="https://github.com/radiac/django-tagulous">django-tagulous</a> to improve usability for developers.</p></li>
<li><p><a class="reference external" href="https://numba.pydata.org/">numba</a>: JIT compile and run your Python code on a GPU with a single decorator.</p></li>
<li><p><a class="reference external" href="https://drf-spectacular.readthedocs.io/en/latest/readme.html">drf-spectacular</a>: iterate over all endpoints in a DRF project, introspecting serializers and calling methods with dummy request objects where necessary, to produce an OpenAPI schema.</p></li>
<li><p>In the stdlib, <a class="reference external" href="https://docs.python.org/3/library/functools.html#functools.total_ordering">@total_ordering</a> will look at your class and add missing rich comparison methods.</p></li>
<li><p>depending on an environment flag, <a class="reference external" href="https://github.com/learnscripture/learnscripture.net/blob/3063de7bd364ccf6105b39485830e75b19f902d9/learnscripture/tests/base.py#L232">automatically wrap all UI test cases in a decorator that takes a screenshot if the test fails</a>.</p></li>
</ul>
<p>EDIT: And some more I discovered after publishing this post, which look interesting:</p>
<ul class="simple">
<li><p><a class="reference external" href="https://github.com/amakelov/mandala">mandala</a> - “Computations that save, query and version themselves”</p></li>
<li><p><a class="reference external" href="https://github.com/google/latexify_py">latexify</a> - pretty print Python functions using LaTeX</p></li>
<li><p><a class="reference external" href="https://jax.readthedocs.io/en/latest/index.html">JAX</a> which has a JIT compiler of Python code and <a class="reference external" href="https://github.com/hips/autograd">Autograd</a> which implements automatic differentiation of Python code, possibly similar to PyTorch.</p></li>
</ul>
</section>
<section id="conclusion">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-23" role="doc-backlink">Conclusion</a></h2>
<p>Why don’t we talk about these much? I think a large part of the answer is that the Python community cares about solving problems, and not about how clever your code is. Clever code, in fact, is looked down on, which is the right attitude – cleverness for the sake of it is always bad. Problem solving is good though. So libraries and projects that do these things don’t tend to brag about their clever techniques, but the problem that they solve.</p>
<p>Also, many libraries that use these things wrap them up so that you don’t have to know what’s going on – It Just Works. As a newbie, everything about computers is magical and you have to just accept that that’s how they work. Then you take it for granted, and just get on with using it.</p>
<p>On the other hand, for the implementer, once you understand the magic, it stops being magic, it’s just a feature that the language has.</p>
<p>Either way, pretty soon none of these things count as “hyper programming” any more – in one sense, they are just normal Python programming, and that’s the whole point: <strong>Python gives you super powers which are not super powers, they are normal powers</strong>. Everyone gets to use them, and you don’t need to learn a different language to do so.</p>
<p>Perhaps we do need to talk about them more, though. At the very least, I hope my examples have sparked some ideas about the kinds of things that are possible in Python.</p>
<p>Happy hacking!</p>
</section>
<section id="links">
<h2><a class="toc-backref" href="https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/#toc-entry-24" role="doc-backlink">Links</a></h2>
<ul class="simple">
<li><p><a class="reference external" href="https://lobste.rs/s/9w7ylg/python_s_disappointing_superpowers">Discussion of this post on Lobsters</a></p></li>
<li><p><a class="reference external" href="https://twitter.com/spookylukey/status/1620851142849863680">Discussion of this post on Twitter</a></p></li>
<li><p><a class="reference external" href="https://news.ycombinator.com/item?id=34611969">Discussion of this post on Hacker News</a></p></li>
</ul>
</section>Raising exceptions or returning error objects in Pythonhttps://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/2022-06-06T11:29:35+01:002022-06-06T11:29:35+01:00Luke Plant<p>How returning error objects can provide some advantages over raising exceptions in Python, such as for static type checking tools.</p><p>The other day I got a question about some old code I had written which, instead
of raising an exception for an error condition as the reader expected, returned
an error object:</p>
<blockquote>
<p>With your EmailVerifyTokenGenerator class, why do you return error classes
instead of raising custom errors? You could still pass the email to a custom
VerifyExpired exception.</p>
<p><a class="reference external" href="https://github.com/cciw-uk/cciw.co.uk/blob/eae8005feb95a5383663e69e92d80e11effe5ee6/cciw/bookings/email.py#L41">https://github.com/cciw-uk/cciw.co.uk/blob/eae8005feb95a5383663e69e92d80e11effe5ee6/cciw/bookings/email.py#L41</a></p>
<p>I think I'm too eager to raise errors but maybe there's something I'm missing with classes 😁!</p>
</blockquote>
<p>The code in question is below (slightly modified and with several uninteresting
methods removed). It is part of a system for doing email address verification
via magic links in emails.</p>
<div class="code"><pre class="code python"><a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-1" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-1" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-1"></a><span class="kn">from</span> <span class="nn">dataclasses</span> <span class="kn">import</span> <span class="n">dataclass</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-2" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-2" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-2"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-3" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-3" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-3"></a><span class="k">class</span> <span class="nc">VerifyFailed</span><span class="p">:</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-4" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-4" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-4"></a> <span class="k">pass</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-5" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-5" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-5"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-6" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-6" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-6"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-7" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-7" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-7"></a><span class="n">VerifyFailed</span> <span class="o">=</span> <span class="n">VerifyFailed</span><span class="p">()</span> <span class="c1"># singleton sentinel value</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-8" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-8" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-8"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-9" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-9" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-9"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-10" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-10" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-10"></a><span class="nd">@dataclass</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-11" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-11" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-11"></a><span class="k">class</span> <span class="nc">VerifyExpired</span><span class="p">:</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-12" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-12" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-12"></a> <span class="n">email</span><span class="p">:</span> <span class="nb">str</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-13" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-13" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-13"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-14" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-14" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-14"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-15" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-15" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-15"></a><span class="k">class</span> <span class="nc">EmailVerifyTokenGenerator</span><span class="p">:</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-16" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-16" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-16"></a> <span class="k">def</span> <span class="nf">token_for_email</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">email</span><span class="p">):</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-17" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-17" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-17"></a> <span class="o">...</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-18" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-18" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-18"></a>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-19" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-19" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-19"></a> <span class="k">def</span> <span class="nf">email_from_token</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">token</span><span class="p">):</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-20" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-20" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-20"></a> <span class="sd">"""</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-21" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-21" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-21"></a><span class="sd"> Extracts the verified email address from the token, or a VerifyFailed</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-22" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-22" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-22"></a><span class="sd"> constant if verification failed, or VerifyExpired if the link expired.</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-23" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-23" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-23"></a><span class="sd"> """</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-24" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-24" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-24"></a> <span class="n">max_age</span> <span class="o">=</span> <span class="n">settings</span><span class="o">.</span><span class="n">EMAIL_VERIFY_TIMEOUT</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-25" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-25" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-25"></a> <span class="k">try</span><span class="p">:</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-26" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-26" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-26"></a> <span class="n">unencoded_token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">url_safe_decode</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-27" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-27" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-27"></a> <span class="k">except</span> <span class="p">(</span><span class="ne">UnicodeDecodeError</span><span class="p">,</span> <span class="n">binascii</span><span class="o">.</span><span class="n">Error</span><span class="p">):</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-28" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-28" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-28"></a> <span class="k">return</span> <span class="n">VerifyFailed</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-29" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-29" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-29"></a> <span class="k">try</span><span class="p">:</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-30" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-30" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-30"></a> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">signer</span><span class="o">.</span><span class="n">unsign</span><span class="p">(</span><span class="n">unencoded_token</span><span class="p">,</span> <span class="n">max_age</span><span class="o">=</span><span class="n">max_age</span><span class="p">)</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-31" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-31" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-31"></a> <span class="k">except</span> <span class="p">(</span><span class="n">SignatureExpired</span><span class="p">,):</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-32" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-32" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-32"></a> <span class="k">return</span> <span class="n">VerifyExpired</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">signer</span><span class="o">.</span><span class="n">unsign</span><span class="p">(</span><span class="n">unencoded_token</span><span class="p">))</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-33" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-33" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-33"></a> <span class="k">except</span> <span class="p">(</span><span class="n">BadSignature</span><span class="p">,):</span>
<a id="rest_code_968c47b7cbd149d584a89da29fe4c56e-34" name="rest_code_968c47b7cbd149d584a89da29fe4c56e-34" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_968c47b7cbd149d584a89da29fe4c56e-34"></a> <span class="k">return</span> <span class="n">VerifyFailed</span>
</pre></div>
<p>To sum up, we have a function that extracts an email address from a token,
checking the HMAC signature that it is bundled with. There are 3 possibilities
we want to deal with:</p>
<ol class="arabic simple">
<li><p>The happy case – we’ve got a valid HMAC code, we just need the email address
returned.</p></li>
<li><p>We’ve got an invalid signature.</p></li>
<li><p>We’ve got a valid but expired signature. We want to handle this separately,
because we’d like to streamline the user experience for getting a new token
generated and sent to them, which means we need to return the email address.</p></li>
</ol>
<p>It’s using <a class="reference external" href="https://docs.djangoproject.com/en/stable/topics/signing/">Django’s signer functions</a> to do the heavy
lifting, but that doesn’t matter for our purposes, because we are wrapping it
up.</p>
<p>To get going on designing our API for this bit of code, here are some bad
options:</p>
<ol class="arabic">
<li><p>We could have a pair of methods or functions: <code class="docutils literal">extract_email_from_token</code>
and <code class="docutils literal">check_signature</code>, which can be used independently. This is bad because
you could easily use <code class="docutils literal">extract_email_from_token</code> and completely forget to
use <code class="docutils literal">check_signature</code>.</p>
<p>The principle here is that we want the developer using this API to fall into
<a class="reference external" href="https://blog.codinghorror.com/falling-into-the-pit-of-success/">the pit of success</a>. Either
the developer should get their code perfectly correct, or if they don’t, it
either will be obviously broken and not work at all, or at least not subtly
flawed with some nasty bug, like a security issue.</p>
</li>
<li><p>We could have <code class="docutils literal">email_from_token()</code> method or function with a return value
of a tuple containing <code class="docutils literal">(email_address: str, valid_and_not_expired_signature:
bool)</code>.</p>
<p>This has a similar issue to above – the calling code could use
<code class="docutils literal">email_address</code> and forget to check the validity boolean.</p>
</li>
</ol>
<p>Having ruled those out, we’ve got two main contenders for how to design
<code class="docutils literal">email_from_token()</code>:</p>
<ol class="arabic simple">
<li><p>We could make it raise exceptions for the “invalid” or “expired” cases. We need
to pass extra data for the latter, but we can put it inside the exception
object – as noted by the original questioner.</p></li>
<li><p>We could make it return error objects for the error cases, as coded above.</p></li>
</ol>
<p><strong>Both</strong> of these satisfy the “pit of success” criterion. If the developer
accidentally does not handle the error cases, they won’t have a bug where we
verified an email address that should not be verified. We will instead probably
have a crasher of some kind, which in the case of a web app, like this one,
means a 500 error page being seen, and something in our logs that makes it
pretty clear what happened.</p>
<p>If we choose to raise exceptions, naive code which doesn’t check for the
exceptions will simply get no further – the exception will propagate up and
terminate the handler. With the second option where we return error objects,
those objects can’t be accidentally converted into success values – the
<code class="docutils literal">VerifyExpired</code> object <strong>contains</strong> the email address, but it is a completely
different shape of value from the happy case.</p>
<p>Both of these approaches, to some degree, respect the principle that can be
summed up as <a class="reference external" href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">Parse Don’t Validate</a>. Instead
of merely validating a token and extracting an email address as two independent
things, we are parsing a token, and encoding the result of the validation in the
type of objects that will then flow through the program.</p>
<p>But which is better?</p>
<p>One of the influences on my thinking is the way types work in Haskell and other
similar language which make it very easy to create types and constructors. In
Haskell, the following is <strong>all</strong> the code you need to define a return type for
this kind of function, and the 3 different data constructors you need, which
then do double duty for <a class="reference external" href="https://en.m.wikibooks.org/wiki/Haskell/Pattern_matching">pattern matching</a>:</p>
<div class="code"><pre class="code haskell"><a id="rest_code_6404aa9a9eee4ea59e465608e3bf3963-1" name="rest_code_6404aa9a9eee4ea59e465608e3bf3963-1" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_6404aa9a9eee4ea59e465608e3bf3963-1"></a><span class="kr">data</span><span class="w"> </span><span class="kt">EmailVerificationResult</span><span class="w"> </span><span class="ow">=</span><span class="w"> </span><span class="kt">EmailVerified</span><span class="w"> </span><span class="n">string</span><span class="w"></span>
<a id="rest_code_6404aa9a9eee4ea59e465608e3bf3963-2" name="rest_code_6404aa9a9eee4ea59e465608e3bf3963-2" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_6404aa9a9eee4ea59e465608e3bf3963-2"></a><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="kt">VerifyFailed</span><span class="w"></span>
<a id="rest_code_6404aa9a9eee4ea59e465608e3bf3963-3" name="rest_code_6404aa9a9eee4ea59e465608e3bf3963-3" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_6404aa9a9eee4ea59e465608e3bf3963-3"></a><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="kt">VerifyExpired</span><span class="w"> </span><span class="n">string</span><span class="w"></span>
</pre></div>
<p>Now, Python is not nearly as succinct, but <a class="reference external" href="https://docs.python.org/3/library/dataclasses.html">dataclasses</a> were a big improvement
for defining things like <code class="docutils literal">VerifyExpired</code>.</p>
<p>In Haskell, due to static type checking, this pattern makes it pretty much
impossible for the calling code to accidentally fail to handle the return value
correctly. But even in Python, which doesn’t have that built in, I think there
are some compelling advantages:</p>
<ol class="arabic">
<li><p>We expect the calling code to handle all the different return values at some
point, and <strong>at the same point</strong>. (This is unlike some code where we can
raise an exception that we never expect the calling code to specifically
handle – it will be handled by more generic methods at a different layer). It
therefore makes sense that we treat all 3 values as the same kind of thing —
they are just different return values.</p></li>
<li><p>If you instead raise exceptions, you are immediately forcing the calling
code into a special control flow structure, namely the <code class="docutils literal">try/except</code> dance,
which can be inconvenient.</p></li>
<li><p>In particular, if you want to hand off processing of the value to some other
function or code for handling, you can’t do it easily. For example, code like
this would be fine with the “return error object” method, but significantly
complicated by the “raise exception” method:</p>
<div class="code"><pre class="code python"><a id="rest_code_01bee2e5325d49249bd3b125bab79f8f-1" name="rest_code_01bee2e5325d49249bd3b125bab79f8f-1" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_01bee2e5325d49249bd3b125bab79f8f-1"></a><span class="n">verify_result</span> <span class="o">=</span> <span class="n">verifier</span><span class="o">.</span><span class="n">email_from_token</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
<a id="rest_code_01bee2e5325d49249bd3b125bab79f8f-2" name="rest_code_01bee2e5325d49249bd3b125bab79f8f-2" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_01bee2e5325d49249bd3b125bab79f8f-2"></a><span class="n">log_verify_result</span><span class="p">(</span><span class="n">request</span><span class="o">.</span><span class="n">ip_address</span><span class="p">,</span> <span class="n">verify_result</span><span class="p">)</span>
<a id="rest_code_01bee2e5325d49249bd3b125bab79f8f-3" name="rest_code_01bee2e5325d49249bd3b125bab79f8f-3" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_01bee2e5325d49249bd3b125bab79f8f-3"></a><span class="c1"># etc.</span>
</pre></div>
</li>
</ol>
<p>In the years since I wrote the code, however, some perhaps more compelling
arguments have come along for the error object method.</p>
<p>First, with some small changes (specifically, removing the sentinel singleton
value), we can now add a type signature for <code class="docutils literal">email_from_token</code>:</p>
<div class="code"><pre class="code python"><a id="rest_code_86cd4bf44b394bbf951809b34a582048-1" name="rest_code_86cd4bf44b394bbf951809b34a582048-1" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_86cd4bf44b394bbf951809b34a582048-1"></a><span class="k">def</span> <span class="nf">email_from_token</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">token</span><span class="p">,</span> <span class="n">max_age</span><span class="o">=</span><span class="kc">None</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span> <span class="o">|</span> <span class="n">VerifyFailed</span> <span class="o">|</span> <span class="n">VerifyExpired</span><span class="p">:</span>
<a id="rest_code_86cd4bf44b394bbf951809b34a582048-2" name="rest_code_86cd4bf44b394bbf951809b34a582048-2" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_86cd4bf44b394bbf951809b34a582048-2"></a> <span class="o">...</span>
</pre></div>
<p>(You may need <a class="reference external" href="https://docs.python.org/3/library/typing.html#typing.Union">typing.Union</a> for older Python
versions)</p>
<p>This is a benefit in itself from a documentation point of view, and for better
IDE/editor help.</p>
<p>We can go further with mypy. We can structure our calling code as follows to make
use of <a class="reference external" href="https://hakibenita.com/python-mypy-exhaustive-checking">mypy exhaustiveness checking</a>:</p>
<div class="code"><pre class="code python"><a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-1" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-1" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-1"></a><span class="kn">from</span> <span class="nn">typing_extensions</span> <span class="kn">import</span> <span class="n">assert_never</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-2" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-2" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-2"></a>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-3" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-3" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-3"></a><span class="n">verified_email</span> <span class="o">=</span> <span class="n">EmailVerifyTokenGenerator</span><span class="p">()</span><span class="o">.</span><span class="n">email_from_token</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-4" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-4" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-4"></a><span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">verified_email</span><span class="p">,</span> <span class="n">VerifyFailed</span><span class="p">):</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-5" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-5" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-5"></a> <span class="o">...</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-6" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-6" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-6"></a><span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">verified_email</span><span class="p">,</span> <span class="n">VerifyExpired</span><span class="p">):</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-7" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-7" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-7"></a> <span class="o">...</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-8" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-8" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-8"></a><span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">verified_email</span><span class="p">,</span> <span class="nb">str</span><span class="p">):</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-9" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-9" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-9"></a> <span class="o">...</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-10" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-10" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-10"></a><span class="k">else</span><span class="p">:</span>
<a id="rest_code_18f3770dba5141e5bd6cc156537a22e4-11" name="rest_code_18f3770dba5141e5bd6cc156537a22e4-11" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_18f3770dba5141e5bd6cc156537a22e4-11"></a> <span class="n">assert_never</span><span class="p">(</span><span class="n">verified_email</span><span class="p">)</span>
</pre></div>
<p>Now, if we remove one of these blocks, let’s say the <code class="docutils literal">VerifyExpired</code> one (or
if we added another option to <code class="docutils literal">email_from_token</code>), mypy will catch it for us:</p>
<div class="code"><pre class="code shell"><a id="rest_code_254f3e6c0cef4168b0211bac38c0f5eb-1" name="rest_code_254f3e6c0cef4168b0211bac38c0f5eb-1" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_254f3e6c0cef4168b0211bac38c0f5eb-1"></a>error: Argument <span class="m">1</span> to <span class="s2">"assert_never"</span> has incompatible <span class="nb">type</span> <span class="s2">"VerifyExpired"</span><span class="p">;</span> expected <span class="s2">"NoReturn"</span>
</pre></div>
<p>With the error object method, we could also write our handling code using
<a class="reference external" href="https://peps.python.org/pep-0636/">structural pattern matching</a>. The
equivalent code, including our mypy exhaustiveness check, now looks like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-1" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-1" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-1"></a><span class="n">verified_email</span> <span class="o">=</span> <span class="n">EmailVerifyTokenGenerator</span><span class="p">()</span><span class="o">.</span><span class="n">email_from_token</span><span class="p">(</span><span class="n">token</span><span class="p">)</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-2" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-2" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-2"></a><span class="k">match</span> <span class="n">verified_email</span><span class="p">:</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-3" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-3" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-3"></a> <span class="k">case</span> <span class="n">VerifyFailed</span><span class="p">():</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-4" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-4" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-4"></a> <span class="o">...</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-5" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-5" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-5"></a> <span class="k">case</span> <span class="n">VerifyExpired</span><span class="p">(</span><span class="n">expired_token_email</span><span class="p">):</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-6" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-6" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-6"></a> <span class="o">...</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-7" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-7" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-7"></a> <span class="k">case</span> <span class="nb">str</span><span class="p">():</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-8" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-8" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-8"></a> <span class="o">...</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-9" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-9" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-9"></a> <span class="k">case</span> <span class="k">_</span><span class="p">:</span>
<a id="rest_code_8e037f35c6fa45efb26ad7553c38086e-10" name="rest_code_8e037f35c6fa45efb26ad7553c38086e-10" href="https://lukeplant.me.uk/blog/posts/raising-exceptions-or-returning-error-objects-in-python/#rest_code_8e037f35c6fa45efb26ad7553c38086e-10"></a> <span class="n">assert_never</span><span class="p">(</span><span class="n">verified_email</span><span class="p">)</span>
</pre></div>
<p>This has destructuring of the email address in <code class="docutils literal">VerifyExpired</code> built in – it
is bound to the name <code class="docutils literal">expired_token_email</code> in that branch.</p>
<p>Hopefully this gives a good justification for the approach I took with this
code. There are times when exceptions are better – generally when the things
mentioned above don’t apply, or the opposite applies – but I think error objects
also have their place, and sometimes are a much better solution.</p>
<section id="links">
<h2>Links</h2>
<ul class="simple">
<li><p><a class="reference external" href="https://twitter.com/spookylukey/status/1533831216536997892">Discussion on Twitter</a></p></li>
</ul>
</section>REPL Python programming and debugging with IPythonhttps://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/2022-05-04T07:26:56+01:002022-05-04T07:26:56+01:00Luke Plant<p>The flows I use for exploratory programming using a REPL and their advantages.</p><p>When programming in Python, I spend a large amount of time using <a class="reference external" href="https://ipython.org/">IPython</a> and its <a class="reference external" href="https://ipython.readthedocs.io/en/stable/">powerful interactive prompt</a>, not just for some one-off
calculations, but for significant chunks of actual programming and debugging. I
use it especially for exploratory programming where I’m unsure of the APIs
available to me, or what the state of the system will be at a particular point
in the code.</p>
<p>While it looks like I’ve been doing this <a class="reference external" href="https://lukeplant.me.uk/blog/posts/exploratory-programming-with-ipython/">for 12 years now</a>, I’m not sure how
widespread this method of working is, as I rarely hear other people talk about
it. So I thought it would be worth sharing in some detail.</p>
<p>If you like videos and want to see this method in action for writing a test, you
could have a look at the <a class="reference external" href="https://www.youtube.com/watch?v=nEr6T2pL8Es&t=248s">django-functest video about writing tests
interactively</a>, or <a class="reference external" href="https://www.youtube.com/watch?v=nEr6T2pL8Es&t=248s">skip
to the bit where I start using the REPL</a>.</p>
<section id="setup">
<h2>Setup</h2>
<p>You normally need IPython installed into your current virtualenv for it to work properly:</p>
<div class="code"><pre class="code shell"><a id="rest_code_a09afdfb28d8442db97b442e151e116c-1" name="rest_code_a09afdfb28d8442db97b442e151e116c-1" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_a09afdfb28d8442db97b442e151e116c-1"></a>pip install ipython
</pre></div>
<p>(See Tips section below if installing IPython is not possible)</p>
</section>
<section id="methods">
<h2>Methods</h2>
<p>There are basically two ways I open an IPython prompt. The first is by running
it directly from a terminal:</p>
<div class="code"><pre class="code shell"><a id="rest_code_31d4c3d8a46749c0938d9794a8d52978-1" name="rest_code_31d4c3d8a46749c0938d9794a8d52978-1" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_31d4c3d8a46749c0938d9794a8d52978-1"></a>$ ipython
<a id="rest_code_31d4c3d8a46749c0938d9794a8d52978-2" name="rest_code_31d4c3d8a46749c0938d9794a8d52978-2" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_31d4c3d8a46749c0938d9794a8d52978-2"></a>Python <span class="m">3</span>.9.5 <span class="o">(</span>default, Jul <span class="m">1</span> <span class="m">2021</span>, <span class="m">11</span>:45:58<span class="o">)</span>
<a id="rest_code_31d4c3d8a46749c0938d9794a8d52978-3" name="rest_code_31d4c3d8a46749c0938d9794a8d52978-3" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_31d4c3d8a46749c0938d9794a8d52978-3"></a>Type <span class="s1">'copyright'</span>, <span class="s1">'credits'</span> or <span class="s1">'license'</span> <span class="k">for</span> more information
<a id="rest_code_31d4c3d8a46749c0938d9794a8d52978-4" name="rest_code_31d4c3d8a46749c0938d9794a8d52978-4" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_31d4c3d8a46749c0938d9794a8d52978-4"></a>IPython <span class="m">8</span>.3.0 -- An enhanced Interactive Python. Type <span class="s1">'?'</span> <span class="k">for</span> help.
<a id="rest_code_31d4c3d8a46749c0938d9794a8d52978-5" name="rest_code_31d4c3d8a46749c0938d9794a8d52978-5" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_31d4c3d8a46749c0938d9794a8d52978-5"></a>
<a id="rest_code_31d4c3d8a46749c0938d9794a8d52978-6" name="rest_code_31d4c3d8a46749c0938d9794a8d52978-6" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_31d4c3d8a46749c0938d9794a8d52978-6"></a>In <span class="o">[</span><span class="m">1</span><span class="o">]</span>:
</pre></div>
<p>In a Django project, <code class="docutils literal">./manage.py shell</code> can also be used if you have
IPython installed, with the advantage that it will properly initialise Django
for you.</p>
<p>This works fine if you want to explore writing some “top level” code – for
example, a new bit of functionality where the entry points have not been created
yet. However, most code I write is not like that. Most of the time I find
myself wanting to write code when I am already 10 levels of function calls
down – for example:</p>
<ul class="simple">
<li><p>I’m writing some view code in a Django application, which has a request
object – an object you could not easily recreate if you started from scratch
at an IPython prompt.</p></li>
<li><p>or, model layer code such as inside a <code class="docutils literal">save()</code> method that is itself being
called by some other code you have not written, like the Django admin or some
signal.</p></li>
<li><p>or, inside a test, where the setup code has already created a whole bunch of
things that are not available to you when you open IPython.</p></li>
</ul>
<p>For these cases, I use the second method:</p>
<ul>
<li><p>Find the bit of code I want to modify, explore or debug. This will often be my
own code, but could equally be a third party library. I’m always working in a
virtualenv, so even with third party libraries ,“go to definition” in my
editor will take me straight to a writable copy of the code (apart from code
not written in Python).</p></li>
<li><p>Insert the code for an IPython prompt and save the file:</p>
<div class="code"><pre class="code python"><a id="rest_code_14d1708c210e41cebc15e1655a6c6be1-1" name="rest_code_14d1708c210e41cebc15e1655a6c6be1-1" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_14d1708c210e41cebc15e1655a6c6be1-1"></a><span class="kn">import</span> <span class="nn">IPython</span><span class="p">;</span> <span class="n">IPython</span><span class="o">.</span><span class="n">embed</span><span class="p">()</span>
</pre></div>
<p>I have this bound to a function key in my editor.</p>
<p>So the code might end up looking like this, if it was a Django view for example:</p>
<div class="code"><pre class="code python"><a id="rest_code_835632436da04103b820547d884470e0-1" name="rest_code_835632436da04103b820547d884470e0-1" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_835632436da04103b820547d884470e0-1"></a><span class="k">def</span> <span class="nf">contact_us</span><span class="p">(</span><span class="n">request</span><span class="p">):</span>
<a id="rest_code_835632436da04103b820547d884470e0-2" name="rest_code_835632436da04103b820547d884470e0-2" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_835632436da04103b820547d884470e0-2"></a> <span class="k">if</span> <span class="n">request</span><span class="o">.</span><span class="n">method</span> <span class="o">==</span> <span class="s2">"POST"</span><span class="p">:</span>
<a id="rest_code_835632436da04103b820547d884470e0-3" name="rest_code_835632436da04103b820547d884470e0-3" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_835632436da04103b820547d884470e0-3"></a> <span class="n">form</span> <span class="o">=</span> <span class="n">ContactUsForm</span><span class="p">(</span><span class="n">request</span><span class="o">.</span><span class="n">POST</span><span class="p">)</span>
<a id="rest_code_835632436da04103b820547d884470e0-4" name="rest_code_835632436da04103b820547d884470e0-4" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_835632436da04103b820547d884470e0-4"></a> <span class="k">if</span> <span class="n">form</span><span class="o">.</span><span class="n">is_valid</span><span class="p">():</span>
<a id="rest_code_835632436da04103b820547d884470e0-5" name="rest_code_835632436da04103b820547d884470e0-5" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_835632436da04103b820547d884470e0-5"></a> <span class="kn">import</span> <span class="nn">IPython</span><span class="p">;</span> <span class="n">IPython</span><span class="o">.</span><span class="n">embed</span><span class="p">()</span>
<a id="rest_code_835632436da04103b820547d884470e0-6" name="rest_code_835632436da04103b820547d884470e0-6" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_835632436da04103b820547d884470e0-6"></a>
<a id="rest_code_835632436da04103b820547d884470e0-7" name="rest_code_835632436da04103b820547d884470e0-7" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_835632436da04103b820547d884470e0-7"></a> <span class="c1"># …</span>
</pre></div>
<p>I sometimes also might put the snippet inside a new <code class="docutils literal">if</code> clause that I
add to catch a particular condition, especially when using this for debugging.</p>
</li>
<li><p>Trigger the code in the appropriate way. For the above case, it would involve
first running the Django development server in a terminal, then opening the
web page, filling out the form and pressing submit. For a test, it would be
running the specific test from a terminal. For command line apps it would be
running the app directly.</p></li>
<li><p>In the terminal, I will now find myself in the IPython REPL, and I can go
ahead and:</p>
<ul class="simple">
<li><p>work out what code I need to write</p></li>
<li><p>or debug the code that I’m confused about.</p></li>
</ul>
</li>
</ul>
<p>Note that you can write and edit multi-line code at this REPL – it’s not quite as
comfortable as an editor, but it’s OK, and has good history support. There’s
much more to say about IPython and its features that I won’t write here, you can
learn about it in <a class="reference external" href="https://ipython.readthedocs.io/en/stable/">the docs</a>.</p>
<p>For those with a background in other languages, it might also be worth pointing
out that a Python REPL is not a different thing from normal Python. Everything
you can do in normal Python, like defining functions and classes, is possible
right there in the REPL.</p>
<p>Once I’m done with my exploring, I can copy any useful snippets back from the
REPL into my real code, using the history to scan back through what I typed.</p>
</section>
<section id="advantages">
<h2>Advantages</h2>
<p>The advantages of this method are:</p>
<ol class="arabic">
<li><p>You can explore APIs and objects much more easily when you actually have the
object, rather than docs about the object, or what your editor’s
auto-complete tools believe to be true about the object. For example, what
attributes and methods are available on Django’s <a class="reference external" href="https://docs.djangoproject.com/en/stable/ref/request-response/#django.http.HttpRequest">HttpRequest</a>?
You don’t have to ensure you’ve got correct type annotations, and hope they
are complete, or make assumptions about what the values are - you’ve got the
object right there, you can inspect it, with extensive and correct tab
completion. You can actually call functions and see what they do.</p>
<p>For example, Django’s request object typically has a <code class="docutils literal">user</code> attribute which
is not part of the <code class="docutils literal">HttpRequest</code> definition, because of how it is added
later. It’s visible in a REPL though.</p>
</li>
<li><p>You can directly explore the state of the system. This can be a huge
advantage for both exploratory programming and debugging.</p>
<p>For debugging, <a class="reference external" href="https://docs.python.org/3/library/pdb.html">pdb</a> and
similar debugging tools and environments will often provide you with “the
state of the system”, and they are much better at being able to step through
multiple layers of code. But I often find that the power and comfort of an
IPython prompt is much nicer for exploring and finding solutions.</p>
</li>
</ol>
<p>The feel of this kind of environment is not quite a smooth as <a class="reference external" href="https://mikelevins.github.io/posts/2020-12-18-repl-driven/">REPL-driven
programming in Lisp</a>, but I still find
it hugely enjoyable and productive. Compared to many other methods, like
iterating on your code followed by manual or automated testing, it cuts the
latency of the feedback loop from seconds or minutes to milliseconds, and that
is huge.</p>
</section>
<section id="tips-and-gotchas">
<h2>Tips and gotchas</h2>
<ul>
<li><p>IPython has tons of cool features that will help you in a REPL environment,
like <a class="reference external" href="https://ipython.org/ipython-doc/3/config/extensions/autoreload.html">%autoreload</a>
(thanks <a class="reference external" href="https://twitter.com/be_haki">haki</a>), and many other cool <a class="reference external" href="https://ipython.readthedocs.io/en/stable/interactive/magics.html">magics</a>. You
should spend the time getting to know them!</p></li>
<li><p>In a multi-threaded (or multi-process) environment, IPython prompts won’t play
nice. Turn off multi-threading if possible, or otherwise ensure that you don’t
hit that gotcha.</p></li>
<li><p>If you do get messed up in a terminal, you may need to manually find the
processes to <a class="reference external" href="https://linuxconfig.org/how-to-kill-a-running-process-on-linux">kill</a> and do
<code class="docutils literal">reset</code> in your terminal.</p></li>
<li><p>With the Django development server:</p>
<ul class="simple">
<li><p>It’s multi-threaded by default, so either ensure that you don’t hit the view
code multiple times, or use <code class="docutils literal"><span class="pre">--nothreading</span></code>.</p></li>
<li><p>Beware of auto-reloading, which will mess you up if you are still in an
IPython prompt when it kicks in. Either use <code class="docutils literal"><span class="pre">--noreload</span></code> or just ensure
you exit IPython cleanly before doing anything that will trigger a reload.</p></li>
</ul>
</li>
<li><p>Beware of environments that capture standard input/output, that will break
this technique.</p></li>
<li><p>pytest captures standard input and breaks things by default. You can turn it
off using <code class="docutils literal"><span class="pre">-s</span></code>. Also if you are using <a class="reference external" href="https://pypi.org/project/pytest-xdist/">pytest-xdist</a> you should remember to do <code class="docutils literal"><span class="pre">-n0</span></code>
to turn off multiple processes.</p></li>
<li><p>When using <code class="docutils literal">IPython.embed()</code> there’s an <a class="reference external" href="https://github.com/ipython/ipython/issues/62">annoying bug involving closures and
undefined names</a> due to Python
limitations. It often shows itself when using generator expressions, but at
other times too. It can often be worked around by doing:</p>
<div class="code"><pre class="code python"><a id="rest_code_767e8bff824b4e3387d7f8d715a31258-1" name="rest_code_767e8bff824b4e3387d7f8d715a31258-1" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_767e8bff824b4e3387d7f8d715a31258-1"></a><span class="nb">globals</span><span class="p">()</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="nb">locals</span><span class="p">())</span>
</pre></div>
</li>
<li><p>If for some reason you can’t use IPython, but only have access to the standard
library, the one-liner you need to run a (basic) REPL at any point in your
code is this:</p>
<div class="code"><pre class="code python"><a id="rest_code_3dc9642855e5471f82def4e9c074bfbe-1" name="rest_code_3dc9642855e5471f82def4e9c074bfbe-1" href="https://lukeplant.me.uk/blog/posts/repl-python-programming-and-debugging-with-ipython/#rest_code_3dc9642855e5471f82def4e9c074bfbe-1"></a><span class="kn">import</span> <span class="nn">code</span><span class="p">;</span> <span class="n">code</span><span class="o">.</span><span class="n">interact</span><span class="p">(</span><span class="n">local</span><span class="o">=</span><span class="nb">locals</span><span class="p">())</span>
</pre></div>
</li>
</ul>
</section>
<section id="end">
<h2>End</h2>
<p>That’s it, I hope you found it useful. Do you have any other tips for using this
technique?</p>
</section>
<section id="links">
<h2>Links</h2>
<ul class="simple">
<li><p><a class="reference external" href="https://twitter.com/spookylukey/status/1521776101760057345">Discussion on Twitter</a></p></li>
</ul>
</section>A Django PAGNI: efficient bulk propertieshttps://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/2021-08-25T16:54:17+01:002021-08-25T16:54:17+01:00Luke Plant<p>When using Django database models and adding a calculated property of some kind, you should probably ensure it will be efficient in bulk even if that isn’t needed yet.</p><p>Adding to my “Probably Are Gonna Need It” list (started by my <a class="reference external" href="https://lukeplant.me.uk/blog/posts/yagni-exceptions/">YAGNI exceptions</a> post a few months back,
with follow ups by <a class="reference external" href="https://simonwillison.net/2021/Jul/1/pagnis/">Simon Willison</a> and <a class="reference external" href="https://jacobian.org/2021/jul/8/appsec-pagnis/">Jacob Kaplan-Moss</a>), this post is about a
pattern that often crops up in Django applications. It probably applies to many
other database-driven applications too, but I’m more confident about saying it
is a PAGNI in Django – namely, a situation where you are usually better off
taking the risk of doing the extra work up front.</p>
<p>To state it briefly:</p>
<blockquote>
<p>If you have a calculated property that relates to a Django model and
requires a database query (or other expensive work), consider making
efficient in bulk even if you don't need it in bulk right now.</p>
</blockquote>
<section id="example-initial-requirement">
<h2>Example – initial requirement</h2>
<p>Suppose we are writing an internal task management app for our team. A
requirement comes along: the user’s dashboard page should have some text that
indicates how many “in progress” tasks they have.</p>
<p>We already have a <code class="docutils literal">Task</code> model associated with our <code class="docutils literal">User</code> model:</p>
<div class="code"><pre class="code python"><a id="rest_code_6839f8d517614cec8f763f9765bce317-1" name="rest_code_6839f8d517614cec8f763f9765bce317-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_6839f8d517614cec8f763f9765bce317-1"></a><span class="k">class</span> <span class="nc">User</span><span class="p">(</span><span class="n">AbstractBaseUser</span><span class="p">):</span>
<a id="rest_code_6839f8d517614cec8f763f9765bce317-2" name="rest_code_6839f8d517614cec8f763f9765bce317-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_6839f8d517614cec8f763f9765bce317-2"></a> <span class="k">pass</span>
<a id="rest_code_6839f8d517614cec8f763f9765bce317-3" name="rest_code_6839f8d517614cec8f763f9765bce317-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_6839f8d517614cec8f763f9765bce317-3"></a>
<a id="rest_code_6839f8d517614cec8f763f9765bce317-4" name="rest_code_6839f8d517614cec8f763f9765bce317-4" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_6839f8d517614cec8f763f9765bce317-4"></a><span class="k">class</span> <span class="nc">Task</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
<a id="rest_code_6839f8d517614cec8f763f9765bce317-5" name="rest_code_6839f8d517614cec8f763f9765bce317-5" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_6839f8d517614cec8f763f9765bce317-5"></a> <span class="n">assigned_to</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">ForeignKey</span><span class="p">(</span><span class="s1">'myproject.User'</span><span class="p">,</span> <span class="n">related_name</span><span class="o">=</span><span class="s1">'assigned_tasks'</span><span class="p">)</span>
<a id="rest_code_6839f8d517614cec8f763f9765bce317-6" name="rest_code_6839f8d517614cec8f763f9765bce317-6" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_6839f8d517614cec8f763f9765bce317-6"></a> <span class="n">state</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">choices</span><span class="o">=</span><span class="p">[</span><span class="s2">"ready_for_work"</span><span class="p">,</span> <span class="s2">"in_progress"</span><span class="p">,</span> <span class="s2">"done"</span><span class="p">])</span>
</pre></div>
<p>We might already have a custom QuerySet with an <code class="docutils literal">in_progress</code> method that does
the appropriate filtering:</p>
<div class="code"><pre class="code python"><a id="rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-1" name="rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-1"></a><span class="k">class</span> <span class="nc">TaskQuerySet</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">QuerySet</span><span class="p">):</span>
<a id="rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-2" name="rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-2"></a> <span class="k">def</span> <span class="nf">in_progress</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<a id="rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-3" name="rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_69cd51cbdd7b4be3a35cb652acfb7ad2-3"></a> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">state</span><span class="o">=</span><span class="s2">"in_progress"</span><span class="p">)</span> <span class="c1"># or perhaps something more complex</span>
</pre></div>
<p>We could then do the calculation with just the following code:</p>
<div class="code"><pre class="code python"><a id="rest_code_e1c89de5231d45b29d932d464de01715-1" name="rest_code_e1c89de5231d45b29d932d464de01715-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_e1c89de5231d45b29d932d464de01715-1"></a><span class="n">Task</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">in_progress</span><span class="p">()</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">assigned_to</span><span class="o">=</span><span class="n">user</span><span class="p">)</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
<a id="rest_code_e1c89de5231d45b29d932d464de01715-2" name="rest_code_e1c89de5231d45b29d932d464de01715-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_e1c89de5231d45b29d932d464de01715-2"></a>
<a id="rest_code_e1c89de5231d45b29d932d464de01715-3" name="rest_code_e1c89de5231d45b29d932d464de01715-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_e1c89de5231d45b29d932d464de01715-3"></a><span class="c1"># or, from a ``User`` instance it might be:</span>
<a id="rest_code_e1c89de5231d45b29d932d464de01715-4" name="rest_code_e1c89de5231d45b29d932d464de01715-4" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_e1c89de5231d45b29d932d464de01715-4"></a>
<a id="rest_code_e1c89de5231d45b29d932d464de01715-5" name="rest_code_e1c89de5231d45b29d932d464de01715-5" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_e1c89de5231d45b29d932d464de01715-5"></a><span class="n">user</span><span class="o">.</span><span class="n">assigned_tasks</span><span class="o">.</span><span class="n">in_progress</span><span class="p">()</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
</pre></div>
<p>At a SQL level, this is doing a simple <code class="docutils literal">SELECT COUNT()</code> with some filtering e.g.:</p>
<div class="code"><pre class="code sql"><a id="rest_code_bddc1639be0a4dd39ff6c408fe901ca2-1" name="rest_code_bddc1639be0a4dd39ff6c408fe901ca2-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_bddc1639be0a4dd39ff6c408fe901ca2-1"></a><span class="k">SELECT</span><span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">myproject_tasks</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">user_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">123</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="k">state</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'in_progress'</span><span class="w"></span>
</pre></div>
<p>We should probably wrap that up in a method or property, which we could easily
add to our <code class="docutils literal">User</code> model like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_dce3e983afb54dd48073ab8d843a1e0f-1" name="rest_code_dce3e983afb54dd48073ab8d843a1e0f-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_dce3e983afb54dd48073ab8d843a1e0f-1"></a><span class="k">class</span> <span class="nc">User</span><span class="p">(</span><span class="n">AbstractBaseUser</span><span class="p">):</span>
<a id="rest_code_dce3e983afb54dd48073ab8d843a1e0f-2" name="rest_code_dce3e983afb54dd48073ab8d843a1e0f-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_dce3e983afb54dd48073ab8d843a1e0f-2"></a> <span class="o">...</span>
<a id="rest_code_dce3e983afb54dd48073ab8d843a1e0f-3" name="rest_code_dce3e983afb54dd48073ab8d843a1e0f-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_dce3e983afb54dd48073ab8d843a1e0f-3"></a>
<a id="rest_code_dce3e983afb54dd48073ab8d843a1e0f-4" name="rest_code_dce3e983afb54dd48073ab8d843a1e0f-4" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_dce3e983afb54dd48073ab8d843a1e0f-4"></a> <span class="nd">@cached_property</span>
<a id="rest_code_dce3e983afb54dd48073ab8d843a1e0f-5" name="rest_code_dce3e983afb54dd48073ab8d843a1e0f-5" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_dce3e983afb54dd48073ab8d843a1e0f-5"></a> <span class="k">def</span> <span class="nf">in_progress_tasks_count</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<a id="rest_code_dce3e983afb54dd48073ab8d843a1e0f-6" name="rest_code_dce3e983afb54dd48073ab8d843a1e0f-6" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_dce3e983afb54dd48073ab8d843a1e0f-6"></a> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">assigned_tasks</span><span class="o">.</span><span class="n">in_progress</span><span class="p">()</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
</pre></div>
<p>You can now use this property in a template very easily:</p>
<div class="code"><pre class="code html+django"><a id="rest_code_2771e99d002f4db0b6f8aee8fe72f8c1-1" name="rest_code_2771e99d002f4db0b6f8aee8fe72f8c1-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_2771e99d002f4db0b6f8aee8fe72f8c1-1"></a><span class="p"><</span><span class="nt">p</span><span class="p">></span>Tasks in progress: <span class="cp">{{</span> <span class="nv">request.user.in_progress_tasks_count</span> <span class="cp">}}</span><span class="p"></</span><span class="nt">p</span><span class="p">></span>
</pre></div>
<p>You may or may not like putting properties on the user model like that, but the
code is simple and gets the job done, and seems to be fine. An alternative
structure would put this code in a utility function in the model layer
somewhere, which would require a bit more work to make the data available in our
template, but either way it doesn’t affect how this example will unfold.</p>
</section>
<section id="example-new-requirement">
<h2>Example – new requirement</h2>
<p>Some time later, a request comes up like this:</p>
<blockquote>
<p>You know that "in progress tasks count" on the user dashboard? Can we add
that as a column to the admin screen that shows the list of users?</p>
</blockquote>
<p>This sounds very simple – they just want a piece of information we already know
how to calculate to appear in another place. What could be easier?</p>
<p>If you are using the Django admin for the admin screen, and you coded the first
part as above, the solution could indeed be very simple to execute – as simple
as adding <code class="docutils literal">"in_progress_tasks_count"</code> to the <a class="reference external" href="https://docs.djangoproject.com/en/stable/ref/contrib/admin/#django.contrib.admin.ModelAdmin.list_display">list_display</a>
property – a five minute job maximum.</p>
<p>It will work fine. But you've hit the <a class="reference external" href="https://adamj.eu/tech/2020/09/01/django-and-the-n-plus-one-queries-problem/">dreaded N+1 queries problem</a>.</p>
<p>For each user instance displayed in the list, we will end up executing a
separate SQL query to get the task count. This is completely unnecessary – there
are multiple ways to do this much more efficiently in SQL:</p>
<ul>
<li><p>Using a SQL <cite>COUNT</cite> with a sub-query added to the main user query.</p></li>
<li><p>Using a SQL <cite>COUNT FILTER</cite> and a join added to the main user query.</p></li>
<li><p>Using a second query like this:</p>
<div class="code"><pre class="code sql"><a id="rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-1" name="rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-1"></a><span class="k">SELECT</span><span class="w"> </span><span class="n">user_id</span><span class="p">,</span><span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">myproject_user</span><span class="w"></span>
<a id="rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-2" name="rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-2"></a><span class="k">WHERE</span><span class="w"> </span><span class="n">assigned_to_id</span><span class="w"> </span><span class="k">IN</span><span class="w"> </span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w"> </span><span class="mi">3</span><span class="p">)</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="k">state</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'in_progress'</span><span class="w"></span>
<a id="rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-3" name="rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_0c4ba0c77f4f4e228f1ddc8b185eef07-3"></a><span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">assigned_to_id</span><span class="p">;</span><span class="w"></span>
</pre></div>
<p>where our list of user IDs values comes from the first query we executed.</p>
</li>
</ul>
<p>Instead of those, we’ve ended up with a very slow method that is going to hurt
us quite quickly in terms of performance. We might not notice the problem on our
development machines, but it will quickly add up in production.</p>
<aside class="admonition admonition-note">
<p class="admonition-title">Note</p>
<p>If you are using SQLite, which is actually <a class="reference external" href="https://sqlite.org/np1queryprob.html">pretty good at lots of small
queries</a>, this might not actually be
a problem.</p>
</aside>
<p>Even if we do notice it, we’re going to have a problem doing it the right way at
this point:</p>
<ul>
<li><p>The weight of the existing code pushes us in the wrong direction. The easy
thing is slow.</p>
<p>Also remember that by the time we go to do this, in addition to
<code class="docutils literal">in_progress_tasks_count</code>, we might also have <code class="docutils literal">completed_tasks_count</code>,
<code class="docutils literal">deferred_tasks_count</code> etc.</p>
</li>
<li><p>The low estimate we probably made for this new requirement, either explicitly
or just internally in the time we’ve allowed for it, pushes us to find a quick
solution.</p></li>
<li><p>“One way to do it”, and “Once And Only Once” push against us implementing a
second way to do the same calculation.</p></li>
</ul>
<p>So the result will be:</p>
<ul class="simple">
<li><p>either the wrong way, which will be slow and contribute further to patterns
that will make us even slower in the future,</p></li>
<li><p>or, doing rework which will be an unexpected and unwelcome cost at this point.</p></li>
</ul>
</section>
<section id="implementation-tips">
<h2>Implementation tips</h2>
<p>So, if we want to do this a bulk efficient way, what are our options?</p>
<ol class="arabic">
<li><p>We can load our <code class="docutils literal">User</code> objects in a query with an annotation that does the
calculation in the database, as part of the main query.</p>
<p>For the case above, we could use a custom User QuerySet method something like:</p>
<div class="code"><pre class="code python"><a id="rest_code_94e5f667a8ab4fed8503693faa6676ad-1" name="rest_code_94e5f667a8ab4fed8503693faa6676ad-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_94e5f667a8ab4fed8503693faa6676ad-1"></a><span class="k">class</span> <span class="nc">UserQuerySet</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">QuerySet</span><span class="p">):</span>
<a id="rest_code_94e5f667a8ab4fed8503693faa6676ad-2" name="rest_code_94e5f667a8ab4fed8503693faa6676ad-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_94e5f667a8ab4fed8503693faa6676ad-2"></a> <span class="k">def</span> <span class="nf">with_in_progress_tasks_count</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<a id="rest_code_94e5f667a8ab4fed8503693faa6676ad-3" name="rest_code_94e5f667a8ab4fed8503693faa6676ad-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_94e5f667a8ab4fed8503693faa6676ad-3"></a> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span><span class="n">in_progress_tasks_count</span><span class="o">=</span><span class="n">models</span><span class="o">.</span><span class="n">Count</span><span class="p">(</span>
<a id="rest_code_94e5f667a8ab4fed8503693faa6676ad-4" name="rest_code_94e5f667a8ab4fed8503693faa6676ad-4" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_94e5f667a8ab4fed8503693faa6676ad-4"></a> <span class="s1">'assigned_tasks'</span><span class="p">,</span>
<a id="rest_code_94e5f667a8ab4fed8503693faa6676ad-5" name="rest_code_94e5f667a8ab4fed8503693faa6676ad-5" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_94e5f667a8ab4fed8503693faa6676ad-5"></a> <span class="nb">filter</span><span class="o">=</span><span class="n">Q</span><span class="p">(</span><span class="n">assigned_tasks__state</span><span class="o">=</span><span class="s1">'in_progress'</span><span class="p">)</span>
<a id="rest_code_94e5f667a8ab4fed8503693faa6676ad-6" name="rest_code_94e5f667a8ab4fed8503693faa6676ad-6" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_94e5f667a8ab4fed8503693faa6676ad-6"></a> <span class="p">))</span>
</pre></div>
<p>This produces pretty nice SQL, although one downside of this code is that we
have duplicated some logic from our <code class="docutils literal">TaskQuerySet.in_progress</code> filter.</p>
<p>You then need to load your user objects with
<code class="docutils literal">User.objects.with_in_progress_tasks_count()</code>, which would require
overriding <code class="docutils literal">ModelAdmin.get_queryset</code> if you are using the Django admin for
example.</p>
<p>To keep everything happy, it is often useful to still define a property on
the model like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_deb94ffd0f504398998a714f338e6330-1" name="rest_code_deb94ffd0f504398998a714f338e6330-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_deb94ffd0f504398998a714f338e6330-1"></a><span class="k">class</span> <span class="nc">User</span><span class="p">(</span><span class="n">AbstractBaseUser</span><span class="p">):</span>
<a id="rest_code_deb94ffd0f504398998a714f338e6330-2" name="rest_code_deb94ffd0f504398998a714f338e6330-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_deb94ffd0f504398998a714f338e6330-2"></a>
<a id="rest_code_deb94ffd0f504398998a714f338e6330-3" name="rest_code_deb94ffd0f504398998a714f338e6330-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_deb94ffd0f504398998a714f338e6330-3"></a> <span class="nd">@cached_property</span>
<a id="rest_code_deb94ffd0f504398998a714f338e6330-4" name="rest_code_deb94ffd0f504398998a714f338e6330-4" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_deb94ffd0f504398998a714f338e6330-4"></a> <span class="k">def</span> <span class="nf">in_progress_tasks_count</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">int</span><span class="p">:</span>
<a id="rest_code_deb94ffd0f504398998a714f338e6330-5" name="rest_code_deb94ffd0f504398998a714f338e6330-5" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_deb94ffd0f504398998a714f338e6330-5"></a> <span class="k">raise</span> <span class="ne">AssertionError</span><span class="p">(</span><span class="s2">"Use User.objects.with_in_progress_tasks_count() if you want to use this"</span><span class="p">)</span>
</pre></div>
</li>
<li><p>We can have a separate query which does the count, and then decorates the
list of user objects with the value of <code class="docutils literal">in_progress_tasks_count</code>.</p>
<p>One of the advantages of this method is that the queries can be easier to
write, whether you are using the ORM or dropping down to raw SQL, and they
don’t complicate the main query at all, which can be a big bonus. Also,
sometimes it can be easier to re-use existing custom QuerySet methods this way.</p>
<p>One of the disadvantages is that it can be difficult to insert this extra bit
of work at the right point, especially if you are in the context of framework
code (like the Django admin or DRF), where the full QuerySet is built up and
evaluated outside of your control.</p>
<p>For this situation, in a number of projects I’ve started using a <a class="reference external" href="https://gist.github.com/spookylukey/8d1a4c73845d1ec86a875fd44b6bdc32">mechanism
for adding callbacks that run immediately after a QuerySet is evaluated</a>.</p>
<p>Usage in this case would look like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-1" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-1" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-1"></a><span class="k">class</span> <span class="nc">UserQuerySet</span><span class="p">(</span><span class="n">AfterFetchQuerySetMixin</span><span class="p">,</span> <span class="n">models</span><span class="o">.</span><span class="n">QuerySet</span><span class="p">):</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-2" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-2" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-2"></a> <span class="k">def</span> <span class="nf">with_in_progress_tasks_count</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-3" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-3" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-3"></a> <span class="k">def</span> <span class="nf">add_in_progress_tasks_count</span><span class="p">(</span><span class="n">user_list</span><span class="p">):</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-4" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-4" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-4"></a> <span class="n">user_ids</span> <span class="o">=</span> <span class="p">[</span><span class="n">u</span><span class="o">.</span><span class="n">id</span> <span class="k">for</span> <span class="n">u</span> <span class="ow">in</span> <span class="n">user_list</span><span class="p">]</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-5" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-5" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-5"></a> <span class="n">counts</span> <span class="o">=</span> <span class="p">(</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-6" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-6" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-6"></a> <span class="n">Task</span><span class="o">.</span><span class="n">objects</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-7" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-7" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-7"></a> <span class="o">.</span><span class="n">in_progress</span><span class="p">()</span><span class="o">.</span><span class="n">order_by</span><span class="p">()</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-8" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-8" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-8"></a> <span class="o">.</span><span class="n">filter</span><span class="p">(</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-9" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-9" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-9"></a> <span class="n">assigned_to__id__in</span><span class="o">=</span><span class="n">user_ids</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-10" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-10" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-10"></a> <span class="p">)</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-11" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-11" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-11"></a> <span class="o">.</span><span class="n">values</span><span class="p">(</span><span class="s1">'assigned_to_id'</span><span class="p">)</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span><span class="n">count</span><span class="o">=</span><span class="n">models</span><span class="o">.</span><span class="n">Count</span><span class="p">(</span><span class="s1">'id'</span><span class="p">))</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-12" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-12" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-12"></a> <span class="o">.</span><span class="n">values_list</span><span class="p">(</span><span class="s1">'assigned_to_id'</span><span class="p">,</span> <span class="s1">'count'</span><span class="p">)</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-13" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-13" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-13"></a> <span class="p">)</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-14" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-14" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-14"></a> <span class="n">counts_dict</span> <span class="o">=</span> <span class="p">{</span><span class="n">user_id</span><span class="p">:</span> <span class="n">c</span> <span class="k">for</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">counts</span><span class="p">)</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-15" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-15" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-15"></a> <span class="c1"># Decorate user list</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-16" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-16" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-16"></a> <span class="k">for</span> <span class="n">user</span> <span class="ow">in</span> <span class="n">user_list</span><span class="p">:</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-17" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-17" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-17"></a> <span class="n">user</span><span class="o">.</span><span class="n">in_progress_tasks_count</span> <span class="o">=</span> <span class="n">counts</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">user</span><span class="o">.</span><span class="n">id</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<a id="rest_code_a3907ecca43840dabbfbd30a1a9e5138-18" name="rest_code_a3907ecca43840dabbfbd30a1a9e5138-18" href="https://lukeplant.me.uk/blog/posts/django-pagni-efficient-bulk-properties/#rest_code_a3907ecca43840dabbfbd30a1a9e5138-18"></a> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">register_after_fetch_callback</span><span class="p">(</span><span class="n">add_in_progress_tasks_count</span><span class="p">)</span>
</pre></div>
<p>This pattern has come up often enough for me that I’m wondering whether
something like this should be included in Django itself. <code class="docutils literal">AfterFetchQuerySetMixin</code> has
to depend on some internals to work.</p>
</li>
</ol>
<p>For this kind of work, in general you might need to get good at using <a class="reference external" href="https://docs.djangoproject.com/en/stable/topics/db/aggregation/">Django’s
aggregation features</a>, which are not
the easiest in my opinion. <a class="reference external" href="https://hakibenita.com/django-group-by-sql">Haki Benita’s guide to Group By in Django</a> is invaluable!</p>
</section>
<section id="what-about-bulk-write-operations">
<h2>What about bulk write operations?</h2>
<p>Above we’ve addressed bulk reads, but what about doing bulk writes in a
similarly efficient manner? In my experience, this depends much more on the task
in hand. In many cases, going from operating on a single row to needing
operations to be efficient in bulk is less common. Often, even if we do need
bulk operations, we can afford to do it more slowly in a background task. Your
mileage may vary etc.</p>
</section>
<section id="discussion-why-yagni-fails">
<h2>Discussion —why YAGNI fails</h2>
<p><a class="reference external" href="https://martinfowler.com/bliki/Yagni.html">YAGNI</a> is based on a few main
observations, which I think are normally true:</p>
<ul class="simple">
<li><p>The time to develop a feature later is (approximately) the same as the time to
develop it earlier.</p></li>
<li><p>Life is full of surprises, and you might not need the feature later even when
you suspect you will.</p></li>
<li><p>Even if you can correctly guess what features will be needed eventually, there
is always an opportunity cost of delivering something before you need to
(“cost of delay” as Martin Fowler describes it). There is almost certainly an
endless list of features/improvements you do need, so if you have implemented
a feature that is not used (yet), then that’s a planning mistake that has cost
you in terms of features needed more urgently.</p></li>
<li><p>In addition to the cost of delay, Martin Fowler also points out the <strong>cost of
carry</strong> – the complexity burden you carry for having added something unneeded.</p></li>
</ul>
<p>Something counts as a YAGNI exception not if you just correctly predict the
future, but if implementing it before you need to ends up with lower costs
overall.</p>
<p>I’m claiming these arguments fail in this case, but why?</p>
<p>First, there is always <strong>some</strong> cost associated with re-work, and for a
relatively small feature like this the overheads are significant. In particular,
implementing the bulk-inefficient way then re-working and implementing the
bulk-efficient way is always going to take longer than just implementing the
bulk-efficient way, especially once you’ve added the desire to not repeat logic.</p>
<p>In this case, it’s true that you may not need the efficiency later on, but often
you do, and the bulk-efficient way also works fine for non-bulk usage, without
introducing that much complexity, and without a large opportunity cost because
it doesn’t take that long, especially if you set up the patterns at the
beginning.</p>
<p>In addition, the attitude of ”I’ll just re-design when I need it” fails to take
into account some powerful forces:</p>
<ul>
<li><p>The disproportionate effect that existing code structure has on code that
follows it. Overwhelmingly, coders will try to make new code fit the existing
pattern, <a class="reference external" href="https://wiki.lesswrong.com/wiki/Chesterton%27s_Fence">which is not a bad instinct</a>, but isn’t always the
right thing.</p>
<p>Every bit of code you write is actually setting a fairly powerful precedent
that requires significant work to overcome.</p>
<p>This actually means that the “cost of carry” argument may go the other way – if
you establish a pattern of bulk-efficient operations, it makes it easier to
implement every other (unrelated) feature that might also need bulk-efficient
operations, whether from the beginning or later.</p>
</li>
<li><p>The pressures of time-management and estimates you’ve already made. It can be
very hard to revisit an estimate of 30 minutes for a simple task and say
“actually it’s going to take 2 days”.</p></li>
</ul>
<p>Up-front thinking about performance like this is not premature optimization —
these are not small nano-second or micro-second differences, but milli-seconds
that often add up to seconds, and the patterns you choose at the beginning will
have big consequences for your performance down the line.</p>
</section>Evolution of a Django Repository patternhttps://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/2020-11-20T19:56:31Z2020-11-20T19:56:31ZLuke Plant<ol class="arabic">
<li><p>First attempt - get product by primary key:</p>
<div class="code"><pre class="code python"><a id="rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-1" name="rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-1"></a><span class="k">class</span> <span class="nc">ProductRepository</span><span class="p">:</span>
<a id="rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-2" name="rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-2"></a> <span class="k">def</span> <span class="nf">get_by_pk</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pk</span><span class="p">):</span>
<a id="rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-3" name="rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-3" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_8d4d52bbff2e4f5cbf12419f29f4d235-3"></a> <span class="k">return</span> <span class="n">Product</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">pk</span><span class="o">=</span><span class="n">pk</span><span class="p">)</span>
</pre></div>
</li>
<li><p><code class="docutils literal">ProductRepository</code> is stateless, use static methods. Usage now looks like:</p>
<div class="code"><pre class="code python"><a id="rest_code_e477873117ae4144b1ee126e6c335974-1" name="rest_code_e477873117ae4144b1ee126e6c335974-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_e477873117ae4144b1ee126e6c335974-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk</span><span class="p">(</span><span class="n">pk</span><span class="p">)</span>
</pre></div>
</li>
<li><p>It turns out I need a 'get by slug' too:</p>
<div class="code"><pre class="code python"><a id="rest_code_f7a8e12527f940ebb12919e0433ac0ba-1" name="rest_code_f7a8e12527f940ebb12919e0433ac0ba-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_f7a8e12527f940ebb12919e0433ac0ba-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk</span><span class="p">(</span><span class="n">pk</span><span class="p">)</span>
<a id="rest_code_f7a8e12527f940ebb12919e0433ac0ba-2" name="rest_code_f7a8e12527f940ebb12919e0433ac0ba-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_f7a8e12527f940ebb12919e0433ac0ba-2"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_slug</span><span class="p">(</span><span class="n">slug</span><span class="p">)</span>
</pre></div>
</li>
<li><p>In a web context, I need to limit according to user because
not all products are public yet:</p>
<div class="code"><pre class="code python"><a id="rest_code_a3a33587f133483d86be02e258c6f108-1" name="rest_code_a3a33587f133483d86be02e258c6f108-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_a3a33587f133483d86be02e258c6f108-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk</span><span class="p">(</span><span class="n">pk</span><span class="p">)</span>
<a id="rest_code_a3a33587f133483d86be02e258c6f108-2" name="rest_code_a3a33587f133483d86be02e258c6f108-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_a3a33587f133483d86be02e258c6f108-2"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_slug</span><span class="p">(</span><span class="n">slug</span><span class="p">)</span>
<a id="rest_code_a3a33587f133483d86be02e258c6f108-3" name="rest_code_a3a33587f133483d86be02e258c6f108-3" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_a3a33587f133483d86be02e258c6f108-3"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk_for_user</span><span class="p">(</span><span class="n">pk</span><span class="p">,</span> <span class="n">request</span><span class="o">.</span><span class="n">user</span><span class="p">)</span>
<a id="rest_code_a3a33587f133483d86be02e258c6f108-4" name="rest_code_a3a33587f133483d86be02e258c6f108-4" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_a3a33587f133483d86be02e258c6f108-4"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_slug_for_user</span><span class="p">(</span><span class="n">slug</span><span class="p">,</span> <span class="n">request</span><span class="o">.</span><span class="n">user</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Need some list APIs as well as individual:</p>
<div class="code"><pre class="code python"><a id="rest_code_5ca2d569a7a0482c8160be4e867889b8-1" name="rest_code_5ca2d569a7a0482c8160be4e867889b8-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_5ca2d569a7a0482c8160be4e867889b8-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_all</span><span class="p">()</span>
</pre></div>
</li>
<li><p>And to limit by user sometimes:</p>
<div class="code"><pre class="code python"><a id="rest_code_864895ee642d41fd8cdd4e3730766b19-1" name="rest_code_864895ee642d41fd8cdd4e3730766b19-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_864895ee642d41fd8cdd4e3730766b19-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_all</span><span class="p">()</span>
<a id="rest_code_864895ee642d41fd8cdd4e3730766b19-2" name="rest_code_864895ee642d41fd8cdd4e3730766b19-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_864895ee642d41fd8cdd4e3730766b19-2"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_all_for_user</span><span class="p">(</span><span class="n">user</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Need to limit to certain brands, for both list and individual. Now I've got:</p>
<div class="code"><pre class="code python"><a id="rest_code_3596111b07654a4685f4930bc61a053b-1" name="rest_code_3596111b07654a4685f4930bc61a053b-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk</span><span class="p">(</span><span class="n">pk</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-2" name="rest_code_3596111b07654a4685f4930bc61a053b-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-2"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_slug</span><span class="p">(</span><span class="n">slug</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-3" name="rest_code_3596111b07654a4685f4930bc61a053b-3" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-3"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk_for_user</span><span class="p">(</span><span class="n">pk</span><span class="p">,</span> <span class="n">user</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-4" name="rest_code_3596111b07654a4685f4930bc61a053b-4" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-4"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_slug_for_user</span><span class="p">(</span><span class="n">slug</span><span class="p">,</span> <span class="n">user</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-5" name="rest_code_3596111b07654a4685f4930bc61a053b-5" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-5"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk_for_brand</span><span class="p">(</span><span class="n">pk</span><span class="p">,</span> <span class="n">brand</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-6" name="rest_code_3596111b07654a4685f4930bc61a053b-6" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-6"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_slug_for_brand</span><span class="p">(</span><span class="n">slug</span><span class="p">,</span> <span class="n">brand</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-7" name="rest_code_3596111b07654a4685f4930bc61a053b-7" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-7"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_pk_for_user_for_brand</span><span class="p">(</span><span class="n">pk</span><span class="p">,</span> <span class="n">user</span><span class="p">,</span> <span class="n">brand</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-8" name="rest_code_3596111b07654a4685f4930bc61a053b-8" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-8"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_by_slug_for_user_for_brand</span><span class="p">(</span><span class="n">slug</span><span class="p">,</span> <span class="n">user</span><span class="p">,</span> <span class="n">brand</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-9" name="rest_code_3596111b07654a4685f4930bc61a053b-9" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-9"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_all</span><span class="p">()</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-10" name="rest_code_3596111b07654a4685f4930bc61a053b-10" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-10"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_all_for_user</span><span class="p">(</span><span class="n">user</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-11" name="rest_code_3596111b07654a4685f4930bc61a053b-11" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-11"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_all_for_brand</span><span class="p">(</span><span class="n">brand</span><span class="p">)</span>
<a id="rest_code_3596111b07654a4685f4930bc61a053b-12" name="rest_code_3596111b07654a4685f4930bc61a053b-12" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3596111b07654a4685f4930bc61a053b-12"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_all_for_user_for_brand</span><span class="p">(</span><span class="n">user</span><span class="p">,</span> <span class="n">brand</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Aargh! Refactor:</p>
<div class="code"><pre class="code python"><a id="rest_code_f47f819fa32e427db8138d23e25653d0-1" name="rest_code_f47f819fa32e427db8138d23e25653d0-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_f47f819fa32e427db8138d23e25653d0-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_one</span><span class="p">(</span><span class="n">pk</span><span class="o">=</span><span class="n">pk</span><span class="p">,</span> <span class="n">for_user</span><span class="o">=</span><span class="n">user</span><span class="p">,</span> <span class="n">brand</span><span class="o">=</span><span class="n">brand</span><span class="p">)</span> <span class="c1"># slug=slug also allowed</span>
<a id="rest_code_f47f819fa32e427db8138d23e25653d0-2" name="rest_code_f47f819fa32e427db8138d23e25653d0-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_f47f819fa32e427db8138d23e25653d0-2"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="n">for_user</span><span class="o">=</span><span class="n">user</span><span class="p">,</span> <span class="n">brand</span><span class="o">=</span><span class="n">brand</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Need paging:</p>
<div class="code"><pre class="code python"><a id="rest_code_91e40a0104cc4d11855bd5477e36c8e2-1" name="rest_code_91e40a0104cc4d11855bd5477e36c8e2-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_91e40a0104cc4d11855bd5477e36c8e2-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="n">page</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">page_size</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>
</pre></div>
</li>
<li><p>But have to specify ordering if paging is to work:</p>
<div class="code"><pre class="code python"><a id="rest_code_3c1b5eb4459f4305ba1c3ac394b28750-1" name="rest_code_3c1b5eb4459f4305ba1c3ac394b28750-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_3c1b5eb4459f4305ba1c3ac394b28750-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="n">ordering</span><span class="o">=</span><span class="s1">'name'</span><span class="p">,</span> <span class="n">page</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">page_size</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Hmm, performance - sometimes I need to fetch other things at the same time:</p>
<div class="code"><pre class="code python"><a id="rest_code_79c808422e014824b23ad13f581b9819-1" name="rest_code_79c808422e014824b23ad13f581b9819-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_79c808422e014824b23ad13f581b9819-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="n">fetch_related</span><span class="o">=</span><span class="p">[</span><span class="s1">'brand'</span><span class="p">,</span> <span class="s1">'stock_info'</span><span class="p">])</span>
</pre></div>
</li>
<li><p>Hmm, my related things also need related things at the same time:</p>
<div class="code"><pre class="code python"><a id="rest_code_2a3d87cdff97499b85c97dcf7fc0bba4-1" name="rest_code_2a3d87cdff97499b85c97dcf7fc0bba4-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_2a3d87cdff97499b85c97dcf7fc0bba4-1"></a><span class="c1"># TODO fix this performance problem in the next release, honest!</span>
</pre></div>
</li>
<li><p>Extra flag needed to only show products that are in stock:</p>
<div class="code"><pre class="code python"><a id="rest_code_ea9e3b90d5d441fe82c19cae0688b34d-1" name="rest_code_ea9e3b90d5d441fe82c19cae0688b34d-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_ea9e3b90d5d441fe82c19cae0688b34d-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="n">in_stock</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Fetch the products in user's basket only:</p>
<div class="code"><pre class="code python"><a id="rest_code_59ee00a84a4249cf892bfe445398c3e7-1" name="rest_code_59ee00a84a4249cf892bfe445398c3e7-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_59ee00a84a4249cf892bfe445398c3e7-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="n">for_user</span><span class="o">=</span><span class="n">user</span><span class="p">,</span> <span class="n">in_basket_for</span><span class="o">=</span><span class="n">user</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Hmm, I have a lot of parameters now:</p>
<div class="code"><pre class="code python"><a id="rest_code_c53f86a3d55741178549bdab426955fd-1" name="rest_code_c53f86a3d55741178549bdab426955fd-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-1"></a><span class="k">class</span> <span class="nc">ProductRepository</span><span class="p">:</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-2" name="rest_code_c53f86a3d55741178549bdab426955fd-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-2"></a> <span class="k">def</span> <span class="nf">get_many</span><span class="p">(</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-3" name="rest_code_c53f86a3d55741178549bdab426955fd-3" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-3"></a> <span class="n">for_user</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-4" name="rest_code_c53f86a3d55741178549bdab426955fd-4" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-4"></a> <span class="n">fetch_related</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-5" name="rest_code_c53f86a3d55741178549bdab426955fd-5" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-5"></a> <span class="n">ordering</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-6" name="rest_code_c53f86a3d55741178549bdab426955fd-6" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-6"></a> <span class="n">page_size</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-7" name="rest_code_c53f86a3d55741178549bdab426955fd-7" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-7"></a> <span class="n">page</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-8" name="rest_code_c53f86a3d55741178549bdab426955fd-8" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-8"></a> <span class="n">brand</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-9" name="rest_code_c53f86a3d55741178549bdab426955fd-9" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-9"></a> <span class="n">in_stock</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-10" name="rest_code_c53f86a3d55741178549bdab426955fd-10" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-10"></a> <span class="n">in_basket_for</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<a id="rest_code_c53f86a3d55741178549bdab426955fd-11" name="rest_code_c53f86a3d55741178549bdab426955fd-11" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_c53f86a3d55741178549bdab426955fd-11"></a> <span class="p">)</span>
</pre></div>
</li>
<li><p>Idea 1 - <code class="docutils literal">Filter</code> object:</p>
<div class="code"><pre class="code python"><a id="rest_code_8a72ee6bad68424b95a06b02730567b1-1" name="rest_code_8a72ee6bad68424b95a06b02730567b1-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_8a72ee6bad68424b95a06b02730567b1-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="nb">filter</span><span class="o">=</span><span class="n">InStock</span><span class="p">())</span>
<a id="rest_code_8a72ee6bad68424b95a06b02730567b1-2" name="rest_code_8a72ee6bad68424b95a06b02730567b1-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_8a72ee6bad68424b95a06b02730567b1-2"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">get_many</span><span class="p">(</span><span class="nb">filter</span><span class="o">=</span><span class="n">InBasket</span><span class="p">(</span><span class="n">user</span><span class="p">))</span>
</pre></div>
</li>
</ol>
<ol class="arabic" start="16">
<li><p>Idea 2 - switch to a <a class="reference external" href="https://en.wikipedia.org/wiki/Fluent_interface">Fluent interface</a>:</p>
<div class="code"><pre class="code python"><a id="rest_code_691e28c7e12846ed826aa1b0933c0f5c-1" name="rest_code_691e28c7e12846ed826aa1b0933c0f5c-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_691e28c7e12846ed826aa1b0933c0f5c-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">for_user</span><span class="p">(</span><span class="n">user</span><span class="p">)</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">InStock</span><span class="p">())</span><span class="o">.</span><span class="n">fetch_related</span><span class="p">(</span><span class="s1">'brand'</span><span class="p">,</span> <span class="s1">'stock_info'</span><span class="p">)</span>
</pre></div>
</li>
<li><p>Advanced ordering:</p>
<div class="code"><pre class="code python"><a id="rest_code_179ca8e4876e4163b82d71192c2cdd1a-1" name="rest_code_179ca8e4876e4163b82d71192c2cdd1a-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_179ca8e4876e4163b82d71192c2cdd1a-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">for_user</span><span class="p">(</span><span class="n">user</span><span class="p">)</span><span class="o">.</span><span class="n">order</span><span class="p">(</span><span class="n">OrderBy</span><span class="p">(</span><span class="s1">'price'</span><span class="p">,</span> <span class="s1">'product.name'</span><span class="p">))</span>
</pre></div>
</li>
</ol>
<ol class="arabic" start="17">
<li><p>Finishing touches - <code class="docutils literal">[x:y]</code> slicing:</p>
<div class="code"><pre class="code python"><a id="rest_code_05f43a05fb01487e8987ee6df030e35e-1" name="rest_code_05f43a05fb01487e8987ee6df030e35e-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_05f43a05fb01487e8987ee6df030e35e-1"></a><span class="n">ProductRepository</span><span class="o">.</span><span class="n">for_user</span><span class="p">(</span><span class="n">user</span><span class="p">)[</span><span class="mi">0</span><span class="p">:</span><span class="mi">10</span><span class="p">]</span>
</pre></div>
</li>
<li><p>Enlightenment:</p>
<div class="code"><pre class="code python"><a id="rest_code_390a626e2ebe47e3896f1772d96c6c6a-1" name="rest_code_390a626e2ebe47e3896f1772d96c6c6a-1" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_390a626e2ebe47e3896f1772d96c6c6a-1"></a><span class="n">Product</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">for_user</span><span class="p">(</span><span class="n">user</span><span class="p">)</span>
<a id="rest_code_390a626e2ebe47e3896f1772d96c6c6a-2" name="rest_code_390a626e2ebe47e3896f1772d96c6c6a-2" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_390a626e2ebe47e3896f1772d96c6c6a-2"></a> <span class="o">.</span><span class="n">in_stock</span><span class="p">()</span>
<a id="rest_code_390a626e2ebe47e3896f1772d96c6c6a-3" name="rest_code_390a626e2ebe47e3896f1772d96c6c6a-3" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_390a626e2ebe47e3896f1772d96c6c6a-3"></a> <span class="o">.</span><span class="n">by_brand</span><span class="p">(</span><span class="n">brand</span><span class="p">)</span>
<a id="rest_code_390a626e2ebe47e3896f1772d96c6c6a-4" name="rest_code_390a626e2ebe47e3896f1772d96c6c6a-4" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_390a626e2ebe47e3896f1772d96c6c6a-4"></a> <span class="o">.</span><span class="n">order_by</span><span class="p">(</span><span class="s1">'price'</span><span class="p">,</span> <span class="s1">'product__name'</span><span class="p">)</span>
<a id="rest_code_390a626e2ebe47e3896f1772d96c6c6a-5" name="rest_code_390a626e2ebe47e3896f1772d96c6c6a-5" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_390a626e2ebe47e3896f1772d96c6c6a-5"></a> <span class="o">.</span><span class="n">select_related</span><span class="p">(</span><span class="s1">'brand'</span><span class="p">)</span>
<a id="rest_code_390a626e2ebe47e3896f1772d96c6c6a-6" name="rest_code_390a626e2ebe47e3896f1772d96c6c6a-6" href="https://lukeplant.me.uk/blog/posts/evolution-of-a-django-repository-pattern/#rest_code_390a626e2ebe47e3896f1772d96c6c6a-6"></a> <span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="mi">10</span><span class="p">]</span>
</pre></div>
</li>
</ol>
<section id="postscript">
<h2>Postscript</h2>
<p>For those who don't know the context, I'm suggesting you should just use <a class="reference external" href="https://spookylukey.github.io/django-views-the-right-way/thin-views.html#example-push-filtering-to-the-model-layer">Custom
QuerySets</a>
as your “service layer”, instead of a hand-coded repository pattern. See also
<a class="reference external" href="https://www.b-list.org/weblog/2020/mar/16/no-service/">Against service layers in Django</a>.</p>
<p>Also, it's worth noting that the evolution of QuerySets in Django itself wasn't
so different from some of these steps.</p>
</section>Test smarter, not harderhttps://lukeplant.me.uk/blog/posts/test-smarter-not-harder/2020-09-04T19:46:50+01:002020-09-04T19:46:50+01:00Luke Plant<p>Tips for winning the automated testing battle.</p><p>“Smarter, not harder” is a saying used in many contexts, but rowing is the
context I think I first heard it in, and I still associate it with rowing many
years later.</p>
<p>When you look at novice and more experienced rowing crews, it seems particularly
appropriate, because the primary difference is not the amount of effort that
goes in, nor even the strength of the rowers, but technique. Poor rowers still
finish a race absolutely exhausted, but they've moved at a fraction of the speed
of better crews. Sometimes the effort they put in actually slows the boat down.
They tend to make a lot of noise, splash a huge amount of water in every
direction, and pull a lot of faces. (I did a lot of all those things when I tried
rowing!).</p>
<p><a class="reference external" href="https://youtu.be/6V6va2RIdeE?t=4327">Expert crews, however, do none of these things</a>, because they don't make you go faster.
These rowers do a huge amount of training, and exercise massive amounts of
concentration, to ensure that every bit of the (very large) effort they put in
is actually contributing to speed.</p>
<p>The “smarter not harder” mindset is also essential for writing good automated
software tests.</p>
<p>It's in this context that religious devotion to things like <a class="reference external" href="https://en.wikipedia.org/wiki/Test-driven_development">TDD</a> can be really
unhelpful. For many religions, the more painful an activity, and the more you do
it, the more meritorious it is – and it may even atone for past misdeeds. If you
take that mindset with you into writing tests, you will do a rather bad job.</p>
<p>If writing tests is extremely painful, it may be a sign that something is wrong.
Huge and unnecessary quantities of tests are not meritorious, they are a massive
maintenance burden. Many of the things that make tests hard to write are also
going to make them hard (and therefore expensive) to maintain. I've seen far too
many examples where it looks like people have just sat back and accepted their
painful fate.</p>
<p>For example, good ol' Uncle Bob seems to have this attitude. He <a class="reference external" href="https://blog.cleancoder.com/uncle-bob/2017/01/11/TheDarkPath.html">wrote</a>:</p>
<blockquote>
<p>you’d better get used to writing lots and lots of tests, no matter what
language you are using!</p>
</blockquote>
<p><a class="reference external" href="https://www.hillelwayne.com/post/uncle-bob/">Don't listen to Uncle Bob!</a> (at
least, not on this subject).</p>
<p>“Test smarter, not harder” means:</p>
<ul>
<li><p>Only write necessary tests – specifically, tests whose estimated value is
greater than their estimated cost. This is a hard judgement call, of course,
but it does mean that at least some of the time you should be saying “it's not
worth it”. Some of the costs associated with tests are:</p>
<ul class="simple">
<li><p>the time taken to write them.</p></li>
<li><p>the time they add to the test suite on every run.</p></li>
<li><p>the time to maintain them - understand them, debug them, change them when
other things change.</p></li>
<li><p>every time they fail incorrectly - when the functionality works, but the
test fails.</p></li>
</ul>
<p>The value on the other hand, is found in:</p>
<ul class="simple">
<li><p>catching regressions, and doing so at low cost with a quick feedback loop.</p></li>
<li><p>enabling fearless refactoring (which is a consequence of the above, but
distinct from it).</p></li>
<li><p>providing a starting point for making changes, including a form of
documentation for the existing desirable behaviour.</p></li>
</ul>
</li>
<li><p>Write your test code with the functions/methods/classes you wish existed, not
the ones you've been given. For example, don't write this:</p>
<div class="code"><pre class="code python"><a id="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-1" name="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-1" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-1"></a><span class="bp">self</span><span class="o">.</span><span class="n">driver</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">live_server_url</span> <span class="o">+</span> <span class="n">reverse</span><span class="p">(</span><span class="s2">"contact_form"</span><span class="p">))</span>
<a id="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-2" name="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-2" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-2"></a><span class="bp">self</span><span class="o">.</span><span class="n">driver</span><span class="o">.</span><span class="n">find_element_by_css_selector</span><span class="p">(</span><span class="s2">"#id_email"</span><span class="p">)</span><span class="o">.</span><span class="n">send_keys</span><span class="p">(</span><span class="s2">"my@email.com"</span><span class="p">)</span>
<a id="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-3" name="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-3" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-3"></a><span class="bp">self</span><span class="o">.</span><span class="n">driver</span><span class="o">.</span><span class="n">find_element_by_css_selector</span><span class="p">(</span><span class="s2">"#id_message"</span><span class="p">)</span><span class="o">.</span><span class="n">send_keys</span><span class="p">(</span><span class="s2">"Hello"</span><span class="p">)</span>
<a id="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-4" name="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-4" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-4"></a><span class="bp">self</span><span class="o">.</span><span class="n">driver</span><span class="o">.</span><span class="n">find_element_by_css_selector</span><span class="p">(</span><span class="s2">"input[type=submit]"</span><span class="p">)</span><span class="o">.</span><span class="n">click</span><span class="p">()</span>
<a id="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-5" name="rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-5" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_6b39ba138bfe4e8aa6cc1891a24d3566-5"></a><span class="n">WebDriverWait</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">driver</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span><span class="o">.</span><span class="n">until</span><span class="p">(</span><span class="k">lambda</span> <span class="n">driver</span><span class="p">:</span> <span class="n">driver</span><span class="o">.</span><span class="n">find_element_by_css_selector</span><span class="p">(</span><span class="s2">"body"</span><span class="p">))</span>
</pre></div>
<p>That looks very tedious! Write this instead:</p>
<div class="code"><pre class="code python"><a id="rest_code_5f3b6ce2322d4588a3a0480918eb6072-1" name="rest_code_5f3b6ce2322d4588a3a0480918eb6072-1" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_5f3b6ce2322d4588a3a0480918eb6072-1"></a><span class="bp">self</span><span class="o">.</span><span class="n">get_url</span><span class="p">(</span><span class="s2">"contact_form"</span><span class="p">)</span>
<a id="rest_code_5f3b6ce2322d4588a3a0480918eb6072-2" name="rest_code_5f3b6ce2322d4588a3a0480918eb6072-2" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_5f3b6ce2322d4588a3a0480918eb6072-2"></a><span class="bp">self</span><span class="o">.</span><span class="n">fill</span><span class="p">({</span>
<a id="rest_code_5f3b6ce2322d4588a3a0480918eb6072-3" name="rest_code_5f3b6ce2322d4588a3a0480918eb6072-3" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_5f3b6ce2322d4588a3a0480918eb6072-3"></a> <span class="s2">"#id_email"</span><span class="p">:</span> <span class="s2">"my@email.com"</span><span class="p">,</span>
<a id="rest_code_5f3b6ce2322d4588a3a0480918eb6072-4" name="rest_code_5f3b6ce2322d4588a3a0480918eb6072-4" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_5f3b6ce2322d4588a3a0480918eb6072-4"></a> <span class="s2">"#id_message"</span><span class="p">:</span> <span class="s2">"Hello"</span><span class="p">,</span>
<a id="rest_code_5f3b6ce2322d4588a3a0480918eb6072-5" name="rest_code_5f3b6ce2322d4588a3a0480918eb6072-5" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_5f3b6ce2322d4588a3a0480918eb6072-5"></a><span class="p">})</span>
<a id="rest_code_5f3b6ce2322d4588a3a0480918eb6072-6" name="rest_code_5f3b6ce2322d4588a3a0480918eb6072-6" href="https://lukeplant.me.uk/blog/posts/test-smarter-not-harder/#rest_code_5f3b6ce2322d4588a3a0480918eb6072-6"></a><span class="bp">self</span><span class="o">.</span><span class="n">submit</span><span class="p">(</span><span class="s2">"input[type=submit]"</span><span class="p">)</span>
</pre></div>
<p>(Like you can with <a class="reference external" href="https://django-functest.readthedocs.io/en/latest/">django-functest</a>, but it's the principle,
not the library, that's important. If the API you want to use doesn't exist
yet, you still use it, and then make it exist.)</p>
</li>
<li><p>Don't write tests for things that can be more effectively tested in other
ways, and lean on other correctness methodologies as much as possible. These
include:</p>
<ul class="simple">
<li><p>code review</p></li>
<li><p>static type checking (especially in languages with sound and powerful type
systems, with type inference everywhere, giving you a very good cost-benefit
ratio)</p></li>
<li><p>linters like <a class="reference external" href="https://github.com/pycqa/flake8">flake8</a> and <a class="reference external" href="https://semgrep.dev/">Semgrep</a>.</p></li>
<li><p><a class="reference external" href="https://www.hillelwayne.com/post/business-case-formal-methods/">formal methods</a></p></li>
<li><p>introspection (like <a class="reference external" href="https://docs.djangoproject.com/en/stable/topics/checks/#module-django.core.checks">Django's checks framework</a>)</p></li>
<li><p>property based testing like <a class="reference external" href="https://hypothesis.readthedocs.io/en/latest/">hypothesis</a>.</p></li>
</ul>
</li>
<li><p>Move the burden onto the computer. “Push the loop in”.</p>
<p>Take, for example, a requirement that every entry point to your web app (i.e.
a page or HTTP API), apart from a few exceptions like login and reset
password, should require authentication.</p>
<p>The “test harder” religion interprets this as:</p>
<ul class="simple">
<li><p><em>For every entry point</em></p>
<ul>
<li><p>Write a test that</p>
<ul>
<li><p>Ensures non-authenticated requests return 403</p></li>
</ul>
</li>
</ul>
</li>
</ul>
<p>That's a lot of tests, and even worse is that you have to remember to write
them.</p>
<p>“Test smarter” says:</p>
<ul class="simple">
<li><p>Write a test that</p>
<ul>
<li><p><em>For every entry point</em></p>
<ul>
<li><p>Ensures non-authenticated requests return 403</p></li>
</ul>
</li>
</ul>
</li>
</ul>
<p>That's one test. “Write a test” is executed in developer time, so in the first
example the loop ("For every entry point") is also executed in developer time.
Push the loop inside the test, and it gets executed in computer time instead.</p>
<p>Already mentioned, but <a class="reference external" href="https://hypothesis.readthedocs.io/en/latest/">hypothesis</a> is a great way to push the
loop in. Also, the implementation of the requirements can benefit from the
same techniques that the tests do.</p>
</li>
<li><p>Cheat on your homework. It's smart to get help, and hard work is for suckers.
If you have a good idea, but don't know the techniques or tools you need to
implement it, or whether it is even possible (for example, in the example
above you don't know how to introspect your system to get a list of all entry
points), there are a lot of smart people on <a class="reference external" href="https://stackoverflow.com/">StackOverflow</a> who will revel in the challenge.</p>
<p>(Level up: loudly claim on Twitter that "it appears to be impossible to X with
tool Y" and know-it-alls like me will magically appear with solutions).</p>
</li>
</ul>
<p>Of course, there are still times when hard work is required for writing tests —
times when it will be tedious, and times when our instincts to skimp are
actually misplaced laziness that will cost more in the long run. But you should
hustle and cheat your way out of unnecessary effort as much as you possibly can.
Your overall testing strategy should feel like “I get that computer to do so
much work for me!”, not ”My RSI and bleeding fingers have hopefully appeased the
testing gods and atoned for my previous omissions”.</p>
<section id="links">
<h2>Links</h2>
<ul class="simple">
<li><p><a class="reference external" href="https://www.reddit.com/r/programming/comments/imzawj/test_smarter_not_harder/">Discussion on this post on Reddit</a></p></li>
<li><p><a class="reference external" href="https://lobste.rs/s/hit4t9/test_smarter_not_harder">Discussion of this post on Lobsters</a></p></li>
</ul>
</section>Announcement: Django Views - The Right Wayhttps://lukeplant.me.uk/blog/posts/announcement-django-views-the-right-way/2020-08-19T21:51:36+01:002020-08-19T21:51:36+01:00Luke Plant<p>Announcement of my guide to writing Django Views.</p><p>I announced this a few days back on Twitter, this is just a quick additional
blog post to announce <a class="reference external" href="https://spookylukey.github.io/django-views-the-right-way/">Django Views - The Right Way</a>. It's an
opinionated guide to writing views in Django that I've been working on for a few
months.</p>
<p>This project turned out to be much bigger than I expected. And in the end, more
about general programming and Python principles than just Django – so you may
enjoy it even if you're not into Django.</p>Double-checked locking with Django ORMhttps://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/2020-02-28T16:14:28Z2020-02-28T16:14:28ZLuke Plant<p>How to implement the classic double-checked locking pattern with Django ORM/PostgreSQL.</p><p>The <a class="reference external" href="https://en.wikipedia.org/wiki/Double-checked_locking">double-checked locking</a> pattern is one that is
useful when:</p>
<ol class="arabic simple">
<li><p>You need to restrict access to a certain resource to stop simultaneous
processes from working on it at the same time.</p></li>
<li><p>The locking patterns available to you have significant costs.</p></li>
</ol>
<p>This post is about how we can implement this pattern in Django, using the ORM
and database level locking features. The pattern applies if you are not using
Django, or indeed any ORM, but I have only checked it for Django, and in fact
only really verified it works as expected using PostgreSQL.</p>
<section id="the-situation">
<h2>The situation</h2>
<p>You have some database records that require 'processing' of some kind, but need
to ensure that they are only processed once. For example, in your e-commerce
system, you might want to send an email to your users when their order is
shipped, but make sure you only send one. You've got something like this:</p>
<div class="code"><pre class="code python"><a id="rest_code_62cc32a6794945eea61326fddb3b5cd4-1" name="rest_code_62cc32a6794945eea61326fddb3b5cd4-1" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_62cc32a6794945eea61326fddb3b5cd4-1"></a><span class="k">class</span> <span class="nc">Order</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
<a id="rest_code_62cc32a6794945eea61326fddb3b5cd4-2" name="rest_code_62cc32a6794945eea61326fddb3b5cd4-2" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_62cc32a6794945eea61326fddb3b5cd4-2"></a> <span class="n">shipped_at</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">DateTimeField</span><span class="p">(</span><span class="n">null</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<a id="rest_code_62cc32a6794945eea61326fddb3b5cd4-3" name="rest_code_62cc32a6794945eea61326fddb3b5cd4-3" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_62cc32a6794945eea61326fddb3b5cd4-3"></a> <span class="n">shipped_email_sent</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">BooleanField</span><span class="p">(</span><span class="n">default</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</pre></div>
<p>Email sending may take a little time, or fail, so we have some kind of
background job to do this. We could use a job queue like Celery to do this and
ensure that only one process at a time attempts to do the sending, but maybe we
don't have a queue like that, or maybe we don't trust it to have been configured
correctly etc. Or, maybe we know we only have one process doing this task, but
need to ensure that other, different concurrent tasks that might lock the same
rows are not interfered with.</p>
<p>We're using <code class="docutils literal">shipped_email_sent</code> to track whether we've sent the emails or
not, but even if we filter on that, simultaneous processes launched at the same
moment could end up sending emails twice, due to the delay between querying and
updating the records. We could use <code class="docutils literal">select_for_update()</code>, but want to avoid
locking this important table anymore than absolutely necessary. What should we
do?</p>
</section>
<section id="solution">
<h2>Solution</h2>
<p>I'll present my solution, and then explain afterwards. But the notes are
important – don't just copy-paste this!</p>
<div class="code"><pre class="code python"><a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-1" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-1" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-1"></a><span class="k">def</span> <span class="nf">send_pending_order_shipped_emails</span><span class="p">():</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-2" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-2" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-2"></a> <span class="n">orders_to_email</span> <span class="o">=</span> <span class="n">Order</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-3" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-3" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-3"></a> <span class="n">shipped_at__isnull</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-4" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-4" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-4"></a> <span class="n">shipped_email_sent</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-5" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-5" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-5"></a> <span class="p">)</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-6" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-6" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-6"></a>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-7" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-7" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-7"></a> <span class="k">for</span> <span class="n">order</span> <span class="ow">in</span> <span class="n">orders_to_email</span><span class="p">:</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-8" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-8" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-8"></a> <span class="k">with</span> <span class="n">transaction</span><span class="o">.</span><span class="n">atomic</span><span class="p">():</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-9" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-9" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-9"></a> <span class="k">for</span> <span class="n">order</span> <span class="ow">in</span> <span class="p">(</span><span class="n">orders_to_email</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-10" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-10" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-10"></a> <span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="nb">id</span><span class="o">=</span><span class="n">order</span><span class="o">.</span><span class="n">id</span><span class="p">)</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-11" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-11" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-11"></a> <span class="o">.</span><span class="n">select_for_update</span><span class="p">(</span><span class="n">of</span><span class="o">=</span><span class="s1">'self'</span><span class="p">,</span> <span class="n">skip_locked</span><span class="o">=</span><span class="kc">True</span><span class="p">)):</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-12" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-12" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-12"></a> <span class="n">send_shipped_email</span><span class="p">(</span><span class="n">order</span><span class="p">)</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-13" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-13" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-13"></a> <span class="n">order</span><span class="o">.</span><span class="n">shipped_email_sent</span> <span class="o">=</span> <span class="kc">True</span>
<a id="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-14" name="rest_code_b5479219e0dd42ed9773ea7816a3c1b8-14" href="https://lukeplant.me.uk/blog/posts/double-checked-locking-with-django-orm/#rest_code_b5479219e0dd42ed9773ea7816a3c1b8-14"></a> <span class="n">order</span><span class="o">.</span><span class="n">save</span><span class="p">()</span>
</pre></div>
</section>
<section id="explanation">
<h2>Explanation</h2>
<ol class="arabic simple">
<li><p>This whole block of code should be executed <strong>outside</strong> an atomic block.</p></li>
<li><p>Note that the outer <code class="docutils literal">orders_to_email</code> block is therefore outside a
transaction, and uses no locking. So, if this query return no results, the
whole process does just a single read query, without locking, and exits. This
is great for performance and avoiding contention on the DB.</p></li>
<li><p>If there are items to process, we <strong>then</strong> start an atomic transaction block.</p></li>
<li><p>Since the first query was outside a transaction, another process may have
beaten us to it, we have to do another query to make sure the record still
matches the criteria – the inner query.</p></li>
<li><p>We add a <code class="docutils literal">.filter(id=order.id)</code> to the inner query, so the inner loop
will run at most once for each outer iteration.</p></li>
<li><p>The inner query also uses <code class="docutils literal">SELECT FOR UPDATE</code>, so that no other process
will be able to enter this block for the filtered record at the same time.</p></li>
<li><p>We use <code class="docutils literal">skip_locked</code> so that if someone else does beat us to it, we just
skip the record and try the next one.</p></li>
<li><p>After processing we set a flag that ensures that this record will no longer
be found by the <code class="docutils literal">orders_to_email</code> query.</p></li>
</ol>
<p>The result is that we guarantee each record gets processed at most once.</p>
</section>
<section id="notes">
<h2>Notes</h2>
<ul>
<li><p>I've only checked this with PostgreSQL and the default isolation level of
<code class="docutils literal">READ COMMITTED</code>.</p></li>
<li><p>Note the use of Django QuerySets: We define the correctly filtered query once,
and then we re-use with chaining to execute it multiple times. We are relying
on the fact that the additional <code class="docutils literal">filter</code> etc. are creating a new,
unevaluated <code class="docutils literal">QuerySet</code>, which executes a new query when we loop over it with
the second for loop.</p></li>
<li><p>Make sure you read the notes for <a class="reference external" href="https://docs.djangoproject.com/en/stable/ref/models/querysets/#select-for-update">select_for_update</a>
and use <code class="docutils literal">of</code> parameter as appropriate.</p></li>
<li><p>We guarantee “at most once” – <strong>but that allows the possibility of zero
times</strong>. If you have other, different processes that are also locking those
rows in the table (not just multiple copies of this code executing this same
code), then the <code class="docutils literal">skip_locked=True</code> flag means this process could exit
without having processed all the rows, and without any errors. If you don't
mind having to try multiple times to be sure, that should be OK. In other
words, this code is assuming “everyone else is more important than me.”</p>
<p>I think you could change this by using <code class="docutils literal">select_for_update(nowait=True)</code>
instead, combined with appropriate try/except/looping.</p>
<p>For trying multiple times, we could:</p>
<ol class="arabic simple">
<li><p>Leave that to the next time our background job attempts this, or</p></li>
<li><p>Do some counting inside the two loops, and if the inner loop comes up
short, we know that for some reason we skipped some rows (it could have
been because someone else already processed the row, or because someone
else locked the rows for a different reason). If so, recursively call
<code class="docutils literal">send_pending_order_shipped_emails</code>. This recursion will definitely
terminate when the <code class="docutils literal">orders_to_email</code> query comes up empty, or when we
succeed in processing everything in it.</p></li>
</ol>
</li>
<li><p>Performance note: we are doing N+1 read queries to process all the pending
records, plus N writes. You might need to be aware of that, compared to doing
1 read and 1 write if we did them all together and used some other mechanism
to ensure we didn't have multiple competing processes.</p></li>
<li><p>If you have multiple processes racing to process the pending records, the
above code will naturally distribute the work between them approximately
equally – you get work distribution for free.</p></li>
<li><p>I've tried to find ways to encapsulate this pattern more neatly in
Django/Python, like <code class="docutils literal">with double_checked_locking(queryset)</code> but so far had
no luck at producing something materially better (like <a class="reference external" href="https://github.com/pinax/django-mailer/blob/80f3d3d18d19010e95d306c32532cf045060b801/src/mailer/engine.py#L43">this in django-mailer</a>,
which works OK but has an <a class="reference external" href="https://github.com/pinax/django-mailer/blob/80f3d3d18d19010e95d306c32532cf045060b801/src/mailer/engine.py#L181">awkward usage pattern</a>).
I think this one is better just doing every time, especially given some of the
issues above.</p></li>
<li><p>If your processing is idempotent, or you can arrange for that, then you may be
able to get away without any locking at all, and may not need this pattern.
You may need to be careful to use <a class="reference external" href="https://docs.djangoproject.com/en/stable/ref/models/querysets/#django.db.models.query.QuerySet.update">QuerySet.update</a>
rather than <code class="docutils literal">Model.save()</code>, to avoid race conditions with overwriting data
etc. (thanks <a class="reference external" href="https://twitter.com/be_haki">Haki</a>)</p></li>
</ul>
<p>Anything else? My understanding of how PostgreSQL isolation levels and
transactions work, along with my experiments (using good 'ol <code class="docutils literal">time.sleep()</code>)
seem to confirm this works as per the description and notes above, but if I've
missed something please add a comment!</p>
</section>
<section id="updates">
<h2>Updates</h2>
<ul class="simple">
<li><p><a class="reference external" href="https://twitter.com/AdamChainz">Adam Johnson</a> has attempted a <a class="reference external" href="https://gist.github.com/adamchainz/51dad7990c073978f27a7e372cfb49db">refactor of
this idea</a>. It's
untested, but looks like it will work. I would still recommend reading all the
caveats above and adapting for your own usage rather than just dropping it
in - there are plenty of subtleties here, as noted.</p></li>
<li><p>Other <a class="reference external" href="https://twitter.com/spookylukey/status/1233409765474209792">discussion on Twitter</a>.</p></li>
<li><p>You might want to consider <a class="reference external" href="https://www.postgresql.org/docs/16/explicit-locking.html#ADVISORY-LOCKS">PostgreSQL advisory locks</a> as a different locking mechanism.</p></li>
</ul>
</section>