Luke Plant's home page (Posts about Web development)

Help my website is too small

2025-12-19T13:45:33Z

A jobs web site I belong to just emailed me, telling me that some of the links in my public profile on their site are “broken” and “thus have been removed”.

The evidence that these sites are broken? They are too small:

https://www.djangoproject.com/: response body too small (6220 bytes)

https://www.cciw.co.uk/: response body too small (3033 bytes)

The first is the home page of the Django web framework, and is, unsurprisingly, implemented using Django (see the djangoproject.com source code). The second is one of my own projects, and also implemented using Django (source also available for anyone who cares).

Checking in webdev tools on these sites gives very similar numbers to the above for the over-the-wire size of the initial HTML (though I get slightly higher figures), so this wasn’t a blip caused by downtime, as far as I can see.

Apparently, if your HTML is less than 7k, that obviously can’t be a real website, let alone something as ridiculously small as 3k. Even with compression turned up all the way, it’s clearly impossible to return more than an error message with less than at least 4k, right?

So please can Django get it sorted and add some bloat to their home page, and to their framework, and can someone also send me tips on bloating my own sites, so that my profile links can be counted as real websites? Thanks!

Keeping things in sync: derive vs test

2024-06-28T10:15:00+01:00

An extremely common problem in programming is that multiple parts of a program need to be kept in sync – they need to do exactly the same thing or behave in a consistent way. It is in response to this problem that we have mantras like “DRY” (Don’t Repeat Yourself), or, as I prefer it, OAOO, “Each and every declaration of behaviour should appear Once And Only Once”.

For both of these mantras, if you are faced with possible duplication of any kind, the answer is simply “just say no”. However, since programming mantras are to be understood as proverbs, not absolute laws, there are times that obeying this mantra can hurt more than it helps, so in this post I’m going to discuss other approaches.

Most of what I say is fairly language agnostic I think, but I’ve got specific tips for Python and web development.

The essential problem

To step back for a second, the essential problem that we are addressing here is that if making a change to a certain behaviour requires changing more than one place in the code, we have the risk that one will be forgotten. This results in bugs, which can be of various degrees of seriousness depending on the code in question.

To pick a concrete example, suppose we have a rule that says that items in a deleted folder get stored for 30 days, then expunged. We’re going to need some code that does the actual expunging after 30 days, but we’re also going to need to tell the user about the limit somewhere in the user interface. “Once And Only Once” says that the 30 days limit needs to be defined in a single place somewhere, and then reused.

There is a second kind of motivating example, which I think often crops up when people quote “Don’t Repeat Yourself”, and it’s really about avoiding tedious things from a developer perspective. Suppose you need to add an item to a menu, and you find out that first you’ve got to edit the MENU_ITEMS file to add an entry, then you’ve got to edit the MAIN_MENU constant to refer to the new entry, then you’ve got to define a keyboard shortcut in the MENU_SHORTCUTS file, then a menu icon somewhere else etc. All of these different places are in some way repeating things about how menus work. I think this is less important in general, but it is certainly life-draining as a developer if code is structured in this way, especially if it is difficult to discover or remember all the things that have to be done.

The ideal solution: derive

OAOO and DRY say that we aim to have a single place that defines the rule or logic, and any other place should be derived from this.

Regarding the simple example of a time limit displayed in the UI and used in the backend, this might be as simple as defining a constant e.g. in Python:

from datetime import timedelta

EXPUNGE_TIME_LIMIT = timedelta(days=30)

We then import and use this constant in both our UI and backend.

An important part of this approach is that the “deriving” process should be entirely automatic, not something that you can forget to do. In the case of a Python import statement, that is very easy to achieve, and relatively hard to get wrong – if you change the constant where it is defined in one module, any other code that uses it will pick up the change the next time the Python process is restarted.

Alternative solution: test

By “test”, I mean ideally an automated test, but manual tests may also work if they are properly scripted. The idea is that you write a test that checks the behaviour of code is synced. Often, it may be that for one (or more) instances that need the behaviour will define it using some constant as above, let’s say the “backend” code. Then, for one instance, e.g. the UI, you would hard code “30 days” without using the constant, but have a test that uses the backend constant to build a string, and checks the UI for that string.

Examples

In the example above, it might be hard to see why you want to use the fundamentally less reliable, less automatic method I’m suggesting. So I now have to show some motivating examples where the “derive” method ends up losing to the cruder, simpler alternative of “test”.

Example 1 - external data sources

My first example comes from the project I’m currently working on, which involves creating CAM files from input data. Most of the logic for that is driven using code, but there are some dimensions that are specified as data tables by the engineers of the physical product.

These data tables look something like below. The details here aren’t important, and I’ve changed them – it’s enough to know that we’ve are creating some physical “widgets” which need to have specific dimensions specified:

Widgets have length 150mm unless specified below
Widget id	Location	Length (mm)
A	start	100
A	end	120
F	start	105
F	end	110

These tables are supplied at design-time rather than run-time i.e. they are bundled with the software and can’t be changed after the code is shipped. But it is still convenient to read them in automatically rather than simply duplicate the tables in my code by some process. So, for the body of the table, that’s exactly what my code does on startup – it reads the bundled XLSX/CSV files.

So we are obeying “derive” here — there is a single, canonical source of data, and anywhere that needs it derives it by an entirely automatic process.

But what about that “150mm” default value specified in the header of that table?

It would be possible to “derive” it by having a parser. Writing such a parser is not hard to do – for this kind of thing in Python I like parsy, and it is as simple as:

import parsy as P

default_length_parser = (
  P.string("Widgets have length ") >>
  P.regex(r"\d+").map(int)
  << P.string("mm unless specified below")
)

In fact I do something similar in some cases. But in reality, the “parser” here is pretty simplistic – it can’t deal with the real variety of English text that might be put into the sentence, and to claim I’m “deriving” it from the table is a bit of a stretch – I’m just matching a specific, known pattern. In addition, it’s probably not the case that any value for the default length would work – most likely if it was 10 times larger, there would be some other problem, and I’d want to do some manual checking.

So, let’s admit that we are really just checking for something expected, using the “test” approach. You can still define a constant that you use in most of the code:

DEFAULT_LENGTH_MM = 150

And then you test it is what you expect when you load the data file:

assert worksheets[0].cell(1, 1).value == f"Widgets have length {DEFAULT_LENGTH_MM}mm unless specified below"

So, I’ve achieved my aim: a guard against the original problem of having multiple sources of information that could potentially be out of sync. But I’ve done it using a simple test, rather than a more complex and fragile “derive” that wouldn’t have worked well anyway.

By the way, for this specific project – we’re looking for another contract developer! It’s a very worthwhile project, and one I’m really enjoying – a small flexible team, with plenty of problem solving and fun challenges, so if you’re a talented developer and interested give me a shout.

Example 2 - defining UI behaviour for domain objects

Suppose you have a database that stores information about some kind of entity, like customers say, and you have different types of customer, represented using an enum of some kind, perhaps a string enum like this in Python:

from enum import StrEnum


class CustomerType(StrEnum):
    ENTERPRISE = "Enterprise"
    SMALL_FRY = "Small fry"  # Let’s be honest! Try not to let the name leak…
    LEGACY = "Legacy"

We need to a way edit the different customer types, and they are sufficiently different that we want quite different interfaces. So, we might have a dictionary mapping the customer type to a function or class that defines the UI. If this were a Django project, it might be a different Form class for each type:

CUSTOMER_EDIT_FORMS = {
    CustomerType.ENTERPRISE: EnterpriseCustomerForm,
    CustomerType.SMALL_FRY: SmallFryCustomerForm,
    CustomerType.LEGACY: LegacyCustomerForm,
}

Now, the DRY instinct kicks in and we notice that we now have two things we have to remember to keep in sync — any addition to the customer enum requires a corresponding addition to the UI definition dictionary. Maybe there are multiple dictionaries like this.

We could attempt to solve this by “deriving”, or some “correct by construction” mechanism that puts the creation of a new customer type all in one place.

For example, maybe we’ll have a base Customer class with get_edit_form_class() as an abstractmethod, which means it is required to be implemented. If I fail to implement it in a subclass, I can’t even construct an instance of the new customer subclass – it will throw an error.

from abc import abstractmethod

class Customer:
    @abstractmethod
    def get_edit_form_class(self):
        pass


class EnterpriseCustomer(Customer):
    def get_edit_form_class(self):
        return EnterpriseCustomerForm

class LegacyCustomer(Customer):
    ...  # etc.

I still need my enum value, or at least a list of valid values that I can use for my database field. Maybe I could derive that automatically by looking at all the sublclasses?

CUSTOMER_TYPES = [
    cls.__name__.upper().replace("CUSTOMER", "")
    for cls in Customer.__subclasses__()
]

Or maybe an __init_subclass__ trick, and I can perhaps also set up the various mappings I’ll need that way?

It’s at this point you should stop and think. In addition to requiring you to mix UI concerns into the Customer class definitions, it’s getting complex and magical.

The alternative I’m suggesting is this: require manual syncing of the two parts of the code base, but add a test to ensure that you did it. All you need is a few lines after your CUSTOMER_EDIT_FORMS definition:

CUSTOMER_EDIT_FORMS = {
    # etc as before
}

for c_type in CustomerType:
    assert (
        c_type in CUSTOMER_EDIT_FORMS
    ), f"You've defined a new customer type {c_type}, you need to add an entry in CUSTOMER_EDIT_FORMS"

You could do this as a more traditional unit test in a separate file, but for simple things like this, I think an assertion right next to the code works much better. It really helps local reasoning to be able to look and immediately conclude “yes, I can see that this dictionary must be exhaustive because the assertion tells me so.” Plus you get really early failure – as soon as you import the code.

This kind of thing crops up a lot – if you create a class here, you’ve got to create another one over there, or add a dictionary entry etc. In these cases, I’m finding simple tests and assertions have a ton of advantages when compared to clever architectural contortions (or other things like advanced static typing gymnastics):

they are massively simpler to create and understand.
you can write your own error message in the assertion. If you make a habit of using really clear error messages, like the one above, your code base will literally tell you how to maintain it.
you can easily add things like exceptions. “Every Customer type needs an edit UI defined, except Legacy because they are read only” is an easy, small change to the above.
- This contrasts with cleverer mechanisms, which might require relaxing other constraints to the point where you defeat the whole point of the mechanism, or create more difficulties for yourself.
the rule about how the code works is very explicit, rather than implicit in some complicated code structure, and typically needs no comment other than what you write in the assertion message.
you express and enforce the rule, with any complexities it gains, in just one place. Ironically, if you try to enforce this kind of constraint using type systems or hierarchies to eliminate repetition or the need for any kind of code syncing, you may find that when you come to change the constraint it actually requires touching far more places.
temporarily silencing the assertion while developing is easy and doesn’t have far reaching consequences.

Of course, there are many times when being able to automatically derive things at the code level, including some complex relationships between parts of the code, can be a win, and it’s the kind of thing you can do in Python with its many powerful techniques.

But my point is that you should remember the alternative: “synchronise manually, and have a test to check you did it.” Being able to add any kind of executable code at module level – the same level as class/function/constant definitions – is a Python super-power that you should use.

Example 3 - external polymorphism and static typing

A variant of the above problem is when, instead of an enum defining different types, I’ve got a set of classes that all need some behaviour defined.

Often we just use polymorphism where a base class defines the methods or interfaces needed and sub-classes provide the implementation. However, as in the previous case, this can involve mixing concerns e.g. user interface code, possibly of several types, is mixed up with the base domain objects. It also imposes constraints on class hierarchies.

Recently for these kind of cases, I’m more likely to prefer external polymorphism to avoid these problems. To give an example, in my current project I’m using the Command pattern or plan-execute pattern extensively, and it involves manipulating CAM objects using a series of command objects that look something like this:

@dataclass
class DeleteFeature:
    feature_name: str


@dataclass
class SetParameter:
    param_name: str
    value: float


@dataclass
class SetTextSegment:
    text_name: str
    segment: int
    value: str


Command: TypeAlias = DeleteFeature | SetParameter | SetTextSegment

Note that none of them share a base class, but I do have a union type that gives me the complete set.

It’s much more convenient to define the behaviour associated with these separately from these definitions, and so I have multiple other places that deal with Command, such as the place that executes these commands and several others. One example that requires very little code to show is where I’m generating user-presentable tables that show groups of commands. I convert each of these Command objects into key-value pairs that are used for column headings and values:

def get_command_display(command: Command) -> tuple[str, str | float | bool]:
    match command:
        case DeleteFeature(feature_name=feature_name):
            return (f"Delete {feature_name}", True)
        case SetParameter(param_name=param_name, value=value):
            return (param_name, value)
        case SetTextSegment(text_name=text_name, segment=segment, value=value):
            return (f"{text_name}[{segment}]", value)

This is giving me a similar problem to the one I had before I had before: if I add a new Command, I have to remember to add the new branch to get_command_display.

I could split out get_command_display into a dictionary of functions, and apply the same technique as in the previous example, but it’s more work, a less natural fit for the problem and potentially less flexible.

Instead, all I need to do is add exhaustiveness checking with one more branch:

match command:
    ...  # etc
    case _:
        assert_never(command)

Now, pyright will check that I didn’t forget to add branches here for any new Command. The error message is not controllable, in contrast to hand-written asserts, but it is clear enough.

The theme here is that additions in one part of the code require synchronised additions in other parts of the code, rather than being automatically correct “by construction”, but you have something that tests you didn’t forget.

Example 4 - generated code

In web development, ensuring consistent design and keeping different things in sync is a significant problem. There are many approaches, but let’s start with the simple case of using a single CSS stylesheet to define all the styles.

We may want a bunch of components to have a consistent border colour, and a first attempt might look like this (ignoring the many issues of naming conventions here):

.card-component, .bordered-heading {
   border-color: #800;
}

This often becomes impractical when we want to organise by component, rather than by property, which introduces duplication:

.card-component {
   border-color: #800;
}

/* somewhere far away … */

.bordered-heading {
   border-color: #800;
}

Thankfully, CSS has variables, so the first application of “derive” is straightforward – we define a variable which we can use in multiple places:

:root {
    --primary-border-color: #800;
}

/* elsewhere */

.bordered-heading {
    border-bottom: 1px solid var(--primary-border-color);
}

However, as the project grows, we may find that we want to use the same variables in different contexts where CSS isn’t applicable. So the next step at this point is typically to move to Design Tokens.

Practically speaking, this might mean that we now have our variables defined in a separate JSON file. Maybe something like this (using a W3C draft spec):

{
  "primary-border-color": {
    "$value": "#800000",
    "$type": "color"
  }
  "primary-hightlight-color": {
    "$value": "#FBC100",
    "$type": "color"
  }
}

From this, we can automatically generate CSS fragments that contain the same variables quite easily – for simple cases, this isn’t more than a 50 line Python script.

However, we’ve got some choices when it comes to how we put everything together. I think the general assumption in web development world is that a fully automatic “derive” is the only acceptable answer. This typically means you have to put your own CSS in a separate file, and then you have a build tool that watches for changes, and compiles your CSS plus the generated CSS into the final output that gets sent to the browser.

In addition, once you’ve bought into these kind of tools you’ll find they want to do extensive changes to the output, and define more and more extensions to the underlying languages. For example, postcss-design-tokens wants you to write things like:

.foo {
     color: design-token('color.background.primary');
 }

And instead of using CSS variables in the output, it puts the value of the token right in to every place in your code that uses it.

This approach has various problems, in particular that you become more and more dependent on the build process, and the output gets further from your input. You can no longer use the Dev Tools built in to your browser to do editing – the flow of using Dev Tools to experiment with changing a single spacing or colour CSS variable for global changes is broken, you need your build tool. And you can’t easily copy changes from Dev Tools back into the source, because of the transformation step, and debugging can be similarly difficult. And then, you’ll probably want special IDE support for the special CSS extensions, rather than being able to lean on your editor simply understanding CSS, and any other tools that want to look at your CSS now need support etc.

It’s also a lot of extra infrastructure and complexity to solve this one problem, especially when our design tokens JSON file is probably not going to change that often, or is going to have long periods of high stability. There are good reasons to want to be essentially build free. The current state of the art in this space is that to get your build tool to compile your CSS you add import './styles.css' in your entry point Javascript file! What if I don’t even have a Javascript file? I think I understand how this sort of thing came about, but don’t try to tell me that it’s anything less than completely bonkers.

Do we have an alternative to the fully automatic derive?

Using the “test” approach, we do. We can even stick with our single CSS file – we just write it like this:

/* DESIGN TOKENS START */
/* auto-created block - do not edit */
:root {
    --primary-border-color: #800000;
    --primary-highlight-color: #FBC100;
}
/* DESIGN TOKENS END */

/* the rest of our CSS here */

The contents of this block will be almost certainly auto-generated. We won’t have a process that fully automatically updates it, however, because this is the same file where we are putting our custom CSS, and we don’t want any possibility of lost work due to the file being overwritten as we are editing it.

On the other hand we don’t want things to get out of sync, so we’ll add a test that checks whether the current styles.css contains the block of design tokens that we expect to be there, based on the JSON. For actually updating the block, we’ll need some kind of manual step – maybe a script that can find and update the DESIGN TOKEN START block, maybe cog – which is a perfect little tool for this use case — or we could just copy-paste.

There are also slightly simpler solutions in this case, like using a CSS import if you don’t mind having multiple CSS files.

Conclusion

For all the examples above, the solutions I’ve presented might not work perfectly for your context. You might also want to draw the line at different place to me. But my main point is that we don’t have to go all the way with a fully automatic derive solution to eliminate any manual syncing. Having some manual work plus a mechanism to test that two things are in sync is a perfectly legitimate solution, and it can avoid some of the large costs that come with structuring everything around “derive”.

Enforcing conventions in Django projects with introspection

2024-04-01T16:05:03+01:00

Naming conventions can make a big difference to maintenance issues in software projects. This post is about how we can use the great introspection capabilities in Python to help enforce naming conventions in Django projects.

Let’s start with an example problem and the naming convention we’re going to use to solve it. There are many other applications of the techniques here, but it helps to have something concrete.

The problem: DateField and DateTimeField confusion

Over several projects I’ve found that inconsistent or bad naming of DateField and DateTimeField fields can cause various problems.

First, poor naming means that you can confuse them for each other, and this can easily trip you up. In Python, datetime is a subclass of date, so if you use a field called created_date assuming it holds a date when it actually holds a datetime, it might be not obvious initially that you are mishandling the value, but you’ll often have subtle problems down the line.

Second, sometimes you have a field named like expired which is actually the timestamp of when the record expired, but it could easily be confused for a boolean field.

Third, not having a strong convention, or having multiple conventions, leads to unnecessary time wasted on decisions that could have been made once.

Finally, inconsistency in naming is just confusing and ugly for developers, and often for users further down the line, because names tend to leak.

Even if you do have an established convention, it’s possible for people not to know. It’s also very easy for people to change a field’s type between date and datetime without also changing the name. So merely having the convention is not enough, it needs to be enforced.

For this specific example, the convention I quite like is:

field names should end with _at for timestamp fields that use DateTimeField, like expires_at or deleted_at.
field names should end with _on or _date for fields that use DateField, like issued_on or birth_date.

This is based on the English grammar rule that we use “on” for dates but “at” for times – “on the 25th March”, but “at 7:00 pm” – and conveniently it also needs very few letters and tends to read well in code. The _date suffix is also helpful in various contexts where _on seems very unnatural. You might want different conventions, of course.

To get our convention to be enforced with automated checks we need a few tools.

The tools

Introspection

Introspection means the ability to use code to inspect code, and typically we’re talking about doing this when our code is already running, from within the same program and using the same programming language.

In Python, this starts from simple things like isinstance() and type() to check the type of an object, to things like hasattr() to check for the presence of attributes and many other more advanced techniques, including the inspect module and many of the metaprogramming dunder methods.

Django app and model introspection

Django is just Python, so you can use all normal Python introspection techniques. In addition, there is a formally documented and supported set of functions and methods for introspecting Django apps and models, such as the apps module and the Model _meta API.

Django checks framework

The third main tool we’re going to use in this solution is Django’s system checks framework, which allows us to run certain kinds of checks, at both “warning” and “error” level. This is the least important tool, and we could in fact switch it out for something else like a unit test.

The solution

It’s easiest to present the code, and then discuss it:

from django.apps import apps
from django.conf import settings
from django.core.checks import Tags, Warning, register


@register()
def check_date_fields(app_configs, **kwargs):
    exceptions = [
        # This field is provided by Django's AbstractBaseUser, we don't control it
        # and we’ll break things if we change it:
        "accounts.User.last_login",
    ]
    from django.db.models import DateField, DateTimeField

    errors = []
    for field in get_first_party_fields():
        field_name = field.name
        model = field.model

        if f"{model._meta.app_label}.{model.__name__}.{field_name}" in exceptions:
            continue

        # Order of checks here is important, because DateTimeField inherits from DateField

        if isinstance(field, DateTimeField):
            if not field_name.endswith("_at"):
                errors.append(
                    Warning(
                        f"{model.__name__}.{field_name} field expected to end with `_at`, "
                        + "or be added to the exceptions in this check.",
                        obj=field,
                        id="conventions.E001",
                    )
                )
        elif isinstance(field, DateField):
            if not (field_name.endswith("_date") or field_name.endswith("_on")):
                errors.append(
                    Warning(
                        f"{model.__name__}.{field_name} field expected to end with `_date` or `_on`, "
                        + "or be added to the exceptions in this check.",
                        obj=field,
                        id="conventions.E002",
                    )
                )
    return errors


def get_first_party_fields():
    for app_config in get_first_party_apps():
        for model in app_config.get_models():
            yield from model._meta.get_fields()


def get_first_party_apps() -> list[AppConfig]:
    return [app_config for app_config in apps.get_app_configs() if is_first_party_app(app_config)]


def is_first_party_app(app_config: AppConfig) -> bool:
    if app_config.module.__name__ in settings.FIRST_PARTY_APPS:
        return True
    app_config_class = app_config.__class__
    if f"{app_config_class.__module__}.{app_config_class.__name__}" in settings.FIRST_PARTY_APPS:
        return True
    return False

We start here with some imports and registration, as documented in the “System checks” docs. You’ll need to place this code somewhere that will be loaded when your application is loaded.

Our checking function defines some allowed exceptions, because there are some things out of our control, or there might be other reasons. It also mentions the exceptions mechanism in the warning message. You might want a different mechanism here, but I think having some way of dealing with exceptions, and advertising its existence in the warnings, is often pretty important. Otherwise, you can end up with worse consequences when people just slavishly follow rules. Notice how in the exception list above I’ve given a comment detailing why the exception is there though – this helps to establish a precedent that exceptions should be justified, and the justification should be there in the code.

We then loop through all “first party” model fields, looking for DateTimeField and DateField instances. This is done using our get_first_party_fields() utility, which is defined in terms of get_first_party_apps(), which in turn depends on:

the get_app_configs() function.
the AppConfig.get_models() method
the _meta get_fields() method

a custom setting FIRST_PARTY_APPS which I’ve created in my settings.py like this:

FIRST_PARTY_APPS = ["myapp", "myotherapp"]

INSTALLED_APPS = [
 "django.contrib.auth",
 "django.contrib.sessions",
] + FIRSTY_PARTY_APPS + [ ... ]

You may have a different way of recognising your own apps.

The id values passed to Warning here are examples – you should change according to your needs. You might also choose to use Error instead of Warning.

Output

When you run manage.py check, you’ll then get output like:

 System check identified some issues:

 WARNINGS:
 myapp.MyModel.created: (conventions.E001) MyModel.created field expected to end with `_at`,
 or be added to the exceptions in this check.

 System check identified 1 issue (0 silenced).

As mentioned, you might instead want to run this kind of check as a unit test.

Conclusion

There are many variations on this technique that can be used to great effect in Django or other Python projects. Very often you will be able to play around with a REPL to do the introspection you need.

Where it is possible, I find doing this far more effective than attempting to document things and relying on people reading and remembering those docs. Every time I’m tripped up by bad names, or when good names or a strong convention could have helped me, I try to think about how I could push people towards a good convention automatically – while also giving a thought to unintended bad consequences of doing that prematurely or too forcefully.

Super-fast Sphinx docs, and SNOB driven development

2023-09-27T15:05:48+01:00

If you are using static HTML files for your docs, such as with Sphinx or many other doc generators, here is a chunk of code that will speed up loading of pages after the first one. If you’re using some other docs generator, the instructions will probably work with minimal adaptation.

Create a custom.js file inside your _static directory, with the following contents:

var script = document.createElement('script');
script.src = "https://unpkg.com/htmx.org@1.9.5"
script.integrity = "sha384-xcuj3WpfgjlKF+FXhSQFQ0ZNr39ln+hwjN3npfM9VBnUskLolQAcN80McRIVOPuO";
script.crossOrigin = 'anonymous';
script.onload = function() {
    var body = document.querySelector("body");
    body.setAttribute('hx-boost', "true");
    htmx.process(body);
}
document.head.appendChild(script);

Add an item to your html_js_files setting in your Sphinx conf.py:
```
html_js_files = [
    'custom.js',
]
```

Rebuild and you’re done.

What this script does is:

Load the htmx library.
If it successfully loads, adds the hx-boost attribute to the body element.
Initialises htmx on the page.

This means that htmx will intercept all internal links on the page, and instead of letting the browser load them the normal way, it sends an AJAX request and swaps in the content of the page. This means that the whole page doesn’t need to be reloaded by the browser, saving precious milliseconds.

Actually, please don’t

I will provide reasons why you really shouldn’t use the code above, although it works almost perfectly. But first, a rant.

This post was inspired by Mux’s blog post on migrating 50,000 lines of React Server Components. It contains a nice overview of the history of web site architecture, including this quote:

Then, we started wondering: What if we wanted faster responses and more interactivity? Every time a user takes an action, do we really want to send cookies back to the server and make the server generate a whole new page? What if we made the client do that work instead? We can just send all the rendering code to the client as JavaScript!

This was called client-side rendering (CSR) or single-page applications (SPA) and was widely considered a bad move

However, instead of then suggesting that we perhaps we should retrace our steps, the article just plunges on and on, deeper and deeper into the jungle.

Now, this might all make sense if we are talking about a highly interactive site that has the highest possible needs in terms of user interactivity. But I realised the article was about just their documentation site, not the main application.

Now, some docs sites are really fancy and do very clever interactive things. Mux’s, however, is not like that. The only interactive things I could find were:

tabs – like you can get with something like sphinx-code-tabs, powered by a tiny bit of Javascript.
their changelog page – which is more complicated, but whose essential functionality could again be implemented by a really small amount of Javascript added to a static page. I should also note that their page is really pretty slugish when you change the filters, much slower than you would get by an approach that just selectively hides parts of the page using DOM manipulation.
search. Search is definitely important, but I can’t see why it means the whole site needs to be implemented in React.
A “Was this helpful” component – this could have been a small web component or something similar.
A few fancy transitions in the side bar.

These are not the highly stateful pages that React was designed for. Maybe there are a few other things I didn’t find, but 95% of it could be handled using entirely static HTML, built by any number of simple docs generators, with tiny amounts of Javascript.

The only other thing I noticed is that page transitions generally had that instant feel an SPA can give you, and were noticeably faster than you would get with the static HTML solution I’m suggesting.

So, not to be beaten, I came up with the above solution on htmx so I could match the speed.

Now, here’s why you shouldn’t use it:

A typical docs page with Sphinx loads in a few hundred milliseconds, which is fine. Do you really need to shave that down to less than 50 so it feels “instant”? Do your users care?
While it is truly a tiny fraction of the complexity of the React docs site Mux described in their post, you are still adding some significant complexity. Is it worth is?
Are you sure it’s not going to interact badly with some Javascript on some page, maybe some future Javascript you will add?
Have you considered all use cases – like the person who downloads your whole docs site using wget --recursive so they can browse offline? Answer: if they have no internet connection when they view the docs, it will actually work fine, because the htmx library won’t load at all. But if they are online, the htmx library will load, and then every internal link will break due to CORS errors. You just broke offline viewing. You could fix this very easily with an extra conditional in the script above, but I’m making a point. Is there anything else that’s broken?

No prizes for guessing that while Sphinx-generated sites normally work perfectly with wget --recursive for offline viewing, docs.mux.com does not work well, to put it mildly. I also wasted hundreds of Mb finding out, due to the vast amount of boilerplate every single HTML file has. Don’t be like them.

This is what you should actually do:

recognise that you know exactly how to make your documentation pages load instantly, like an SPA, and could absolutely do it if you wanted to, still with a tiny fraction of the complexity of an actual SPA architecture, and with fixes for the issues I’ve mentioned, in about 15 minutes, then,
don’t.

As protection against the FOMO and fashion that drives so much of web development, this attitude needs a catchy slogan, which is the kind of thing I’m not very good at. But as a first attempt, how about: SNOB driven development. SNOB means “Smugly kNOwing Better”. Or maybe that could be “Smugly NO-ing Better”.

Join me. Be an arrogant SNOB and just say No.

No one actually wants simplicity

2023-08-22T18:49:31+01:00

The reason that modern web development is swamped with complexity is that no one really wants things to be simple. We just think we do, while our choices prove otherwise.

A lot of developers want simplicity in the same way that a lot of clients claim they want a fast website. You respond “OK, so we can remove some of these 17 Javascript trackers and other bloat that’s making your website horribly slow?” – no, apparently those are all critical business functionality.

In other words, they prioritise everything over speed. And then they wonder why using their website is like rowing a boat through a lake of molasses on a cold day using nothing but a small plastic spoon.

The same is often true of complexity. The real test is the question “what are you willing to sacrifice to achieve simplicity?” If the answer is “nothing”, then you don’t actually love simplicity at all, it’s your lowest priority.

When I say “sacrifice”, I don’t mean that choosing simplicity will mean you are worse off overall – simplicity brings massive benefits. But it does mean that there will be some things that tempt you to believe you are missing out.

For every developer, it might be something different. For one, the tedium of having to spend half an hour a month ensuring that two different things are kept in sync easily justifies the adoption of a bulky framework that solves that particular problem. For another, the ability to control how a checkbox animates when you check it is of course a valid reason to add another 50 packages and 3 layers of frameworks to their product. For another, adding an abstraction with thousands of lines of codes, dozens of classes and page after page of documentation in order to avoid manually writing a tiny factory function for a test is a great trade-off.

Of course we all claim to hate complexity, but it’s actually just complexity added by other people that we hate — our own bugbears are always exempted, and for things we understand we quickly become unable to even see there is a potential problem for other people. Certainly there are frameworks and dependencies that justify their existence and adoption, but working out which ones they are is hard.

I think a good test of whether you truly love simplicity is whether you are able to remove things you have added, especially code you’ve written, even when it is still providing some value, because you realise it is not providing enough value.

Another test is what you are tempted to do when a problem arises with some of the complexity you’ve added. Is your first instinct to add even more stuff to fix it, or is it to remove and live with the loss?

The only path I can see through all this is to cultivate an almost obsessive suspicion of FOMO. I think that’s probably key to learning to say no.

You can stop using user-scalable=no and maximum-scale=1 in viewport meta tags now

2023-06-10T15:18:08+01:00

Many websites are still using a viewport meta tag like one of the following:

<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">

<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">

These days, you can almost certainly remove the maximum-scale or user-scalable properties, to leave:

<meta name="viewport" content="width=device-width, initial-scale=1">

This is the same as suggested by HTML5 boilerplate, so it should be a pretty good default for most people.

Why should you remove these properties? Because they’re bad for accessibility — they stop users on many mobile devices (mostly Android) from being able to zoom in and view things that would be too small otherwise. This doesn’t just affect people with impaired vision — as a fully sighted person I often find web pages where there are graphics with text and other details that are too small when using a mobile phone, and then I find I can’t zoom in either.

Who says so? The A11Y Project says “Never use maximum-scale=1”, and MDN also agree:

maximum-scale: Any value less than 3 fails accessibility

…

user-scalable: Setting the value to 0, which is the same as no, is against Web Content Accessibility Guidelines (WCAG).

The final question, then, is “Can you?”.

If you are like me you don’t want to remove something that was clearly added for some reason, which is a good instinct — see Chesterton’s fence. As far as I can tell, the practice of adding user-scalable=no or maximum-scale=1 became widespread because of several browser bugs which are now irrelevant or best addressed with other fixes:

Using CSS position:fixed only works in Android 2.1 thru 2.3 by using the following meta tag: <meta name="viewport" content="width=device-width, user-scalable=no"> (from caniuse.com)

This should not be relevant to most users these days.
Safari on iOS, at least in the past, would “zoom in” when flipping from portrait to landscape, unless you added maximum-scale=1

From what I can tell, this bug is probably fixed, and you get good behaviour when adding just initial-scale=1
Safari on iOS has unhelpful zooming behaviour when you click on a text box and the keyboard pops up, which some people fix using maximum-scale=1.

Rick Strahl has a comprehensive post on better fixes to this, which are basically:
- selectively add maximum-scale=1 to the viewport tag, only on iOS Safari, using a small bit of Javascript. This works without breaking accessibility, because iOS Safari apparently ignores maximum-scale=1 when it comes to user-initiated zooming
- setting font-size: 16px or higher for form inputs.

There are a couple of final cases to address:

Some people want pages to behave more like native apps, where zooming wouldn’t even be possible. Before you do this, consider that you are making problems for many people across your site for the sake of your own aesthetic preference. And, it doesn’t work for recent iOS anyway because it deliberately ignores the properties for the sake of accessibility.
You may need to control how zooming gestures work for certain components on the page. I believe the correct solution in this case is touch-action.

That’s all, thanks!

Re-using CSS for the wrong HTML with Sass

2023-06-01T20:44:15Z

Recently, while writing up some examples and pattern for using htmx with Django for form validation, I discovered a new trick for using externally defined CSS without having to change the HTML you are working with.

To make it concrete, an example might be that you are using some CSS from a CSS library or framework that requires your HTML to look a certain way. In the Bulma framework, for instance, you have to add the right class attribute directly on an element that needs styling.

At the same time, you might be working with another system that is generating the HTML for you, and modifying that output might be hard or impossible or just tedious and a potential maintenance burden going forward. For instance, in Django forms, there is an ErrorList class whose output can be overridden, but by default renders like this:

<ul class="errorlist">
  <li>Enter a valid email address.</li>
</ul>

Now I have these requirements:

I want this error list to be coloured using a Bulma colour utility as if it had class="has-text-danger" when it appears within a field row (which are <div class="field"> elements).
When it appears at the top of the form where it has an extra nofield class, I want it to instead be styled like a Bulma notification as if it had class="notification is-danger is-light".

But I want to do these without changing the HTML we’re given by Django, or changing existing CSS – only by adding some CSS rules.

The “best” way to do this is if your CSS framework provides its styles as a set of Sass mixins, or something equivalent. Bulma, as it happens, usually does this, but sometimes we’re not so lucky, and we just have CSS.

The trick I learnt requires you to use Sass/SCSS and the @extend directive. This powerful directive takes rules relating to one selector, and pulls them into whatever rule you are writing.

(If you are, like me, put off using things like CSS pre-processors because of the need for a separate build step, or needing to use Node.js/npm, see my post on How to use Sass/SCSS in a Django project without needing Node.js/npm or running a build process)

The one thing you have to do is rename the base CSS file you want to re-use from .css to .scss. This works because SCSS is a CSS superset. Then, for the example above, you can write your own SCSS file like this:

@import "path/to/bulma.scss";

.field ul.errorlist {
    @extend .has-text-danger;
}

ul.errorlist.nonfield {
    @extend .notification;
    @extend .is-danger;
    @extend .is-light;
}

This technique can be very powerful e.g. make all input[type=text] inside a <form class="bulma"> have the normal Bulma input appearance:

form.bulma {
    input[type=text] {
        @extend .input;
    }
}

This will include all related rules like .input:focus etc.

As mentioned, it may not always be the best technique, but it’s a great one to have in your toolbox.

Django and Sass/SCSS without Node.js or a build step

2023-06-01T19:54:15Z

Although they are less necessary than in the past, I like to use a CSS pre-processor when doing web development. I used to use LessCSS, but recently I’ve found that I can use Sass without needing either a separate build step, or a package that requires Node.js and npm to install it. The heart of the functionality is provided by libsass, an implementation of Sass as a C++ library.

On Linux systems, this can be installed as a package libsass or similar, but even better is that you can pip install it as a Python package, libsass.

When it comes to using it from a Django project, the first step is to install django-compressor.

Then, you need to add django-libsass as per its instructions.

That’s about it. As per the django-libsass instructions, somewhere in your base HTML templates you’ll have something like this:

{# at the top #}
{% load compress %}
{% load static %}

{# in the <head> element #]
{% compress css %}
  <link rel="stylesheet" type="text/x-scss" href="{% static "myapp/css/main.scss" %}" />
{% endcompress %}

You write your SCSS in that main.scss file (it doesn’t have to be called that), and it can @import other SCSS files of course.

Then, when you load a page, django-compressor will take care of running the SCSS files through libsass, saving the output CSS to a file and inserting the appropriate HTML that references that CSS file into your template output. It caches things very well so that you don’t incur any penalty if files haven’t changed — and libsass is a very fast implementation for when the processing does need to happen.

What this means is that you have eliminated both the need for Node.js/npm, and the need for a build step/process, if you only needed these things for CSS pre-processing.

Of course, the SCSS → CSS compilation still has to happen, but it happens on demand in the same process that runs the web app, and it’s both fast enough and reliable enough that you simply never have to think about it again. So this is “build-less” in the same way that “server-less” means you don’t have to think about servers, and the same way that Python “doesn’t have a compilation step”.

Future proofing

On the Sass-lang page about libsass, they say it is “deprecated”, and on the project page page it says:

While it will continue to receive maintenance releases indefinitely, there are no plans to add additional features or compatibility with any new CSS or Sass features.

In other words, this is what I prefer to call “mature software” 😉. libsass already has everything I need. If it does eventually fail to be maintained or I need new features, it’s not a problem:

Switch to Dart Sass, which can be installed as a standalone binary.

Set your django-compressor settings like this:

COMPRESS_PRECOMPILERS = [
    ("text/x-scss", "sass {infile} {outfile}"),
]

This covers the basic case. If you want all the features of django-libsass, which includes looking in your other static file folders for SCSS, you’ll probably need to fork the code and make it work by calling Dart Sass using subprocess — a small amount of work, and nothing that will fundamentally break this approach.

Test smarter, not harder

2020-09-04T19:46:50+01:00

“Smarter, not harder” is a saying used in many contexts, but rowing is the context I think I first heard it in, and I still associate it with rowing many years later.

When you look at novice and more experienced rowing crews, it seems particularly appropriate, because the primary difference is not the amount of effort that goes in, nor even the strength of the rowers, but technique. Poor rowers still finish a race absolutely exhausted, but they've moved at a fraction of the speed of better crews. Sometimes the effort they put in actually slows the boat down. They tend to make a lot of noise, splash a huge amount of water in every direction, and pull a lot of faces. (I did a lot of all those things when I tried rowing!).

Expert crews, however, do none of these things, because they don't make you go faster. These rowers do a huge amount of training, and exercise massive amounts of concentration, to ensure that every bit of the (very large) effort they put in is actually contributing to speed.

The “smarter not harder” mindset is also essential for writing good automated software tests.

It's in this context that religious devotion to things like TDD can be really unhelpful. For many religions, the more painful an activity, and the more you do it, the more meritorious it is – and it may even atone for past misdeeds. If you take that mindset with you into writing tests, you will do a rather bad job.

If writing tests is extremely painful, it may be a sign that something is wrong. Huge and unnecessary quantities of tests are not meritorious, they are a massive maintenance burden. Many of the things that make tests hard to write are also going to make them hard (and therefore expensive) to maintain. I've seen far too many examples where it looks like people have just sat back and accepted their painful fate.

For example, good ol' Uncle Bob seems to have this attitude. He wrote:

you’d better get used to writing lots and lots of tests, no matter what language you are using!

Don't listen to Uncle Bob! (at least, not on this subject).

“Test smarter, not harder” means:

Only write necessary tests – specifically, tests whose estimated value is greater than their estimated cost. This is a hard judgement call, of course, but it does mean that at least some of the time you should be saying “it's not worth it”. Some of the costs associated with tests are:
- the time taken to write them.
- the time they add to the test suite on every run.
- the time to maintain them - understand them, debug them, change them when other things change.
- every time they fail incorrectly - when the functionality works, but the test fails.
The value on the other hand, is found in:
- catching regressions, and doing so at low cost with a quick feedback loop.
- enabling fearless refactoring (which is a consequence of the above, but distinct from it).
- providing a starting point for making changes, including a form of documentation for the existing desirable behaviour.

Write your test code with the functions/methods/classes you wish existed, not the ones you've been given. For example, don't write this:

self.driver.get(self.live_server_url + reverse("contact_form"))
self.driver.find_element_by_css_selector("#id_email").send_keys("my@email.com")
self.driver.find_element_by_css_selector("#id_message").send_keys("Hello")
self.driver.find_element_by_css_selector("input[type=submit]").click()
WebDriverWait(self.driver, 10).until(lambda driver: driver.find_element_by_css_selector("body"))

That looks very tedious! Write this instead:

self.get_url("contact_form")
self.fill({
    "#id_email": "my@email.com",
    "#id_message": "Hello",
})
self.submit("input[type=submit]")

(Like you can with django-functest, but it's the principle, not the library, that's important. If the API you want to use doesn't exist yet, you still use it, and then make it exist.)

Don't write tests for things that can be more effectively tested in other ways, and lean on other correctness methodologies as much as possible. These include:
- code review
- static type checking (especially in languages with sound and powerful type systems, with type inference everywhere, giving you a very good cost-benefit ratio)
- linters like flake8 and Semgrep.
- formal methods
- introspection (like Django's checks framework)
- property based testing like hypothesis.
Move the burden onto the computer. “Push the loop in”.

Take, for example, a requirement that every entry point to your web app (i.e. a page or HTTP API), apart from a few exceptions like login and reset password, should require authentication.

The “test harder” religion interprets this as:
- For every entry point
  - Write a test that
    - Ensures non-authenticated requests return 403
That's a lot of tests, and even worse is that you have to remember to write them.

“Test smarter” says:
- Write a test that
  - For every entry point
    - Ensures non-authenticated requests return 403
That's one test. “Write a test” is executed in developer time, so in the first example the loop ("For every entry point") is also executed in developer time. Push the loop inside the test, and it gets executed in computer time instead.

Already mentioned, but hypothesis is a great way to push the loop in. Also, the implementation of the requirements can benefit from the same techniques that the tests do.
Cheat on your homework. It's smart to get help, and hard work is for suckers. If you have a good idea, but don't know the techniques or tools you need to implement it, or whether it is even possible (for example, in the example above you don't know how to introspect your system to get a list of all entry points), there are a lot of smart people on StackOverflow who will revel in the challenge.

(Level up: loudly claim on Twitter that "it appears to be impossible to X with tool Y" and know-it-alls like me will magically appear with solutions).

Of course, there are still times when hard work is required for writing tests — times when it will be tedious, and times when our instincts to skimp are actually misplaced laziness that will cost more in the long run. But you should hustle and cheat your way out of unnecessary effort as much as you possibly can. Your overall testing strategy should feel like “I get that computer to do so much work for me!”, not ”My RSI and bleeding fingers have hopefully appeased the testing gods and atoned for my previous omissions”.

Announcement: Django Views - The Right Way

2020-08-19T21:51:36+01:00

I announced this a few days back on Twitter, this is just a quick additional blog post to announce Django Views - The Right Way. It's an opinionated guide to writing views in Django that I've been working on for a few months.

This project turned out to be much bigger than I expected. And in the end, more about general programming and Python principles than just Django – so you may enjoy it even if you're not into Django.

Luke Plant's home page (Posts about Web development)

Help my website is too small

Links

Keeping things in sync: derive vs test

The essential problem

The ideal solution: derive

Alternative solution: test

Examples

Example 1 - external data sources

Example 2 - defining UI behaviour for domain objects

Example 3 - external polymorphism and static typing

Example 4 - generated code

Conclusion

Links

Enforcing conventions in Django projects with introspection

The problem: DateField and DateTimeField confusion

The tools

Introspection

Django app and model introspection

Django checks framework

The solution

Output

Conclusion

Links

Super-fast Sphinx docs, and SNOB driven development

Actually, please don’t

Links

No one actually wants simplicity

Links

You can stop using user-scalable=no and maximum-scale=1 in viewport meta tags now

Re-using CSS for the wrong HTML with Sass

Django and Sass/SCSS without Node.js or a build step

Future proofing

Test smarter, not harder

Links

Announcement: Django Views - The Right Way