Keeping things in sync: derive vs test

by Luke Plant

Posted in:

— June 28, 2024 10:15

An extremely common problem in programming is that multiple parts of a program need to be kept in sync – they need to do exactly the same thing or behave in a consistent way. It is in response to this problem that we have mantras like “DRY” (Don’t Repeat Yourself), or, as I prefer it, OAOO, “Each and every declaration of behaviour should appear Once And Only Once”.

For both of these mantras, if you are faced with possible duplication of any kind, the answer is simply “just say no”. However, since programming mantras are to be understood as proverbs, not absolute laws, there are times that obeying this mantra can hurt more than it helps, so in this post I’m going to discuss other approaches.

Most of what I say is fairly language agnostic I think, but I’ve got specific tips for Python and web development.

The essential problem

To step back for a second, the essential problem that we are addressing here is that if making a change to a certain behaviour requires changing more than one place in the code, we have the risk that one will be forgotten. This results in bugs, which can be of various degrees of seriousness depending on the code in question.

To pick a concrete example, suppose we have a rule that says that items in a deleted folder get stored for 30 days, then expunged. We’re going to need some code that does the actual expunging after 30 days, but we’re also going to need to tell the user about the limit somewhere in the user interface. “Once And Only Once” says that the 30 days limit needs to be defined in a single place somewhere, and then reused.

There is a second kind of motivating example, which I think often crops up when people quote “Don’t Repeat Yourself”, and it’s really about avoiding tedious things from a developer perspective. Suppose you need to add an item to a menu, and you find out that first you’ve got to edit the MENU_ITEMS file to add an entry, then you’ve got to edit the MAIN_MENU constant to refer to the new entry, then you’ve got to define a keyboard shortcut in the MENU_SHORTCUTS file, then a menu icon somewhere else etc. All of these different places are in some way repeating things about how menus work. I think this is less important in general, but it is certainly life-draining as a developer if code is structured in this way, especially if it is difficult to discover or remember all the things that have to be done.

The ideal solution: derive

OAOO and DRY say that we aim to have a single place that defines the rule or logic, and any other place should be derived from this.

Regarding the simple example of a time limit displayed in the UI and used in the backend, this might be as simple as defining a constant e.g. in Python:

from datetime import timedelta

EXPUNGE_TIME_LIMIT = timedelta(days=30)

We then import and use this constant in both our UI and backend.

An important part of this approach is that the “deriving” process should be entirely automatic, not something that you can forget to do. In the case of a Python import statement, that is very easy to achieve, and relatively hard to get wrong – if you change the constant where it is defined in one module, any other code that uses it will pick up the change the next time the Python process is restarted.

Alternative solution: test

By “test”, I mean ideally an automated test, but manual tests may also work if they are properly scripted. The idea is that you write a test that checks the behaviour of code is synced. Often, it may be that for one (or more) instances that need the behaviour will define it using some constant as above, let’s say the “backend” code. Then, for one instance, e.g. the UI, you would hard code “30 days” without using the constant, but have a test that uses the backend constant to build a string, and checks the UI for that string.

Examples

In the example above, it might be hard to see why you want to use the fundamentally less reliable, less automatic method I’m suggesting. So I now have to show some motivating examples where the “derive” method ends up losing to the cruder, simpler alternative of “test”.

Example 1 - external data sources

My first example comes from the project I’m currently working on, which involves creating CAM files from input data. Most of the logic for that is driven using code, but there are some dimensions that are specified as data tables by the engineers of the physical product.

These data tables look something like below. The details here aren’t important, and I’ve changed them – it’s enough to know that we’ve are creating some physical “widgets” which need to have specific dimensions specified:

Widgets have length 150mm unless specified below
Widget id	Location	Length (mm)
A	start	100
A	end	120
F	start	105
F	end	110

These tables are supplied at design-time rather than run-time i.e. they are bundled with the software and can’t be changed after the code is shipped. But it is still convenient to read them in automatically rather than simply duplicate the tables in my code by some process. So, for the body of the table, that’s exactly what my code does on startup – it reads the bundled XLSX/CSV files.

So we are obeying “derive” here — there is a single, canonical source of data, and anywhere that needs it derives it by an entirely automatic process.

But what about that “150mm” default value specified in the header of that table?

It would be possible to “derive” it by having a parser. Writing such a parser is not hard to do – for this kind of thing in Python I like parsy, and it is as simple as:

import parsy as P

default_length_parser = (
  P.string("Widgets have length ") >>
  P.regex(r"\d+").map(int)
  << P.string("mm unless specified below")
)

In fact I do something similar in some cases. But in reality, the “parser” here is pretty simplistic – it can’t deal with the real variety of English text that might be put into the sentence, and to claim I’m “deriving” it from the table is a bit of a stretch – I’m just matching a specific, known pattern. In addition, it’s probably not the case that any value for the default length would work – most likely if it was 10 times larger, there would be some other problem, and I’d want to do some manual checking.

So, let’s admit that we are really just checking for something expected, using the “test” approach. You can still define a constant that you use in most of the code:

DEFAULT_LENGTH_MM = 150

And then you test it is what you expect when you load the data file:

assert worksheets[0].cell(1, 1).value == f"Widgets have length {DEFAULT_LENGTH_MM}mm unless specified below"

So, I’ve achieved my aim: a guard against the original problem of having multiple sources of information that could potentially be out of sync. But I’ve done it using a simple test, rather than a more complex and fragile “derive” that wouldn’t have worked well anyway.

By the way, for this specific project – we’re looking for another contract developer! It’s a very worthwhile project, and one I’m really enjoying – a small flexible team, with plenty of problem solving and fun challenges, so if you’re a talented developer and interested give me a shout.

Example 2 - defining UI behaviour for domain objects

Suppose you have a database that stores information about some kind of entity, like customers say, and you have different types of customer, represented using an enum of some kind, perhaps a string enum like this in Python:

from enum import StrEnum


class CustomerType(StrEnum):
    ENTERPRISE = "Enterprise"
    SMALL_FRY = "Small fry"  # Let’s be honest! Try not to let the name leak…
    LEGACY = "Legacy"

We need to a way edit the different customer types, and they are sufficiently different that we want quite different interfaces. So, we might have a dictionary mapping the customer type to a function or class that defines the UI. If this were a Django project, it might be a different Form class for each type:

CUSTOMER_EDIT_FORMS = {
    CustomerType.ENTERPRISE: EnterpriseCustomerForm,
    CustomerType.SMALL_FRY: SmallFryCustomerForm,
    CustomerType.LEGACY: LegacyCustomerForm,
}

Now, the DRY instinct kicks in and we notice that we now have two things we have to remember to keep in sync — any addition to the customer enum requires a corresponding addition to the UI definition dictionary. Maybe there are multiple dictionaries like this.

We could attempt to solve this by “deriving”, or some “correct by construction” mechanism that puts the creation of a new customer type all in one place.

For example, maybe we’ll have a base Customer class with get_edit_form_class() as an abstractmethod, which means it is required to be implemented. If I fail to implement it in a subclass, I can’t even construct an instance of the new customer subclass – it will throw an error.

from abc import abstractmethod

class Customer:
    @abstractmethod
    def get_edit_form_class(self):
        pass


class EnterpriseCustomer(Customer):
    def get_edit_form_class(self):
        return EnterpriseCustomerForm

class LegacyCustomer(Customer):
    ...  # etc.

I still need my enum value, or at least a list of valid values that I can use for my database field. Maybe I could derive that automatically by looking at all the sublclasses?

CUSTOMER_TYPES = [
    cls.__name__.upper().replace("CUSTOMER", "")
    for cls in Customer.__subclasses__()
]

Or maybe an __init_subclass__ trick, and I can perhaps also set up the various mappings I’ll need that way?

It’s at this point you should stop and think. In addition to requiring you to mix UI concerns into the Customer class definitions, it’s getting complex and magical.

The alternative I’m suggesting is this: require manual syncing of the two parts of the code base, but add a test to ensure that you did it. All you need is a few lines after your CUSTOMER_EDIT_FORMS definition:

CUSTOMER_EDIT_FORMS = {
    # etc as before
}

for c_type in CustomerType:
    assert (
        c_type in CUSTOMER_EDIT_FORMS
    ), f"You've defined a new customer type {c_type}, you need to add an entry in CUSTOMER_EDIT_FORMS"

You could do this as a more traditional unit test in a separate file, but for simple things like this, I think an assertion right next to the code works much better. It really helps local reasoning to be able to look and immediately conclude “yes, I can see that this dictionary must be exhaustive because the assertion tells me so.” Plus you get really early failure – as soon as you import the code.

This kind of thing crops up a lot – if you create a class here, you’ve got to create another one over there, or add a dictionary entry etc. In these cases, I’m finding simple tests and assertions have a ton of advantages when compared to clever architectural contortions (or other things like advanced static typing gymnastics):

they are massively simpler to create and understand.
you can write your own error message in the assertion. If you make a habit of using really clear error messages, like the one above, your code base will literally tell you how to maintain it.
you can easily add things like exceptions. “Every Customer type needs an edit UI defined, except Legacy because they are read only” is an easy, small change to the above.
- This contrasts with cleverer mechanisms, which might require relaxing other constraints to the point where you defeat the whole point of the mechanism, or create more difficulties for yourself.
the rule about how the code works is very explicit, rather than implicit in some complicated code structure, and typically needs no comment other than what you write in the assertion message.
you express and enforce the rule, with any complexities it gains, in just one place. Ironically, if you try to enforce this kind of constraint using type systems or hierarchies to eliminate repetition or the need for any kind of code syncing, you may find that when you come to change the constraint it actually requires touching far more places.
temporarily silencing the assertion while developing is easy and doesn’t have far reaching consequences.

Of course, there are many times when being able to automatically derive things at the code level, including some complex relationships between parts of the code, can be a win, and it’s the kind of thing you can do in Python with its many powerful techniques.

But my point is that you should remember the alternative: “synchronise manually, and have a test to check you did it.” Being able to add any kind of executable code at module level – the same level as class/function/constant definitions – is a Python super-power that you should use.

Example 3 - external polymorphism and static typing

A variant of the above problem is when, instead of an enum defining different types, I’ve got a set of classes that all need some behaviour defined.

Often we just use polymorphism where a base class defines the methods or interfaces needed and sub-classes provide the implementation. However, as in the previous case, this can involve mixing concerns e.g. user interface code, possibly of several types, is mixed up with the base domain objects. It also imposes constraints on class hierarchies.

Recently for these kind of cases, I’m more likely to prefer external polymorphism to avoid these problems. To give an example, in my current project I’m using the Command pattern or plan-execute pattern extensively, and it involves manipulating CAM objects using a series of command objects that look something like this:

@dataclass
class DeleteFeature:
    feature_name: str


@dataclass
class SetParameter:
    param_name: str
    value: float


@dataclass
class SetTextSegment:
    text_name: str
    segment: int
    value: str


Command: TypeAlias = DeleteFeature | SetParameter | SetTextSegment

Note that none of them share a base class, but I do have a union type that gives me the complete set.

It’s much more convenient to define the behaviour associated with these separately from these definitions, and so I have multiple other places that deal with Command, such as the place that executes these commands and several others. One example that requires very little code to show is where I’m generating user-presentable tables that show groups of commands. I convert each of these Command objects into key-value pairs that are used for column headings and values:

def get_command_display(command: Command) -> tuple[str, str | float | bool]:
    match command:
        case DeleteFeature(feature_name=feature_name):
            return (f"Delete {feature_name}", True)
        case SetParameter(param_name=param_name, value=value):
            return (param_name, value)
        case SetTextSegment(text_name=text_name, segment=segment, value=value):
            return (f"{text_name}[{segment}]", value)

This is giving me a similar problem to the one I had before I had before: if I add a new Command, I have to remember to add the new branch to get_command_display.

I could split out get_command_display into a dictionary of functions, and apply the same technique as in the previous example, but it’s more work, a less natural fit for the problem and potentially less flexible.

Instead, all I need to do is add exhaustiveness checking with one more branch:

match command:
    ...  # etc
    case _:
        assert_never(command)

Now, pyright will check that I didn’t forget to add branches here for any new Command. The error message is not controllable, in contrast to hand-written asserts, but it is clear enough.

The theme here is that additions in one part of the code require synchronised additions in other parts of the code, rather than being automatically correct “by construction”, but you have something that tests you didn’t forget.

Example 4 - generated code

In web development, ensuring consistent design and keeping different things in sync is a significant problem. There are many approaches, but let’s start with the simple case of using a single CSS stylesheet to define all the styles.

We may want a bunch of components to have a consistent border colour, and a first attempt might look like this (ignoring the many issues of naming conventions here):

.card-component, .bordered-heading {
   border-color: #800;
}

This often becomes impractical when we want to organise by component, rather than by property, which introduces duplication:

.card-component {
   border-color: #800;
}

/* somewhere far away … */

.bordered-heading {
   border-color: #800;
}

Thankfully, CSS has variables, so the first application of “derive” is straightforward – we define a variable which we can use in multiple places:

:root {
    --primary-border-color: #800;
}

/* elsewhere */

.bordered-heading {
    border-bottom: 1px solid var(--primary-border-color);
}

However, as the project grows, we may find that we want to use the same variables in different contexts where CSS isn’t applicable. So the next step at this point is typically to move to Design Tokens.

Practically speaking, this might mean that we now have our variables defined in a separate JSON file. Maybe something like this (using a W3C draft spec):

{
  "primary-border-color": {
    "$value": "#800000",
    "$type": "color"
  }
  "primary-hightlight-color": {
    "$value": "#FBC100",
    "$type": "color"
  }
}

From this, we can automatically generate CSS fragments that contain the same variables quite easily – for simple cases, this isn’t more than a 50 line Python script.

However, we’ve got some choices when it comes to how we put everything together. I think the general assumption in web development world is that a fully automatic “derive” is the only acceptable answer. This typically means you have to put your own CSS in a separate file, and then you have a build tool that watches for changes, and compiles your CSS plus the generated CSS into the final output that gets sent to the browser.

In addition, once you’ve bought into these kind of tools you’ll find they want to do extensive changes to the output, and define more and more extensions to the underlying languages. For example, postcss-design-tokens wants you to write things like:

.foo {
     color: design-token('color.background.primary');
 }

And instead of using CSS variables in the output, it puts the value of the token right in to every place in your code that uses it.

This approach has various problems, in particular that you become more and more dependent on the build process, and the output gets further from your input. You can no longer use the Dev Tools built in to your browser to do editing – the flow of using Dev Tools to experiment with changing a single spacing or colour CSS variable for global changes is broken, you need your build tool. And you can’t easily copy changes from Dev Tools back into the source, because of the transformation step, and debugging can be similarly difficult. And then, you’ll probably want special IDE support for the special CSS extensions, rather than being able to lean on your editor simply understanding CSS, and any other tools that want to look at your CSS now need support etc.

It’s also a lot of extra infrastructure and complexity to solve this one problem, especially when our design tokens JSON file is probably not going to change that often, or is going to have long periods of high stability. There are good reasons to want to be essentially build free. The current state of the art in this space is that to get your build tool to compile your CSS you add import './styles.css' in your entry point Javascript file! What if I don’t even have a Javascript file? I think I understand how this sort of thing came about, but don’t try to tell me that it’s anything less than completely bonkers.

Do we have an alternative to the fully automatic derive?

Using the “test” approach, we do. We can even stick with our single CSS file – we just write it like this:

/* DESIGN TOKENS START */
/* auto-created block - do not edit */
:root {
    --primary-border-color: #800000;
    --primary-highlight-color: #FBC100;
}
/* DESIGN TOKENS END */

/* the rest of our CSS here */

The contents of this block will be almost certainly auto-generated. We won’t have a process that fully automatically updates it, however, because this is the same file where we are putting our custom CSS, and we don’t want any possibility of lost work due to the file being overwritten as we are editing it.

On the other hand we don’t want things to get out of sync, so we’ll add a test that checks whether the current styles.css contains the block of design tokens that we expect to be there, based on the JSON. For actually updating the block, we’ll need some kind of manual step – maybe a script that can find and update the DESIGN TOKEN START block, maybe cog – which is a perfect little tool for this use case — or we could just copy-paste.

There are also slightly simpler solutions in this case, like using a CSS import if you don’t mind having multiple CSS files.

Conclusion

For all the examples above, the solutions I’ve presented might not work perfectly for your context. You might also want to draw the line at different place to me. But my main point is that we don’t have to go all the way with a fully automatic derive solution to eliminate any manual syncing. Having some manual work plus a mechanism to test that two things are in sync is a perfectly legitimate solution, and it can avoid some of the large costs that come with structuring everything around “derive”.

You may also like: §

Comments §