Life on the diagonal – adventures in 2-D time

by Luke Plant

Posted in:

Software development

— July 19, 2021 12:59

In his post on Temporal Patterns, Martin Fowler writes:

We've all learned, if only from bad Science Fiction books, that time is the fourth dimension. The trouble is that this is wrong… Time isn't the fourth dimension, it's the fourth and fifth dimensions!

When I first read this, I bookmarked the page as interesting, but had no practical need for the pattern described, until 5 years later, when it became indispensable in some work for a client. If you have a system that needs 2-D time but doesn't have it, you will become as lost and confused as someone who attempts to navigate a city with a 1-D map.

My aim in this post is to make this excellent pattern more well known, and to give further examples of the pattern in action.

The setup

There are many ways this pattern can manifest itself or be used, and Martin Fowler's article has some examples where it is essential. I'm going to use a slightly different example.

Let's say we are a business with customers, and we track the amount of money that they have spent on various services in an account for each customer. In the examples that follow I'll assume it's an email hosting company, that provides several plans and other add-on services.

The ledger solution

We might have a simple log of every time money gets spent. But problems arise when we need to make corrections of some kind.

Every bank account I've come across, at least from an external perspective, handles this by not really handling it. Corrections to existing entries are not allowed. So if you challenge a bank fee, and you win, the result will be a reversal that adds a new item, cancelling out part or whole of the fee, and restoring the correct balance (hopefully). I'll call this the “ledger” approach.

The immutable ledger has some compelling advantages, especially for the implementer, but sometimes it just doesn't cut it. In particular, it means that the balance is actually the only thing you can correct. You cannot correct the date or amount of any item. So, if you are trying to present to your customer an account that shows you haven't fiddled them, it's not hugely useful. They cannot simply check that you charged them the right amount for each item – they have to do their own list of all item amounts, then do some sums to check that the listed item amounts plus the adjustment amounts add up to what they expect. Or you will have to add those calculations for them in a plain text description field, which is hard to check.

Can we do better?

We could allow destructive updates in order to make corrections. However, we may have already sent notification emails or done automatic charges to a credit card based on the old amounts. If we simply change the amounts, things are not going to add up, and we'll get ourselves into hot water pretty quickly. For that situation, we'd like to know both what we thought the value was, and what we think it now is – and possibly any number of corrections.

Instead of adding hack after hack to get closer to the solution, I'll present the answer, which is 2-D time. Then we'll see how this one conceptual jump, which at first is a bit mind-boggling, suddenly makes almost impossibly complicated sums extremely straight-forward.

The calendar solution

Instead of an immutable ledger, we are going to have a fully mutable calendar that we can add and remove events from and alter as we wish. However, the calendar is also fully versioned, so that we don't have to do destructive updates and throw away any information.

This system will give us a bi-temporal view of the changing amount of money in each customer's account:

The first dimension is “actual” time, or “event” time. This is like the time that would appear in a ledger, if we had one – the time at which money is spent.
The second dimension is “record” time – the time at which we make a change to the calendar. We can also call it “knowledge” time.

Now, it is possible in some cases that we will hide one or even both of these times in normal operations. When you charge a customer for a service, you might have an interface that asks simply for the amount, which you charge immediately — meaning that you set both the event time and the record time to “right now”. But it is still important to understand that these are two distinct times that are in fact in completely different dimensions, and are completely incomparable to each other.

Data storage

This section is strictly optional – the remainder of the post doesn't depend on understanding how we store and query the data – but it can help some people to have a more concrete understanding.

There are multiple ways that we can store this 2-D calendar, but I'll present a simple, flexible one that can also be made fairly efficient:

Every event that goes on the calendar needs an ID of some kind. It mustn't clash with any other event, obviously, but the same ID must be used for each different version of the event if you make changes.

In our example, charges in our calendar could relate to different tables – for example the payments table and the subscriptions table – so we're going to use a string ID with different forms for different related tables. For example:
- For incoming payments from the customer, I'm going to use payment-$P where $P is the ID from the payments table.
- For deductions against “subscriptions”, I'm going to use subscription-$S-month-$N for the event ID, where:
  - $S is the ID in the subscription table, which is a table of all the services all users have subscribed to.
  - $N is an increasing integer, 1 for the first month etc.
We need columns to record the amount, the charged_at timestamp of when the deduction was made for the service, and the recorded_at timestamp of when we put the data into the database.
We will use the recorded_at timestamp to distinguish between different versions of the same event.
We will store this data in an append-only table with no destructive updates (UPDATE or DELETE). This gives us the guarantee of being able to retrieve earlier versions of the calendar.

We achieve “edit” by adding a new entry with the same event ID, but different details. The row with the most recent recorded_at “wins” in terms of representing current data.

We can do deletions by adding a new record with the same event_id as an existing one, with the charged_at timestamp set to NULL, indicating the event is no longer on our calendar. (An alternative would be to just set the amount to zero, but that would disallow zero amount items from being visible, which we might want).

Here is an example in which we add:

a payment
the first month's deduction for “Basic email plan”
an amendment in which we reduce the amount to give them the discounted price which wasn't applied in the first month for some reason.
the second month's deduction.

`id`	`customer`	`event_id`	`recorded_at`	`charged_at`	`amount`	`description`
1	1	payment-1	2021-01-09	2021-01-09	100	Credit card payment
2	1	subscription-123-month-1	2021-01-10	2021-01-10	-10	Basic email plan
3	1	subscription-123-month-1	2021-01-25	2021-01-10	-8	Basic email plan (discounted)
4	1	subscription-123-month-2	2021-02-10	2021-02-10	-8	Basic email plan (discounted)

Note:

Typically I'd use a full timestamp, not just date, for both time columns above.
Event ID subscription-123-month-1 appears twice, distinguished by different recorded_at timestamps. The row with id 3 is an amendment to the row with id 2, and supersedes it.
The charged_at timestamp value might be the same as recorded_at, but it might not be.
The id column here serves no purposes, other than a simple primary key for items in the table. We should be able to add a unique constraint on (event_id, recorded_at)

We might also want some additional columns:

nullable foreign key fields relating to the subscriptions/payments table etc., to allow easier filtering if needed. This can be important for performance and scalability too.
some auditing fields which will include the user or higher-level actions that triggered the addition or change to the calendar.

Edit: as an alternative implementation strategy, you might consider "Temporal Tables", a SQL 2011 extension that is available in some database systems such Maria DB, SQL Server, as a PostgreSQL extension and probably others. Thanks to Adam Johnson for the pointer. There are also databases like XTDB with built-in support for bitemporality.

Querying

There are different things we might want to do with this data, but here are some common things that come up:

We need to get a single version for each event. We can do this easily with window functions and probably other techniques. Here is a simple query that will get the most recent version of the calendar, which is typically what you care most about. This code should work in SQLite (tested), PostgreSQL and probably others, but may not be the most efficient way.

SELECT DISTINCT
  event_id,
  first_value(amount) OVER w AS amount,
  first_value(charged_at) OVER w AS charged_at,
  first_value(description) OVER w AS description
FROM calendar
WINDOW w AS (PARTITION BY event_id ORDER BY recorded_at DESC)
ORDER BY charged_at;

Results:

`event_id`	`amount`	`charged_at`	`description`
payment-1	100	2021-01-09	Credit card payment
subscription-123-month-1	-8	2021-01-10	Basic email plan (discounted)
subscription-123-month-2	-8	2021-02-10	Basic email plan (discounted)

This query doesn't exclude null events, however – that would need to be done by wrapping in an outer query, or with application level filtering.

We might want to get old versions of the calendar. This can be done by adding a WHERE recorded_at < ... clause to the above query

A 2-D calendar

Let's put the example I detailed above onto a 2-D visualisation. To re-iterate for those that skipped that section:

we had an incoming payment of $100 on 2021-01-09
we had a deduction of $10 on the next day, 2021-01-10
on 2021-01-25, a few weeks later, we adjusted the previous deduction to $8
we had a further deduction of $8 on 2021-02-10

We'll represent the calendar itself vertically. Successive new versions of the calendar, corresponding to changes in the knowledge dimension of time, are represented horizontally.

Vertical time

When a customer comes onto their account page and wants to see a list of charges, we'll present something like the last column of the above diagram – a simple list of charges ordered by date, just as you would have for the ledger system. The dates shown are the ones down the left hand side, in “event time” — we have just a simple vertical slice at a single horizontal coordinate.

Horizontal time

Horizontal movements in time provide a solid basis for all auditing features. In auditing, we are asking “when did the calendar change?”. We might be interested in all changes, changes to a specific event, or changes across a specific horizontal band. The dates that appear in our auditing report are the dates across the top of the above diagram, only some of which correspond to the dates of charges made.

Diagonal time

Things get more interesting here!

Suppose we have a monthly statement email that shows the change in balance over last month, possibly itemised.

In the ledger system, this is simple. Amendments to previous months are not allowed, so we just report on what happened over the last month, and everything will add up relative to the previous month's statement.

For the calendar solution, however, it's not quite so simple.

Suppose the customer has challenged something we charged in a previous month, and we agreed to reduce or remove the charge. But, for the sake of a fuller example, let's make things a bit harder.

Let's say we offer a “pay up front discount” if you pay a year in advance for an email plan. This feature is itself further complicated by the fact that customers have other services being charged from the same account, and the possibilities of debts etc. The upshot is that we calculate the discount on a monthly basis using a method that depends on the current amount in credit.

So, if we amend a previous entry, later amounts also need to be changed. We might have a more or less sophisticated or automatic method for handling the dependencies and updates, but whatever we do, we now have several entries that have been amended.

If we base our monthly statement email just on the most recent version of the calendar at the time we send it, the result will be confusing. Let's see an example:

The email statement for Month 1 said:

initial balance: $0
payment received: $100
service X: -$50
email plan: -$10
final balance: $40

Then the phone call happened where we agreed to cancel the charge for service X, refunding them $50, plus reducing the email plan to $9 due to our pay up front discount which they now qualify for, for a total of $51 refund.

For Month 2, if we just create the statement based on month 2 in the most recent version of the calendar, it will read:

initial balance: $91
email plan: -$9
final balance: $82

Here, the “final balance” is now correct, but the “initial balance” is confusing, since it isn't the same as the “final balance” of the previous email, and there is no record here at all of the refund, which in some sense “happened” in month 2.

Have we made things harder for ourselves? Slightly, but we've opened up more possibilities too. The trick is to answer the question of “when” the refund happened using 2-D time, not 1-D time.

The refund was done as a set of amendments to existing entries, which means it was a “refinement of knowledge” that happened in “knowledge time”. For this reason, the refund doesn't appear as an event on the calendar itself. It affected two events on the calendar, so the refund was actually done at two different points in “event time”.

Let's see the two dimensions on a diagram:

When we send out the email for month 2, the key point to understand is that since we last sent the statement out, for the end of month 1, we have moved in two dimensions of time, not one:

The “event time” on the calendar (vertical), has moved on by one month, introducing a new charge to be included in the balance.
But knowledge time (horizontal) has also moved on by one month. We know things that we didn't know the previous month, or know things more accurately (e.g. that we shouldn't have charged for service X).

In other words we have made a diagonal jump. Once we have this understanding, constructing a correct email to send out becomes actually very straightforward. We have to compare the state of affairs at (end of month 2, end of month 2) with (end of month 1, end of month 1).

Once we do this:

we can easily calculate the “initial balance” of $40 – it's just the balance at (end of month 1, end of month 1).
we calculate the final balance at (end of month 2, end of month 2), giving $82 as expected.
we can give an itemised breakdown of the difference between the initial balance and the final balance. It is composed of 2 sections:
- new entries (which will be for month 2).
- any amendments for previous months, for which we can give complete details of the changes (one charge completely refunded, one charge partly refunded in this case).

Now, of course, for the simple and more common case of “no amendments", we don't display the amendments section at all, and everything appears just as simple as if we were using a ledger approach.

Understanding diagonal time

A more fine-grained diagonal view is possible. Suppose a customer comes onto their account page every day, or even every hour, and makes a note of their balance. As charges and amendments are made, they will see the balance go up and down accordingly. They can make a plot of this over time, and the time axis here will in fact be a diagonal slice through our two time dimensions – they are sampling the 2-D calendar at points where “event time” = “record time”.

Alternatively, they could make a plot of the balance by looking once at the end of the year, using the list of charge amounts and dates. This will show only the amended data, and the plot they make will correspond to a vertical cut, going down “event time”, with the “record time” coordinate set to “most recent knowledge”. This plot will be different from the first if there have been any amendments, but will always end at the same balance.

The diagonal movement in 2-D time actually corresponds to the way we live our lives – it could be called “experiential time”.

Our experience of life consists of two sets of changes:

things that happen to us as time progresses,
changes in knowledge about things that have happened to us.

So, suppose I ask you what kind of week you've had. Your response might include:

things that have happened to you this last week.
things that actually happened to you before, but you only found out about this week. Perhaps:
- news about a nice payout from an investment that matured several months ago, but you only got notified of this week.
- new information about a childhood experience that sheds a completely different light on it, for good or bad, and causes you to reassess some important relationships etc.
These things could legitimately be thought of as things that have happened to you ”this week”, because of their effects on your experience this week. We live life on the diagonal.

Handling the future

We can use the same calendar to track expected or planned expenditure in the future, and so we can use it to do forecasts, for example.

In the ledger approach this would be very tricky – since you can't amend anything, you want to be careful not to put anything on that you are unsure of. But with the calendar, amendments are no problem at all. We can start with very incomplete and uncertain knowledge, and refine it over time, and that is no problem. If you are summing up expenditure to produce a balance, you will presumably want to exclude anything with a future date as “speculative”, but having it in the same calendar database table is not an issue.

Other tips and pointers

Once you have this system in place, in many cases we can actually model bug fixes as “refinements in knowledge”, which means that rolling out a change which corrects many incorrectly calculated charges is not a headache – it's just updating the calendar in the normal way.
Having an explicit understanding of the two dimensions can be helpful for requirements analysis and eliminating the impossible.

For example, when implementing refund requests, or certain kinds of refund requests, instead of amending existing entries, we could have new ones. This will make our calendar work more like a ledger (while still having our calendar super-powers available for some kinds of amendments). This may be a legitimate decision. However, a single type of refund cannot be both an amendment to existing events and an event in its own right. You have to choose!
You may need to think about whether other tables in your system need the full 2-D treatment, or whether it will be localised to just the calendar table.

Conclusion

Personally, I found this temporal pattern both mind-bending (in an enjoyable way), and indispensably useful in some situations, once I got my head around it. In the same way I hope you'll find 2-D time both fun and profitable!

Credits

This post brought to you by:

ditaa for SVG diagrams.
Nikola's nice plugin system that enabled me to make use of ditaa without leaving my beloved Emacs!

Footnotes

To be absolutely clear – neither Martin Fowler nor I am saying that time is really two dimensional, but that it can be helpful to view it like that. We might go further and ask, “Can you have more that two time dimensions?”.

To answer that, let's consider the base case – a database with no time dimensions. It might be a table of facts about the colours of different things:

Thing	Colour
Grass	Green
Sky	Blue

This is entirely non-temporal. I could then add a recorded_timestamp column:

Thing	Colour	Timestamp
Grass	Green	2021-07-19
Sky	Blue	2021-07-20

This reads as, "On Monday, I learnt that grass is green. On Tuesday, I learnt that the sky is blue”. This schema also allows us to have contradictory facts if they are interpreted as storing what I believed to be true on different dates. This is a temporal recording of non-temporal data.

I could then go 2-D by having a temporal recording of temporal data - for example, if I recorded the weather on certain days:

Day	Weather	Timestamp
2021-07-19	Rainy	2021-07-19
2021-07-20	Sunny	2021-07-20
2021-07-19	Mostly rainy	2021-07-20

Here, the information itself relates to certain times, and the timestamp at which I learn about it is a different dimension to the timestamp in the data itself. In the example above, on Monday I recorded the weather that day was rainy, but on Tuesday I found out that wasn't entirely true, and recorded that the weather on Monday was mostly rainy.

Can we go further? Well, suppose we have another system that takes snapshots of the table above, recording each against a timestamp. This would be a temporal recording of 2-D temporal data, so it would be 3-D temporal data. And you can keep going of course. However, it might be difficult to find a use for that kind of thing...

You may also like: §

Comments §