The curse of scalable technology

by Luke Plant

Posted in:

Software development

— December 2, 2021 13:55

The scalability and breadth of usefulness of some tools in the software development world is huge. Relational databases spring to mind as one example — at the low end, a database as small as one table with a few dozen rows can be useful, while at the higher end millions or billions of rows in thousands of tables is not unusual, and you might choose the same technology for both.

A more obvious example, perhaps, is programming languages. Code bases can be anything from a handful of lines for simple scripts, to millions of lines, and many languages scale to both these extremes. There are many more examples in the programming world of libraries and tools that scale like this.

This is an amazing success, but it brings with it some issues. In the software development world, you can be rubbing shoulders online with people working at very different scales to you, and in very different contexts. That can lead to:

endless debates where we talk past each other, each convinced that our experience and expertise (which may be real) qualify us to tell other people they are all doing it wrong.
copying inappropriate solutions, resulting in over-engineering (or under-engineering, but I suspect that happens less), due to being unaware of the context in which solutions really make sense.
projects that get side-tracked trying to please everyone, where a more focused approach would have been better. Similarly when choosing how to develop our expertise.
becoming arrogant and dismissive of others when you see what looks like obvious incompetence, due to being unaware of the context in which someone else’s code or decisions do in fact make sense. (That’s not to say that there is no incompetence, there might be plenty, but you may or may not be competent to judge that!)

Jamie Brandon touched on the subject of “Context Matters” in a recent post. On that theme, I thought it might be useful to list some of the different ways in which your context might be different from that of other people you might be interacting with on the internet.

In no particular order, important aspects of development context include:

Size of code base.
Size of development team.
Size of the larger organisation that your software development team exists within or relates to.
Nature of hierarchies within the organisation running your software development.
Organisation type – e.g. just for fun, charity, government non-profit, for profit, open source community (of many different kinds) etc.
Business domain e.g. medicine, law, finance, physics, gaming etc.
Speed at which you must react to market changes. Especially, the speed at which you might need to scale operations, or develop new features.
Need for or use of specialists vs generalists e.g. for web development, you have “full stack” developers at one end of the scale, vs specialists in databases, backend application code (which may have multiple layers), frontend application code, design, UI/UX etc.
Priority given to avoiding or fixing faults. Consider critical medical systems, for example, compared with an experimental game you are developing for fun in public.
Priority given to avoiding regressions. This can be in direct tension to the above, due to Hyrum's Law.
Priority of getting the best possible performance. Performance could be measured in terms of memory usage, CPU time, network latency, network bandwidth etc. For some projects, a 1% improvement is considered a very worthy goal, for others, solutions that are thousands of times slower than optimal could be absolutely fine, and a factor of 10 improvement might not even be worth it.
Kind of users your software has. They could be other developers like you, or end users with a massive range of ability with computers.
Nature of your relationship with the software’s user. This can range from:
- you are the user
- you work for the user and are paid by them
- you work on a team with the user, and are both paid by the same people
- you don’t know the user directly at all, but are nonetheless paid by them (indirectly).
- you don’t know the user and they don’t pay for the software even indirectly.
- probably many other variations.
Number of users of your software – this could easily range over 9 orders of magnitude.
Amount of data you process – again, a massive range.
Relative value to your “business” of each customer.
To what degree you know what hardware your software will run on.
How close you are to hardware limits (compare embedded systems to most desktop software, for example).
How many different environments (e.g. operating systems) your software needs to work in.
Extensibility requirements – from “none”, through “extendable by developers” or “extendable by end users” (for example with embedded scripting engines)
Long term maintenance needs. Some projects are kind of “throw-away”, and some see few changes after a “release” e.g. some (but not all) games. On the other hand, some software will be maintained for decades, sometimes being heavily changed after initial development, other less so.
Need for new developers and maintainers. In some projects, you’ll have a pretty static team, others will see huge amount of turnover.
Ability to find more developers and maintainers. Some projects have no money at all, while for others money is no object. Some may attract new developers by virtue of being “fun”, others may not be able to.
Backwards compatibility needs with external dependencies. Differences here will make a large difference in how much design you need to do up front.

All of the above are dimensions in which software projects live. Many of them, on their own, make profound differences to technology decisions. So considered together, the space is truly vast. You could be “next door” to someone in 10 of these dimensions, but miles away in some others.

(Obviously some of these dimensions have strong correlations with each other e.g. large code bases are usually managed by larger teams, which reduces the populated possibility space, but there are probably many exceptions to whatever correlations you assume)

And there are probably many more dimensions I’m not aware of.

Trying to work out how much advice can be generalised is extremely hard. This is compounded by the fact that people who have experience in a lot of different projects often do not have in depth knowledge, or knowledge spanning a really long time period. I know from experience that conclusions I’ve come to after 2 or 3 years on a project are different to after 1 year, and they might change again after 5 or 10 years. So it may be that the most experienced people (judging by breadth) are actually the least qualified to advise others, due to lack of depth – but also the least aware of that!

And then you have the problem that many people with a lot of experience are pretty silent about it, and you have no idea how many they are (because they are not vocal about their existence either!) Further, the most vocal might not be the best qualified to help with your situation. For example, I know from at least 2 data points that it’s entirely possible to run a multi-million dollar business that has a main database containing much less than 100 Mb of data. But I don’t know how common that is, and I suspect you will probably hear a lot more from companies that have a massively different profit-to-data ratio.

When I think too much about this, I feel I am stuck between two extremes: wild and unjustified extrapolations from the tiny bit of experience I have gained so far on the one hand, and failing to learn from anything on the other. The latter seems much worse – none of us would be alive today if we reasoned “well just because a lion ate my friend, it would be unscientific and unjustified to jump to the conclusion that this lion might eat me”.

So I think the right path is something like this:

Try to generalise from your experiences, but don’t hold your opinions too strongly.
Listen to other people’s conclusions, but try to learn as much as you can about the context that formed them.
See the value in expertise and approaches that have a limited scope of application.

Any other ideas?

The curse of scalable technology

Links

Comments §

Links

You may also like: §

Comments §