Luke Plant's home page (Posts about Haskell)

We need less powerful languages

2015-11-14T11:46:01Z

Translations of this post (I can't vouch for their accuracy):

Japanese

Many systems boast of being ‘powerful’, and it sounds difficult to argue that this is a bad thing. Almost everyone who uses the word assumes that it is always a good thing.

The thesis of this post is that in many cases we need less powerful languages and systems.

Before I get going, there is very little original insight in this post. The train of thought behind it was set off by reading Hofstadter’s book Gödel, Escher, Bach – an Eternal Golden Braid which helped me pull together various things in my own thinking where I’ve seen the principle in action. Philip Wadler’s post on the rule of least power was also formative, and most of all I’ve also taken a lot from the content of this video from a Scala conference about everything that is wrong with Scala, which makes the following fairly central point:

Every increase in expressiveness brings an increased burden on all who care to understand the message.

My aim is simply to illustrate this point using examples that might be more accessible to the Python community than the internals of a Scala compiler.

I also need a word about definitions. What do we mean by “more powerful” or “less powerful” languages? In this article, I mean something roughly like this: “the freedom and ability to do whatever you want to do”, seen mainly from the perspective of the human author entering data or code into the system. This roughly aligns with the concept of “expressiveness”, though not perhaps with a formal definition. (More formally, many languages have equivalent expressiveness in that they are all Turing complete, but we still recognise that some are more powerful in that they allow a certain outcome to be produced with fewer words or in multiple ways, with greater freedoms for the author).

The problem with this kind of freedom is that every bit of power you insist on having when writing in the language corresponds to power you must give up at other points of the process – when ‘consuming’ what you have written. I’ll illustrate this with various examples which range beyond what might be described as programming, but have the same principle at heart.

We’ll also need to ask “Does this matter?” It matters, of course, to the extent that you need to be able to ‘consume’ a message you have put in. Different players who might ‘consume’ the message are software maintainers, compilers and other development tools, which means you almost always care – this has implications both for performance and correctness as well as human concerns.

Databases and schema

Starting at the low end of the scale in terms of expressiveness, there is what you might call data rather than language. But both “data” and “language” can be thought of as “messages to be received by someone”, and the principle applies to both.

In my years of software development, I’ve found that clients and users often ask for “free text” fields, often for “notes”. A free text field is maximally powerfully as far as the end user is concerned – they can put whatever they like in. In this sense, this is the “most useful” field – you can use it for anything.

But precisely because of this, it is also the least useful, because it is the least structured. Even search doesn’t work reliably because of typos and alternative ways of expressing the same thing. The longer I do software development involving databases, the more I want to tightly constrain everything as much as possible. When I do so, the data I end up with is massively more useful. I can do powerful things when consuming the data only when I severely limit the power (i.e. the freedom) of the agents putting data into the system.

In terms of database technologies, the same point can be made. Databases that are “schema-less” give you great flexibility and power when putting data in, and are extremely unhelpful when getting it out. A key-value store is a more technical version of “free text”, with the same drawbacks – it is pretty unhelpful when you want to extract info or do anything with the data, since you cannot guarantee that any specific keys will be there.

HTML

The success of the web has been partly due to the fact that some of the core technologies, HTML and CSS, have been deliberately limited in power. Indeed, you probably wouldn’t call them programming languages, but markup languages. This, however, was not an accident, but a deliberate design principle on the part of Tim Berners Lee. I can’t do better than to quote that page at length:

Computer Science in the 1960s to 80s spent a lot of effort making languages which were as powerful as possible. Nowadays we have to appreciate the reasons for picking not the most powerful solution but the least powerful. The reason for this is that the less powerful the language, the more you can do with the data stored in that language. If you write it in a simple declarative form, anyone can write a program to analyze it in many ways. The Semantic Web is an attempt, largely, to map large quantities of existing data onto a common language so that the data can be analyzed in ways never dreamed of by its creators. If, for example, a web page with weather data has RDF describing that data, a user can retrieve it as a table, perhaps average it, plot it, deduce things from it in combination with other information. At the other end of the scale is the weather information portrayed by the cunning Java applet. While this might allow a very cool user interface, it cannot be analyzed at all. The search engine finding the page will have no idea of what the data is or what it is about. This the only way to find out what a Java applet means is to set it running in front of a person.

This is has become a W3C principle:

Good Practice: Use the least powerful language suitable for expressing information, constraints or programs on the World Wide Web.

Note that this is almost exactly the opposite of Paul Graham’s advice (with the caveat that ‘power’ is often too informally defined to compare):

if you have a choice of several languages, it is, all other things being equal, a mistake to program in anything but the most powerful one.

Python setup.py MANIFEST.in file

Moving up towards ‘proper’ programming language, I came across this example — the MANIFEST.in file format used by distutils/setuptools. If you have had to create a package for a Python library, you may well have used it.

The file format is essentially a very small language for defining what files should be included in your Python package (relative to the MANIFEST.in file, which we’ll call the working directory from now on). It might look something like this:

include README.rst
recursive-include foo *.py
recursive-include tests *
global-exclude *~
global-exclude *.pyc
prune .DS_Store

There are two types of directive: include type directives (include, recursive-include, global-include and graft), and exclude type directives (exclude, recursive-exclude, global-exclude and prune).

There comes a question – how are these directives to be interpreted (i.e. what are the semantics)?

You could interpret them in this way:

A file from the working directory (or sub-directories) should be included in the package if it matches at least one include type directive, and does not match any exclude type directive.

This would make it a declarative language.

Unfortunately, that is not how the language is defined. The distutils docs for MANIFEST.in are specific about this – the directives are to be understood as follows (my paraphrase):

Start with an empty list of files to include in the package (or technically, a default list of files).
Go down the directives in the MANIFEST.in in order.
For every include type directive, copy any matching files from the working directory to the list for the package.
For every exclude type directive, remove any matching files from the list for the package.

As you can see, this interpretation defines a language that is imperative in nature – each line of MANIFEST.in is a command that implies an action with side effects.

The point to note is that this makes the language more powerful than my speculative declarative version above. For example, consider the following:

recursive-include foo *
recursive-exclude foo/bar *
recursive-include foo *.png

The end result of the above commands is that .png files that are below foo/bar are included, but all other files below foo/bar are not. If I’m thinking straight, to replicate the same result using the declarative language is harder – you would have to do something like the following, which is obviously sub-optimal:

recursive-include foo *
recursive-exclude foo/bar *.txt *.rst *.gif *.jpeg *.py ...

So, because the imperative language is more powerful, there is a temptation to prefer that one. However, the imperative version comes with significant drawbacks:

It is much harder to optimise.

When it comes to interpreting the MANFIEST.in and building a list of files to include in the package, one fairly efficient solution for a typical case is to first build an immutable list of all files in the directory and its sub-directories, and then apply the rules: addition rules involve copying from the full list to an output list, and subtraction rules involve removing from the output list. This is how the Python implementation currently does it.

This works OK, unless you have many thousands of files in the full list, most of which are going to get pruned or not included, in which case you can spend a lot of time building up the full list, only to ignore most of it.

An obvious shortcut is to not recurse into directories that would be excluded by some exclude directive. However, you can only do that if the exclude directives come after all include directives.

This is not a theoretical problem – I’ve found that doing setup.py sdist and other commands can take 10 minutes to run, due to a large number of files in the working directory if you use the tool tox for instance. This means that runs of tox itself (which uses setup.py) become very slow. I am currently attempting to fix this issue, but it is looking like it will be really hard.

Adding the optimised case might not look that hard (you can shortcut the file system traversal using any exclude directives that come after all include directives), but it adds sufficiently to the complexity that a patch is unlikely to be accepted – it increases the number of code paths and the chances of mistakes, to the point of it not being worth it.

It might be that the only practical solution is to avoid MANIFEST.in altogether and optimise only the case when it is completely empty.
The power has a second cost – MANIFEST.in files are harder to understand.

First, in understanding how the language works – the docs for this are considerably longer than for the declarative version I imagined.

Second, in analysing a specific MANIFEST.in file – you have to execute the commands in your head in order to work out what the result will be, rather than being able to take each line on its own, or in any order that makes sense to you.

This actually results in packaging bugs. For instance, it would be easy to believe that a directive like:
```
global-exclude *~
```
at the top of a MANIFEST.in file would result in any file name ending in ~ (temporary files created by some editors) being excluded from the package. In reality it does nothing at all, and the files will be erroneously included if other commands include them.

Examples I’ve found of this mistake (exclude directives that don’t function as intended or are useless) include:
- hgview (exclude directives at the top do nothing)
- django-mailer (global-exclude at the top does nothing)
Another result is that you cannot groups lines in the MANIFEST.in file in any way you please, for clarity, since re-ordering changes the meaning of the file.

In addition, virtually no-one will actually use the additional power. I’m willing to bet that 99.99% MANIFEST.in files do not make use of the additional power of the imperative language (I downloaded 250 and haven’t found any that do). So we could have been served much better by a declarative language here instead of an imperative one. But backwards compatibility forces us to stick with this. That highlights another point – it is often possible to add features to a language to make it more powerful, but compatibility concerns usually don’t allow you to make it less powerful, for example by removing features or adding constraints.

URL reversing

One core piece of the Django web framework is URL routing. This is the component that parses URLs and dispatches them to the handler for that URL, possibly passing some components extracted from the URL.

In Django, this is done using regular expressions. For an app that displays information about kittens, you might have a kittens/urls.py with the following:

from django.conf.urls import url

from kittens import views

urlpatterns = [
    url(r'^kittens/$', views.list_kittens, name="kittens_list_kittens"),
    url(r'^kittens/(?P<id>\d+)/$', views.show_kitten, name="kittens_show_kitten"),
]

The corresponding views.py file looks like:

def list_kittens(request):
    # ...

def show_kitten(request, id):
    # ...

Regular expressions have a capture facility built in, which is used to capture parameters that are passed to the view functions. So, for example, if this app were running on cuteness.com, a URL like http://www.cuteness.com/kittens/23/ results in calling the Python code show_kitten(request, id="23").

Now, as well as being able to route URLs to specific functions, web apps almost always need to generate URLs. For example, the kitten list page will need to include links to the individual kitten page i.e. show_kitten. Obviously we would like to do this in a DRY way, re-using the URL routing configuration.

However, we would be using the URL routing configuration in the opposite direction. When doing URL routing, we are doing:

URL path -> (handler function, arguments)

In URL generation, we know the handler function and arguments we want the user to arrive at, and want to generate a URL that will take the user there, after going through the URL routing:

(handler function, arguments) -> URL path

In order to do this, we essentially have to predict the behaviour of the URL routing mechanism. We are asking “given a certain output, what is the input?”

In the very early days Django did not include this facility, but it was found that with most URLs, it was possible to 'reverse' the URL pattern. The regex can be parsed looking for the static elements and the capture elements.

Note, first of all, that this is only possible at all because the language being used to define URL routes is a limited one – regular expressions. We could easily have defined URL routes using a more powerful language. For example, we could have defined them using functions that:

take a URL path as input
raise NoMatch if they do not match
return a truncated URL and an optional set of captures if they do match.

Our kittens urls.py would look like something like this:

from django.conf.urls import url, NoMatch

def match_kitten(path):
    KITTEN = 'kitten/'
    if path.startswith(KITTEN):
        return path[len(KITTEN):], {}
    raise NoMatch()

def capture_id(path):
    part = path.split('/')[0]
    try:
        id = int(part)
    except ValueError:
        raise NoMatch()
    return path[len(part)+1:], {'id': id}

urlpatterns = [
    url([match_kitten], views.list_kittens, name='kittens_list_kittens'),
    url([match_kitten, capture_id], views.show_kitten, name="kittens_show_kitten"),
]

Of course, we could provide helpers that make things like match_kitten and capture_id much more concise:

from django.conf.urls import url, m, c

urlpatterns = [
    url([m('kitten/'), views.list_kittens, name='kittens_list_kittens'),
    url([m('kitten/'), c(id=int)], views.show_kitten, name="kittens_show_kitten"),
]

Now, this language for URL routing is actually a lot more powerful than our regex based one, assuming that m and c are returning functions as above. The interface for matching and capturing is not limited to the capabilities of regexes – for instance, we could do database lookups for the IDs, or many other things.

The downside, however, is that URL reversing would be entirely impossible. For general, Turing complete languages, you cannot ask “given this output, what is the input?”. We could potentially inspect the source code of the function and look for known patterns, but it quickly becomes totally impractical.

With regular expressions, however, the limited nature of the language gives us more options. In general, URL configuration based on regexes is not reversible — a regex as simple as . cannot be reversed uniquely. (Since we want to generate canonical URLs normally, a unique solution is important. As it happens, for this wild card, Django currently picks an arbitrary character, but other wild cards are not supported). But as long as wild cards of any sort are only found within capture groups (and possibly some other constraints), the regex can be reversed.

So, if we want to be able to reliably reverse the URL routes, we actually want a language less powerful than regular expressions. Regular expressions were presumably chosen because they were powerful enough, without realising that they were too powerful.

Additionally, in Python defining mini-languages for this kind of thing is quite hard, and requires a fair amount of boilerplate and verbosity both for implementation and usage – much more than when using a string based language like regexes. In languages like Haskell, relatively simple features like easy definitions of algebraic data types and pattern matching make these things much easier.

Regular expressions

The mention of regexes as used in Django’s URL routing reminds me of another problem:

Many usages of regexes are relatively simple, but whenever you invoke a regex, you get the full power whether you need it or not. One consequence is that for some regular expressions, the need to do backtracking to find all possible matches means that it is possible to construct malicious input that takes a huge amount of time to be processed by the regex implementation.

This has been the cause of a whole class of Denial Of Service vulnerabilities in many web sites and services, including one in Django due to an accidentally 'evil' regex in the URL validator — CVE-2015-5145 (and one that took down the whole of Stack Exchange - update 2016-07-22).

A less powerful string matching language wouldn’t have these problems.

Django templates vs Jinja templates

The Jinja template engine was inspired by the Django template language, but with some differences in philosophy and syntax.

One major advantage of Jinja2 over Django is that of performance. Jinja2 has an implementation strategy which is to compile to Python code, rather than run an interpreter written in Python, which is how Django works, and this results in a big performance increase – often 5 to 20 times. (YMMV etc.)

Armin Ronacher, the author of Jinja, attempted to use the same strategy to speed up Django template rendering. There were problems, however.

The first he knew about when he proposed the project – namely that the extension API in Django makes the approach taken in Jinja very difficult. Django allows custom template tags that have almost complete control over the compilation and rendering steps. This allows some powerful custom template tags like addtoblock in django-sekizai that seems impossible at first glance. However, if a slower fallback was provided for these less common situations, a fast implementation might still have been useful.

However, there is another key difference that affects a lot of templates, which is that the context object that is passed in (which holds the data needed by the template) is writable within the template rendering process in Django. Template tags are able to assign to the context, and in fact some built-in template tags like url do just that.

The result of this is a key part of the compilation to Python that happens in Jinja is impossible in Django.

Notice that in both of these, it is the power of Django’s template engine that is the problem – it allows code authors to do things that are not possible in Jinja2. However, the result is that a very large obstacle is placed in the way of attempts to compile to fast code.

This is not a theoretical consideration. At some point, performance of template rendering becomes an issue for many projects, and a number have been forced to switch to Jinja because of that. This is far from an optimal situation!

Often the issues that make optimisation difficult are only clear with the benefit of hindsight, and it isn’t true to say that simply adding restrictions to a language is necessarily going to make it easier to optimise. There are certainly languages which somehow manage to hit a “sour spot” of providing little useful power to either the authors or the consumers!

You might also say that for the Django template designers, allowing the context object to be writable was the obvious choice because Python data structures are typically mutable by default. Which brings us to Python...

Python

There are many ways that we could think about the power of the Python language, and how it makes life hard for every person and program that wants to make sense of Python code.

Compilation and performance of Python is an obvious one. The unrestricted effects that are possible at any point, including writable classes and modules etc., not only allow authors to do some very useful things, they make it extremely difficult to execute Python code quickly. PyPy has made some impressive progress, but looking at the curve from PyPy 1.3 onward, which shows diminishing returns, makes it clear that they are unlikely to make much bigger gains in the future. And the gains that have been made in terms of run time have often been at the expense of memory usage. There is simply a limit to how well you can optimise Python code.

(Please note, to all who continue reading this – I’m not a Python basher, or a Django basher for that matter. I’m a core developer of Django, and I use Python and Django in almost all my professional programming work. The point of this post is to illustrate the problems caused by powerful languages).

However, rather than focus on the performance problems of Python, I’m going to talk about refactoring and maintenance. If you do any serious work in a language, you find yourself doing a lot of maintenance, and being able to do it quickly and correctly often becomes very important.

So, for example, in Python, and with typical VCS tools (Git or Mercurial, for instance), if you re-order functions in a module e.g. move a 10 line function to a different place, you get a 20 line diff, despite the fact that nothing changed in terms of the meaning of the program. And if something did change (the function was both moved and modified), it’s going to be very difficult to spot.

This happened to me recently, and set me off thinking just how ridiculously bad our toolsets are. Why on earth are we treating our highly structured code as a bunch of lines of text? I can’t believe that we are still programming like this, it's insane!

At first, you might think that this could be solved with a more intelligent diff tool. But the problem is that in Python, the order in which functions are defined can in fact change the meaning of a program (i.e. change what happens when you execute it).

Here are a few examples:

Using a previously defined function as a default argument:

def foo():
    pass

def bar(a, callback=foo):
    pass

These functions can’t be re-ordered or you’ll get a NameError for foo in the definition of bar.

Using a decorator:

@decorateit
def foo():
    pass

@decorateit
def bar():
    pass

Due to unrestricted effects that are possible in @decorateit, you can’t safely re-order these functions and be sure the program will do the same thing afterwards. Similarly, calling some code in the function argument list:

def foo(x=Something()):
    pass

def bar(x=Something()):
    pass

Similarly, class level attributes can’t be re-ordered safely:

class Foo():
    a = Bar()
    b = Bar()

Due to unrestricted effects possible inside the Bar constructor, the definitions of a and b cannot be re-ordered safely. (This might seem theoretical, but Django, for instance, actually uses this ability inside Model and Form definitions to provide a default order for the fields, using a cunning class level counter inside the base Field constructor).

Ultimately, you have to accept that a sequence of function statements in Python is a sequence of actions in which objects (functions and default arguments) are created, possibly manipulated, etc. It is not a re-orderable set of function declarations as it might be in other languages.

This gives Python an amazing power when it comes to writing it, but imposes massive restrictions on what you can do in any automated way to manipulate Python source code.

Above I used the simple example of re-ordering two functions or class attributes. But every single type of refactoring that you might do in Python becomes virtually impossible to do safely because of the power of the language e.g. duck typing means you can’t do method renames, the possibility of reflection/dynamic attribute access (getattr and friends) means you can’t in fact do any kind of automated renames (safely).

So, if we are tempted to blame our crude VCS or refactoring tools, we actually have to blame the power of Python – despite the huge amount of structure in correct Python source code, there is very little that any software tool can do with it when it comes to manipulating it, and the line-based diffing that got me so mad is actually a reasonable approach.

Now, 99% of the time, we don’t write Python decorators which mean that the order of function definitions makes a difference, or silly things like that – we are responsible “adults”, as Guido put it, and this makes life easier for human consumers. But the fact remains that our tools are limited by what we do in the 0.01% of cases. For some consumers, we can optimise on the basis of the common case, and detect when that fails e.g. a JIT compiler using guards. But with others e.g. VCS or refactoring tools, the “runtime” information that you hit the unlucky case comes far too late – you might have released your subtly-broken code by the time you find out, so you have to be safe rather than sorry.

In an ideal world, with my dream language, when you rename a function, the entire “diff” in your VCS should simply be “Function foo renamed to bar”. (And, this should be exportable/importable, so that when you upgrade a dependency to a version in which foo is renamed to bar, it should be exactly zero work to deal with this). In a “less powerful” language, this would be possible, but the power given to the program author in Python has taken power from all the other tools in the environment.

Does this matter? It depends on how much time you spend manipulating your code, compared to using code to manipulate data.

At the beginning of a project, you may be tempted to desire the most powerful language possible, because it gives you the most help and freedom in terms of manipulating data. But later on, you spend a huge amount of time manipulating code, and often using an extremely basic tool to do so – a text editor. This treats your highly structured code as one of the least structured forms of data — a string of text – exactly the kind of manipulation you would avoid at all costs inside your code. But all the practices you would choose and rely on inside your program (manipulating all data inside appropriate containers) are no longer available to you when it comes to manipulating the program itself.

Some popular languages make automated refactoring easier, but more is needed: to actually make use of the structure of your code, you need an editor and VCS that understand your code properly. Projects like Lamdu, Unison and isomorf.io are in the right direction, but still in their infancy, and unfortunately involve re-thinking the entire software development stack :-(

Summary

When you consider the total system and all the players (whether software or human), including the need to produce efficient code, and long term maintainability, less powerful languages are actually more powerful – “slavery is freedom”. There is a balance between expressiveness and reasonability.

The more powerful a language, the greater the burden on software tools, which either need to be more complicated in order to work, or are forced to do less than they could. This includes:

compilers – with big implications for performance.
automated refactoring and VCS tools – with big implications for maintenance.

Similarly, the burden also increases for humans – for anyone attempting to understand the code or modify it.

A natural instinct is to go for the most powerful solution, or a solution that is much more powerful than is actually needed. We should try to do the opposite — find the least powerful solution that will do the job.

This won’t happen if creating new languages (which might involve parsers etc.) is hard work. We should prefer software ecosystems that make it easy to create very small and weak languages.

Addendum

Similar posts I've found since writing this:

Maximally powerful, minimally useful

You can’t compare language features, only languages

2014-11-11T09:51:43Z

A lot of programming language debate is of the form “feature X is really good, every language needs it”, or “feature X is much better than its opposite feature Y”. The classic example is static vs dynamic typing, but there are many others, such as different types of meta-programming etc.

I often find myself pulled in both directions by these debates, as I’m rather partial to both Haskell and Python. But I’d like to suggest that doing this kind of comparison in the abstract, without talking about specific languages, is misguided, for the following reasons:

Language features can take extremely different forms in different languages

In my experience, static typing in Haskell is almost entirely unlike static typing in C, and different again from C# 1.0, and, from what I can tell, very different from static typing in C# 5.0. Does it really make sense to lump all these together?

Similarly, dynamic typing in Shell script, PHP, Python and Lisp are perhaps more different than they are alike. You can’t even put them on a spectrum – for example, Python is not simply a ‘tighter’ type system than PHP (in not treating strings as numbers etc.), because it also has features that allow far greater flexibility and power (such as dynamic subclassing due to first class classes).

Combination of features is what matters

One of my favourite features of Python, for example, is keyword arguments. They often increase the clarity of calling code, and give functions the ability to grow new features in a backwards compatible way. However, this feature only makes sense in combination with other features. If you had keyword arguments but without the **kwargs syntax for passing and receiving an unknown set of keyword arguments, it would make decorators extremely difficult.

If you are thinking of how great Python is, I don’t think it helps to talk about keyword arguments in general as a killer feature. It is keyword arguments in Python that work particularly well.

Comparing language features opens up lots of opportunities for bad arguments

For example:

Attacking the worst implementation

So, a dynamic typing advocate might say that static typing means lots of repetitive and verbose boilerplate to indicate types. That criticism might apply to Java, but it doesn’t apply to Haskell and many other modern languages, where type inference handles 95% of the times where you might need to specify types.

Defending the best implementation

The corollary to the above fallacy is that if you are only debating language features in the abstract, you can pick whichever implementation you want in order to refute a claim. Someone claims that dynamic typing makes IDE support for refactoring very difficult, and a dynamic typing advocate retorts that this isn’t the case with Smalltalk – ignoring the fact that they don’t use Smalltalk, they have never used Smalltalk, and their dynamically-typed language of choice does indeed present much greater or even insurmountable problems to automated refactoring.

Defending a hypothetical implementation

Defending the best implementation goes further when you actually defend one that doesn’t exist yet.

The mythical “smart enough compiler” is an example of this, and another would be dynamic typing advocates might talk about “improving” dynamic analysis.

Hypothetical implementations are always great for winning arguments, especially as they can combine all the best features of all the languages, without worrying about whether those features will actually fit together, and produce something that people would actually want to use. Sometimes a hybrid turns out like Hercules, and sometimes like the Africanized bee.

Ignoring everything else

In choosing a programming language, it’s not only the features of the language that you have to consider – there is long list of other factors, such as the maturity of the language, the community, the libraries, the documentation, the tooling, the availability (and quality) of programmers etc.

Sometimes the quality of these things are dominated by accidents of history (which language became popular and when), and sometimes they can be traced back to features of the language design.

Many language-war debates ignore all these things. But it’s even easier if you are not actually comparing real languages – just language features, abstracted from everything else.

I understand that comparing everything at once is difficult, and we will always attempt to break things down into smaller pieces for analysis. But I doubt that this goes very far with programming languages, because of the way the different features interact with each other, and also exert huge influence on the way that everything else develops e.g. libraries.

Conclusion

Language features exist within the context of a language and everything surrounding that language. It seems to me that attempts to analyse them outside that context simply lead to false generalisations.

Of course, being really concrete and talking about specific languages often ends up even more personal, which has its own pitfalls! Is there a good way forward?

Translating sentences with substitutions

2013-01-24T00:14:46Z

The problem

Many programs build up sentences using bits - often a template into which different things might be substituted. However, the things you substitute into a sentence can change the sentence, and vice-versa, in ways that are not anticipated by the programmer.

For example, plurals. In English, you might try code like this:

if n == 1:
    return "I have 1 pig"
else:
    return "I have %s pigs" % n

Localising these strings gives problems, because the rules for how plural forms work are different in every language. This specific problem is generally considered 'solved' by the use of gettext, but many more exist.

For example, we have another problem as soon as we start substituting nouns:

"Delete selected %s?" % object_name

Various attributes about the noun could affect the sentence. In French, the adjective "selected" needs to agree in gender with the noun being substituted in. So you cannot lookup the translations for "Delete selected %s" and for object_name separately. (This is a real example picked from Django source code).

Further, depending on how the sentence uses the noun, the form of the noun might need to change. For example, the noun might appear in the accusative position for a given sentence and language, which requires a different form of the noun to be used compared to the nominative form.

Several other examples of this appeared in Django ticket 11688. One proposed solution on that ticket would require a huge amount of knowledge and effort on the part of Django programmers, and almost certainly would not work anyway.

This post is an attempt to come up with a better solution, or at least kick start discussion. I haven't been able to find any solutions to this problem online, and most people seem to be just using gettext, which is a 95% solution — and maybe that is good enough for most people.

[Update 2013-02-19 - ‘Richard’ pointed me to Locale::Maketext article, which has in essence a similar approach to what I've done here]

[Update 2015-08-10/2018-03-29 - Mozilla’s l20n and fluent projects seem to cover most of the use cases here. The strategies are quite similar to the one described below. The language has iterated, and the one now used is essentially similar to what I envisaged below, but much more practical and having the substitution syntax that you would want. You should look there if you just want a working solution. It's not clear how active or widespread these projects are, though.]

Assumptions and simplifications

We will assume that a sentence is a composable unit of meaning, such that sentences can be translated independently. So, if in language A we have sentences 1 and 2, in that order, we can translate these into language B by translating sentence 1 and sentence 2 independently, and putting them together in the same order.

This is, no doubt, a simplification. In some languages, the two sentences might make more sense if re-ordered, or combined, or split in various ways. Indeed, some languages may not have a truly equivalent concept of 'sentence' at all.

However, we have to do something, and this is a reasonable approximation.

Requirements

We need a powerful way of defining sentences in a given human language. It must be powerful enough that the person doing the translation can do anything they need, without the programmer needing to be aware of all the things in the language that will cause difficulty.

So, we'll start with a full programming language, and chop out the things we shouldn't need.

We shouldn't need side effects - translation should be a pure function. So we'll use a purely functional programming language without side effects.

We need something fairly readable, because translators are going to have to use it. It should be as close as possible to declarative in style.

Pattern matching seems like a great fit for some of our needs.

Possible solution

Given the above requirements, let's start with a Haskell-like pure functional language, whose pattern matching will be extremely helpful. It will obviously have IO removed, and no type signatures (but that won't stop us inferring them and being able to statically type-check the code). Everything else will be borrowed directly from Haskell, so that I can avoid having to make up my own syntax and semantics.

If the concept works, we can argue about better or simpler syntax for some constructs, or helper functions that aren't part of the Haskell prelude.

Hopefully, we will find a relatively small subset of Haskell that is needed to give us all the power we need to solve this problem - a subset small enough that we could guarantee non-termination ideally, to avoid problems with translations created by malicious agents.

This will be an example based exploration.

Let's assume that every sentence can be generated by a function. The function will take as parameters all the substutions that are needed, and return the translated string.

So, suppose we have the English sentence "I have some pigs". For every different language we need, we would have a translation file which contains the function iHaveSomePigs, which in this case takes zero parameters. So for French:

iHaveSomePigs = "J'ai des cochons"

(The mapping between the English sentence "I have some pigs" and the function name iHaveSomePigs hasn't been defined, and we'll skate over that detail for now).

If we have a variable number of pigs, for French we might have:

iHaveNPigs 0 = "Je n'ai pas de cochon"
iHaveNPigs 1 = "J'ai un cochon"
iHaveNPigs n = "J'ai " ++ show n ++ " cochons"

For English we could do this:

iHaveNPigs 1 = "I have 1 pig"
iHaveNPigs n = "I have " ++ show n ++ " pigs"

(For those unfamiliar with Haskell, the way that pattern matching works is that the first definition that matches the arguments is used. Since n is not a literal, but a variable, it can match any argument.)

We can cope with more complicated rules, such as those used in Polish, perhaps something like this:

iHaveNFiles n = "Mam " ++ show n ++ " " ++ pluralize n "file"

plurals "file" = [ "plik"
                 , "pliki"
                 , "plików"
                 ]

pluralize n word = plurals word !! pluralForm n

pluralForm n
  | n == 1                                                                        = 0
  | n `mod` 10 >= 2 && n `mod` 10 <= 4 && (n `mod` 100 < 10 || n `mod` 100 >= 20) = 1
  | otherwise                                                                     = 2

Note that the complex logic in pluralForm and pluralize only has to be defined once. Adding more words simply requires additional plurals lines. It's not the nicest syntax, but could probably be improved, and it's pretty easy to copy.

Let's add in gender, using the sentences "Delete this %s?" (singular) and "Delete selected %s?" (plural). We can use guards:

deleteThisThing thing
    | isMasculine thing = "Supprimer ce " ++ singular thing ++ "?"
    | otherwise         = "Supprimer cette " ++ singular thing ++ "?"

    -- (Ignoring the problem with 'ce' followed by vowel for now...)

deleteSelectedThings thing
    | isMasculine thing = "Supprimer les " ++ plural thing ++ " sélectionnés"
    | otherwise         = "Supprimer les " ++ plural thing ++ " sélectionnées"

isMasculine thing = elem thing [ "pig"
                               , "man"
                               -- anything else masculine
                               ]

singular thing = pluralForm 1 thing
plural   thing = pluralForm 2 thing

pluralForm 1 "pig" = "cochon"
pluralForm 2 "pig" = "cochons"

pluralForm 1 "man" = "homme"
pluralForm 2 "man" = "hommes"

Note that the only thing required by this system is that the functions deleteThisThing and deleteSelectedThings exist. Everything else is at the freedom of the translator, and better ways of defining any of these functions are possible.

Of course, it isn't expected that a translator would be able to produce this by himself/herself. However, once the basic logic has been set up, this syntax is readable enough that a translator could easily add more of the same. Lines like:

pluralForm 1 "pig" = "cochon"

are actually pretty readable. The lack of parentheses in Haskell function calls is also a bonus (though, as I said earlier, exact syntax could be debated). This is not really that much harder than editing a .po file if you are just wanting to add more of the same.

Also, we've got flexibility. If we really don't care about getting the gender right, we can just do "sélectioné(e)s" and be done with it.

Let's make it harder - we'll add case. I'll use NT Greek as an example, because it has nouns that decline with case (and I don't know any similar modern languages well enough). I'm going to introduce an enum for the different cases, using data for now, and for the different genders. I could also do the same for number ("Singular" and "Plural"), but just using 1 and 2 seems easier.

Our sentence will be "You like the %s.". For this in Greek, we need to choose the accusative singular form of the thing we pass in. We also need to pick the word for "the" (the definite article) which matches the gender and number of the noun, and it has to match the accusative case too. So, if we pass in a masculine word, we need the singular accusative masculine definite article (having fun yet?):

data Case = Nominative | Accusative | Genitive | Dative
data Gender = Masculine | Feminine | Neuter

youLikeTheThing thing = "φιλεις "
                        ++ definiteArticle 1 Accusative (genderOf thing)
                        ++ " "
                        ++ accusativeSingular thing ++ "."

accusativeSingular thing = nounForm 1 Accusative thing

nounForm 1 Nominative "book" = "βιβλιον"
nounForm 1 Accusative "book" = "βιβλιον"
nounForm 1 Genitive   "book" = "βιβλιου"
nounForm 1 Dative     "book" = "βιβλιω"

nounForm 2 Nominative "book" = "βιβλια"
nounForm 2 Accusative "book" = "βιβλια"
nounForm 2 Genitive   "book" = "βιβλιων"
nounForm 2 Dative     "book" = "βιβλιοις"

genderOf "book" = Neuter
genderOf "man"  = Masculine
-- etc

definiteArticle 1 Nominative Masculine = "ο"
definiteArticle 1 Accusative Masculine = "τον"
definiteArticle 1 Genitive   Masculine = "του"
definiteArticle 1 Dative     Masculine = "τω"

definiteArticle 1 Nominative Neuter    = "το"
definiteArticle 1 Accusative Neuter    = "το"
definiteArticle 1 Genitive   Neuter    = "του"
definiteArticle 1 Dative     Neuter    = "τω"

-- feminine etc

-- definiteArticle 2 (plurals) etc.

Of course, you can easily define shorter aliases to avoid some typing here, and there may be better ways to generate the tables, though as written above they are pretty readable, and should be familiar to anyone who knows Greek.

The function youLikeTheThing here is no longer very readable, although it could be much worse. Some kind of substitution syntax/function could be used.

The code above actually works, BTW, and it actually ran first time I tried - the only correction I needed to make its output correct was to add a space after the definite article. You just need to put it in a file test.hs, add the following line:

main = putStrLn $ youLikeTheThing "book"

and do:

$ runhaskell test.hs

There is not a type signature in sight, but you have compile time guarantees. This is all a testimony to the clarity of Haskell's syntax.

The features of Haskell we've used are:

functions
simple pattern matching on numbers and strings
guards
data statements, limited to union types of nullary constructors i.e. effectively enumerated values. We could use a keyword enum for clarity.
string concatenation
lists
a few arithmetic and logical operators

We haven't used recursion. I can imagine circumstances where it might be useful, but if deemed too risky, you could add some rules that would disallow it (e.g. by requiring a function mustn't call itself directly, and must only call functions that exist prior to it in the source code, to avoid mutual recursion.) This would be helpful to ensure termination.

You might also want a module system, to be able to pull in some common definitions and functions for a given language, for consistency across different projects.

This whole approach has the advantage of being able to refine and special case as much as you want. Take the sentence "you like the %s": suppose that if the thing is a human being e.g. "man" or "woman", you need to use a completely different verb. Then you just add a special case first:

isAPerson "man"   = True
isAPerson "woman" = True
isAPerson n       = False

youLikeTheThing thing
    | isAPerson thing = ...
-- fall through to the normal case here

In the other direction, if you just don't have the time to care about any of this, you can just use a really simple (and often wrong) formula:

youLikeTheThing thing =  "φιλεις τον " ++ greek thing

greek "book" = "βιβλιον"

Notice that the programmer of the main project does not know anything about plural forms, gender, case etc., or put any of that into the source code. The only thing he/she would do is call a function with all the things to be substituted. We could have some mapping from English strings to function names, or we could just use the function name as a string, e.g. from a Python project we might call the function like so:

prompt = translate("doYouWantToDelete", n, object_name)

This would call the translation function doYouWantToDelete with the parameters n and object_name.

As a refinement, we can provide a version which will work when the whole localisation machinery is turned off i.e. we allow the programmer to provide their own version of the translation function which returns the default language:

prompt = translate("doYouWantToDelete", n, object_name,
                   lambda n, object_name: "Do you want to delete these %s %s(s)" %
                                          (n, object_name))

As before, the provided function can be correct or simplistic as desired for English.

Feedback

There are a few questions in my mind:

Would a solution like this work for the languages you know? What additional features would be needed to cope with other human languages?
Is this vaguely practical? Could you get translators to be able to edit code like this? If not, and only programmers would be able to do this, are there enough programmer-translators to make it a viable solution, at least for some big projects?

I'm aware that the string concatenation gets ugly fairly quicky, and some kind of interpolation might be needed (including the ability to call functions within that interpolation). With that in place, I think you could achieve a reasonable level of readability.

A translation tool could also have language-specific templates to quickly insert the code for common forms.
Is it possible to have a simpler language that would still be able to cope with the examples here?

The examples I've come up with suggest to me that you need a full programming language, and that attempting to start from the other direction (e.g. build up from the current gettext approach) will produce a monstrosity.

gettext already does a 95% job, and we are at the point of diminishing returns. So if we are going to try to tackle the final bit, we need to err on the side of enough power to get all the of that 5%, rather than put a lot of effort in and discover we've only arrived at 96%.

You also cover the case of having a client who insists that the program should output "cet homme" and not "ce homme" - while it might make your translation file ugly, you've got the power to be able to do it if you want.

Comments?

Dynamic typing in a statically typed language

2012-11-14T00:23:22Z

A recent question on programmers.stackexchange.com asked What functionality does dynamic typing allow?

I thought one of the best short answers to this was from Mark Ransom:

Theoretically there's nothing you can't do in either, as long as the languages are Turing Complete. The more interesting question to me is what's easy or natural in one vs. the other.

This post is about providing an example to back that up, and to respond to people who claim that, since you can implement dynamic types in a statically typed language, statically typed languages give you all the benefits of dynamically typed languages.

[Edit: to those who think I'm being a language or dynamic typing advocate or engaging in any kind of bashing, please read that last paragraph again, and note especially the use of word 'all'.]

Let's set up a problem. It's made up, but it illustrates the point I want to make:

Given a file, 'invoices.yaml', take the first document in it, extract the 'bill-to' field, and save the data in it as JSON in an output file 'address.json'. You can take it for granted that the contents of that field can be serialised as JSON (e.g. doesn't contain dates), although that might not be true for the rest of the document. To keep the example focussed and simple, everything will be ASCII.

The particular YAML file I used was taken from an example YAML document I found on the web, and then expanded for the sake of illustration:

---
invoice: 34843
date   : 2001-01-23
bill-to:
    given  : Chris
    family : Dumars
    address:
        lines: |
            458 Walkman Dr.
            Suite #292
        city    : Royal Oak
        state   : MI
        postal  : 48046
---
invoice: 34844
date   : 2001-01-24
bill-to:
    given  : Pete
    family : Smith
    address:
        lines: |
            3 Amian Rd
        city    : Royal Oak
        state   : MI
        postal  : 48047

I'll use Python and Haskell as representatives of dynamic typing and static typing, because I know them and many would consider them to be very good representatives of their camps, and I'm a big fan of both languages.

I also think that examining any programming problem in the abstract, or with respect to ideas like ‘dynamic typing’ or ‘static typing’, is not very relevant, because in the real world we have to use real, concrete languages, and they come with a whole set of properties (in terms of the language definition, tool sets, communities and libraries) that make a massive impact on how you actually use them.

So I'm going to try to use real libraries that actually exist, ignore solutions that could theoretically exist but don't, and ignore problems that could theoretically exist but don't.

Python

Here is my Python solution:

import yaml
import json
json.dump(list(yaml.load_all(open('invoices.yaml')))[0]['bill-to'],
          open('address.json', 'w'))

Notes: I didn't have to consult docs once. This isn't just due to my familiarity with Python – it's also the fact that I can fire up IPython and go:

In [1]: import yaml
In [2]: yaml.<TAB>

and get a list of likely functions. I can then go:

In [3]: yaml.load_all?

and get help, or go:

In [4]: yaml.load_all??

and get the complete source code of the function/method/class/module, in case I need it.

Haskell

Now for the Haskell version. First, a disclaimer: I'm much less experienced in Haskell than in Python. I did manage to write my blog software in Haskell at one point, but I don't use Haskell on anything like a daily basis, and I do use Python that much.

I first need to parse YAML. I've got a choice of packages. Unlike in Python, for a library like this, the choice you make is likely to have a big impact on the code you write – switching to a different (perhaps faster) package won't be just a case of changing an import, as we will see. The choice of packages represents the fact that even designing how this thing should work in terms of API and data structures is not straightforward in Haskell, and represents a much bigger commitment, and therefore problem, for the library user. In Python, while there are a few API choices (like supporting streaming or not, potentially), mostly it's pretty obvious how the library should work.

Looking on Hackage, I first find the 'yaml' package. The first line of the Data.Yaml API docs reads:

A JSON value represented as a Haskell value.

(Yes, you read that right). This doesn't look good. The whole file has stuff about JSON, not YAML, with no indication why I want to be using JSON values, not YAML. But I have a go anyway, perhaps it was deliberate.

When trying to use the decodeFile function, I get an error about needing a type signature, due to the way decodeFile is defined:

decodeFile :: FromJSON a => FilePath -> IO (Maybe a)

There are lots of instances of FromJSON to choose from, but I have to know in advance the type of data. And it looks like I've got data that isn't going to fit into any of those types, because it involves heterogenous collections. [Correction in comments, see below].

I gave up and tried another package - Data.Yaml.Syck.

First try:

import Data.Yaml.Syck

main = do
  d <- parseYamlFile "invoices.yaml"
  print d

This works - well, I've got some kind of parsing going on, at least. It looks like I've got some YamlNode datastructure, and the top thing is an EMap (it looks like it has only parsed the first document, which is worrying, but doesn't matter given my requirements, so I'll ignore that). But how do I get data out?

OK, let's try yaml-light - it wraps HsSyck and has some easier utility functions, like lookupYL.:

lookupYL :: YamlLight -> YamlLight -> Maybe YamlLight

That expects the lookup key to be a YamlLight, so I need to create one from a string, somehow. The docs show how to turn a ByteString into a YamlLight node, and I need to pass in a String, which from previous experience requires doing something like pack from Data.ByteString.

My program so far:

import Data.Yaml.YamlLight
import Data.ByteString.Char8 (pack)
import Data.Maybe

main = do
  d <- parseYamlFile "invoices.yaml"
  print $ fromJust $ lookupYL (YStr $ pack "bill-to") d

Which gives this output:

YMap (fromList [(YStr "bill-to",YMap (fromList [(YStr "address",YMap (fromList [(YStr "city",YStr "Royal Oak"),(YStr "lines",YStr "458 Walkman Dr.\nSuite #292\n"),(YStr "postal",YStr "48046"),(YStr "state",YStr "MI")])),(YStr "family",YStr "Dumars"),(YStr "given",YStr "Chris")])),(YStr "date",YStr "2001-01-23"),(YStr "invoice",YStr "34843")])

Now I have to dump to JSON. From a Python perspective, all I want is a function that can take some ‘native values’ and dump them to JSON, like the Python json.dump function. But every piece of data in my data structure is wrapped in things like YStr and YMap.

In addition, though I can see the structure of my data in front of me, the requirements I've been given don't make guarantees that it will stay the same, just that it can be converted to JSON. I need a routine that will convert anything YAML to the equivalent in JSON, where that is possible.

It looks like I could create a JSON instance for YamlLight, so that the encode function I want to use (which dumps JSON to a string) could take YamlLight as an input directly. I end up with this:

import Data.Yaml.YamlLight (parseYamlFile, lookupYL, YamlLight(..), unStr)
import Data.ByteString.Char8 (pack, unpack)
import Text.JSON (JSON(..), encode, JSValue(..), toJSString, toJSObject)
import Data.Maybe (fromJust)
import Data.Map (toList)

instance JSON YamlLight where
  showJSON yml =
    case yml of
      YStr bs -> JSString $ toJSString $ unpack bs
      YMap m -> JSObject $ toJSObject $
                map (\(y1, y2) -> (unpack $ fromJust $ unStr y1, showJSON y2)) $
                toList m
      YSeq ymls -> JSArray $ map showJSON ymls
      YNil -> JSNull

main = do
  d <- parseYamlFile "invoices.yaml"
  writeFile "address.json" $ encode $ fromJust $ lookupYL (YStr $ pack "bill-to") d

This works, and I'm sure there are other solutions. If I were cleverer, and knew Haskell better, I could perhaps write a cleverer, shorter solution, which would also be proportionately more difficult for someone else to understand, so I'm not particularly interested in making this code shorter, as it does the job.

But this illustrates why some people like dynamically typed languages. The fact that you can implement a variant data type in Haskell (such as YamlLight or JSValue) doesn't mean much, because these data types are not used everywhere, and therefore you have multiple competing ones that you've got to convert between. If you did have a single variant datatype that was used everywhere... you'd have a dynamically typed language, in effect.

The strictness of the type system gave rise to a choice of libraries and APIs that made my life harder, not easier. I then had to write glue code to marshall between the dynamic types used by the two libraries I needed. [Edit: or, as it turned out, I need to know where to find it, possibly in the form of already written type class instances, or how to get the compiler to write it for me]

Some people might still prefer the Haskell version. It has some nice properties, like the fact that compiler has checked that it can indeed convert any YAML object into JSON – you'd get a warning if you missed a case. One response to that might be that if the two types didn't happen to match so well – for instance if the YAML library started supporting date/time objects – this benefit would disappear. If you need to avoid all possible problems up front, Haskell will help you out more. Python, on the other hand, will allow you to avoid spending time thinking about theoretical problems which may never happen in reality.

But there are always runtime errors that you could come across, even in Haskell — for example, if you want to convert this to cope with non-ASCII documents, the compiler can't point out all the places you need to fix, and if you forget one you could still get a runtime exception, or worse, silent data corruption.

So, in my opinion, this is a case where dynamic typing shines, and the ability to implement dynamic typing on top of static typing simply doesn't give you the benefits you get in a language that embraces dynamic typing to its core.

There are, incidentally, some interesting developments in Haskell that might allow the possibility of running programs that aren't quite typed correctly, as long as you don't encounter the type errors in practice. This could counter some of the points I've raised – see this interview with Simon Peyton-Jones , from 27:45 onwards.

Is static type checking a redundant testing mechanism?

2009-11-09T15:45:16Z

As there has been discussion about not writing unit tests recently, I thought I'd use my recent experience in finishing a non-trivial Haskell program to comment on the issue of writing tests (unit tests and other automated tests) in the context of real code.

I'm especially prompted by this comment by Ned Batcheldor that I came across a few weeks ago:

Since static type checking can't cover all possibilities, you will need automated testing. Once you have automated testing, static type checking is redundant.

(that's in a comment on his own blog post)

To some extent I agree with this, but I want to give some reasons why a strong and powerful static type checker really does eliminate the need for automated tests in some cases – that is to say, there are instances when the static type checking makes the automated (unit/integration) tests redundant and not the other way around, and does a better job.

I have very few tests in my Haskell blog software. There are significantly more in the Ella library which I wrote alongside it, but still far from complete coverage. While I like test driven development, and did it for some parts of this project, many times it felt like a waste of time. In some cases it was perhaps misdirected laziness, but I'm not convinced it always was. So what are the characteristics of code that doesn't benefit from automated/unit tests?

Trivial code

If code is extremely simple, it can actually be worse to have tests than to not have them.

In defending that statement, the first thing to remember is that tests can have bugs in them too. Now, many bugs in the tests will be caught, as long as you follow the rule of making sure the test fails, then writing the code, then making sure it passes. However, many bugs of omission, which are also very common, will not be caught i.e. when the test fails to test something it ought to.

Second, there is always a cost to writing tests. So, as the probability of making a mistake in your code tends to zero, the usefulness of tests against that code also tends to zero—and not just to zero, it can go negative. You spent x minutes writing a test for something that didn't need testing, which is lost time and money already, and lost opportunities, and you also have extra (test) code to maintain in the future, and a longer test suite to run.

Third, you can write an infinite number of tests, and still have bugs. You can have 100% code coverage, and still have bugs. (I'll leave you to do the research on code coverage if you don't believe me). So, you have to stop somewhere, and therefore you need to know when to stop.

It is always a bad idea to write a test whose cost outweighs its value. That is, there is no neutral code – it is always positive or negative, because merely by existing it has a maintenance cost – not even counting the cost of producing it in the first place.

So suppose you write a utility function that is used to sanitise phone numbers that people might enter. It removes '-' and ' ' characters. (The result will of course be validated separately, but we want to allow people to enter phone numbers in a convenient way). In Python:

def sanitise_phone_number(s):
    return s.replace("-", "").replace(" ", "")

The testing fanatics might stop to write a unit test, but not the rest of us, because:

You would mainly be testing that the built-in string library works.
If you think of the ways that the function is likely to be wrong, the test is just as likely to fail to catch it. For example, the function above might really need to strip newline chars as well, but that's not going to be tested unless I think to write a test for that.
If there actually is a bug here, or the implementation gets more complex so that it merits a test, I can cross that bridge when I come to it, and it won't cost me extra.
It's more likely that I'll forget to use this function than that I get it wrong. Therefore, an integration test would be far more useful. But in some cases, integration tests can be extremely expensive, both to write and to run, especially when testing javascript based web frontends, or GUIs that are not very testable. I'm almost certainly going to test this code by at least one manual integration test, and after that, do I really need to write an automatic one?

However, if I was writing the function in a language that was less capable than Python, I might well write a test for the above.

Declarative code

(You could argue that this is an extension of trivial code, but it feels slightly different, and the case is even stronger).

Imagine your spec says that you should have 5 news items on the front page of your web site. You are using a library that has utility code for getting the first n items, or page x of n items each. And of course you are going to use a constant for that 5, rather than code it right in. So somewhere you are going to write (assuming Python):

NEWS_ITEMS_ON_HOME_PAGE = 5

Are you going to write a test that ensures that this value stays at 5, and doesn't accidentally get changed? Then your code base violates DRY—you now have two places where you are specifying the number of news items on the home page. That is, to some extent, the nature of all tests, but it's worse in this case. With non-declarative code and tests, one instance specifies behaviour, the other implementation, and it's usually obvious which is correct. But with declarative code, if one instance is different, how do you know which is correct?

Or are you going to write a test for the actual home page having 5 items? That would be pointless, because it's just testing that you are capable of calling a trivial API, which itself belongs to thoroughly tested code. You might want a sanity check that you haven’t made a typo, but checking that the page returns anything with a 200 code will often be enough.

What about something like a Django model? Your spec says that a 'restaurant' needs to have a 'name' which is a maximum of 100 chars. You write the following code:

class Restaurant(models.Model):
    name = models.CharField("Name", max_length=100)
    # ...

Are you going to write code to test that you've typed this in correctly? It would again be violating DRY. Are you going to check that this interfaces with the database correctly? There are already hundreds of tests in Django which cover this. Are you going to write tests that are effectively checking for typos? Well, if you use this model at all, it's going to be very obvious if you've made a mistake, and some other simple integration test is going to catch it.

Haskell

Now, coming to Haskell. You can guess the point I'm going to make.

In Haskell, a lot of code is either trivial or declarative.

Further, many of the types of errors you could make are caught by the compiler. Typos and missing imports etc. are always caught, and many other errors beside.

Functional programming languages, especially pure ones, eliminate a lot of the kind of mistakes that are easy in imperative languages. Everything being an expression helps a lot—it forces you to think about every branch and return a value. In monadic code it becomes possible to avoid this, but a lot of your code is pure functional.

Example 1

Imagine a more complex function than our sanitise_phone_number above. It's going to take a list of 'transformation' functions and an input value and apply each function to the value in turn, returning the final value. In some languages, that would be just about worth writing a test for. You might have to worry about iterating over the list, boundary conditions, etc. But in Haskell it looks like this:

apply = foldl' (flip ($))

In the above definition, there is basically nothing that can go wrong. We already know that foldl' works, and isn't going to miss anything, or fail with an empty list. You can't forget to return the return value, like you can in Python. The compiler will catch any type errors. If the function doesn't do anything approaching what it's supposed to then you'll know as soon as you try to use it. I've used point-free style, so there isn't any chance of doing something silly with the input variables, because they don't even appear in the function definition!

For something like the above, you would often write your type signature first:

apply :: a -> [a -> a] -> a

Once you've done that, it's even harder to make a mistake. It's almost possible to try vaguely relevant code at random and see if it compiles. For something like this, if it compiles, and it looks very simple, it's probably correct. (There are obviously times when that will fail you, but it's amazing how often it doesn't. You often feel like you just have to keep doing what the compiler tells you and you'll get working code.)

Is the above code 'trivial' or 'declarative'? Well, that's a tough call. A lot of code in Haskell quickly becomes very declarative in style, especially when written point free.

Example 2

But what about something much bigger—say the generation of an Atom feed? With a library that makes use of a strong static type system, this can be actually quite hard to get wrong.

In my blog software, I use the feed library for Atom feeds. The code I've had to write is extremely simple—a matter of creating some data structures corresponding to Atom feeds. The data structures are defined to force you to supply all required elements. Where there is a choice of data type, it forces you to choose – for example the 'content' field has to be set with either HTMLContent "<h1>your content</h1>" or TextContent "Your content". (For those who don't know Haskell, it should also be pointed out that there is no equivalent to 'null'. Optional values are made explicit using the Maybe type).

After filling in all the values for these feeds, I wrote some very simple 'glue' functions that fed in the data and returned the result as an HTTP response. I created 4 different feeds, all of which worked perfectly first time, as soon as I got them to compile. I cannot see any value, and only cost, in adding tests for this. A check for a 200 response code and non empty content might be worth it, but would be much easier to write as a bash script that uses 'curl' on a few known URLs.

Had I written this in Python, I might have wanted tests to ensure that the HTML in the Atom feed content was escaped properly and various other things, in addition to a simple check for status 200. But the API of the feed library, combined with the type checking that the compiler has done, has made that redundant, and has tested it far more easily and thoroughly than I could have done with tests.

And it's not in general true that the simple functional test will catch any type errors, because often it will only exercise one route through the code, ignoring the fact that in many places dynamically typed code can return values of different types, which can cause type failures etc.

Example 3

One final example of reducing the need for automated tests is the routing system I've used in Ella. OK, it's really a chance to show off the only slightly clever bit of code that I wrote, but hopefully it will explain something of the power of a strong type system :-)

Consider the following bits of code/configuration in a Django project, which are responsible for matching a URL, pulling out some bits from it and dispatching it to a view function.

### myproject/urls.py

patterns = ('',
   (r'^members/(\d+)/$', 'myproject.views.member_detail'),
   # etc...
)

### myproject/views.py

def member_detail(request, memberid):
    memberid = int(memberid)
    member = get_member(memberid)
    # etc...

Now, there are a number of possible failure points in this code that you might want some regression tests for. For example, if in the future we change it so that the URL uses a string such as a user name, rather an integer, we will need to change the URLconf, the line in member_detail that calls int, and the definition of get_member (or use a different function).

There is a DRY or OAOO failure here – the fact that we are expecting an integer is specified multiple times, either implicitly or explicitly. This is one of the causes of fragility in this chunk of code – if one is changed, the others might not be updated, introducing bugs of different kinds. Now, there are things you can do about this, with some small or large changes to how URLconfs work. But they are not complete solutions, and one solution not open to Python developers is the one I coded in Ella.

The equivalent bits of code, with type signatures and explanations of them for those who don't know any Haskell, would look like this in my system.

----- MyProject/Routes.hs

import MyProject.Views

routes = [
   "members/" <+/> intParam //-> memberDetail $ []
   -- etc...
]

----- MyProject/Views.hs

-- memberDetail takes an 'Int' and an HTTP 'Request' object, and returns an
--  HTTP 'Response' (or 'Nothing' to indicate a 404), doing some IO on the
--  way.
memberDetail :: Int -> Request -> IO (Maybe Response)
memberDetail memberId request = do
   member <- getMember memberId
   -- etc...

You should read <+/> as ‘followed by’ and //-> as ‘routes to’. Just ignore the $ [] bit for now (it exists to allow decorators to be applied easily in the routing configuration, but we are applying no decorators, hence the empty list).

intParam is a ‘matcher’: it attempts to pull off the next chunk of the URL (ending in a '/'), match it and parse it as an integer. If it can do so, it passes the parsed value on to memberDetail as a parameter i.e. it partially applies memberDetail with an integer.

The beauty of this system is that nothing can go wrong any more. We still have DRY violations at the moment, but it doesn't cause a problem, because the compiler checks for consistency.

In fact, we can even remove the DRY violation. We could change the code like this:

----- MyProject/Routes.hs

import MyProject.Views

routes = [
   "members/" <+/> anyParam //-> memberDetail $ []
   -- etc...
]

----- MyProject/Views.hs

memberDetail memberId request = do
   member <- getMember memberId
   -- etc...

We've replaced intParam with anyParam, which is a polymorphic version that can match any parameter of type class Param. You can define your own Param instances, so this is completely extensible (and you can also define your own matchers, for complete power). We've also removed the type signature from memberDetail. So how can anyParam know what type of thing to match?

This is where type inference comes in. The function getMember will probably have a type signature, or it will use its parameter in such a way that its type signature can be inferred. From that, the type of memberId can be inferred. From that, the type of value that anyParam must return can be inferred. And from that, finally, the instance of Param can be chosen. The compiler is using the type system to pick which method should be used to match and parse the URL parameters based on how those parameters are eventually used.

This is very nice. (At least I think so :-). We've removed the DRY violation, or, if we choose to use type signatures or explicitly specify types in routes, DRY violations don't matter because the compiler will catch them for us.

Would unit or functional tests have caught any problems? Well, they might. If they checked the happy case, they will prove whether that still works. But they're unlikely to check whether the URLconf is too permissive or not. But the compiler can do that kind of consistency check.

The end result is that there are just fewer things that can possibly go wrong. I'm not saying that you wouldn't bother to write any tests. But in this case, if memberDetail was really just glue, you might decide to only test its component parts (for example, by testing the template that it relies on). Since most of the glue has been constructed so that it can't go wrong, you can focus tests on what can go wrong. And some sections of the code sink below the threshold at which tests provide positive value.

There are many other ways in which static type checking can make automated tests redundant. Parsers are a great example – a spec might define a syntax in BNF notation. In Haskell, you might well implement that using parsec. But if you look at the code, it will have pretty much a one-to-one correspondence with the BNF definitions. Any tests you write will simply check that a few examples happen to be parsed correctly, as you cannot begin to cover the input space. It's therefore far better to spend your time manually checking that the code matches the BNF spec than writing lots of tests.

It's also often argued that integration/unit tests that achieve 100% coverage will catch all type related errors, making static type checking redundant, since even with static type checking we'll need tests to catch the value related errors. But this is a myth. In Python, it's easy to have code with type errors that have 100% test coverage. A simple example:

class Discount(models.Model):
    # nullable if there is no expiry:
    expires_on = models.DateField(null=True)

    @property
    def has_expired(self):
        return date.today() > self.expires_on

def test_has_expired():
    d = Discount(expires_on=date(2000, 1, 1))
    self.assertEqual(d.has_expired, True)

I omitted the negative case for has_expired for brevity, but we already have 100% coverage. However, we didn't check the None case and we'll get a TypeError at runtime for some legitimate values. In dynamically typed languages (or even all languages which allow nullable values), unit testing is extremely unhelpful for this situation. A powerful static type system like that found in Haskell, on the other hand, will find the error at compile time, and require that the signature of has_expired changes, and all the related code. The changes needed to get it to compile are almost impossible to get wrong, so for the case of having no expiry date, you have trivial code that does not need manual automated tests (that is the say, the test you would have written would have so little value, and relatively high costs, that writing one would be a failure of judgement and a waste of your current and future resources).

In general, unit tests often will not catch the type of errors that a compiler can if there is any polymorphism in the code paths. And in dynamically typed code, almost every code path can have polymorphism, because you can usually pass in None (and very often this is reasonable and legitimate), or any duck-typed object, and in the code itself you simply cannot tell how it will be called.

Conclusion

Before you flame me, don't think that I'm attacking other languages. This experience with Haskell has actually proved to me that Python is still easily my favourite language for web development, especially in combination with Django. (I could do a follow up on why that is—I have a growing list of things I dislike about Haskell, some of which are fixable). But I often hear the Python crowd saying things about static typing and testing that come from ignorance, and the way you would imagine things to be (often based on experience of Java/C++/C#), and not from experience of something like Haskell.

Notes

2017-05-15 - Added examples about tests not catching type errors, and opportunity cost.

Haskell blog software

2009-11-07T15:57:15Z

I finally finished the Haskell blog project that I've been doing for a long time! You're looking at it now (unless you are reading this a few months/years after I wrote it, in which case I will probably have again re-implemented my blog software in my new language-du-jour...) [EDIT: I switched to blogofile in June 2012]

The blog software itself is not particularly interesting – fairly standard features, Atom feeds etc. It uses HDBC Sqlite for storage, and HStringTemplate for rendering (a nice library, BTW). For framework stuff, it uses my own Ella library. I didn't find a forms/validation library I could use, and ended up just using a few adhoc bits and pieces. I've used the lovely pandoc to allow reStructuredText both for my own posts and for comments, which is a nice feature IMO.

The main interest for me has been the learning process. You get a much better, rounded understanding of a language from a project like this than you do from the small code samples that people knock around.

The project nearly failed at the last hurdle. Everything was working, but when I uploaded to my server, it failed on some URLs. I realised it was a memory problem – the CGI program must have been killed for using too much memory.

At first, I thought the limits on the server must be unreasonably small. Understanding the output of +RTS -s -RTS is kind of difficult. When I eventually found out that GHC compiled programs never release any memory back to the operating system, I realised that it's the first figure—the total amount of memory allocated in the heap—that was killing me. On the bigger pages, this was over 160 Mb. At that point I stopped complaining to my web host!

By changing to ByteString instead of Data.Text for StringTemplate, and using ByteString in a few other places, I achieved a 4-5 fold reduction in memory usage, along with a significant speed up. Most pages now only use about 10-15 Mb to render, which is OK for a short running process I think. It's not ideal, especially when an additional 1k comment on a page seems to require at least 300k extra memory to render, but it's good enough for now. Profiling further will be very hard, as I suspect it will mainly be to do with the guts of HStringTemplate.

I'll be blogging about the experience of developing this over the next few days/weeks, and what I've learnt. It's certainly been enjoyable overall, although it's definitely had its pain points too!

I've put redirection in for all the old, crufty URLs, so there shouldn't be any broken links. Feed readers will likely be confused, sorry!

If you have problems getting through my spam protection, please let me know. It enforces a 10 second wait before it accepts submissions, which serves to prevent thoughtless comments as well as spam :-)

Building GHC is fun...

2009-11-04T11:09:41Z

So, I rewrote my blog softare in Haskell, for kicks. I've finally finished, after a long time developing, trying out different ideas, learning Haskell etc.

I had already confirmed that I could build a binary for my target machine. That was a long process, which involved installing GHC 6.4 from binaries, and using that to build GHC 6.8.3. I have to build from source because of bug #2211.

However, in the process of developing, things have moved on, and it was much easier to develop with GHC 6.10 and newer libraries than the 6.8.* series. Which means that I now need GHC 6.10.* on the VM that I'm using to build binaries.

I tried 6.10.4, but due to bug #3179, I found I had to downgrade to 6.10.1.

Trying to build that, however, produced bug #3639 – it won't build with GHC 6.10.4. I switched to using GHC 6.8.3 install to try to build it, but it still isn't happy:

Configuring ghc-6.10.1...
cabal-bin: At least the following dependencies are missing:
Cabal -any,
base <3,
filepath >=1 && <1.2,
haskell98 -any,
hpc -any,
template-haskell -any,
unix -any
make[2]: *** [boot.stage.2] Error 1
make[2]: Leaving directory `/home/build/build/ghc-6.10.1/compiler'
make[1]: *** [stage2] Error 2
make[1]: Leaving directory `/home/build/build/ghc-6.10.1'
make: *** [bootstrap2] Error 2

Now, GHC 6.8.3 comes with base = 3.0.2.0, which might be the problem here. If that's right, then you can't build GHC 6.10.1 with 6.8.3. So, it sounds like I'm going to have to build GHC 6.6.1 in order to build 6.10.1.

This seems pretty crazy! It wouldn't be so bad if GHC was quick to build, but every build takes many hours.

Anyway, here goes, wish me luck!

Haskell string support

2009-08-03T21:58:59+01:00

This is my suggestion about what needs to go into the Haskell Platform.

Consider the following extremely simple program:

s = "λ"

main = do
    writeFile "test.txt" s
    s2 <- readFile "test.txt"
    print (s == s2)

No prizes for guessing that the output of this program is not "True". It highlights an essential problem with the Haskell standard library – many of the functions provided by the Prelude, System.IO, System.Posix and many others are completely broken (by design) and silently corrupt your data, unless it is composed only of ASCII characters.

The problem is that these APIs use Strings for operating system calls (such as reading/writing files, reading environment variables etc). A String is a list of unicode Chars, but none of the operating system calls have a clue what unicode chars are – they work entirely with bytes, which are a completely different kind of thing. Result: your program breaks without warning if you don't happen to be using ASCII.

And even worse, many libraries are built on the use of Strings and standard library functions, and they inherit these same problems, so as a user of those libraries, you can end up with problems that you can't even work around. For the library developer, too, it can be a very nasty problem – you start developing code using Strings, which works fine for ages, but a long time later you realise you can't support just ASCII, and really you need Data.ByteString, which requires changing function signatures or duplicating existing code if you don't want to break compatibility.

This is a rather embarrassing situation for the standard library of a modern language. What's worse is that even if you include the Haskell Platform as it currently stands, as far as I can see there is no solution to this bug – no correct way to simply write a string out to disk and read it back! I presume this is because there is no universally accepted library for dealing with encodings. Personally, I'd like to see the standard library change to remove the pretence that you can talk Unicode to the operating system, but at the very least we need a standardised way of doing the right thing, so that developers (of both programs and libraries) don't have to use those broken functions, and know what the correct alternatives are.

GHC bug?

2009-07-04T17:38:44+01:00

What do you do when you are dealing with what seems like a bizarre compiler bug, with the compiler being nothing less than GHC? First, pinch yourself — check; then try again, 3 times to be sure – check; clear out 'dist/' and any temporary build files – check; sleep on it – check.

And it's still happening.

I'm trying to use HStringTemplate for my personal blog software, in particular the renderf function. I was getting tricky compilation errors, and in the course of messing around I found the following:

GHC cannot compile a certain function, call it func1 for now, which uses renderf. But it compiles and works just fine if another function func2 (which doesn't use renderf, but does use a related HStringTemplate function render) is present in the module, even though func2 is not used anywhere in the project. Changing some of the details of what func2 does causes compilation to fail again, though other details can be changed.

That has to be impossible, right? Am I losing my mind?

Ideally I'd create a nice simple test case, but that might take hours, and changing small things about the voodoo function func2 seems to destroy its magical properties, and I'm suspecting the problem is in me. So I'll just post all my code. The bad news is there are lots of dependencies. The good news is I have used cabal, so the following instructions should suffice if you have cabal installed.

Download and install 'ella' (CGI web framework I'm writing) and dependencies:

git clone https://github.com/spookylukey/ella/
cd ella/
git checkout ba74e25b6275f27c3832ac9529d05e076e5f4f43
cabal configure --user && cabal build && cabal install --user
cd ..

Download and build the blog software:

git clone https://github.com/spookylukey/haskellblog/
cd haskellblog/
git checkout e9dfecbcdd14b297d818c224707499cb75b0e24e
cabal configure --user && cabal build

The build should succeed. Now, the voodoo function is at the end of src/Blog/Views.hs. Comment it out:

perl -pi -e 's/this_is_not_used/-- this_is_not_used/' src/Blog/Views.hs

Build again:

cabal build

Result - this compilation error.

I don't know whether that compilation error is correct or not, but either way, it seems crazy that it could depend on the existence and implementation of a completely unused function.

For reference, I'm using GHC 6.10.1.

Any ideas?

2017 Update - looks like I was bitten by the monomorphism restriction ‘feature’.

Haskell regex problem - help needed!

2008-11-21T19:14:32Z

I need some help! I did someone a good deed on a blog the other day, so I'm swallowing my pride and asking for a random kind deed from someone who knows something.

While trying to compile this snippet, I get this compilation error - a complaint about a missing instance.

This occurs with:

GHC 6.8.2
Various packages installed system wide (ubuntu 8.04 packages) and locally (using cabal --user --prefix=$HOME/local)
bytestring-0.9.0.2 or anything later

If I revert to the bytestring that comes with my system, 0.9.0.1 , the error goes away. Having finally looked at the differences between 0.9.0.1 and 0.9.0.2, which are tiny, and do not include any differences when it comes to the definition of typeclass instances, it seems clear that this isn't really the problem, but something else very funny is going on. But I do not have the first what.

I was just coping with it by sticking with bytestring-0.9.0.1, but I won't be able to do that forever...

Do I have to rebuild all the packages in my system or something evil? Any ideas?

Thanks in advance!