In Haskell, there is a datastructure called
Maybe is used where
you would often use null pointers in languages such as C# or Python. Python
doesn't have null pointers as such, but it does have the
None object, which
in many cases has the same kind of patterns and problems associated with it.
Maybe because no data types are nullable – unless you are
writing code that interfaces to C, you never have the possibility of a null
pointer. Here is a brief explanation why Haskell's
Maybe is infinitely
better than null pointers.
First, a word about
Maybe. It is defined, at its most basic level, something
like this in Haskell:
On the left hand side,
a stands for any type. This definition means that
a variable x which is, for example, of the type
Maybe String is either
Nothing or a
Just y, where y is the actual string. To work out which
it is, we normally use pattern matching, or a utility function that uses
pattern matching – but either way, we can't use variable x as if it was a
string, we have to explicitly unpack it. This might seem laborious,
especially when in other languages you get 'nullability' for free for most
variables. A few examples will highlight why Haskell's way is much better.
(By the way, the
deriving clause tells the compiler to automatically
generate methods for
Maybe for testing equality, ordering, printing as a
string and reading from a string).
Now, onto our examples. Suppose you have a function that is like the lookup function of a dictionary – it takes a key, and may or may not return a value. I'll use 'K' or 'k' for the type of the key. In C# 2.0, using generics, the function signature might look something like this:
The idea is that
GetItem() will return a null if the item isn't found —
this is a typical in languages like C#, whereas Python would probably throw a
KeyError. (In both languages, we might do
foo[key] rather than
foo.GetItem(key), which can be done easily in either). A more functional
implementation (but less idiomatic) would be to use delegates. We would
require a declaration like this:
In Haskell, the function would be typed like this:
Haskell already beats both Python and C# for the following reasons:
The fact that the nullability is explicit in the signature means that you can't forget that there is a possibility the object is 'null'. Even if you do forget, the compiler won't. So all those
NullPointerExceptions? You can say goodbye to them forever. If you have ever seen a
NoneType object has no attribute 'blah'in Python), and I'm sure we all have, then you have all the evidence you need that programmers, even good ones, do forget to do these checks.
Even better than not producing
NullPointerExceptionsetc. is the fact that you can forget about whether a certain function can return null or not. The compiler will always catch it for you, so you are free to concentrate on the programming. This is a different point – the first means you aren't producing programs that will ever crash with this error, the second means that as you are actually programming, you have less things in your mind to distract you, so you can keep your registers full with things you need.
But let's make things more interesting. First, let's say our dictionary is
caching results of potentially expensive database queries. So, the key might
be the SQL query string, and the value could be objects of various kinds. In
Nothing is a valid value that the dictionary might hold – it
means we have done the search already and found nothing. The 'lookup'
function in this case would return
Just Nothing. But what about the
C#/Python versions? There is nothing to stop us putting a null value in
against a certain key, but when we do the lookup, a null pointer or
object could now either mean 'we already did the search and we found
nothing', or 'we haven't done the search'.
Let's add another level. Suppose we are searching for a value in the database that is held in a column that can actually be NULL. For example, we might be doing a search for Mrs Jones's age, and come back with the result that, although we do actually have a record for Mrs Jones, we don't have a value for her age.
In the C# version, we are really in trouble now. We have 3 possible meanings for 'null', with no way to tell them apart. In Python, we could solve this either by throwing different types of exceptions (as we could in C#), or by additional null objects – in Python there is nothing to stop you creating singleton null objects and assigning them to any variable. We can then check for identity against these null objects to cope with our various errors.
Our Python solution suffers some limitations – both exceptions and
additional null objects will give us headaches if we try to serialise and
resurrect them (
None will resurrect OK, but for other singleton null
objects you need a significantly more sophisticated mechanism than the ones
normally used – try it). But we also have to remember to document and handle
all the possibilities. And when it comes to writing generic code that can
easily be reused and combined in novel ways, our hacks will get in the way.
We could decide to use a tuple to return (success_boolean,
actual_value_or_None) at one or more levels, but this would be seen as
unpythonic when the vast majority of code out there will either throw
exceptions or return None.
By contrast, Haskell's
Maybe mechanism is a breath of fresh air. You can
just wrap as many layers of
Maybe as you need. Our 'lookup' function
would not need modifying, but would in fact be returning an object of type
Just (Just (Just Int)). This could be:
Nothing, meaning we haven't got a cached result for the search,
Just Nothingto mean we have done the search, but not found a record for Mrs Jones at all,
Just (Just Nothing)to indicate we have already done the search, and the result was the database doesn't know the age of Mrs Jones,
Just (Just (Just 43)), if Mrs Jones is 43 years old.
We won't get confused with all these levels – for a start, each bit of code
that adds or removes a
Maybe wrapper is likely to only deal with one
layer. But the compiler will catch it for us anyway if we make a mistake.
Furthermore, we don't have to learn various different ways to cope with null
— we only have one, and we can be sure that our code won't need hacks to
keep it working in different scenarios.
And, to add some icing to the cake,
Maybe has been written as an instance
of a Monad. We all know the common pattern in code that is basically "if the
value I get at this point is null, then return null from this function".
Using Maybe as a monad, we can eliminate a lot of that kind of stuff – the
monad encapsulates this pattern for us. Sometimes it will be clearer to
explicitly check for
Just x, but often you will be able
avoid doing that. Using monad notation what you effectively do is write a
number of functions that are bound together using the 'bind' method of Monads
(though these functions look more like imperative statements in monad
Maybe version of 'bind' knows to abandon ship as soon as
it gets a
Nothing, and just return
Nothing. Of course, this way of
doing things is just as type-safe, but it just allows us to avoid some of the
explicit null handling.
In conclusion: this is one instance of Haskell's wonderfully designed type system. If languages like C#, C++, Java etc have put you off static typing, Haskell might be able to change your mind.