All Unkept
Posted in: Haskell, Python, Software development  —  7 May 2007

Null pointers vs None vs Maybe

In Haskell, there is a datastructure called Maybe. Maybe is used where you would often use null pointers in languages such as C# or Python. Python doesn't have null pointers as such, but it does have the None object, which in many cases has the same kind of patterns and problems associated with it. Haskell needs Maybe because no data types are nullable -- unless you are writing code that interfaces to C, you never have the possibility of a null pointer. Here is a brief explanation why Haskell's Maybe is infinitely better than null pointers.

First, a word about Maybe. It is defined, at its most basical level, something like this in Haskell:

data  Maybe a  =  Nothing | Just a
  deriving (Eq, Ord, Show, Read)

On the left hand side, a stands for any type. This definition means that a variable x which is, for example, of the type Maybe String is either Nothing or a Just y, where y is the actual string. To work out which it is, we normally use pattern matching, or a utility function that uses pattern matching -- but either way, we can't use variable x as if it was a string, we have to explicitly unpack it. This might seem laborious, especially when in other languages you get 'nullability' for free for most variables. A few examples will highlight why Haskell's way is much better. (By the way, the deriving clause tells the compiler to automatically generate methods for Maybe for testing equality, ordering, printing as a string and reading from a string).

Now, onto our examples. Suppose you have a function that is like the lookup function of a dictionary -- it takes a key, and may or may not return a value. I'll use 'K' or 'k' for the type of the key. In C# 2.0, using generics, the function signature might look something like this:

interface ILookup<K, V>
{
   V GetItem(K key);
}

The idea is that GetItem() will return a null if the item isn't found -- this is a typical in languages like C#, whereas Python would probably throw a KeyError. (In both languages, we might do foo[key] rather than foo.GetItem(key), which can be done easily in either). A more functional implementation (but less idiomatic) would be to use delegates. We would require a declaration like this:

delegate V Lookup<K, V>(K key);

In Haskell, the function would be typed like this:

lookup :: (Eq k) => k -> Maybe v

Haskell already beats both Python and C# for the following reasons:

  1. The fact that the nullability is explicit in the signature means that you can't forget that there is a possibility the object is 'null'. Even if you do forget, the compiler won't. So all those NullPointerExceptions? You can say goodbye to them forever. If you have ever seen a NullPointerException (or NoneType object has no attribute 'blah' in python), and I'm sure we all have, then you have all the evidence you need that programmers, even good ones, do forget to do these checks.
  2. Even better than not producing NullPointerExceptions etc. is the fact that you can forget about whether a certain function can return null or not. The compiler will always catch it for you, so you are free to concentrate on the programming. This is a different point -- the first means you aren't producing programs that will ever crash with this error, the second means that as you are actually programming, you have less things in your mind to distract you, so you can keep your registers full with things you need.

But let's make things more interesting. First, let's say our dictionary is caching results of potentially expensive database queries. So, the key might be the SQL query string, and the value could be objects of various kinds. In this case, Nothing is a valid value that the dictionary might hold -- it means we have done the search already and found nothing. The 'lookup' function in this case would return Just Nothing. But what about the C#/Python versions? There is nothing to stop us putting a null value in against a certain key, but when we do the lookup, a null pointer or None object could now either mean 'we already did the search and we found nothing', or 'we haven't done the search'.

Let's add another level. Suppose we are searching for a value in the database that is held in a column that can actually be NULL. For example, we might be doing a search for Mrs Jones's age, and come back with the result that, although we do actually have a record for Mrs Jones, we don't have a value for her age.

In the C# version, we are really in trouble now. We have 3 possible meanings for 'null', with no way to tell them apart. In Python, we could solve this either by throwing different types of exceptions (as we could in C#), or by additional null objects -- in Python there is nothing to stop you creating singleton null objects and assigning them to any variable. We can then check for identity against these null objects to cope with our various errors.

Our Python solution suffers some limitations -- both exceptions and additional null objects will give us headaches if we try to serialise and resurrect them (None will resurrect OK, but for other singleton null objects you need a significantly more sophisticated mechanism than the ones normally used -- try it). But we also have to remember to document and handle all the possibilities. And when it comes to writing generic code that can easily be reused and combined in novel ways, our hacks will get in the way. We could decide to use a tuple to return (success_boolean, actual_value_or_None) at one or more levels, but this would be seen as unpythonic when the vast majority of code out there will either throw exceptions or return None.

By contrast, Haskell's Maybe mechanism is a breath of fresh air. You can just wrap as many layers of Maybe as you need. Our 'lookup' function would not need modifying, but would in fact be returning an object of type Just (Just (Just Int)). This could be:

  1. Nothing, meaning we haven't got a cached result for the search,
  2. Just Nothing to mean we have done the search, but not found a record for Mrs Jones at all,
  3. Just (Just Nothing) to indicate we have already done the search, and the result was the database doesn't know the age of Mrs Jones,
  4. or Just (Just (Just 43)), if Mrs Jones is 43 years old.

We won't get confused with all these levels -- for a start, each bit of code that adds or removes a Maybe wrapper is likely to only deal with one layer. But the compiler will catch it for us anyway if we make a mistake. Furthermore, we don't have to learn various different ways to cope with null -- we only have one, and we can be sure that our code won't need hacks to keep it working in different scenarios.

And, to add some icing to the cake, Maybe has been written as an instance of a Monad. We all know the common pattern in code that is basically "if the value I get at this point is null, then return null from this function". Using Maybe as a monad, we can eliminate a lot of that kind of stuff -- the monad encapsulates this pattern for us. Sometimes it will be clearer to explicitly check for Nothing or Just x, but often you will be able avoid doing that. Using monad notation what you effectively do is write a number of functions that are bound together using the 'bind' method of Monads (though these functions look more like imperative statements in monad notation). The Maybe version of 'bind' knows to abandon ship as soon as it gets a Nothing, and just return Nothing. Of course, this way of doing things is just as type-safe, but it just allows us to avoid some of the explicit null handling.

In conclusion: this is one instance of Haskell's wonderfully designed type system. If languages like C#, C++, Java etc have put you off static typing, Haskell might be able to change your mind.

Comments §

§ On 7 May 2007, Laurie Cheers wrote:
196 > We won't get confused with all these levels -- for a start, each bit of
> code that adds or removes a Maybe wrapper is likely to only deal with
> one layer...

Yeah, right. If you call a database lookup function from a library, and it returns you Just Just Just Nothing, what are you going to do with that?

You're right that C# needs a way to signify whether a given value is allowed to be NULL. But if you're dealing with a complicated case like the one you've described, then C# (and Haskell too, I assume) will let you construct something much more self-documenting than a series of Justs.

Instead of returning Null plus an error number (which is basically what the Justs are encoding), I would much prefer the function to return a DatabaseLookupResult object, with members that I can query such as IsValueCached, IsValidRow, IsValidColumn, GetFoundValue and so on. Much nicer.

§ On 7 May 2007, Ricky Clarkson wrote:
197 I've been using Maybe from within Java for a while, after discovering it when learning Haskell, and I find it problematic, because I cannot easily change something from returning Maybe x to x, or vice versa. I have to go through and change all the call sites.

That is, Maybe seems to work best for monolithic software, software where all the call sites are in the same codebase. It also, for Java at least, pretty much requires some IDE support, or an extensible IDE (including a developer with the knowledge necessary to extend his IDE).

I find it much more valuable to get rid of possibly-null things than to make them into Maybes, but at least I still have Maybe on hand for those possible nulls that I can't yet get rid of.

§ On 7 May 2007, Phillip J. Eby wrote:
198 If you want serializable symbols in Python, just use a class or a function. Or if you want something fancier than that, use the SymbolType package from the CheeseShop.

§ On 7 May 2007, dmwit wrote:
199 Ricky: In Haskell, there is a function "liftM" which allows you to use non-Maybe functions on Maybe values. So, for example, if you have the function
foo x = x + 1 -- doesn't deal with Maybe correctly
And the value
val = Just 3
Then you can get Just 4 by doing
liftM foo val
This still handles "Nothing" correctly, as liftM foo Nothing will evaluate to Nothing. Cool!
~d

§ On 7 May 2007, Brian wrote:
200 So like when you are looking for a bug and you see something is a Maybe, does that mean you only might have found the bug?

C# has nullable value types, just so you know.

Null != 0

The null object pattern combats problems with null behavior if you constantly have to check for nulls.


Imagine trying to get an object out of a hashtable and getting a Maybe back... wth you mean mabye? Either you got something for that key or you don't. Why return some custom representation of "I Ain't g ot nothin yo", when you can just return nothing... a null.

§ On 7 May 2007, luke wrote:
201 Laurie: In the example I gave, you really would only have one layer dealing with each layer of Maybe. For example, the bit that handles the cache, would be (assuming this is in the IO monad to be able to do the db query and update the cache):

getWithCache cache query =
                           let cached = lookup cache query
                           in case cached of
                                 Nothing -> doQueryAndUpdateCache cache query
                                Just val -> return val
 
The point I was making is that I don't have to go back and change 'getWithCache' when I discover that the value it returns might itself be nullable, nor do I have to change the 'lookup' function on my cache object.

§ On 7 May 2007, luke wrote:
202 Phillip: I'm honoured to get a comment from you! Thanks for the link. I probably should have expanded my point about serializing. I've seen these two idioms for singleton null values:

class MyNullValue:
    pass
MyNullValue = MyNullValue()

and

MyNullValue = object()

I was just pointing out that neither of these work if you want to pickle, unpickle, and still expect 'is' to work correctly. But you are right, adding the 'MyNullValue = MyNullValue()' to stop people constructing the object themselves is unnecessary, and if you don't do that it works fine.

§ On 7 May 2007, chuck wrote:
203 > Imagine trying to get an object out of a hashtable and getting a Maybe back

You don't get Maybe back when you run it, you get Just x or Nothing. Maybe is the type, not a value. Because anything can return null, Java and C# and many other languages effectively make *every* function return a Maybe type, without the benefit of enforcing an exhaustive pattern match. Only Nice (a language for the JVM) appears to really address this at all, while C# has gone the opposite direction with nullable types. Now you can't even depend on primitives existing when they're supposed to. How is this considered to be in the neighborhood of sane programming practice?


I don't particularly agree with deeply nesting Maybe types either, especially attempting to get a property on a nonexistent row -- it should fail informatively instead. Of course, the 'fail' function for the Maybe monad is simply to return Nothing, so that remains generic monadic goodness, but the consumers of the database API are going to have to pattern-match the Maybe type because the result is constructed all at once. HaXML and HaskellDB provide richer interfaces, but of course are beyond a simple tutorial on the uses of a Maybe type.

§ On 7 May 2007, Tony Morris wrote:
204 > returns you Just Just Just Nothing, what are you going to do with that?

The article makes reference to the fact that Maybe is a monad. The bind (>>=) function allows you to chain computations exactly as you describe.

§ On 8 May 2007, Brian Donovan wrote:
205 While I appreciate the point of this discussion (Maybe/Just), I'd just like to point out that the situation for the value/null croud isn't as dire as you say.

For example, in Ruby, you might check to see whether a query had been cached not by checking to see if the value is nil, but by asking the Hash whether it has that key (the has_key? method).

You're still left with the nil vs. DbNull problem, but that's one layer removed.

§ On 8 May 2007, chuck wrote:
206 You're still with the ubiquitous possibility of "null pointer exceptions" in general. Haskell won't let a program compile that doesn't handle the None case, either through bind (which just propagates None rather like a thrown exception) or through explicit pattern-matching.

Using Maybe for handling error cases is informative, but it's a bit of a disservice to teach people to use it that way. The "fetch from nonexistent record" thing should throw an exception. Real world code can always use Control.Exception, which conveniently catches the IO monad's 'error' values.

§ On 10 May 2007, Miles Gould wrote:
207 chuck: alas, no, or at least, not with my copy of GHC. Compiling a file containing only

fred (Just x) = x

gives the following:

Loading package base-1.0 ... linking ... done.
Compiling Main ( maybe.hs, interpreted )
Ok, modules loaded: Main.
*Main> fred Nothing
*** Exception: maybe.hs:1:0-16: Non-exhaustive patterns in function fred

I believe ndm's Catch tool is meant to solve this problem, though.

Luke: the database example doesn't help your case at all. That's a *horrible* way of dealing with that problem. Using null pointers for every possible error condition is horrible, too, but I don't think anyone would actually do that. In general, Maybe's nice, but I'm not convinced it's the unalloyed win you seem to think it is - all the Justs and Nothings do clutter up your code rather. Does having to type out "Just x" everywhere make it *easier* to forget about? Maybe this can be avoided if you're good enough at monadic code, but I'm not there yet :-(

§ On 10 May 2007, luke wrote:
208 Miles:

thanks for your feedback. You're right, the database example isn't great. But I don't think this is necessarily a situation where you'd use exception handling. I don't consider it an error condition if a user-initiated query on a database returns no result, and you really don't need -- and can't provide -- any more information about the problem. An error condition is when the database isn't there, or it doesn't contain the information you would expect to be there. Also, for database NULLs, Maybe does seem a pretty good fit, though you would probably be dealing with large record structures rather than single columns from a table.

The point I should have majored on was *orthogonality* -- I should be able to write my cache system or dictionary etc just once, and it either shouldn't go wrong if I use it in an unexpected way, or shouldn't allow me to use it incorrectly.

With regards to regards to 'forgetting' about it, what I meant was that the only things you have to deal with are things that are on the surface. The function you are using returns Maybe, so you have to deal with Nothing. You never have to worry about other possibilities -- the type system has ruled them out.

§ On 14 April 2009, Anonymous Coward wrote:
422 Using the Maybe Monad is simple - the syntax helps a lot. Look at the following code:
alpha renames a variable (from name to name') within an abstract syntax tree, unless there's a name collision:

alpha :: Name -> Name -> Term -> Maybe Term

alpha name name' (Variable var) = if var == name'
    then Nothing -- collision; name already used
    else if var == name
        then Just (Variable name') -- rename
        else Just (Variable name) -- keep old name

alpha name name' (Branch t1 t2) = do
    {
        t1' <- alpha name name' t1;
        t2' <- alpha name name' t2;
        return (Branch t1' t2');
    }

alpha _ _ t = Just t

In the Variable case, I'm explicitly checking whether the names collide; if so, I'll have to manually state failure (yield Nothing).
In the Branch case, renaming fails if either t1 or t2 cannot be renamed. Using do-notation, this is handled automatically.
t1' and t2' are of type Term (not Maybe Term). If, for example, t1' cannot be extracted (alpha ... t1 evaluated to Nothing), the whole expression evaluates to Nothing.
return simply wraps a value into a Just (at least in the Maybe monad).
The last line is a catch-all, simply wrapping other nodes that need not be changed.

§ On 8 March 2010, Bob Foster wrote:
863 Good article. You may have got more intelligent responses from the unwashed masses - who think that whatever their language does is obviously the right thing - if you'd gone into a little more detail about the number of bugs caused by null, and the cost of these bugs. C.A.R. Hoare, who added null to Algol W in 1965, calls it the "billion dollar mistake".

http://tinyurl.com/8wmgtt

Why don't we know what bugs-waiting-to-happen like null actually cost? Good question. There seems to be no culture of empiricism in computing.

§ On 15 June 2010, Luke Worth wrote:
893 Here's (untested) Maybe in C++, just so you know you can get all those Maybe advantages in any language with generics:

template <typename T>
class Maybe
{
public:
   Maybe() : yes_(false), a_()
   {
   }

   Maybe(T const & a) : yes_(true), a_(a)
   {
   }

   void Set() { yes_ = false; }
   void Set(T const & a) { yes_ = true; a_ = a; }
   T & Get() { assert(yes_); return a_; }
   bool Yes() { return yes_; }
   bool No() { return !yes_; }

private:
   bool yes_;
   T a_;
};

Add comment

Format:

  • Javascript has to be on to get past my spam protection, and cookies, and there is a delay, sorry for any inconvenience!
  • I reserve the right to moderate comments.