In Haskell, there is a datastructure called Maybe. Maybe is used where you would often use null pointers in languages such as C# or Python. Python doesn't have null pointers as such, but it does have the None object, which in many cases has the same kind of patterns and problems associated with it. Haskell needs Maybe because no data types are nullable -- unless you are writing code that interfaces to C, you never have the possibility of a null pointer. Here is a brief explanation why Haskell's Maybe is infinitely better than null pointers.
First, a word about Maybe. It is defined, at its most basical level, something like this in Haskell:
data Maybe a = Nothing | Just a deriving (Eq, Ord, Show, Read)
On the left hand side, a stands for any type. This definition means that a variable x which is, for example, of the type Maybe String is either Nothing or a Just y, where y is the actual string. To work out which it is, we normally use pattern matching, or a utility function that uses pattern matching -- but either way, we can't use variable x as if it was a string, we have to explicitly unpack it. This might seem laborious, especially when in other languages you get 'nullability' for free for most variables. A few examples will highlight why Haskell's way is much better. (By the way, the deriving clause tells the compiler to automatically generate methods for Maybe for testing equality, ordering, printing as a string and reading from a string).
Now, onto our examples. Suppose you have a function that is like the lookup function of a dictionary -- it takes a key, and may or may not return a value. I'll use 'K' or 'k' for the type of the key. In C# 2.0, using generics, the function signature might look something like this:
interface ILookup<K, V>
{
V GetItem(K key);
}
The idea is that GetItem() will return a null if the item isn't found -- this is a typical in languages like C#, whereas Python would probably throw a KeyError. (In both languages, we might do foo[key] rather than foo.GetItem(key), which can be done easily in either). A more functional implementation (but less idiomatic) would be to use delegates. We would require a declaration like this:
delegate V Lookup<K, V>(K key);
In Haskell, the function would be typed like this:
lookup :: (Eq k) => k -> Maybe v
Haskell already beats both Python and C# for the following reasons:
- The fact that the nullability is explicit in the signature means that you can't forget that there is a possibility the object is 'null'. Even if you do forget, the compiler won't. So all those NullPointerExceptions? You can say goodbye to them forever. If you have ever seen a NullPointerException (or NoneType object has no attribute 'blah' in python), and I'm sure we all have, then you have all the evidence you need that programmers, even good ones, do forget to do these checks.
- Even better than not producing NullPointerExceptions etc. is the fact that you can forget about whether a certain function can return null or not. The compiler will always catch it for you, so you are free to concentrate on the programming. This is a different point -- the first means you aren't producing programs that will ever crash with this error, the second means that as you are actually programming, you have less things in your mind to distract you, so you can keep your registers full with things you need.
But let's make things more interesting. First, let's say our dictionary is caching results of potentially expensive database queries. So, the key might be the SQL query string, and the value could be objects of various kinds. In this case, Nothing is a valid value that the dictionary might hold -- it means we have done the search already and found nothing. The 'lookup' function in this case would return Just Nothing. But what about the C#/Python versions? There is nothing to stop us putting a null value in against a certain key, but when we do the lookup, a null pointer or None object could now either mean 'we already did the search and we found nothing', or 'we haven't done the search'.
Let's add another level. Suppose we are searching for a value in the database that is held in a column that can actually be NULL. For example, we might be doing a search for Mrs Jones's age, and come back with the result that, although we do actually have a record for Mrs Jones, we don't have a value for her age.
In the C# version, we are really in trouble now. We have 3 possible meanings for 'null', with no way to tell them apart. In Python, we could solve this either by throwing different types of exceptions (as we could in C#), or by additional null objects -- in Python there is nothing to stop you creating singleton null objects and assigning them to any variable. We can then check for identity against these null objects to cope with our various errors.
Our Python solution suffers some limitations -- both exceptions and additional null objects will give us headaches if we try to serialise and resurrect them (None will resurrect OK, but for other singleton null objects you need a significantly more sophisticated mechanism than the ones normally used -- try it). But we also have to remember to document and handle all the possibilities. And when it comes to writing generic code that can easily be reused and combined in novel ways, our hacks will get in the way. We could decide to use a tuple to return (success_boolean, actual_value_or_None) at one or more levels, but this would be seen as unpythonic when the vast majority of code out there will either throw exceptions or return None.
By contrast, Haskell's Maybe mechanism is a breath of fresh air. You can just wrap as many layers of Maybe as you need. Our 'lookup' function would not need modifying, but would in fact be returning an object of type Just (Just (Just Int)). This could be:
- Nothing, meaning we haven't got a cached result for the search,
- Just Nothing to mean we have done the search, but not found a record for Mrs Jones at all,
- Just (Just Nothing) to indicate we have already done the search, and the result was the database doesn't know the age of Mrs Jones,
- or Just (Just (Just 43)), if Mrs Jones is 43 years old.
We won't get confused with all these levels -- for a start, each bit of code that adds or removes a Maybe wrapper is likely to only deal with one layer. But the compiler will catch it for us anyway if we make a mistake. Furthermore, we don't have to learn various different ways to cope with null -- we only have one, and we can be sure that our code won't need hacks to keep it working in different scenarios.
And, to add some icing to the cake, Maybe has been written as an instance of a Monad. We all know the common pattern in code that is basically "if the value I get at this point is null, then return null from this function". Using Maybe as a monad, we can eliminate a lot of that kind of stuff -- the monad encapsulates this pattern for us. Sometimes it will be clearer to explicitly check for Nothing or Just x, but often you will be able avoid doing that. Using monad notation what you effectively do is write a number of functions that are bound together using the 'bind' method of Monads (though these functions look more like imperative statements in monad notation). The Maybe version of 'bind' knows to abandon ship as soon as it gets a Nothing, and just return Nothing. Of course, this way of doing things is just as type-safe, but it just allows us to avoid some of the explicit null handling.
In conclusion: this is one instance of Haskell's wonderfully designed type system. If languages like C#, C++, Java etc have put you off static typing, Haskell might be able to change your mind.
Comments §
> code that adds or removes a Maybe wrapper is likely to only deal with
> one layer...
Yeah, right. If you call a database lookup function from a library, and it returns you Just Just Just Nothing, what are you going to do with that?
You're right that C# needs a way to signify whether a given value is allowed to be NULL. But if you're dealing with a complicated case like the one you've described, then C# (and Haskell too, I assume) will let you construct something much more self-documenting than a series of Justs.
Instead of returning Null plus an error number (which is basically what the Justs are encoding), I would much prefer the function to return a DatabaseLookupResult object, with members that I can query such as IsValueCached, IsValidRow, IsValidColumn, GetFoundValue and so on. Much nicer.
That is, Maybe seems to work best for monolithic software, software where all the call sites are in the same codebase. It also, for Java at least, pretty much requires some IDE support, or an extensible IDE (including a developer with the knowledge necessary to extend his IDE).
I find it much more valuable to get rid of possibly-null things than to make them into Maybes, but at least I still have Maybe on hand for those possible nulls that I can't yet get rid of.
foo x = x + 1 -- doesn't deal with Maybe correctly
And the value
val = Just 3
Then you can get Just 4 by doing
liftM foo val
This still handles "Nothing" correctly, as liftM foo Nothing will evaluate to Nothing. Cool!
~d
C# has nullable value types, just so you know.
Null != 0
The null object pattern combats problems with null behavior if you constantly have to check for nulls.
Imagine trying to get an object out of a hashtable and getting a Maybe back... wth you mean mabye? Either you got something for that key or you don't. Why return some custom representation of "I Ain't g ot nothin yo", when you can just return nothing... a null.
getWithCache cache query =
let cached = lookup cache query
in case cached of
Nothing -> doQueryAndUpdateCache cache query
Just val -> return val
The point I was making is that I don't have to go back and change 'getWithCache' when I discover that the value it returns might itself be nullable, nor do I have to change the 'lookup' function on my cache object.
class MyNullValue:
pass
MyNullValue = MyNullValue()
and
MyNullValue = object()
I was just pointing out that neither of these work if you want to pickle, unpickle, and still expect 'is' to work correctly. But you are right, adding the 'MyNullValue = MyNullValue()' to stop people constructing the object themselves is unnecessary, and if you don't do that it works fine.
You don't get Maybe back when you run it, you get Just x or Nothing. Maybe is the type, not a value. Because anything can return null, Java and C# and many other languages effectively make *every* function return a Maybe type, without the benefit of enforcing an exhaustive pattern match. Only Nice (a language for the JVM) appears to really address this at all, while C# has gone the opposite direction with nullable types. Now you can't even depend on primitives existing when they're supposed to. How is this considered to be in the neighborhood of sane programming practice?
I don't particularly agree with deeply nesting Maybe types either, especially attempting to get a property on a nonexistent row -- it should fail informatively instead. Of course, the 'fail' function for the Maybe monad is simply to return Nothing, so that remains generic monadic goodness, but the consumers of the database API are going to have to pattern-match the Maybe type because the result is constructed all at once. HaXML and HaskellDB provide richer interfaces, but of course are beyond a simple tutorial on the uses of a Maybe type.
The article makes reference to the fact that Maybe is a monad. The bind (>>=) function allows you to chain computations exactly as you describe.
For example, in Ruby, you might check to see whether a query had been cached not by checking to see if the value is nil, but by asking the Hash whether it has that key (the has_key? method).
You're still left with the nil vs. DbNull problem, but that's one layer removed.
Using Maybe for handling error cases is informative, but it's a bit of a disservice to teach people to use it that way. The "fetch from nonexistent record" thing should throw an exception. Real world code can always use Control.Exception, which conveniently catches the IO monad's 'error' values.
fred (Just x) = x
gives the following:
Loading package base-1.0 ... linking ... done.
Compiling Main ( maybe.hs, interpreted )
Ok, modules loaded: Main.
*Main> fred Nothing
*** Exception: maybe.hs:1:0-16: Non-exhaustive patterns in function fred
I believe ndm's Catch tool is meant to solve this problem, though.
Luke: the database example doesn't help your case at all. That's a *horrible* way of dealing with that problem. Using null pointers for every possible error condition is horrible, too, but I don't think anyone would actually do that. In general, Maybe's nice, but I'm not convinced it's the unalloyed win you seem to think it is - all the Justs and Nothings do clutter up your code rather. Does having to type out "Just x" everywhere make it *easier* to forget about? Maybe this can be avoided if you're good enough at monadic code, but I'm not there yet :-(
thanks for your feedback. You're right, the database example isn't great. But I don't think this is necessarily a situation where you'd use exception handling. I don't consider it an error condition if a user-initiated query on a database returns no result, and you really don't need -- and can't provide -- any more information about the problem. An error condition is when the database isn't there, or it doesn't contain the information you would expect to be there. Also, for database NULLs, Maybe does seem a pretty good fit, though you would probably be dealing with large record structures rather than single columns from a table.
The point I should have majored on was *orthogonality* -- I should be able to write my cache system or dictionary etc just once, and it either shouldn't go wrong if I use it in an unexpected way, or shouldn't allow me to use it incorrectly.
With regards to regards to 'forgetting' about it, what I meant was that the only things you have to deal with are things that are on the surface. The function you are using returns Maybe, so you have to deal with Nothing. You never have to worry about other possibilities -- the type system has ruled them out.
alpha renames a variable (from name to name') within an abstract syntax tree, unless there's a name collision:
alpha :: Name -> Name -> Term -> Maybe Term
alpha name name' (Variable var) = if var == name'
then Nothing -- collision; name already used
else if var == name
then Just (Variable name') -- rename
else Just (Variable name) -- keep old name
alpha name name' (Branch t1 t2) = do
{
t1' <- alpha name name' t1;
t2' <- alpha name name' t2;
return (Branch t1' t2');
}
alpha _ _ t = Just t
In the Variable case, I'm explicitly checking whether the names collide; if so, I'll have to manually state failure (yield Nothing).
In the Branch case, renaming fails if either t1 or t2 cannot be renamed. Using do-notation, this is handled automatically.
t1' and t2' are of type Term (not Maybe Term). If, for example, t1' cannot be extracted (alpha ... t1 evaluated to Nothing), the whole expression evaluates to Nothing.
return simply wraps a value into a Just (at least in the Maybe monad).
The last line is a catch-all, simply wrapping other nodes that need not be changed.
http://tinyurl.com/8wmgtt
Why don't we know what bugs-waiting-to-happen like null actually cost? Good question. There seems to be no culture of empiricism in computing.
template <typename T>
class Maybe
{
public:
Maybe() : yes_(false), a_()
{
}
Maybe(T const & a) : yes_(true), a_(a)
{
}
void Set() { yes_ = false; }
void Set(T const & a) { yes_ = true; a_ = a; }
T & Get() { assert(yes_); return a_; }
bool Yes() { return yes_; }
bool No() { return !yes_; }
private:
bool yes_;
T a_;
};