All Unkept

Implementing a dictionary using first class functions

Posted in: Haskell  — 

I'm going to demonstrate using functions to build dictionaries, requiring only normal functions and the Maybe data type. This is a literate programming article (you can copy, paste, save it as 'something.lhs' and run it using 'runhaskell something.lhs'). The principles apply to any language which can use functions as first class values, but I'm using Haskell, as it is ideally suited to this sort of thing.

Disclaimer: The 'dictionary' implementation is very general and therefore very limited — the only concept it supports is "what is the value for this key (if any)?".

First, some imports that will be needed to demonstrate uses of our dictionary:

> import Data.List
> import qualified Data.Map as Map

A 'functional dictionary' (as I shall call them) is defined simply as a lookup function — a function that takes an argument, and sometimes returns a Maybe value — that is, it returns Nothing if no match is found, or Just something if a match is found. The type in Haskell would be: a -> Maybe b

An empty dictionary always returns nothing:

> emptyDict = const Nothing

To add an item to our dictionary, we need to wrap it and return a new function. To start off with, we will add a key-value pair (a is the key, b is the value). In this case our key type will have to support testing for equality.

> addKeyValue :: Eq a => a -> b -> (a -> Maybe b) -> (a -> Maybe b)

That is, this function will take a key, a value, and a functional dictionary, and return a new function that will try to match the key, return the value (wrapped in Just) if it matches, but otherwise use the original dictionary.

> addKeyValue key val dict = \k -> if k == key
>                                     then Just val
>                                     else dict k

Here is a test — I'll create is as an action (IO function), so that we can run all the tests at the end. If I write the last paragraph of this article now, I can then incrementally add code and run the tests, by running the literate Haskell file that contains this article. The first argument to assertEquals is a name that we can use to identify tests that fail.

> d0 = emptyDict :: (String -> Maybe String) -- GHC complains at us without this type declaration
> d1 = addKeyValue "testKey" "test value 1" d0
> d2 = addKeyValue "another key" "value 2" d1
> test1 = do
>         assertEquals "addKeyValue 1" (d2 "testKey") (Just "test value 1")
>         assertEquals "addKeyValue 2" (d2 "another key") (Just "value 2")
>         assertEquals "addKeyValue 3" (d2 "testBadKey") Nothing

Now we can also remove items in a similar way:

> removeKey key dict = \k -> if k == key
>                               then Nothing
>                               else dict k

And some tests:

> d3 = removeKey "testKey" d2
> test2 = do
>         assertEquals "Test removeKey 1" (d3 "testKey") Nothing
>         assertEquals "Test removeKey 2" (d3 "another key") (Just "value 2")

Now, what is the point in all this? Why not use a data structure that is optimised for dictionary lookups? Well, first I wanted to demonstrate how useful it can be to have functions as first class values — we have built a flexible, working 'lookup' implementation with very little code.

But more importantly, this kind of thing can actually be an efficient way of doing things for some uses. Sometimes you require a structure that acts like a dictionary, but occasionally you will be able to do massive memory and/or speed optimisations using some inside knowledge e.g. you know that everything that starts with "x" returns the same answer.

> addPrefixMatcher prefix val dict = \k -> if prefix `isPrefixOf` k
>                                             then Just val
>                                             else dict k
> d4 = addPrefixMatcher "x" "A value for things starting with 'x'" d3
> test3 = do
>         assertEquals "prefix 1" (d4 "xenophobic") (Just "A value for things starting with 'x'")
>         assertEquals "prefix 2" (d4 "zebra") Nothing

The pattern of 'if match then Just val else dict k' is getting tedious, so lets abstract it:

> addMatcher :: (a -> Bool) -> b -> (a -> Maybe b) -> (a -> Maybe b)
> addMatcher tester val dict = \k -> if tester k
>                                       then Just val
>                                       else dict k

We can now rewrite our other functions if we want:

> addPrefixMatcher' p = addMatcher (p `isPrefixOf`)
> d4' = addPrefixMatcher' "x" "A value for things starting with 'x'" d3
> test3' = do
>          assertEquals "prefix 1 (v2)" (d4 "xenophobic") (Just "A value for things starting with 'x'")
>          assertEquals "prefix 2 (v2)" (d4 "zebra") Nothing

Or we can demonstrate some other matchers:

> d5 = addMatcher ((<= 3) . length) "the value for short keys" d4
> test4 = do
>         assertEquals "length matcher 1" (d5 "abc") (Just "the value for short keys")
>         assertEquals "length matcher 2" (d5 "xool") (Just "A value for things starting with 'x'")

We can combine several dictionaries into a larger one which will try each in order until an answer is found:

> combineDicts :: [(a -> Maybe b)] -> (a -> Maybe b)
> combineDicts [] = emptyDict
> combineDicts (d:ds) = \k -> let v = d k
>                             in case v of
>                                       Nothing -> combineDicts (ds) k
>                                       Just _  -> v
> d6 = combineDicts [d5, addPrefixMatcher "z" "value for z" emptyDict]
> test5 = do
>         assertEquals "combineDicts 1" (d6 "z") (Just "the value for short keys")
>         assertEquals "combineDicts 2" (d6 "zoology") (Just "value for z")
>         assertEquals "combineDicts 3" (d6 "ridiculous") Nothing

We can even wrap the standard library Data.Map with our implementation. This way, you could combine a structure that is efficient at searching for some answers, and a functional dictionary that is more flexible but only knows the 'exceptional' cases.

> mapToFuncDict map = \k -> Map.lookup k map

To test it:

> map1 = Map.singleton "map key 1" "map val 1"
> d7 = combineDicts [d6, mapToFuncDict map1]
> test6 = do
>         assertEquals "mapToFuncDict 1" (d7 "map key 1") (Just "map val 1")
>         assertEquals "mapToFuncDict 2" (d7 "zoology") (Just "value for z")

That's all for this article, but hopefully it demonstrates something of how useful it is to have functions as first class values, and something of the functional mindset.

Of course, finally, we need to define assertEquals and run all the tests.

> assertEquals testname realVal expectedVal =
>     putStrLn (testname ++ ": " ++
>                    if expectedVal == realVal
>                    then "passed"
>                    else "failed: \n expected " ++ show expectedVal ++ "\n received " ++ show realVal)
> tests = [test1, test2, test3, test3', test4, test5, test6]
> main = sequence_ tests

Comments §

blog comments powered by Disqus