Haskell API documentation is very lacking for newbies. For instance, I want to understand how to create and use regexes. If you start at Text.Regex.Posix documentation, it tells you that =~ and =~~ are the high level API, and the hyperlinks for those functions go to Text.Regex.Posix.Wrap, where the main functions are not actually documented at all!
So we look at the type signatures -- here is the first:
(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target
So, that leads me to the class declarations for these things. But trying to understand them is rather intimidating:
class RegexOptions regex compOpt execOpt | regex -> compOpt execOpt, compOpt -> regex execOpt, execOpt -> regex compOpt where
Or how about this?
class RegexOptions regex compOpt execOpt => RegexMaker regex compOpt execOpt source | regex -> compOpt execOpt, compOpt -> regex execOpt, execOpt -> regex compOpt where
They are using multi-parameter type classes and functional dependencies. Having read bits of Haskell for a while, I happen to know what they are (vaguely), but I don't really understand them, nor does the above really give me any clue to how to actually use this API.
Google to the rescue. (This is bad: I shouldn't have to google for documentation when I'm already looking at the obvious place for something to be documented). The first result for "haskell regex" is a completely useless and hopeless out of date page, but there is a Haskell regex tutorial on a blog that shows us how to do it, and it is astonishingly simple:
> "bar" =~ "(foo|bar)" :: String "bar"
So what is going on? It looks like the library has been designed extremely cleverly so that in the simple case (regex with default options etc), you can use it very easily, but you don't need to use different functions if you want to add regex options. Furthermore, it is polymorphic in its return type, so we can also do this:
> "bar" =~ "(foo|bar)" :: Bool True
In fact you can get lists of matches, or lists of match offsets etc -- almost anything you can think, just by specifying (directly or using type inference) the type of the result you want. This is beautifully elegant and clever and I'm sure it gave the designer a warm fuzzy feeling inside (well, it gives me one, and I'm just looking at it). The downside is that if you try to use =~ at a GHCi prompt without a type annotation, you just get a ridiculously unhelpful error.
The problem here is that making the library so clever has also made it utterly impenetrable to the beginner. The main functions are not even documented, and there is no explanation of the crazy type signature. You might say that it is simply a documentation problem, but it is actually a combination of the two -- if the type signature had been something simple, it would have been easy to deduce how to use it. It seems to me that the documentation of a library has got to be proportional to the cleverness of its type signatures, or people are going to be absolutely lost. Since Haskell libraries are almost always implemented by Haskell gurus, and they implement them with themselves in mind (I have no objection to this, they are enthusiasts working for free), they use lots of clever code and advanced Haskell techniques. But this means that if you want people to actually use these libraries (and by consequence Haskell itself), the documentation for Haskell libraries has to be about an order of magnitude better than anything you'd find anywhere else. I suspect it is at least an order of magnitude worse than for something like .NET APIs, which means that relatively speaking the documentation of Haskell is currently in an absolutely dire state.
Sorry, I'm just saying it like it is. These libraries are great when you can get them to work, and I'm really grateful to the authors for their fantastic work, and the effort that has gone into packaging and distributing them (so that installation is literally one short command-line away), but the hurdles are still currently far too great compared to any other language for Haskell to become popular.
Moving forward, I guess one problem is contributing to a library's documentation. There is nothing on the API doc pages that shows you how to do this. I suspect you need to check out the source with darcs (not something I do normally, I just use cabal) and then start email patches or something. Even then, I don't know if I would contribute any documentation -- 'howto' style documentation seems out of place on the API pages, but it is desperately needed.
Comments §
One short command-line? Hah! cabal-install has 13 dependencies, some of which aren't packaged in most distros. In order to avoid having to learn how to install packages from hackage the hard way, you first have to learn how to install packages from hackage the hard way.
Perhaps this is merely a ploy to get people to realise how much they value cabal-install?
But is it being hard to install really such a bad thing? It'll be a front-end to Haskell packages, used by any number of people who may want to do minimal to no Haskell development (perhaps they just want XMonad), so it behooves the Haskell community to make sure it gets a lot of testing and usage before it goes into, say, Debian Stable.
As it is, cabal-install isn't entirely done. Witness the latest arguments over how it should handle the case where it installs executables inside its ~/.cabal/ (which is obviously not in one's $PATH by default).
Besides, the prerequisites are all cabalized. If the distros don't have them it's their fault (especially given the various cabal-to-package programs on Hackage).
(Where the repo is depends; regex-posix isn't a base library, so its repo could be anywhere.)
The issue is just that no one has written and submitted docs, not that Haskellers won't - I think the XMonad Haddocks are really good, and other packages by Don have good documentation as well (like ByteString's).
@ gwern: Yes, there are some libraries which are much better. I just started using 'template', a small library which has just the right sort of docs, and ByteString is generally very good, as you say.
I guess I am implicitly comparing to something like Python which has a 'batteries included' standard library, and everything in those it has to have documentation that is absolutely up to scratch. There are then other Python libraries for which the documentation is very variable. The problem with Haskell is that the standard libraries are much narrower in scope, so many common tasks are in the category of having unreliable documentation.
I am the author of the regex-* packages. Sorry about the lack of tutorial level documentation, I write all this as a hobby and hardly ever use any of it. Documentation patches are welcome...
As for the darcs repository is, the Packcage-URL in the cabal file points to
http://darcs.haskell.org/packages/regex-unstable/regex-base/
which is responsible for the high-level type machinery you are quoting. Other packages are in neighboring directories, I suggest using the code under "regex-unstable" regardless of its name.
That high-level API is the fusion of two medium level APIs, class RegexMaker and class RegexLike.
The first compile the source (byte)string into a mostly opaque regex and the second uses that regex to match against some to-be-searched (byte)string. Using these two classes makes for less type-complicated code and allows the compiled Regex to be cached and reused.
The high-level API of class RegexContext builds on RegexLike to create all the nifty dependence on the requested type. One still uses RegexMaker with RegexContext. Then =~ and =~~ are merely fusions of RegexMaker and RegexContext.
That being said, the API documentation for Haskell libraries is often quite sparse. It depends a lot on who originally wrote the code.
Besides, Python gets you spoiled. When you get some third-party code without instrospectable docstrings, even with good external documentation, you’re not happy.
n :: [String]
n = "hello dfadf hello" =~ "hello"
main = print n
However a type of AllTextMatches [] String worked. You just have to use getAllTextMatches to get the actual list.
n :: AllTextMatches [] String
n = "hello dfadf hello" =~ "hello"
main = print $ getAllTextMatches n
~~~~
Another problem I had was =~ won't work on precompiled regexes. It fails with different error messages depending on what order you put the string and regex in.
Probably I'm just dumb but I found this interface to be way too complex. I still don't fully understand it and I've been working with it for a couple hours. It's a nice idea but needs more work to make it thoughtless to use.