<?xml version="1.0" encoding="UTF-8"?>
<feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:thr="http://purl.org/syndication/thread/1.0"
  xml:lang="en"
   >
  <title type="text">All Unkept</title>
  <subtitle type="text"></subtitle>

  <updated>2013-05-24T11:16:23Z</updated>
  <generator uri="http://blogofile.com/">Blogofile</generator>

  <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog" />
  <id>http://lukeplant.me.uk/blog/feed/atom/</id>
  <link rel="self" type="application/atom+xml" href="http://lukeplant.me.uk/blog/feed/atom/" />
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Bundling dependencies]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/bundling-dependencies" />
    <id>http://lukeplant.me.uk/blog/posts/bundling-dependencies</id>
    <updated>2013-04-15T11:08:10Z</updated>
    <published>2013-04-15T11:08:10Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Django" />
    <summary type="html"><![CDATA[Bundling dependencies]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/bundling-dependencies"><![CDATA[<div class="document">
<p>This post is about maintenance programming and the issue of Open Source
dependencies that may need customising. It compiles some of my current thoughts,
but I'm also eager to find out what other people do.</p>
<div class="section" id="approaches-to-dependencies">
<h1>3 approaches to dependencies</h1>
<ol class="arabic">
<li><p class="first">Pure dependency</p>
<p>The source code of the dependency does not become a part of your project in
any way. For a web project with Python and virtualenv/pip, you would just
list the project name and version in requirements.txt, and it will be
installed when you deploy your project.</p>
<p>This is by far the easiest approach to dependencies.</p>
</li>
<li><p class="first">Forked dependency</p>
<p>You create a fork of the library (usually hosted publicly, but not
necessarily) and add to it the changes you need. You then use this fork from
your main project.</p>
<p>This is done either in the hope that bug fixes and feature additions that you
make will be merged into the original, so that you won't have to maintain
your fork forever, or with the aim of keeping your changes small enough
that it will always be easy to merge in fixes from upstream.</p>
</li>
<li><p class="first">Bundled dependency</p>
<p>You take a copy of the library, and include it directly into your own source
code, so that it becomes a part of your source code, so that you can make
whatever modifications you need. The code becomes a part of your source code
forever.</p>
</li>
</ol>
<p>This post is about number 3 — the bundled dependency.</p>
<p>(There are, of course, variants and mixtures of these — for example, <a class="reference external" href="https://www.djangoproject.com/">Django</a> has often bundled dependencies, but this was
purely because of the confusing state of packaging, and the code was never
modified for use in Django. These libraries have been or will be un-bundled as
soon as possible.)</p>
</div>
<div class="section" id="avoid-it-if-you-possibly-can">
<h1>Avoid it if you possibly can</h1>
<p>The first thing to say about bundling dependencies is that you should avoid doing
so if at all possible:</p>
<ul class="simple">
<li>It can result in large increases in code base.</li>
<li>You won't get critical fixes from upstream, and it can be hard to merge them
in.</li>
</ul>
<p>Bundling a dependency can be a drastic decision — you are taking on all the
technical debt and maintenance burden of the code you are adding. Some
developers look at Open Source libraries and think “wow, all this free source
code I can just add to my project”. Your attitude needs to be exactly the
opposite: “Wow, look at all that code I'm going to have to maintain and debug”.</p>
<p>An external dependency is often much worse from a maintenance point of view than
code you have written yourself:</p>
<ul class="simple">
<li>You may not understand the code very well at all, and you may not have access
to the original reasons for the way it is.</li>
<li>When you add it to your project, you typically lose its history, making it
harder to track down reasons for its current state.</li>
<li>Library code can often be over-generalised and complex. It copes with all
kinds of situations that you don't need, but you will have to understand and
maintain that complexity.</li>
<li>The code will not ‘fit’ into your project well — there may be all kinds of
conventions and decisions that make it alien to your project, but now it is
part of your project and needs to fit.</li>
</ul>
</div>
<div class="section" id="alternatives">
<h1>Alternatives</h1>
<p>To avoid bundling a dependency, you can go for the ‘forked dependency’
above. For the missing features you need, attempt to add extension points that
will give you the flexibility you need, rather than simply hard code something
very specific to your project that will never get merged upstream.</p>
<p>Another alternative is to build what you need yourself, or very selectively add
parts of the dependency into your own source. This may seem more work, but could
be easier to maintain long term.</p>
<p>Finally, you could consider a <a class="reference external" href="http://en.wikipedia.org/wiki/Monkey_patch">monkey patch</a>. But be very careful, and make
sure you know all the places where you are doing that kind of thing, so that you
can assess at what point you should be switching strategy.</p>
</div>
<div class="section" id="when-you-should-consider-it">
<h1>When you should consider it</h1>
<p>However, there are times when you should consider bundling the dependency:</p>
<ul class="simple">
<li>When the changes you want to make are more than bug fixes.</li>
<li>When the changes can't be easily made by adding extension points to the original.</li>
<li>When the number/size of extensions is going to severely inhibit a developer's
ability to understand the code.</li>
</ul>
<p>I recently took on a project that had bundled a copy of <a class="reference external" href="http://www.satchmoproject.com/">Satchmo</a>. It was a bit of a shock, because
requirements.txt also listed Satchmo as a dependency, making me think I was in
situation 1, when actually I was in situation 3, which is much worse.</p>
<p>Sometimes, however, it is unavoidable. e.g. you need multiple fields adding to
DB different models, or you need to make invasive changes in other ways. As I
looked at the number of modifications made to the bundled Satchmo, I realised
there was no way that strategies 1 or 2 would be any good. Strategy 3 had
already been chosen, it was impossible to turn back the clock, and with
hindsight it was probably the right decision.</p>
<p>But implementation of that decision was lacking in lots of ways.</p>
<p>So how do you cope when you are forced to bundle? Here are my hints so far.</p>
<ul>
<li><p class="first">Recognise that you have done a really bad thing, and you need to take equally
drastic action to cope with it. The bigger the dependency you've bundled, the
more likely it is that you have seriously damaged your ability to maintain the
project long term.</p>
</li>
<li><p class="first">Make sure you include the tests of the original dependency, and integrate them
as part of your test suite.</p>
<p>Sounds obvious, but in the project that inspired this post, the opposite had
been done — they had copied all the source code, with the exception of every
file called 'tests.py' or directory called 'tests'. I do not know what
possessed them to do this, but this decision was an extremely expensive one
for their client, and has caused massive damage to the project.</p>
</li>
<li><p class="first">Maintain the test suite properly.</p>
<p>Again, sounds obvious, but tests are extremely valuable to a project, and in
this situation it is vital that you keep them maintained.</p>
<p>It is acceptable to delete tests if they are checking requirements that you no
longer have. But you should be deleting the code that supports those tests as
well.</p>
</li>
<li><p class="first">Take complete ownership of the code.</p>
<p>Having made the decision to bundle, don't treat the code like an external
dependency. It is your code now, only you can fix it. Don't pretend you are
going to merge in upstream changes.</p>
<p>The code should live at the same ‘level’ as the rest of your code — for
example, it should be in the same directory, not off in some 'libs' directory
that makes it harder to find. You need to embrace the fact that it is part of
your maintenance burden.</p>
<p>On the other hand, it is your code now, you can do what you want with it. So
don't be afraid of making changes. A tentative approach will leave you with
the worst of both worlds — a library that doesn't really do what you want, but
that you have to maintain. Make it do what you want.</p>
<p>Obviously, there can be some value in maintaining a separation between &quot;your
stuff&quot; and the &quot;framework stuff&quot; or &quot;library stuff&quot;, but this is just good
coding practice — you wouldn't hard code something very specific into a
function that is supposed to be generic.</p>
</li>
<li><p class="first">Delete, delete, delete.</p>
<p>If there is code that you don't need, just delete it. The more code you can
remove, the better. There can be a case for keeping some code around if:</p>
<ol class="arabic simple">
<li>It is causing very little nuisance to maintenance efforts.</li>
<li>It is fairly likely to be needed in the <strong>near</strong> future.</li>
<li>It is not causing runtime weaknesses (e.g. security problems),
because there is no entrance point to the code.</li>
</ol>
<p>But note that just the existence of code is a maintenance problem. If, for
example, you need to change the signature of a function, you will do a search
for sites that call it. Every hit you get is something you have to
investigate, which takes time. If, in the process of this kind of
investigation, you find some code that might be unused, find out if it is, and
delete aggressively where appropriate.</p>
<p>And code that <strong>might</strong> be needed <strong>one day</strong> is better deleted. By the time
you come to need it, it might be horribly broken, or broken in subtle ways
that will take you longer to debug than to write, or too complex or badly
performing for the context of your evolved application.</p>
<p>This applies to all kinds of code, including templates etc.</p>
</li>
<li><p class="first">Clean aggressively.</p>
<p>If you delete unused code, you'll find that you may well end up with code that
has essentially unused generality, or various other things that no longer make
sense for your specific project.</p>
<p>This is my golden rule for maintenance:</p>
<blockquote>
<p>Leave the code looking as if it had always been designed that way.</p>
</blockquote>
<p>This is a general maintenance principle, but it is especially important for
the situation where you are trying to go from a larger code base to a smaller
one.</p>
<p>Ideally, there should never be artefacts that can only be explained by talking
about the history of the project. This applies to every detail, including:</p>
<ul class="simple">
<li>names of models</li>
<li>names of fields</li>
<li>names of variables and functions</li>
</ul>
<p>Altering models is not hard if you have a good database migration tool
e.g. South for Django.</p>
<p>This principle may seem like it adds to the load of the maintenance
programming, but long term it reduces the load, and reduces the likelihood
that a project will collapse under its own weight. Even <em>with</em> this principle,
projects tend to become unmaintainable — the natural tendency of a project is
towards chaos, and you have to be very proactive about reversing that.</p>
<p>Example 1: after deleting some classes, you end up with a class hierarchy
where each base class is only used once. This adds a lot of overhead when
reading the code. You should clean aggressively — fold the classes together
(unless keeping them separate increases the clarity of the code).</p>
<p>Example 2: The code I'm maintaining uses livesettings (and uses it far too
much in my opinion, for things that ought to be in settings.py). It includes
some options that are unlikely to change for a given project, or are likely to
become ignored easily. For example, there is an &quot;Only authenticated users can
check out&quot; setting. In a project with an overridden login form or login view
(which can easily happen), it's very easy for this switch to become (at least
partly) broken. When you are working on some code that branches on the value
of this switch, there is no point fixing both branches — you won't have decent
tests to ensure that the unused branch is really working.</p>
<p>Instead, find out what the current value is, and just delete the other
branch. Then find all instances of the setting being used, and clean up
similarly. Finally, delete the code that defines the switch in the first
place. Remove every trace — you always have the history if you really need to
see how something was done before.</p>
</li>
<li><p class="first">Lather, rinse, repeat.</p>
<p>The aggressive process of deleting and cleaning leads to more, and you should
follow this up. You may not have the time to do it right now, but you should
be doing as you go — whenever some coding has turned up something that can be
cleaned/deleted, first do the necessary commit for whatever you were working
on. Then do a round of cleaning/deleting, finding all the code paths that are
now dead or can be simplified, commit the change, and repeat,</p>
</li>
</ul>
<p>These things have to go together. Aggressive deleting and cleaning can be made a
lot easier if you have a good test suite. Of course, when deleting code, you
will do a search for sites that might call it. But it ought to be possible to
check if you can delete code simply by running the test suite with it absent.</p>
<p>What other approaches or hints do you have for dealing with this situation?</p>
</div>
</div>
]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Never fix a bug twice]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/never-fix-a-bug-twice/" />
    <id>http://lukeplant.me.uk/blog/posts/never-fix-a-bug-twice/</id>
    <updated>2012-06-19T13:42:56Z</updated>
    <published>2012-06-19T13:42:56Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Django" />
    <summary type="html"><![CDATA[Never fix a bug twice]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/never-fix-a-bug-twice/"><![CDATA[<div class="document">
<p>Noah Sussman wrote a post on <a class="reference external" href="http://infiniteundo.com/post/25230828820/things-you-should-test">Things you should test</a>, “A checklist of things that are worth testing in pretty much any software system.”</p>
<p>Many of the things on the list are helpful reminders. However, I think the mindset it encourages is essentially wrong.</p>
<p>The mindset is basically this:</p>
<blockquote>
Identify common mistakes that developers make, and ensure you are writing tests that check you haven't made them.</blockquote>
<p>The problem with this approach is that it is essentially <a class="reference external" href="http://www.youtube.com/watch?v=kbyekup6i6U">whack-a-mole</a> debugging. There is a never ending supply of bugs to kill.</p>
<p>A much more helpful approach is found in <a class="reference external" href="http://www.red-sweater.com/blog/125/easy-programming">this post on easy programming</a> that advocates “Never fix a bug twice” (about one-third of the way down).</p>
<p>If you come across a bug or class of bugs that often occur, you should <strong>not</strong> be thinking first of all “better add that to my list of classes of bugs that need testing against”. You should rather be thinking “how can I change the system so that this class of bugs disappears entirely?”.</p>
<p>So, to take some of items listed for testing:</p>
<ul>
<li><p class="first">Input handling bugs, such SQL injection and XSS attacks.</p>
<p>In Django apps, I never write tests for SQL injection attacks or XSS attacks. The reason for this is that these types of bugs are very rare. The reason for that is that the APIs that are easiest to use for executing SQL or generating HTML are secure by default. If there are any errors of this type in my programs, they will be very rare and obscure, and trying to catch bugs by a black box scatter-gun approach will be a waste of time.</p>
<p>With XSS, there are times when I am slightly more tempted to generate HTML in ways that might be vulnerable to XSS — for example, using Python string interpolation. At this point, however, you still shouldn't reach for a test to ensure you did it correctly. Rather: <strong>stop writing HTML generation the stupid way!</strong></p>
<p>To do that, you first have to ask “why?”. “Why am I tempted to write HTML this way?”. The answer is usually “there isn't a convenient API for the particular way I want to write this”. The correct solution now becomes obvious: create the API that removes the temptation to introduce potentially buggy code.</p>
<p>So, here is some code in a project I'm writing, that uses string interpolation to implement a template tag for formatting a link in a standard way:</p>
<pre class="code python literal-block">
<span class="kn">from</span> <span class="nn">django.core.urlresolvers</span> <span class="kn">import</span> <span class="n">reverse</span>
<span class="kn">from</span> <span class="nn">django.template</span> <span class="kn">import</span> <span class="n">Library</span>
<span class="kn">from</span> <span class="nn">django.utils.html</span> <span class="kn">import</span> <span class="n">escape</span>
<span class="kn">from</span> <span class="nn">django.utils.safestring</span> <span class="kn">import</span> <span class="n">mark_safe</span>

<span class="n">register</span> <span class="o">=</span> <span class="n">Library</span><span class="p">()</span>

<span class="nd">&#64;register.filter</span>
<span class="k">def</span> <span class="nf">account_link</span><span class="p">(</span><span class="n">account</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">mark_safe</span><span class="p">(</span><span class="s">u'&lt;a href=&quot;</span><span class="si">%s</span><span class="s">&quot; title=&quot;</span><span class="si">%s</span><span class="s"> </span><span class="si">%s</span><span class="s">&quot;&gt;</span><span class="si">%s</span><span class="s">&lt;/a&gt;'</span> <span class="o">%</span> <span class="p">(</span>
            <span class="n">escape</span><span class="p">(</span><span class="n">reverse</span><span class="p">(</span><span class="s">'account_stats'</span><span class="p">,</span> <span class="n">args</span><span class="o">=</span><span class="p">(</span><span class="n">account</span><span class="o">.</span><span class="n">username</span><span class="p">,))),</span>
            <span class="n">escape</span><span class="p">(</span><span class="n">account</span><span class="o">.</span><span class="n">first_name</span><span class="p">),</span>
            <span class="n">escape</span><span class="p">(</span><span class="n">account</span><span class="o">.</span><span class="n">last_name</span><span class="p">),</span>
            <span class="n">escape</span><span class="p">(</span><span class="n">account</span><span class="o">.</span><span class="n">username</span><span class="p">),</span>
            <span class="p">))</span>
</pre>
<p>The problem with this is that I have to remember to use <tt class="docutils literal">escape</tt> on each variable. The reason I wrote it this way is that Django's <tt class="docutils literal">Template</tt> API is kind of bulky for this use case. So, I should instead have written this:</p>
<pre class="code python literal-block">
<span class="kn">from</span> <span class="nn">django.core.urlresolvers</span> <span class="kn">import</span> <span class="n">reverse</span>
<span class="kn">from</span> <span class="nn">django.template</span> <span class="kn">import</span> <span class="n">Library</span>

<span class="kn">from</span> <span class="nn">somewhere</span> <span class="kn">import</span> <span class="n">html_fragment</span>

<span class="n">register</span> <span class="o">=</span> <span class="n">Library</span><span class="p">()</span>

<span class="nd">&#64;register.filter</span>
<span class="k">def</span> <span class="nf">account_link</span><span class="p">(</span><span class="n">account</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">html_fragment</span><span class="p">(</span><span class="s">u'&lt;a href=&quot;</span><span class="si">%s</span><span class="s">&quot; title=&quot;</span><span class="si">%s</span><span class="s"> </span><span class="si">%s</span><span class="s">&quot;&gt;</span><span class="si">%s</span><span class="s">&lt;/a&gt;'</span><span class="p">,</span>
            <span class="n">reverse</span><span class="p">(</span><span class="s">'account_stats'</span><span class="p">,</span> <span class="n">args</span><span class="o">=</span><span class="p">(</span><span class="n">account</span><span class="o">.</span><span class="n">username</span><span class="p">,)),</span>
            <span class="n">account</span><span class="o">.</span><span class="n">first_name</span><span class="p">,</span>
            <span class="n">account</span><span class="o">.</span><span class="n">last_name</span><span class="p">,</span>
            <span class="n">account</span><span class="o">.</span><span class="n">username</span><span class="p">,</span>
            <span class="p">)</span>
</pre>
<p>And then write the API, <tt class="docutils literal">html_fragment</tt>, that makes this work, which is just:</p>
<pre class="code python literal-block">
<span class="kn">from</span> <span class="nn">django.utils.html</span> <span class="kn">import</span> <span class="n">escape</span>
<span class="kn">from</span> <span class="nn">django.utils.safestring</span> <span class="kn">import</span> <span class="n">mark_safe</span><span class="p">,</span> <span class="n">conditional_escape</span>

<span class="k">def</span> <span class="nf">html_fragment</span><span class="p">(</span><span class="n">template</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">mark_safe</span><span class="p">(</span><span class="n">template</span> <span class="o">%</span> <span class="nb">tuple</span><span class="p">(</span><span class="nb">map</span><span class="p">(</span><span class="n">conditional_escape</span><span class="p">,</span> <span class="n">args</span><span class="p">)))</span>
</pre>
<p><em>EDIT: After some discussion on django-devs and subsequent modification, this
is now in Django 1.5 as 'django.utils.html.format_html', so use that instead
of the above</em></p>
<p>I'm now no longer tempted to write it the vulnerable way, and I don't need a test (though I may want a test or two for <tt class="docutils literal">html_fragment</tt>). I have two ways of doing it - <tt class="docutils literal">html_fragment</tt> for very small snippets, and Django templates for bigger chunks, both of them secure by default.</p>
<p>So, if you find yourself needing tests for specific SQL injection or XSS attacks in your code, you are probably doing it wrong. Fix the underlying API that makes the mistake likely in the first place.</p>
</li>
<li><p class="first">Input handling type checking - “Invalid values such as null and NaN. Strings instead of integers, arrays instead of strings.”</p>
<p>Input handling should generally only occur on trust boundaries, but it should occur systematically on such boundaries. If you have a problem with the wrong type of value being passed to an internal function, due to an external input being passed in, you shouldn't be dealing with it in the function, you should be asking “how was this even possible?”.</p>
<p>One solution of course is a static type system. Of course, that might not be possible/practical in your situation, but you should be thinking about it.</p>
</li>
<li><p class="first">Math-related bugs: in many of the listed instances in Noah's post, the underlying bug is that the language does funny things.</p>
<p>So you should be asking “why am I using this language? Is it the right tool for the job?”.</p>
<p>Or at least “will a library solve this problem?”.</p>
<p>If you are handling decimal values, the solution is a library that does it well, and doesn't make it easy to make mistakes. For example, the Python 'decimal' library does not allow multiplication of floats and decimals - which sounds inconvenient, but stops you from making all kinds of mistakes.</p>
</li>
<li><p class="first">Units of measurement.</p>
<p>If you have potential bugs with this, you should be thinking “why are these possible? How can I eliminate the bug entirely?”.</p>
<p>Different languages have different solutions, often involving including the unit of measurement in the value itself. Haskell has <a class="reference external" href="http://www.haskell.org/haskellwiki/Applications_and_libraries/Mathematics#Physical_units">various solutions</a> that make it impossible to add &quot;3 years&quot; to &quot;2 meters&quot;, for example, and the compiler will stop you at compile time.</p>
<p>For Python, there are things like <a class="reference external" href="http://pypi.python.org/pypi/magnitude">magnitude</a> and <a class="reference external" href="http://pypi.python.org/pypi/units/">units</a> that work at run time.</p>
<p>Of course, even if you use these internally, you'll still have boundaries where you need to do some conversions from inputs etc., and at this point you might need to know what units the inputs are in. But:</p>
<ol class="arabic simple">
<li>You should try to insist that the external system sends the unit information along with the numerical value, so that you <strong>never</strong> have to rely on common knowledge, and you will know immediately if something changed. And your internal APIs have helped you do this by forcing the issue.</li>
<li>Even if the external systems can't be changed, you'll still have very few places to check - just your input and output boundaries, each of which should be managed in a single place.  And writing tests probably won't help you very much — you should write at most one, as a single integration test. Your time will be better spent manually checking the boundaries.</li>
</ol>
</li>
<li><p class="first">Text - “Are Unicode inputs treated differently than ASCII?”</p>
<p>Again, you should be using proper internal datatypes — unicode everywhere — to eliminate potential bugs.</p>
<p>Python 2.x is notorious here - its ‘forgiving’ conversion between bytestrings and unicode causes endless bugs that are found in production, not in development.</p>
<p>And the correct solution is not just to write tests — which you may have to do for now — but rather to fix the underlying cause. This is what has been done in Python 3.</p>
</li>
</ul>
<p>The same type of things applies to almost everything on the list, and in the cases where I can't see how it applies, I imagine that this is due to my own lack of imagination :-) If I can't think of a way of completely eliminating a class of bug, then I should think harder, rather than assume it is impossible.</p>
<p>Notice that from the perspective of an older programmer, there are some notable omissions from the list - things like buffer overflows, for example. Presumably that's because the author uses languages/frameworks where those bugs are very unlikely. And that has happened not because previous generations of programmers wrote lots of tests, but because they wrote languages and systems where those bugs are impossible or almost impossible.</p>
<p>So, in all this I'm not saying that you won't ever have to write tests for these kinds of bugs. But every time you identify a class of bugs, tests are a very poor answer. You should try to ensure that you never fix the same bug twice, and so never write the same test twice.  If you are writing essentially the same test more than once, you are proving that you haven't fixed the underlying issue yet.</p>
<p>If a bug is worth adding to a list of common bugs, then you have identified a systemic problem with your platform, and it is worth <strong>eliminating entirely</strong>. We should be striving for libraries/languages/systems which do that.</p>
</div>
]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[A prayer to the programming gods]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/a-prayer-to-the-programming-gods/" />
    <id>http://lukeplant.me.uk/blog/posts/a-prayer-to-the-programming-gods/</id>
    <updated>2011-09-19T12:52:12Z</updated>
    <published>2011-09-19T12:52:12Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Web development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Django" />
    <summary type="html"><![CDATA[A prayer to the programming gods]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/a-prayer-to-the-programming-gods/"><![CDATA[<div class="document">
<div class="line-block">
<div class="line">O gods of software development and operations, I have sinned.</div>
<div class="line"><br /></div>
<div class="line">Your anger falls on me, and I feel your wrath.</div>
<div class="line"><br /></div>
<div class="line">The web site I have inherited has no unit tests.</div>
<div class="line">It has no deployment script, and no README.</div>
<div class="line">Or database migration tool.</div>
<div class="line">It makes no use of virtualenv or requirements.txt or buildout,</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;nor has any description of dependencies.</div>
<div class="line">It has most of the VCS history missing.</div>
<div class="line">Source dependencies are in random folders,</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;clearly checked out from private SVN clones of</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;proprietary and open source projects,</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;but forked at unknown date with no history.</div>
<div class="line"><br /></div>
<div class="line">And I cry, “Why me?”</div>
<div class="line"><br /></div>
<div class="line">Have I not used a fabfile for projects I have started?</div>
<div class="line">Have I not included a setup.py for my open source apps?</div>
<div class="line">Have I not written helpful docs, or at least a README.rst?</div>
<div class="line">Have I not written correct commit messages, with carefully</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;constructed patches that didn't mix features and fixes?</div>
<div class="line">Are the projects I hand on not covered by automated tests,</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;at least for the critical functions?</div>
<div class="line"><br /></div>
<div class="line">But then I consider the sins of my youth,</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;and I confess: You are just.</div>
<div class="line"><br /></div>
<div class="line">You could have given me the VBA project I wrote when I was 18.</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;or some of the web apps I have written since.</div>
<div class="line">It could have been the thousands of ASP.NET lines I cranked out in two short years,</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;like the proverbial monkeys trying to produce Shakespeare.</div>
<div class="line">It could be raw SQL in the frontend code,</div>
<div class="line">&nbsp;&nbsp;&nbsp;&nbsp;and HTML mixed with business logic.</div>
<div class="line">You could have given me a PHP project.</div>
<div class="line"><br /></div>
<div class="line">But it is Python, and Django at that, and it is easily fixed.</div>
<div class="line">Your chastisement is light indeed.</div>
</div>
</div>
]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[My Bash prompt]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/my-bash-prompt/" />
    <id>http://lukeplant.me.uk/blog/posts/my-bash-prompt/</id>
    <updated>2011-02-28T23:32:42Z</updated>
    <published>2011-02-28T23:32:42Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Linux" />
    <summary type="html"><![CDATA[My Bash prompt]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/my-bash-prompt/"><![CDATA[<div class="document">
<p>I like my <a class="reference external" href="http://tldp.org/HOWTO/Bash-Prompt-HOWTO/">bash prompt</a>:</p>
<ol class="arabic simple">
<li>to include the current <a class="reference external" href="http://mercurial.selenic.com/">Mercurial</a>/<a class="reference external" href="http://git-scm.com/">Git</a> branch if applicable — otherwise I would quickly get in big trouble with committing things to the wrong branch in <a class="reference external" href="www.djangoproject.com">Django</a> etc.</li>
<li>to show the <strong>full</strong> current working directory - for similar reasons to above, as I often end up lost in a maze of twisty little directories, all alike.</li>
<li>to be always instantaneous (tricky, when 'hg branch' can take up to a second on a bad day due to startup time)</li>
<li>to work well with virtualenv (which adds the environment name to the beginning of the prompt)</li>
<li>to show the time (so that if I have a long running job that I didn't realise was going to be long running, all the info is there to find out how long it took).</li>
<li>to do all the above without confusing me!</li>
</ol>
<p>All of these together mean I need two lines, but this has advantages - it means the commands I type always start at the same column. I achieve (1) by a bash function that looks for .hg or .git recursively, and looks for '.hg/branch' where applicable. I achieve (6) by using some colours distinguish the different bits.</p>
<p>The end product looks like this - first without a virtualenv, then with:</p>
<img alt="/blogmedia/custom_bash_prompt.png" src="/blogmedia/custom_bash_prompt.png" />
<p>The code looks like this (in ~/.bashrc)</p>
<pre class="code bash literal-block">
<span class="c"># Prints &quot; $branchname&quot; if in a hg or git repo, otherwise nothing.
</span>print_branch_name<span class="o">()</span> <span class="o">{</span>
    <span class="k">if</span> <span class="o">[</span> -z <span class="s2">&quot;$1&quot;</span> <span class="o">]</span>
    <span class="k">then
        </span><span class="nv">curdir</span><span class="o">=</span><span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>
    <span class="k">else
        </span><span class="nv">curdir</span><span class="o">=</span><span class="nv">$1</span>
    <span class="k">fi
    if</span> <span class="o">[</span> -d <span class="s2">&quot;$curdir/.hg&quot;</span> <span class="o">]</span>
    <span class="k">then
        </span><span class="nb">echo</span> -n <span class="s2">&quot; &quot;</span>
        <span class="k">if</span> <span class="o">[</span> -f  <span class="s2">&quot;$curdir/.hg/branch&quot;</span> <span class="o">]</span>
        <span class="k">then
            </span>cat <span class="s2">&quot;$curdir/.hg/branch&quot;</span>
        <span class="k">else
            </span><span class="nb">echo</span> <span class="s2">&quot;default&quot;</span>
        <span class="k">fi
        return </span>0
    <span class="k">elif</span> <span class="o">[</span> -d <span class="s2">&quot;$curdir/.git&quot;</span> <span class="o">]</span>
    <span class="k">then
        </span><span class="nb">echo</span> -n <span class="s2">&quot; &quot;</span>
        git branch --no-color 2&gt; /dev/null | sed -e <span class="s1">'/^[^*]/d'</span> -e <span class="s1">'s/* \(.*\)/\1/'</span>
    <span class="k">fi</span>
    <span class="c"># Recurse upwards
</span>    <span class="k">if</span> <span class="o">[</span> <span class="s2">&quot;$curdir&quot;</span> <span class="o">==</span> <span class="s1">'/'</span> <span class="o">]</span>
    <span class="k">then
        return </span>1
    <span class="k">else
        </span>print_branch_name <span class="sb">`</span>dirname <span class="s2">&quot;$curdir&quot;</span><span class="sb">`</span>
    <span class="k">fi</span>
<span class="o">}</span>

<span class="nv">e</span><span class="o">=</span><span class="se">\\\0</span>33
<span class="nb">export </span><span class="nv">PS1</span><span class="o">=</span><span class="s2">&quot;\[$e[1;36m\][\u&#64;\h \t]\[$e[1;33m\]\$(print_branch_name) \[$e[0m\]\w\n\[$e[1;37m\]——&gt; \[$e[0m\]&quot;</span>
</pre>
</div>
]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Docs or it doesn't exist]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/docs-or-it-doesnt-exist/" />
    <id>http://lukeplant.me.uk/blog/posts/docs-or-it-doesnt-exist/</id>
    <updated>2011-01-10T16:43:18Z</updated>
    <published>2011-01-10T16:43:18Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Django" />
    <summary type="html"><![CDATA[Docs or it doesn't exist]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/docs-or-it-doesnt-exist/"><![CDATA[<div class="document">
<div class="section" id="tl-dr">
<h1>tl;dr</h1>
<p>Writing good quality documentation for the software libraries you publish <strong>always</strong> matters. Otherwise, you are doing the world a <strong>disservice</strong> by publishing.</p>
</div>
<div class="section" id="definitions">
<h1>Definitions</h1>
<p>I realise I am talking mainly in the context of libraries published for public consumption, rather than in-house libraries. Some of the reasons given below don't apply so much for the in-house library that people may be forced to use.</p>
<p>By publishing, I specifically mean package repositories like <a class="reference external" href="http://pypi.python.org/pypi">PyPI</a>, <a class="reference external" href="http://djangopackages.com/">djangopackages</a>, <a class="reference external" href="http://hackage.haskell.org/packages/hackage.html">Hackage</a>, <a class="reference external" href="http://www.cpan.org/">CPAN</a> — the places where developers will go looking for your library.</p>
<p>By ‘good quality documentation’ I do not mean ‘lots of documentation’. Often documentation can be ruined by thoughtless quantity. I remember some API docs produced by the NDoc documentation tool, which, when I last used it (admittedly about 5 years ago), flagged as warnings any methods or properties that were not documented. This led to developers jumping through the hoops, documenting the <tt class="docutils literal">identity</tt> property of the <tt class="docutils literal">Person</tt> class with such epiphany-inducing revelations as “Gets/sets the identity of the Person object”.  The tool, however, doesn't flag in red the missing parts of the documentation that you <strong>really need</strong>, like how you would go about populating a <tt class="docutils literal">Person</tt> object from the database or why you might want to use it etc., nor does it flag poor quality docs.</p>
<p>The result is documentation that is <strong>worse than useless</strong> — it promises usefulness by the fact of its existence, but instead wastes hours of your time, as you search it in vain to find anything which could not be deduced from looking at a class browser. Even if there <strong>is</strong> some documentation that is worth having, I will give up looking, and never find it, due to all the almost-auto-generated dross that surrounds it. Developers need to be persuaded of the importance of documentation, rather than simply instructed to eradicate build warnings.</p>
<p>Also, blog posts are not documentation. Blog posts are adverts. Some blog posts contain ‘tutorial style’ documentation. This is a valid type of documentation, but it belongs in one place, with your software, not on a blog that might disappear one day. If your documentation isn't part of your software package, I have no idea it exists. Googling for it is not a solution — docs found in blog posts are typically fragmented in content, scattered over several blogs, frustratingly incorrect because they are out of date, and always leave you wondering &quot;is that it, or should I look somewhere else?&quot;</p>
</div>
<div class="section" id="intro">
<h1>Intro</h1>
<p>I'm prompted to write this rant by the astonishing number of packages on PyPI and Hackage that have virtually no documentation or very poor documentation — not even a README sometimes. I presume that people put libraries on these repositories because they would like to think that they might be useful to other people. Often they like to think they are doing ‘Open Source’ development by publishing in this way.</p>
<p>However, I would argue that releasing software libraries without documentation is like dumping all your old junk on the street “because someone might want it”. You are not, in reality, being helpful or contributing to society — you are just littering. The same is true of libraries without documentation.</p>
</div>
<div class="section" id="arguments">
<h1>Arguments</h1>
<div class="section" id="a-library-is-a-risk">
<h2>A library is a risk</h2>
<p>The problem with the person who dumps their junk on the street is that he is not considering his ‘contribution’ from the perspective of the people who might be looking for such things — or from the perspective of the people who might want to use the street.</p>
<p>People who are looking for old fridges etc. have places that they will go for that — junk yards, <a class="reference external" href="http://www.uk.freecycle.org/">freecycle</a>, ebay, etc. And people walking down streets have expectations of what they will find on those streets. Once you take those things into account, you won't leave your old fridge on the street if you actually care about other people.</p>
<p>Now, people looking for software libraries have a specific task that they need to achieve. They are looking for a library because they believe the task can be abstracted into a library somehow, and they don't want to have to write the code themselves. They then go to Google or a framework/programming language specific repository with keywords for that task.</p>
<p>Every result they find is an avenue that might need exploring. And everything they find which is not suitable is junk that is just getting in their way. The developer has to do 3 things:</p>
<ol class="arabic simple">
<li>Evaluate whether a certain package could possibly do what they need.</li>
<li>Install the package, possibly including some kind of configuration.</li>
<li>Write the code to get the package to do what they need.</li>
</ol>
<p>And then, possibly, the developer may need to patch/fork your code to produce a version that does what they need.</p>
<p>Every step represents increasing commitment in terms of time. At each step, every second that I have to spend to find something out is time that I am spending, and therefore potentially wasting, because of your library. I will not know until I have actually <strong>finished</strong> the last step whether your library will help me — it could easily have some flaw that makes it useless for me.</p>
<p>So your library represents <strong>risk</strong> to me, especially as I always have alternatives — another library, or just writing the code myself. At each point, I've got to assess whether it is likely I will succeed with this package. I've made an investment in this library already, but will it reward continuing investment, or will it turn out to be another dead end? Should I get out now?</p>
<p>So, at the very least I need an overview that tells me what a library does. Without that, every second I spend looking at a library is probably time wasted. It seems ridiculous that someone would publish a library like <a class="reference external" href="http://hackage.haskell.org/package/classify">this</a>, where there is simply nothing to help you know how you are supposed to use it, but many instances exist (I picked that one at random), and, thinking that you must be missing something obvious, you actually spend time searching for any docs, before concluding that <strong>there is no documentation whatsoever, not even a single source code comment that might give you a clue</strong>.</p>
<p>I hate your package already, and wish it didn't exist, just like the fridge on the street. You have probably not helped anyone, and you have certainly hindered me.</p>
<p>But I need to know more than what is covered by the overview, since it is very unlikely that the default, simple case will cover all my needs. I also need to know what customisation or extension points there are. If none are documented, I can only presume that none exist.</p>
<p>Even if they do exist, I cannot know from reading the source whether they are essentially accidental or not, and this leads to another point:</p>
</div>
<div class="section" id="documentation-is-an-api-contract">
<h2>Documentation is an API contract</h2>
<p>With small libraries in particular, the author is not going to write a document explaining exactly how they see the code evolving. The only thing that people can rely on is what has been documented to work. Anything else may be something that is considered an implementation detail, and relying on it is increasing the risk all the time. Even with languages that clearly distinguish between public and private, often these distinctions are not fine-grained enough for the current purpose, and I still want some assurance from the author that something isn't public by accident.</p>
<p>And most people will not want to fork the code to get a version they know works — they want to know they will get bug fixes.</p>
<p>In addition, the existence of documentation tells you not only what the author is thinking about what is really public/private, it shows that the author actually cares about this library. It shows the kind of pride in your work that helps other people to know that this library is likely to get bug fixes.</p>
</div>
</div>
<div class="section" id="responses">
<h1>Responses</h1>
<div class="section" id="it-s-open-source-and-so-people-can-just-read-the-source">
<h2>It's open source, and so people can just read the source</h2>
<p>If you only have the source, it can take a huge investment of reading and understanding before you can conclude whether the library is even attempting to solve the problem you need to solve. I have to fit all the code inside my head and mentally execute it — or I have to download it and try to use it. Both of these represent a huge investment.</p>
</div>
<div class="section" id="my-software-is-too-small-to-be-worth-documenting">
<h2>My software is too small to be worth documenting</h2>
<p>On the contrary, the smaller the software library, the <strong>more</strong> important it is to have good documentation.</p>
<p>If there is a relatively small task that I want to find a library for, I am not going to spend a huge amount of time researching the libraries, simply because it is not worth it. Any library will likely have all kinds of disadvantages compared to writing my own code — overhead for features I don't need, bugs in features I do need but the author obviously didn't, the need to add customisations (like subclassing, which serves to make things harder for a maintenance programmer) etc. So for small features, I'm very tempted to write my own solution anyway, since I know that it will fit my needs.</p>
<p>Therefore, I need to know <strong>even faster</strong> whether the library will do what I want it to do, or whether it can be easily extended to do so. I am much less likely to get as far as step 2 or 3 above, and I am more annoyed with every library that doesn't make step 1 easy — because the time wasted represents an even bigger fraction of my time allowance for this task.</p>
<p>So, if your software is too small to be worth documenting, it is too small to be wasting other people's time with it, and putting it in any public package repositories is doing the world a disservice.</p>
</div>
<div class="section" id="i-have-only-published-as-a-backup-of-my-own-code-or-to-share-with-one-friend">
<h2>I have only published as a backup of my own code or to share with one friend</h2>
<p>I think this is just about the only valid reason to not write docs. If you have put some highly experimental code on github or bitbucket etc. simply as a way to distribute to someone else — perhaps someone else who is going to the write the docs for it — you can justify not writing any documentation.  But there is no reason to add it to any package index like PyPI or Hackage — it will only make the useful packages harder to find. Even on bitbucket/github, which often come up as search results from Google, it might help if you label the code as not intended for public use.</p>
<p>If the intended way for people to ‘re-use’ your software is for them to fork it and actually make it into something useful, supporting it entirely themselves, that is fine — but it would be helpful if you <strong>don't</strong> pretend it is a ‘package’, and just say so somewhere.</p>
</div>
</div>
<div class="section" id="conclusion">
<h1>Conclusion</h1>
<p>If you don't have time to write documentation, don't bother publishing on some package repository — you are simply fooling yourself into thinking you might have done someone a favour, when all you have done is waste resources.</p>
<p>I think we need a new mantra for software libraries, especially in the Open Source world, that applies to entire projects or features of individuals libraries:</p>
<p><strong>Docs or it doesn't exist.</strong></p>
<p>Perhaps, in the context of public package repositories, we should go further:</p>
<p><strong>Docs or delete.</strong></p>
<hr class="docutils" />
<p><strong>UPDATE:</strong></p>
<p>Of course, I should have also mentioned that with projects like <a class="reference external" href="http://readthedocs.org/">Read The Docs</a> and <a class="reference external" href="http://sphinx.pocoo.org/">Sphinx</a> there is even less excuse for creating great quality docs.</p>
</div>
</div>
]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Class Based Views and Dry Ravioli]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/class-based-views-and-dry-ravioli/" />
    <id>http://lukeplant.me.uk/blog/posts/class-based-views-and-dry-ravioli/</id>
    <updated>2010-11-20T04:29:53Z</updated>
    <published>2010-11-20T04:29:53Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Django" />
    <summary type="html"><![CDATA[Class Based Views and Dry Ravioli]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/class-based-views-and-dry-ravioli/"><![CDATA[<div class="document">
<p><a class="reference external" href="http://www.djangoproject.com/">Django's</a> new <a class="reference external" href="http://docs.djangoproject.com/en/dev/topics/class-based-views/">class based views</a> are very cool. I am starting to clean up an existing project using them, and lots of existing views are turning into declarative code.  But it makes me worried about the ravioli effect.</p>
<p>Take my feedback form as an example.  It is a simple view, and typical of code in my project — a couple of calls to some standard utilities.  One of the them, <tt class="docutils literal">standard_extra_context</tt>, adds some standard context variables depending on what it passed in (like <tt class="docutils literal">title</tt>, <tt class="docutils literal">description</tt> etc.) which are used in the HTML metadata and in the page itself. Another, <tt class="docutils literal">json_validation_request</tt> allows a form view to be re-used for AJAX validation. The form declaration and view look like this:</p>
<pre class="code python literal-block">
<span class="k">class</span> <span class="nc">FeedbackForm</span><span class="p">(</span><span class="n">CciwFormMixin</span><span class="p">,</span> <span class="n">forms</span><span class="o">.</span><span class="n">Form</span><span class="p">):</span>
    <span class="n">email</span> <span class="o">=</span> <span class="n">forms</span><span class="o">.</span><span class="n">EmailField</span><span class="p">(</span><span class="n">label</span><span class="o">=</span><span class="s">&quot;Email address&quot;</span><span class="p">,</span> <span class="n">max_length</span><span class="o">=</span><span class="mi">320</span><span class="p">)</span>
    <span class="n">name</span> <span class="o">=</span> <span class="n">forms</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">label</span><span class="o">=</span><span class="s">&quot;Name&quot;</span><span class="p">,</span> <span class="n">max_length</span><span class="o">=</span><span class="mi">200</span><span class="p">,</span> <span class="n">required</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
    <span class="n">message</span> <span class="o">=</span> <span class="n">forms</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">label</span><span class="o">=</span><span class="s">&quot;Message&quot;</span><span class="p">,</span> <span class="n">widget</span><span class="o">=</span><span class="n">forms</span><span class="o">.</span><span class="n">Textarea</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">feedback</span><span class="p">(</span><span class="n">request</span><span class="p">):</span>
    <span class="n">c</span> <span class="o">=</span> <span class="n">standard_extra_context</span><span class="p">(</span><span class="n">title</span><span class="o">=</span><span class="s">u&quot;Contact us&quot;</span><span class="p">)</span>

    <span class="k">if</span> <span class="n">request</span><span class="o">.</span><span class="n">method</span> <span class="o">==</span> <span class="s">'POST'</span><span class="p">:</span>
        <span class="n">form</span> <span class="o">=</span> <span class="n">FeedbackForm</span><span class="p">(</span><span class="n">request</span><span class="o">.</span><span class="n">POST</span><span class="p">)</span>

        <span class="n">json</span> <span class="o">=</span> <span class="n">utils</span><span class="o">.</span><span class="n">json_validation_request</span><span class="p">(</span><span class="n">request</span><span class="p">,</span> <span class="n">form</span><span class="p">)</span>
        <span class="k">if</span> <span class="n">json</span><span class="p">:</span> <span class="k">return</span> <span class="n">json</span>

        <span class="k">if</span> <span class="n">form</span><span class="o">.</span><span class="n">is_valid</span><span class="p">():</span>
            <span class="n">send_feedback</span><span class="p">(</span><span class="n">form</span><span class="o">.</span><span class="n">cleaned_data</span><span class="p">[</span><span class="s">'email'</span><span class="p">],</span>
                          <span class="n">form</span><span class="o">.</span><span class="n">cleaned_data</span><span class="p">[</span><span class="s">'name'</span><span class="p">],</span>
                          <span class="n">form</span><span class="o">.</span><span class="n">cleaned_data</span><span class="p">[</span><span class="s">'message'</span><span class="p">])</span>
            <span class="n">c</span><span class="p">[</span><span class="s">'message'</span><span class="p">]</span> <span class="o">=</span> <span class="s">u&quot;Thank you, your message has been sent.&quot;</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">form</span> <span class="o">=</span> <span class="n">FeedbackForm</span><span class="p">()</span>

    <span class="n">c</span><span class="p">[</span><span class="s">'form'</span><span class="p">]</span> <span class="o">=</span> <span class="n">form</span>
    <span class="k">return</span> <span class="n">shortcuts</span><span class="o">.</span><span class="n">render_to_response</span><span class="p">(</span><span class="s">'cciw/feedback.html'</span><span class="p">,</span>
                <span class="n">context_instance</span><span class="o">=</span><span class="n">template</span><span class="o">.</span><span class="n">RequestContext</span><span class="p">(</span><span class="n">request</span><span class="p">,</span> <span class="n">c</span><span class="p">))</span>
</pre>
<p>It does one bad thing, namely it doesn't redirect on success, but apart from that it is fine. But there is this annoying boilerplate of flow control. So, very quickly, after creating some standard mixins and base classes that are equivalent to my standard utilities, I now have the following (which now does the redirect-on-success):</p>
<pre class="code python literal-block">
<span class="k">class</span> <span class="nc">FeedbackForm</span><span class="p">(</span><span class="n">CciwFormMixin</span><span class="p">,</span> <span class="n">forms</span><span class="o">.</span><span class="n">Form</span><span class="p">):</span>
    <span class="n">email</span> <span class="o">=</span> <span class="n">forms</span><span class="o">.</span><span class="n">EmailField</span><span class="p">(</span><span class="n">label</span><span class="o">=</span><span class="s">&quot;Email address&quot;</span><span class="p">,</span> <span class="n">max_length</span><span class="o">=</span><span class="mi">320</span><span class="p">)</span>
    <span class="n">name</span> <span class="o">=</span> <span class="n">forms</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">label</span><span class="o">=</span><span class="s">&quot;Name&quot;</span><span class="p">,</span> <span class="n">max_length</span><span class="o">=</span><span class="mi">200</span><span class="p">,</span> <span class="n">required</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
    <span class="n">message</span> <span class="o">=</span> <span class="n">forms</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">label</span><span class="o">=</span><span class="s">&quot;Message&quot;</span><span class="p">,</span> <span class="n">widget</span><span class="o">=</span><span class="n">forms</span><span class="o">.</span><span class="n">Textarea</span><span class="p">)</span>

<span class="k">class</span> <span class="nc">FeedbackBase</span><span class="p">(</span><span class="n">DefaultMetaData</span><span class="p">):</span>
    <span class="n">metadata_title</span> <span class="o">=</span> <span class="s">u&quot;Contact us&quot;</span>

<span class="k">class</span> <span class="nc">FeedbackFormView</span><span class="p">(</span><span class="n">FeedbackBase</span><span class="p">,</span> <span class="n">AjaxyFormView</span><span class="p">):</span>
    <span class="n">form_class</span> <span class="o">=</span> <span class="n">FeedbackForm</span>
    <span class="n">template_name</span> <span class="o">=</span> <span class="s">'cciw/feedback.html'</span>

    <span class="k">def</span> <span class="nf">get_success_url</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">return</span> <span class="n">reverse</span><span class="p">(</span><span class="s">'cciwmain.misc.feedback_done'</span><span class="p">)</span>

    <span class="k">def</span> <span class="nf">form_valid</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">form</span><span class="p">):</span>
        <span class="n">send_feedback</span><span class="p">(</span><span class="n">form</span><span class="o">.</span><span class="n">cleaned_data</span><span class="p">[</span><span class="s">'email'</span><span class="p">],</span>
                      <span class="n">form</span><span class="o">.</span><span class="n">cleaned_data</span><span class="p">[</span><span class="s">'name'</span><span class="p">],</span>
                      <span class="n">form</span><span class="o">.</span><span class="n">cleaned_data</span><span class="p">[</span><span class="s">'message'</span><span class="p">])</span>
        <span class="k">return</span> <span class="nb">super</span><span class="p">(</span><span class="n">FeedbackFormView</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="n">form_valid</span><span class="p">(</span><span class="n">form</span><span class="p">)</span>

<span class="k">class</span> <span class="nc">FeedbackDone</span><span class="p">(</span><span class="n">FeedbackBase</span><span class="p">,</span> <span class="n">TemplateView</span><span class="p">):</span>
    <span class="n">template_name</span> <span class="o">=</span> <span class="s">'cciw/feedback_done.html'</span>

<span class="n">feedback</span> <span class="o">=</span> <span class="n">FeedbackFormView</span><span class="o">.</span><span class="n">as_view</span><span class="p">()</span>
<span class="n">feedback_done</span> <span class="o">=</span> <span class="n">FeedbackDone</span><span class="o">.</span><span class="n">as_view</span><span class="p">()</span>
</pre>
<p>(<tt class="docutils literal">DefaultMetaData</tt> is a <tt class="docutils literal">TemplateResponseMixin</tt> subclass that replaces standard_extra_context, <tt class="docutils literal">FeedbackBase</tt> defines common attributes for <tt class="docutils literal">FeedbackFormView</tt> and <tt class="docutils literal">FeedbackDone</tt>. <tt class="docutils literal">AjaxyFormView</tt> is a FormView subclass and replaces <tt class="docutils literal">utils.json_validation_request</tt>. The last two lines are simply there for the sake of not adding imports to my urls.py).</p>
<p>Now, this is a big improvement in one way.  All standard flow control has been removed, so it is very <a class="reference external" href="http://c2.com/cgi/wiki?DontRepeatYourself">DRY</a> in that sense, and the details shown are the details which are relevant to the one task of handling a feedback form. The way to read this view is as declarative code, similar to how you read models and forms:</p>
<p><tt class="docutils literal">FeedbackFormView</tt>:</p>
<ul class="simple">
<li>is a feedback view<ul>
<li>so it has the title 'Contact us'<ul>
<li>and other default metadata</li>
</ul>
</li>
</ul>
</li>
<li>is a form processing view (which has been ajaxified)</li>
<li>has <tt class="docutils literal">FeedbackForm</tt> as the form</li>
<li>has <tt class="docutils literal">&quot;cciw/feedback.html&quot;</tt> as the template.</li>
<li>has the 'success url' equal to the 'feedback_done' URL.</li>
<li><em>does</em> the following with a valid form:<ul>
<li>sends a feedback message.</li>
</ul>
</li>
</ul>
<p>Only the last of these is really imperative, and this is quite a nice way to think about a chunk of code.</p>
<p>However, suppose we have a maintenance programmer who needs, let's say, to make the template display a certain message on the basis of a certain URL parameter.</p>
<p>In the first case, it is pretty obvious what to do — a couple of lines inserted before the <tt class="docutils literal">render_to_response</tt> call will work fine, with a corresponding change to the template.</p>
<p>In the second, it is also fairly easy to do — you would just have to define an appropriate <tt class="docutils literal">get_context_data()</tt> method. But to find that out, you've got a fairly intimidating stack of mixins and base classes to investigate. In fact, it's more like a tree not a stack (though I think I may already have some diamond inheritance in there by accident).</p>
<p>This is classic <a class="reference external" href="http://c2.com/cgi/wiki?RavioliCode">ravioli code</a> —
“thousands of little classes everywhere and no one knows how to find the places
where things really happen”. As one person on that c2 page says, “this style
maximizes maintainability by old hands at the expense of comprehensibility by
newcomer”.</p>
<p>Put it this way: suppose you want to go from the current version to the previous version, in a precise and controlled way. You could trace through all the functions that would be called, starting with <cite>as_view()</cite> and working through till you collected all the code paths that were being used, but it would take you a lot longer than the forward transformation just took me. And this is exactly the transformation you might need to do if, for example, you needed to make a fundamental change to the flow control while keeping most of the code intact. It is faster to start from scratch with the declarative reading above — but that will only work if the view really is declarative, and there are no sneaky surprises in the base classes, and even then it may take some digging.</p>
<p>So, what am I saying? Don't do this? I guess I'm just saying “be aware of this trade-off”. As <a class="reference external" href="http://jjinux.blogspot.com/2010/11/software-engineering-coping-when-you.html">Shannon Behrens recently blogged</a>, there are times when fixing DRY violations are not worth it.</p>
</div>
]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Mercurial tree visualisation]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/mercurial-tree-visualisation/" />
    <id>http://lukeplant.me.uk/blog/posts/mercurial-tree-visualisation/</id>
    <updated>2010-02-12T15:51:23Z</updated>
    <published>2010-02-12T15:51:23Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <summary type="html"><![CDATA[Mercurial tree visualisation]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/mercurial-tree-visualisation/"><![CDATA[
<div class="document">


<p>I discovered <a class="reference external" href="http://www.logilab.org/project/hgview">hgview</a>, and wondered whether it could make better sense of my Django repository than <a class="reference external" href="http://mercurial.selenic.com/wiki/HgkExtension">hgk</a>.</p>
<p>Which of these two would you rather look at?</p>
<p>hgk:</p>
<a class="reference external image-reference" href="/blogmedia/hgk-shot1.png"><img alt="hgk output" src="/blogmedia/hgk-shot1-th.png" style="width: 200px;"></a>
<p>hgview:</p>
<a class="reference external image-reference" href="/blogmedia/hgview-shot1.png"><img alt="hgk output" src="/blogmedia/hgview-shot1-th.png" style="width: 200px;"></a>
</div>

]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Eclipse and Mercurial with existing checkout]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/eclipse-and-mercurial-with-existing-checkout/" />
    <id>http://lukeplant.me.uk/blog/posts/eclipse-and-mercurial-with-existing-checkout/</id>
    <updated>2010-02-12T14:53:00Z</updated>
    <published>2010-02-12T14:53:00Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <summary type="html"><![CDATA[Eclipse and Mercurial with existing checkout]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/eclipse-and-mercurial-with-existing-checkout/"><![CDATA[<div class="document">
<p>One very nice thing about Emacs — that you don't appreciate until you try alternatives — is that it is very unobtrusive.  It doesn't try to take control of any of your projects.  It takes your code as it finds it, and works from there.</p>
<p>I'm trying to get MercurialEclipse or HgEclipse to work with my existing project.  In switching to Eclipse, it was actually relatively easy to persuade it that my code can stay where it is on my disk, and doesn't need to live inside the 'workspace' folder — you can use symlinks or the builtin 'linked resources' feature.</p>
<p>But now I want to persuade it to recognise one of these source folders as a Mercurial checkout.  It seems the only way to do this is to use the &quot;import&quot; feature, or &quot;share&quot; the project - both of which do the wrong thing.</p>
<p>My setup is that inside a workspace folder (call it workspace1), I have a project defined (e.g. 'Django').  This project has a .project file with a 'linkedResources' to include the actual source, which is elsewhere:</p>
<pre class="literal-block">
devel/
 django/
   hg/
     trunk/
 workspaces/
   workspace1/
     Django/
       src/ (defined in .project as a link to devel/django/hg/trunk)
     python2.5
</pre>
<p>I simply want the Mercurial plugin to recognise that 'src' folder as a Mercurial checkout, which it is already.</p>
<p>All attempts at reverse engineering settings files etc. have failed.  Has anyone achieved this?</p>
<p>It would certainly make IDEs much more attractive if they didn't insist on trying to manage everything for you.  I'm a big boy, I don't want them to hold my hand as if this was the first time I had used Mercurial or worked on a project.  And I'm certainly not going to move my code about on the disk to satisfy some IDE - I might change my mind and use something else next month, or I might want to use more than one IDE on the same project.  IDE metadata needs to be completely separate from my source code.</p>
</div>
]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Python and copyright]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/python-and-copyright/" />
    <id>http://lukeplant.me.uk/blog/posts/python-and-copyright/</id>
    <updated>2009-07-15T22:04:42Z</updated>
    <published>2009-07-15T22:04:42Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Linux" />
    <summary type="html"><![CDATA[Python and copyright]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/python-and-copyright/"><![CDATA[
<div class="document">


<p>There has been lots of <a class="reference external" href="http://jacobian.org/writing/gpl-questions/">interesting discussion</a> about dynamic languages and the
GPL, but much of it seems to miss the main point: with many dynamic languages,
the terms of the GPL are an irrelevance.</p>
<p>In the comments on the above page, Peter Gerdes seems to be one of the few who
does get this:</p>
<blockquote>
<p>@Morty Yidel</p>
<p>It doesn't matter what the GPL says in situations 2 &amp; 3. What matters here is
copyright law.</p>
<p>The GPL is a license that lets you create and distribute copies of certain
software. You are only bound by the terms of that license when you need the
license as a matter of copyright law.</p>
<p>In 2 &amp; 3 you aren't distributing any GPLed software thus you need only abide
by the GPL if it is a derivitive work of a GPLed program. As I said that's a
complex question but it's one that can only be answered by looking at
copyright law and precedent not the GPL. Moreover, if merely using an API
makes your code a derivitive work of the library servicing the API it's hard
to see how programs like WINE aren't infringing.</p>
</blockquote>
<p>If someone wants to apply the terms of the GPL, they <strong>first</strong> have to demonstrate
something that would otherwise be copyright infringement. As the GPL (v2) says:</p>
<blockquote>
5. You are not required to accept this License, since you have not signed it.
However, nothing else grants you permission to modify or distribute the
Program or its derivative works.  These actions are prohibited by law if you
do not accept this License.  Therefore, by modifying or distributing the
Program (or any work based on the Program), you indicate your acceptance of
this License to do so, and all its terms and conditions for copying,
distributing or modifying the Program or works based on it.</blockquote>
<p>So, if I sit down at my computer and type (for the sake of argument):</p>
<pre class="literal-block">
echo 'import qt' &gt; test.py
</pre>
<p>and then distribute test.py, have I created a derivative work of Qt (or any
other GPL library)?  Since I could type all of that without having a drop of Qt
source code on my computer, I'd argue it's pretty difficult to say that this is
a derivative work.  Of course, once I go further and actually start using more
the API than just the name 'qt', there is the chance that someone important
might decide that by virtue of my use of an API, the program constitutes a
derivative work of the library.  (If they did, that would have huge implications
— every Linux program, not to mention Linux itself, would become a derivative of
Unix itself, for example, because it uses Unix API's, and every Win32 program
would be a Windows derivative...)</p>
<p>This, of course, is completely different from C/C++, where in order to compile
an executable, the compiler #includes parts of the actual source code of the
library into the executable, thereby automatically triggering the need for an
exemption from copyright laws.</p>
<p>With Python and Javascript, nothing of the sort occurs for the user to get the
developer's software.  If the developer decides to test their program before
distributing, they may well need copies of the library code, either in source or
object form, on their machine, but that has nothing to do with their
distribution of their own code.</p>
<p>Now, the GPL (or whatever), may have terms explicitly forbidding you from
finding some sneaky way around the licence, such as dynamic languages or web
interfaces etc.  But it doesn't matter.  The licence could require you to
sacrifice your first-born child to the gods of Jupiter for even thinking of
doing such a thing, but the licence is irrelevant, since you haven't done
anything that would require you to accept its terms.</p>
<p>So, unless judges are going to rule that use of APIs constitutes copyright
infringement, <strong>all</strong> software licenses of this kind are simply irrelevant in
this case.</p>
<p>This is one of the things that makes me shy away from the GPL.  I'm a really
grateful beneficiary of all that it has helped to safeguard, but as it relies on
copyright law which may well be <em>completely irrelevant</em> for dynamic languages,
it seems somewhat unwise to lean on it for any of my own financial security.
More and more it seems that being paid for <em>writing</em> code, and not for the <em>code
itself</em>, is the way programmers should make their living.  Just because
copyright law may have supported certain practices in the past doesn't mean we
have a right to that continuing.</p>
</div>

]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://lukeplant.me.uk/blog</uri>
    </author>
    <title type="html"><![CDATA[Managing Django patches using Mercurial]]></title>
    <link rel="alternate" type="text/html" href="http://lukeplant.me.uk/blog/posts/managing-django-patches-using-mercurial/" />
    <id>http://lukeplant.me.uk/blog/posts/managing-django-patches-using-mercurial/</id>
    <updated>2008-07-31T21:51:37Z</updated>
    <published>2008-07-31T21:51:37Z</published>
    <category scheme="http://lukeplant.me.uk/blog" term="Python" />
    <category scheme="http://lukeplant.me.uk/blog" term="Software development" />
    <category scheme="http://lukeplant.me.uk/blog" term="Django" />
    <summary type="html"><![CDATA[Managing Django patches using Mercurial]]></summary>
    <content type="html" xml:base="http://lukeplant.me.uk/blog/posts/managing-django-patches-using-mercurial/"><![CDATA[
<p>Update: this is probably superseded by: <a href="http://code.djangoproject.com/wiki/MercurialBranches">http://code.djangoproject.com/wiki/MercurialBranches</a>.</p>

<hr>

<P>
For any slightly longlived Django patch, for instance if I want to work on  something in stages before committing, or if it might never be committed to trunk, here is the pattern I started using today:
</P>
<H2 id="Setup">Setup<A class="anchor" href="#Setup" title="Link to this section"> §</A></H2>
<P>
Do this once:
</P>
<P>
In a subversion checkout:
</P>
<DIV class="code"><PRE><SPAN class="nv">$ </SPAN><SPAN class="">hg init
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">cat &gt; .hgignore
</SPAN><SPAN class="se">\.</SPAN><SPAN class="">svn/
</SPAN><SPAN class="se">\.</SPAN><SPAN class="">hgignore</SPAN><SPAN class="err">$</SPAN><SPAN class="">
</SPAN><SPAN class="se">\.</SPAN><SPAN class="">pyc
~</SPAN><SPAN class="err">$</SPAN><SPAN class="">
^D
</SPAN></PRE></DIV><P>
(<TT>^D</TT> == Control-D, EOF)
</P>
<DIV class="code"><PRE><SPAN class="nv">$ </SPAN><SPAN class="">hg ci -Am </SPAN><SPAN class="s2">"Initializing from subversion checkout"</SPAN><SPAN class="">
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">hg qinit -c
</SPAN></PRE></DIV><P>
NB: make sure you do 'hg qinit -c'.  Your patches live in .hg/patches, and this command makes that directory into a Mercurial repository, so all your changes to the patches are versioned.  You have to remember to do <TT>hg qcommit</TT> after <TT>hg qrefresh</TT>.
</P>
<H2 id="Usage">Usage<A class="anchor" href="#Usage" title="Link to this section"> §</A></H2>
<P>
Commit all work to mercurial queues, not to mercurial directly.
</P>
<P>
New patch:
</P>
<DIV class="code"><PRE><SPAN class="nv">$ </SPAN><SPAN class="">hg qnew name_of_patch
</SPAN></PRE></DIV><P>
(Hack away, using hg diff etc. to see changes)
</P>
<P>
Commit changes to the current patch:
</P>
<DIV class="code"><PRE><SPAN class="nv">$ </SPAN><SPAN class="">hg qrefresh &amp;&amp; hg qcommit
</SPAN></PRE></DIV><P>
Pull in new subversion changes:
</P>
<DIV class="code"><PRE><SPAN class="nv">$ </SPAN><SPAN class="">hg qpop -a
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">svn update
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">hg addremove -s 75
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">hg commit -m upstream
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">hg qpush -a
</SPAN></PRE></DIV><P>
(Of course, you could put that lot in a little script)
</P>
<P>
Once done, commit to subversion, and delete patches:
</P>
<DIV class="code"><PRE><SPAN class="nv">$ </SPAN><SPAN class="">svn commit
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">hg qdel -r qbase:qtip
</SPAN><SPAN class="nv">$ </SPAN><SPAN class="">hg qcommit -m </SPAN><SPAN class="s2">"Pushed patches upstream"</SPAN><SPAN class="">
</SPAN></PRE></DIV><P>
(Your patches are still there, of course, in the mercurial repository in .hg/patches, should you need the history)
</P>
<H2 id="Evaluation">Evaluation<A class="anchor" href="#Evaluation" title="Link to this section"> §</A></H2>
<P>
I'm not yet sure if this is the best way to do this (see the other methods on the <A class="ext-link" href="http://www.selenic.com/mercurial/wiki/index.cgi/WorkingWithSubversion"><SPAN class="icon">Mercurial Working with Subversion</SPAN></A> page), but it is a lot better than Subversion on its own, and it isn't too complicated.
</P>
<P>
One advantage of the system is that it is very easy to ignore it or use it as required.  Most of my Django work I will just hack directly and commit.  If I realise part way through that it is going to take I while, or that I'm doing two sets of changes that I want to split up, <TT>hg qrecord</TT> is just the ticket -- interactively record changes to a new patch.  If I then need to work on something else, I can just do <TT>hg qnew</TT>, doing <TT>hg qpop</TT> first if necessary/desired.
</P>
<P>
It is quite nice that the patches are just simple files in .hg/patches -- you can just go and look at them with normal tools etc.  If your patch has got messy (i.e. accidental whitespace changes), you can just <TT>hg qpop</TT> it, fix it up manually, <TT>hg qcommit</TT> and <TT>hg qpush</TT> it again.
</P>
<P>
One of the downsides is that patches are managed as a stack -- if you want to add a new patch, you have to put in somewhere in the series of patches (not necessarily at the 'top' i.e. most recent end, but you do need to think about where to insert it).  I imagine things can get tricky if you put it in the wrong order initially, though you could always just delete it and put it somewhere else.  As long as you have versioned everything, you can get at it easily.
</P>
<P>
On the other hand, given that your patches directory is a full hg repository, I imagine you can actually do very powerful things -- you can branch your patches etc. very easily.  I've discovered recently that Mercurial branching is actually very powerful -- you don't have to create clones, you can do everything inside one working directory.
</P>
<P>
Anyway, hope that is useful to someone else.
</P>

]]></content>
  </entry>
</feed>
