MVC is not a helpful analogy for Django

by Luke Plant

Posted in:

— June 25, 2013 13:22

Sometimes Django is described as MVC — Model-View-Controller. The problem with that is that people will either:

come with baggage from existing MVC frameworks, which might be nothing like Django,
or end up at something like the wikipedia page on MVC, which describes an architecture which is very unlike Django’s.

The classic MVC architecture is about managing state. Suppose you have a GUI that allows you to, say, view and edit a drawing:

You’ve got to store the drawing in memory somewhere.
You’ve got to display the drawing on the screen.
You have controls that allow you to modify the drawing e.g. change the colour of a shape.
And you’ve got to display the changes when that happens.

The controller tells the model to change, and the model notifies the view in some way (preferably by some kind of pub/sub mechanism that allows the view to be fairly decoupled from the model).

MVC is primarily about managing the changes in state so that everything is kept in sync. Model, View and Controller are all things that exist at the same time in memory (possibly running in different threads or processes), for extended periods, with their own state, and have to interact with each other.

Django’s Model-View-Template is quite different from this.

In MVT, there is no state. There is only data. For the purposes of most HTTP requests (GET requests), the data in the database is treated as an immutable data input, not state. It could be said that the name ‘view’ is misleading, since it implies reading, not writing, i.e. GET requests not POST requests. A better name might be ‘handler’, because it handles an HTTP request, and that is the terminology used by most Django REST frameworks.

In HTTP, stateful interactions can be built up by modifying data on the server, and modifying data on the client (cookies). But these interactions are bigger than the scope of a single page view. The browser holds one half of the state – the state of the current page, and cookies – and the session database holds the other half.

But when it comes to responding to an HTTP request, Django’s MVT has a complete lack of state. Many web pages are essentially pure functions of the inputs – an HTTP request and the data in the database – so it is clear that MVT is not intrinsically about state.

Of course, there is data modification. POST requests are about that. But these do not result in views being ‘notified’. Classic web apps handle state by side-stepping it completely. If a user knows that the ‘state’ of the system has changed (i.e. new data in the database), the user presses ‘Refresh’, which:

throws away all the state (i.e. the current state of the browser), with the exception of the pieces of state that identify what the user was looking at - the URL, and the site’s cookies.
causes a brand new HTTP request asking for the document. The server responds completely from scratch: it doesn’t notify the view function or the template, it runs over from the beginning.

So, if anything changed, the approach is “cancel everything, start again from the beginning”.

And for the actual handling of POST requests within Django, you have a similar approach. Once the data has been updated (typically, a SQL INSERT or UPDATE), you send back an HTTP redirect to do a GET – “something changed, let’s start again from the beginning”. This is why Django’s ORM does not have an identity mapper. The model for handling state is to ignore it altogether, and just start again whenever you know that something has changed.

This is exactly the opposite of the way classic MVC apps work (including client-side, Javascript MVC frameworks) – they are all about avoiding starting again, and having live systems that can be informed about updates and keep everything in sync, by sending message between them.

There is a second aspect to MVC, which is separation of concerns. If you think of MVC as meaning “separation of code that stores data, code that displays data, and code that handles requests for data to be changed or displayed”, then Django does indeed fit that pattern.

But I don’t think the classic description of MVC is a helpful starting point at all. HTTP is different, and has its own needs which have given birth to its own architectures.

What difference does this make?

First, we can avoid unhelpful comparisons that just confuse. The best way to understand how MVT works is to try it. You have to learn something about HTTP - a view function is a bit of code that handles an HTTP request and returns an HTTP response. Analogies to systems that are not like HTTP are not going to help that much.

Second, we can avoid trying to shoe-horn Django applications into a mold created by a different architecture. I believe MVC apps will provide very little guidance about how to structure code in Django apps.

In particular, I kind of disagreed with the post Know Your Models by Hynek Schlawack. It starts with the assumption of classic MVC, and because Django models don’t fit that ideal well, because they are fundamentally tied to their underlying relational storage, he feels the need for a separate set of ‘pure’ models, and the ability to change the components of M, V, and C separately from each other, because that’s what MVC is supposed to enable you to do.

I do agree with the approach of creating an API on your models that you want to use from view functions. So, for example, I tend to eschew all direct .filter() calls in view functions, preferring methods on models that can be tested independently. But I think the analogy with MVC can lead you in an unhelpful direction for many Django apps.

So, to contradict Hynek:

Many applications are fairly thin, simple wrappers around the database. In fact, these are Django’s sweet spot, with the admin being designed specifically for the case where it makes sense to be editing database tables with a layer so thin that it can be 95% autogenerated.

Further, if you put business logic in separate classes, rather than on your Django Model, it will be hard to re-use it in the admin and other ModelForms.
In my experience, the data model usually has been constructed with your application in mind. If it hasn’t, you are going to have an extremely painful time.

If your database allows only one email address per customer, then your application is going to reflect that. If your schema changes so that now you have multiple, the change will ripple right the way through your application (in most cases). MVT is not supposed to insulate you from that.

You can’t really build an application on top of a database that wasn’t designed for it, unless you are using it is a key-value store or something.
The database isn’t global state, it is global data. It doesn’t vary over the lifetime of an HTTP request being handled. As soon as it has changed, you send a redirect and start again.

Hynek’s approach can be necessary sometimes, but it adds a layer of complexity and indirection that itself can be deadly. Sometimes this is even worse for bigger projects – you are adding more complexity and indirection to a project that was already large.

When I write Django apps, I’m regularly making changes to the database schema, constantly making it match the exact and changing needs of the application. It is then easy to use the Django Model as a great basis for an API, and keep things as simple as possible.

There are other differences of approach in this, of course. Do you regard the database as merely a persistence mechanism for your application, or do you regard the database as an integral part of your application – probably the most important part?

I actually tend towards the latter – to call an RDMS like Postgres a persistence mechanism is pretty insulting. That means I lean far more heavily on the database, so that I’m not embarrassed to make use of an extremely powerful RDMS that can do all kinds of things, like constraint checking, transactions, triggers etc. With that mindset, testing your application by necessity involves testing what happens to your database for different HTTP inputs, rather than simply checking a ‘model layer’.

If there are bits of logic that can be separated out easily from the database, and tested more easily, by all means do that. But I wouldn’t make that the goal of my architecture.

What difference does this make?

You may also like: §

Comments §