This is the third in a seris of posts in which my upcoming wedding has exercised my programming skills.
There are a lot of ways to seat approx 130 guests in tables of 10.
(Digression: how many ways?
There are n! ways to order a list of n people.
However, we then group that list of n people into j groups (tables) of size k each (assuming an exact division is possible for simplicity).
In each group, there are k! different ways to order the people, but all these are considered equivalent for our purposes (we don't care what order people are listed within a table). So we divide by k!, and do it j times, once for each table.
We could then shuffle all the groups, and all the results of such shuffling are also considered equivalent. So we divide by j! to count for these permutations.
So, I make it: n!/((k!^j)*j!)
(You can get the same formula by considering 'n choose k' to choose the people for the first table, multiplied by '(n-k) choose k', for the second, etc., with n reducing by k each time as the pool of available people shrinks, dividing by j! to account for permutations of tables, and simplifying).
Or, in my case:
54, 890, 022, 949, 532, 268, 126, 852, 483, 022, 955, 182, 591, 896, 498, 645, 394, 651, 998, 355, 750, 163, 025, 911, 301, 922, 700, 044, 395, 961, 856, 125, 066, 251, 303, 921, 134, 714,366, 269, 440, 000
But my maths is shockingly bad — corrections in comments are welcome! Whether I've got it right or not, there are still definitely lots of potential arrangements...
[Edit 2014-02-28: previous incorrect formula got me a larger figure, 2.201437978814401e+202]
End of digression...)
So, we haven't got into the business of doing the table plan yet, but this is such a nice programming puzzle that I couldn't resist starting now (and we probably need to get a move on if we are to get this done...).
I found one commercial solution, Perfect Table Plan, but:
- It doesn't run on Linux :-(
- It's not free
- It's already written, which is half the fun!
I found a paper tackling the problem, but:
- No source code!
- It required a commercial software package, GAMS with the CPLEX solver.
- On some pretty hefty hardware, running the solver still took 36 hours.
So I wrote my own. I used a model very similar to the one given in the paper, with some tweaks.
The idea is that you define a matrix of connections, with a zero indicating that the people don't know each other, 1 that they do know each other, plus 50 indicating that they must be together, and negative numbers for people who should be kept apart!
You then use this matrix to rate different potential seating plans — essentially trying to maximise the strength of connections on tables, summed over all people. In order to explore the possible solutions, I'm using simulated annealing, and a simulating annealing Python module that I found.
Defining a matrix for how 130 people know each other is a challenge. I've created a UI with a grid, and added controls to make it easy to enter connections for groups of people together. This provides a reasonable solution, but a grid 130x130 is a bit of a pain.
The UI is all in a web browser, with the main algorithmic code running server side, written in Python. I've used Flask as the web framework, instead of Django which I almost always use, since it is nice and light, and I don't need an ORM or admin etc.
I also discovered a few tricks along the way.
I don't want this web app to be storing any data on the server at all, not even temporarily, but people still need to be able to save their data (the connection matrix that they have defined, and also any generated seating plans they want to save) in a convenient way, and be able to upload the data that they saved previously. Copy-paste to a text file works, but it is far from convenient.
For downloading data, I've found a trick using an iframe with an auto-submitting POST form. The server simply echoes back the POSTed data with “Content-Disposition: attachment”, and the data will be saved — in Chrome and Firefox this works very well, since they can be instructed very easily not to prompt you for every file that is downloaded, so this becomes a one-click process.
For uploading data, within a single page app, there is now a way to do this using XHR, at least for modern browsers. I then have a simple view which again just echoes the data back client-side, allowing me to use a browser’s normal file upload mechanism to get the data back into the client-side app.
From my tests so far, I don't think the algorithm is going to be up to the job, especially with the machine time I have available. It worked quite well for 25 people, but 130 is another matter. I'm leaving it to run for a few hours, using a high number of "exploration steps", and fairly high "annealing time", but I'm not that hopeful.
First, I think there is a flaw in the model: it tries to maximises the total strength of connections between people on a table, summed across all people and tables. However, this has some bad implications: if someone is moved to a table where they know lots of people, the score will increase a lot, due to everyone on the destination table being happier. If they have left a table where they knew, say one person, that will be a penalty, but not a big one. But it is possible that the person they abandoned is now left friendless on that table. Overall, the score will still increase — the misery of the one friendless person having been outweighed by the happiness of the larger group.
So, basically, it is too utilitarian. Or it doesn't weigh the misery of being alone strongly enough.
Second, I suspect that the solution space is just too large to explore. Simulated annealing requires you to define a 'move' function, to change the system from one state to a nearby one. I just swap two random people on two random tables.
Now, in many cases you have a strong requirement that two or more people sit together e.g. a connection coefficient of 50. There is first of all the problem of random swaps getting these people together in the first place. Second, once they are together, you get a local maxima, but because each move only moves one person, not the group, they are basically stuck at that table, barring a small miracle in which they are both swapped to the same table by chance within a few moves.
So I'm thinking that a different physical metaphor is needed — something more like gravity or springs, which kind of guarantees that couples and their children will find each other, and stick together. I can't think how to continue that model so that you satisfy the strict constraints of so many people to a table.
I have very little experience with this kind of code, so it is fun to be stretched in this direction!
Since this is highly algorithmic code, I experimented with PyPy.
Initially I found PyPy was actually a bit slower. However, my code uses lists of integers to represent people, and originally I used ‘None’ as a sentinel value to represent an empty seat. I realised this might be killing PyPy's JIT. Switching to a sentinel value of ‘-1’ gave me approximately 5x to 10x improvement with PyPy. So it is now worth using PyPy.
Other than that one optimisation, I've done very little, so there are probably big ways the code could be improved.
It's all free, and on BitBucket:
I'm kind of hoping that someone will turn this into something that actually works. It is quite a fun problem, but it has taken up far too much of my time this week, given that sitting on the floor with bits of paper is probably going to be much more effective...
[Update: having let it work for a longer run of a few hours, the results are surprisingly good. It has at least managed to satisfy all of the '50' connections, grouping families of up to 6 together, and managing to get other '50' couples onto the same table. And so far, the only examples I can find of people without any friends on a table are people for whom I haven't put in any connection info yet. I'm impressed! But it still lacks that human touch...]