A CP Solution for XKCD NP Complete Restaurant Order

I’ve been messing around with Constraint Programming (CP) the past week. A few people at work have tried it out on some real world problems lately but it didn’t seem to stand up when given a lot of data and variables. This seemed sad as the declarative nature of CP attracts me and it strikes me there must be a set problems that it could be used for and it deserved a look.

The first CP model I wrote by myself is stupidly simple but since my non techie fiance understood the code I figure it is a good example. The problem is as described by XKCD below, select some appetizers from the menu so that the total cost adds up to $15.05.

NP Complete

I modeled the problem in Minizinc, a declarative language for modeling CP problems. I am just learning so if you know Minizinc and I’ve done something dumb don’t judge me too harshly.

Firstly we declare a bunch of variables that the solver needs to find values for. We provide the solver a domain in which the variables must lie, zero to ten for all variables in this case. Each of these variables represents the number of times as part of a solution we buy an item to add up to $15.05.

var 0..10: fruit;
var 0..10: fries;
var 0..10: salad;
var 0..10: wings;
var 0..10: sticks;
var 0..10: sampler;

Then we declare a constraint, something that the solver must meet to solve the problem. And in this case we say that a sum of the cost of the items (converted to cents) multiplied by the number of items in the solution must equal the required 1505 cents (that English version could be taken a couple of ways, the maths below makes better sense).

constraint fruit*215 + fries*275 + salad*335 + wings*355 + sticks*420 + sampler*580 == 1505;

We tell the solver to solve to satisfy.

solve satisfy;

And provide a format in which to output the solution.

output ["fruit=", show(fruit), "\t fries=", show(fries), 
        "\t salad=", show(salad), "\t wings=", show(wings),
        "\t sticks=", show(sticks), "\t sampler=", show(sampler)];

Running the model gives us:

$ minizinc --all-solutions xkcd.mzn
fruit=7     fries=0     salad=0     wings=0     sticks=0     sampler=0
fruit=1     fries=0     salad=0     wings=2     sticks=0     sampler=1

So there we go, all the possible solutions to the poor waiters problem! We know it is all of the solutions because of the “==========” minizinc cryptically places at the end of its output. Of course this problem is easy to brute force with a couple of for loops, there aren’t that many combinations.

But it is a start along what I hope will be a fruitful path.

Update: as people have noted on HN and Reddit I originally screwed up transcribing the price for salad which produced a couple of extra solutions. Fixed that now.

WTForms and Cherrypy 3.1

I have been trialling WTForms, a HTML form input and validation library for Python with a project I am working on. Much to my irritation however WTForms and Cherrypy don’t play nicely in one small area. Using wtforms.FileField with the validator wtforms.validators.Required will always fail.

Cherrypy in 3.1 (but not 3.2 interestingly) uses the Python built in cgi.FieldStorage to handle file uploads. In Python 2.6 and 2.7 at least this is beause the code for cgi.FieldStorage.__nonzero__ [1] only checks self.list and ignores self.file which is where the data is (at least for Cherrypy 3.1). No idea why this is the case, google gives no love on the why of this issue.

There has been an issue raised with the WTForms guys about the same case with Pylons but the long and the short of it is the developers don’t want to add special cases for the various frameworks. Special cases are the bane of a coder’s existence, they bloat otherwise lean and understandable code and cause maintenance nightmares so I do understand.

So what about me? Moving to Cherrypy 3.2 is an option but I don’t want to deal with the worry of that migration right now. An easy fix on my end is to write a custom validator and use it in place of the built in one. But what about the next project and the dozens of other people on my team who have to remember to use the custom validator for file upload? I might need to continue looking at form libraries, at least there is a lot of them!

[1] http://hg.python.org/cpython/file/9f8771e09052/Lib/cgi.py line 602

Github, Rails and Interface Design

All the excitement over the Rails and Github hack reminded me about Scott Meyers (author of Effective C++ et al.) who has talked about the subject of interface design many times (including in the aforementioned book) and his perspective puts the ball squarely in the Rails team’s court.

Let’s make the reasonable assumption that your clients—the people using your interfaces— are trying to do a good job. They’re smart, they’re motivated, they’re conscientious. They’re willing to read some documentation to help them understand the system they’re using. They want things to behave correctly.

That being the case, if they make a mistake when using your interface, it’s your fault. We’re assuming they’re doing their best—they want to succeed. If they fail, it’s because you let them. So, if somebody uses your interface incorrectly, either they’re working hard at it (less likely) or your interface allowed them to do something easy that was not correct (more likely). This puts the shoe on the foot not used to wearing it: it means that responsibility for interface usage errors belongs to the interface designer, not the interface user.

Source: Scott Meyers: The Most Important Design Guideline?

Slow CREATE DATABASE with PostgreSQL on Ubuntu

My new project (Relishment) is an opportunity for me to experiment with different technology. In this case I decided to switch to PostgreSQL from the RDMS I am familiar with MySQL.

So after struggling to learn the user management system in PostgreSQL, ”Ident authentication failed” was stumping me for a while, I finally got Django connected to a database. The next step was to run the test suite, and then wait.

And wait I did, what was a two or three second run on MySQL was a fifteen second run on PostgreSQL. Hardly desirable when I can run my tests up to several times a minute when debugging an issue. A quick poke around established it was “CREATE DATABASE” at the beginning of the test run that was eating all time.

Googling around the suggestions were that my shiny new Ubuntu 11.04 install, which uses the ext4 file system, is less than optimal for how PostgreSQL goes about things. It also turned up a solution, turning fsync off, which is dangerous on a live server but for development it doesn’t matter if you lose data in a crash.

The fix works splendidly, speeds at least matching MySQL, if not beating it.


So first you want to edit your PostgreSQL file:

sudo vi /etc/postgresql/8.4/main/postgresql.conf

Find this line:

#fsync = on

And change it to this:

fsync = off

Save and close the config file and restart PostgreSQL:

sudo service postgresql restart

And you should get a nice speed improvement when running your test cases.

How I learned to stop worrying and love Apache

Okay, not quite. Those of you looking at my HTTP headers will see nginx, but who I am to stand in the way of a catchy title and the chance to rip off Stanley Kubrick?

The title does have a point, however: that all too often we hackers worry about optimising our sites rather than addressing the real issues. There are endless discussions about which configuration better serves our site. We pour over the latest benchmarks, drilling down to the last decimal like a spice seller counting saffron threads. Pondering the pros and cons of various servers/proxies/CGIs/languages is enormously enjoyable to hackers, especially those of an entrepreneurial bent. It tickles our technical bone while allowing fantasies of wads of traffic and overheating servers. But it is all a distraction from the real issues which we all too easily allow to slip from prominence.

If we are to accomplish our goals, site optimisation should be low on the priority list. What we should be working on prior to optimisation is endless. I don’t need to list them; your own guilty conscience is doing that for me right now. It is always useful to maintain a perspective on this as more often than not the work we produce isn’t the most elegant or the fastest way of getting what we want. So if you are like me, your first impulse is to refactor, recode and tweak.

Don’t. There will always time later to improve things once you have your first customer, first investor or your mum has had a look. (Seriously, if your mum doesn’t understand your site what are you doing worrying about SQL vs NOSQL?)

There are of course caveats to all of this. This advice mainly applies to early in the life of a project/feature. At some point traffic on that 500ms query is going to take your site down, but won’t that be a nice problem to have? The other obvious exception is bad design. Keep future optimisation in mind so you don’t paint yourself into a corner and nothing but a major rewrite is going to fix things.

With this in mind go and get something done!

Django and Dreampie

As would be obvious by now I am a dedicated djangonaut (cheesy name I know) and I’ve been using django long enough now that using the ORM (Object Relational Mapping) is becoming second nature. Using SQL for anything but database gymnastics is somewhat annoying as the data isn’t as accessible to python as it is with an ORM. So when I need to get a quick insight into some data or prototype some code I want to be able to use the ORM instead of SQL.

I’ve also fallen in love with dreampie, which is an interactive python shell on steroids. It allows you to type and edit multi-line pieces of python code which you can then run in the inbuilt python shell. It has a fast clean interface and autocompletes attributes from the interpreter so you can get some hacking done quick.

Of course I want to be able to do quick django hacking in dreampie and I worked out a simple way of doing this.

import sys
import settings
from django.core.management import setup_environ
from a_django_app.models import AModel

Pasting this into dreampie (or loading it from the nice history functionality dreampie provides) sets up the shell with everything it needs to access the ORM (or anything else you use in django for that matter. The first two lines add the project’s path to the list of paths python uses to search for modules and packages. The next imports the project’s settings and activates those settings using the django function setup_environ. After that you can import app models and query away.

You can also use the same snippet of code at the top of standalone scripts to allow them to access the project’s django environment which is useful though it is normally advisable to add a management command instead.

Django Site Admin Emails

There is a neat feature in default behavior of Django. If while rendering a view it throws an uncaught exception it emails all the sysadmins in it’s settings file. This simple little feature is great on several levels.

  1. Repeatedly getting emails about an issue with often enough information to diagnose and fix the problem means you’re less likely to put up with broken windows.
  2. When something goes wrong you’re much more likely to find out about it, as monitoring is often one of the last things to be done on a project.

So it is a warm and fuzzy safety net that reassures you with an empty inbox, I really like it when I come in on Monday morning and I don’t get a single email telling me something barfed over the weekend.

And as an added bonus Django has this wonderful philosophy about exposing the pixie dust it uses to do it’s magic. All it takes is a call to mail_admins():

mail_admins("Made a sale","We just sold a pony for $%d" % pony_price)

And you’re alerting everyone when something good happens. This is great with new low volume projects where every sale is a momentous event. Of course you can use this for alerting the team about errors as well. There is also mail_managers() so you can setup the non technicals to receive relevant alerts with the same amount of effort.

Adding Composite Primary Keys to Django

Composite primary keys are great and relating two records together is best done using them. Adding another field just to identify the relationship makes the table un-normalized (that isn’t a word is it?) and we all want our tables normalized don’t we? So it is common practice for (good) database designs to use them anywhere two records are being related.

Unfortunately Django doesn’t support them right now, at least not without some nasty hacks. This means every time you get your hands on a legacy database there is a high likelihood that there will be composite primary keys are in use and you will be in for a world of pain if you wish to integrate it with Django. I’ve been there and the hacks are nasty, and also mean you can’t use a lot of components such as the admin which makes me sad as the admin is a beautiful thing.

The ticket (http://code.djangoproject.com/ticket/373) has been around for 5 years with only one person taking a decent crack at it and he seems to have stopped working on it around the same time the core team was working on 1.0. So I have started on the path of getting Django to support them. And this doesn’t just mean working on the ORM part of Django, it means getting every part of Django to understand that a primary key doesn’t just mean one column/field. Not a small project by any means but it will show some pay-offs in the short term as it replaces the hacks I currently have in place.

Once I have some code to show it will be up on my github http://github.com/hartror

My 4 favourite development sites

1. Hacker News

I habitually go to hacker news a few times a day, the people are intelligent and the discussion is the right mixture of coding, entrepreneurial and geek. It is also the first site I’ve consistently commented on, the karma feedback from writing an insightful comment is quite effective.

2. Startup Lessons Learned

I work as the head of development at a startup and I refer to this site regularly. The author, Eric Ries, has an excellent writing style that combines the best of narrative and technical writing to highlight the lessons he draws from his experiences in the trenches.

3. Joel on Software

I want to work with this guy, but I am afraid that the only way I will ever get job satisfaction again will be to run my own company, which probably isn’t a bad thing. His articles inform and inspire developers the world over to question how and what they do and generally do an all round better job.

4. Scott Berkun

Scott Berkun’s book Making Things Happen has been my bible in the project managerial baptism of fire I experienced. His website has a varied focus but I always seem to find it applicable to my situation, weather it is dealing with deadlines, people or public speaking.

Why do I seek to be a better developer?

I am continually reading about programming, software engineering and project management, both on the net and in dead tree form, my latest dead tree being Effective C++ by Scott Meyers. This reading is out of a desire to be a better developer, and today I though: but why do I seek to be a better developer? A little bit of the drive to better myself comes from the usual suspects, monetary gain, respect from my peers and suchlike. But for the most part I want to feel like I am doing a good job at something that I love, that how well I perform is important to me for its own sake rather than from some external pressure.

This means that if I hate what I am working on or am unable to complete it to my own satisfaction I have to take care not to fall into a funk, professional pride has its drawbacks (though feel and hope are overwhelmed by its advantages). So to manage this what do I do? Well I am fortunate to have an innumerable number of hobbies that I can draw professional pride from, programming being a major one, cooking being another. And then I always have learning, the discovery of new techniques and coming to new understandings is often the best way to for me to create that feeling of self worth we all seek.