Slow CREATE DATABASE with PostgreSQL on Ubuntu

My new project (Relishment) is an opportunity for me to experiment with different technology. In this case I decided to switch to PostgreSQL from the RDMS I am familiar with MySQL.

So after struggling to learn the user management system in PostgreSQL, ”Ident authentication failed” was stumping me for a while, I finally got Django connected to a database. The next step was to run the test suite, and then wait.

And wait I did, what was a two or three second run on MySQL was a fifteen second run on PostgreSQL. Hardly desirable when I can run my tests up to several times a minute when debugging an issue. A quick poke around established it was “CREATE DATABASE” at the beginning of the test run that was eating all time.

Googling around the suggestions were that my shiny new Ubuntu 11.04 install, which uses the ext4 file system, is less than optimal for how PostgreSQL goes about things. It also turned up a solution, turning fsync off, which is dangerous on a live server but for development it doesn’t matter if you lose data in a crash.

The fix works splendidly, speeds at least matching MySQL, if not beating it.

DO NOT DO THIS ON A LIVE SERVER!!

So first you want to edit your PostgreSQL file:

sudo vi /etc/postgresql/8.4/main/postgresql.conf

Find this line:

#fsync = on

And change it to this:

fsync = off

Save and close the config file and restart PostgreSQL:

sudo service postgresql restart

And you should get a nice speed improvement when running your test cases.

How I learned to stop worrying and love Apache

Okay, not quite. Those of you looking at my HTTP headers will see nginx, but who I am to stand in the way of a catchy title and the chance to rip off Stanley Kubrick?

The title does have a point, however: that all too often we hackers worry about optimising our sites rather than addressing the real issues. There are endless discussions about which configuration better serves our site. We pour over the latest benchmarks, drilling down to the last decimal like a spice seller counting saffron threads. Pondering the pros and cons of various servers/proxies/CGIs/languages is enormously enjoyable to hackers, especially those of an entrepreneurial bent. It tickles our technical bone while allowing fantasies of wads of traffic and overheating servers. But it is all a distraction from the real issues which we all too easily allow to slip from prominence.

If we are to accomplish our goals, site optimisation should be low on the priority list. What we should be working on prior to optimisation is endless. I don’t need to list them; your own guilty conscience is doing that for me right now. It is always useful to maintain a perspective on this as more often than not the work we produce isn’t the most elegant or the fastest way of getting what we want. So if you are like me, your first impulse is to refactor, recode and tweak.

Don’t. There will always time later to improve things once you have your first customer, first investor or your mum has had a look. (Seriously, if your mum doesn’t understand your site what are you doing worrying about SQL vs NOSQL?)

There are of course caveats to all of this. This advice mainly applies to early in the life of a project/feature. At some point traffic on that 500ms query is going to take your site down, but won’t that be a nice problem to have? The other obvious exception is bad design. Keep future optimisation in mind so you don’t paint yourself into a corner and nothing but a major rewrite is going to fix things.

With this in mind go and get something done!

Django and Dreampie

As would be obvious by now I am a dedicated djangonaut (cheesy name I know) and I’ve been using django long enough now that using the ORM (Object Relational Mapping) is becoming second nature. Using SQL for anything but database gymnastics is somewhat annoying as the data isn’t as accessible to python as it is with an ORM. So when I need to get a quick insight into some data or prototype some code I want to be able to use the ORM instead of SQL.

I’ve also fallen in love with dreampie, which is an interactive python shell on steroids. It allows you to type and edit multi-line pieces of python code which you can then run in the inbuilt python shell. It has a fast clean interface and autocompletes attributes from the interpreter so you can get some hacking done quick.

Of course I want to be able to do quick django hacking in dreampie and I worked out a simple way of doing this.

import sys
sys.path.append('/path/to/your/project/')
import settings
from django.core.management import setup_environ
setup_environ(settings)
from a_django_app.models import AModel

Pasting this into dreampie (or loading it from the nice history functionality dreampie provides) sets up the shell with everything it needs to access the ORM (or anything else you use in django for that matter. The first two lines add the project’s path to the list of paths python uses to search for modules and packages. The next imports the project’s settings and activates those settings using the django function setup_environ. After that you can import app models and query away.

You can also use the same snippet of code at the top of standalone scripts to allow them to access the project’s django environment which is useful though it is normally advisable to add a management command instead.

Django Site Admin Emails

There is a neat feature in default behavior of Django. If while rendering a view it throws an uncaught exception it emails all the sysadmins in it’s settings file. This simple little feature is great on several levels.

  1. Repeatedly getting emails about an issue with often enough information to diagnose and fix the problem means you’re less likely to put up with broken windows.
  2. When something goes wrong you’re much more likely to find out about it, as monitoring is often one of the last things to be done on a project.

So it is a warm and fuzzy safety net that reassures you with an empty inbox, I really like it when I come in on Monday morning and I don’t get a single email telling me something barfed over the weekend.

And as an added bonus Django has this wonderful philosophy about exposing the pixie dust it uses to do it’s magic. All it takes is a call to mail_admins():

mail_admins("Made a sale","We just sold a pony for $%d" % pony_price)

And you’re alerting everyone when something good happens. This is great with new low volume projects where every sale is a momentous event. Of course you can use this for alerting the team about errors as well. There is also mail_managers() so you can setup the non technicals to receive relevant alerts with the same amount of effort.

Adding Composite Primary Keys to Django

Composite primary keys are great and relating two records together is best done using them. Adding another field just to identify the relationship makes the table un-normalized (that isn’t a word is it?) and we all want our tables normalized don’t we? So it is common practice for (good) database designs to use them anywhere two records are being related.

Unfortunately Django doesn’t support them right now, at least not without some nasty hacks. This means every time you get your hands on a legacy database there is a high likelihood that there will be composite primary keys are in use and you will be in for a world of pain if you wish to integrate it with Django. I’ve been there and the hacks are nasty, and also mean you can’t use a lot of components such as the admin which makes me sad as the admin is a beautiful thing.

The ticket (http://code.djangoproject.com/ticket/373) has been around for 5 years with only one person taking a decent crack at it and he seems to have stopped working on it around the same time the core team was working on 1.0. So I have started on the path of getting Django to support them. And this doesn’t just mean working on the ORM part of Django, it means getting every part of Django to understand that a primary key doesn’t just mean one column/field. Not a small project by any means but it will show some pay-offs in the short term as it replaces the hacks I currently have in place.

Once I have some code to show it will be up on my github http://github.com/hartror

My 4 favourite development sites

1. Hacker News

I habitually go to hacker news a few times a day, the people are intelligent and the discussion is the right mixture of coding, entrepreneurial and geek. It is also the first site I’ve consistently commented on, the karma feedback from writing an insightful comment is quite effective.

2. Startup Lessons Learned

I work as the head of development at a startup and I refer to this site regularly. The author, Eric Ries, has an excellent writing style that combines the best of narrative and technical writing to highlight the lessons he draws from his experiences in the trenches.

3. Joel on Software

I want to work with this guy, but I am afraid that the only way I will ever get job satisfaction again will be to run my own company, which probably isn’t a bad thing. His articles inform and inspire developers the world over to question how and what they do and generally do an all round better job.

4. Scott Berkun

Scott Berkun’s book Making Things Happen has been my bible in the project managerial baptism of fire I experienced. His website has a varied focus but I always seem to find it applicable to my situation, weather it is dealing with deadlines, people or public speaking.

Why do I seek to be a better developer?

I am continually reading about programming, software engineering and project management, both on the net and in dead tree form, my latest dead tree being Effective C++ by Scott Meyers. This reading is out of a desire to be a better developer, and today I though: but why do I seek to be a better developer? A little bit of the drive to better myself comes from the usual suspects, monetary gain, respect from my peers and suchlike. But for the most part I want to feel like I am doing a good job at something that I love, that how well I perform is important to me for its own sake rather than from some external pressure.

This means that if I hate what I am working on or am unable to complete it to my own satisfaction I have to take care not to fall into a funk, professional pride has its drawbacks (though feel and hope are overwhelmed by its advantages). So to manage this what do I do? Well I am fortunate to have an innumerable number of hobbies that I can draw professional pride from, programming being a major one, cooking being another. And then I always have learning, the discovery of new techniques and coming to new understandings is often the best way to for me to create that feeling of self worth we all seek.