Microservices: But What About Foreign Keys & Transactions?

One of the most impressively sized books on my bookshelf is Introduction to Database Systems, an artefact of my university years. It had an equally “impressive” price tag, or should that be “punishing” given my student wage? I regarded this expensive and weighty tome to contain unquestionable wisdom from on high, and in the very early 2000s it practically was. I had taught myself SQL in the past, but concepts such as normalisation, ACID and referential integrity were new and became to me immutable aspects of a good database.

From where we sit now, basking in the light of dozens of NoSQL database technologies (a terrible name, but we’re stuck with it), this is obviously not true. The hegemony of RDMS is over, document databases, graph databases and wide column stores are all widely used and acceptable options. With this revolution came the discarding of many of these “immutable” aspects, the argument being that by making this trade off an advantage is conferred. People make these same trade offs in RDMS schemas all the time, de-normalisation for example might be considered a cowboy move, but its read performance advantage is indisputable.

So what does all this have to do with microservices? Well trade offs have to be made, and this becomes obvious fairly early on, often when a developer is first introduced to the concept. With a distributed finely grained architecture spread across different databases and technologies, transactions won’t always be an option, neither will foreign keys. With the speed and agility that a microservice architecture provides this is the cost.

This is a scary idea, especially for those of us weaned on SQL, what will become of our data? First of all transactions. Is there a reason your whole system needs to be able to run in a single transaction? More often a web call will generate several data calls, but many of them are read only, and most of them touching only a few tables or rows. The read only calls probably can run outside the transaction and the others likely centre around some domain concept. That domain concept in turn probably makes sense to be collected into a microservice, which can then run a transaction. This won’t always be the case and hard decisions are inevitable somewhere along the line.

Giving up referential integrity is an easier task as it comes with a big reward. Removing foreign keys and replacing them with API calls means the owner of the data is free to change their internals. As long as the contract with the consumer of the API is obeyed then the owner can change as fast as requirements change, without the consumer having to also be updated. Databases aren’t the only line of defence for referential integrity, most applications we write already deal with this, often checks happen in a few layers as data travels through our systems. Without the database enforcing referential integrity we’re relying on our services and applications behaving correctly manner, something we already do to prevent errors in any case.

Everything old is new again, we aren’t dealing with new concepts, distributed systems have always had to face these trade offs. A micro-service architecture makes these trade offs more visible and explicit creating a tension developers must address. Even if a team chooses a more course grained approach, they’ve evaluated what is going to work for their project best and this can only be a good thing.

Design and Implementation of Microservices Workshop

Today I attended a full day workshop presented by Sam Newman and Scott Shaw on micro-services. Most people seemed pretty up with the why of micro-services, if you’re not James Lewis and Martin Fowler have talked at length on the subject. The workshop covered a wide range of topics and gave an excellent overview of the how and what of micro-services, something which is still lacking in literature (but is coming, see below).

I have seen Sam talk before, in fact his talk at Yow! last year was my first introduction to micro-services and I have read an early release copy of his book Building Microservices. A lot of the concepts were very familiar to me, however being a workshop meant that there was significant discussion around those ideas. This gave me a great deal of insight into the thinking around micro-services and how people have dealt with the trade offs and choices they present.

It was especially nice to have confirmation that some of the ways I am heading with Biarri’s architecture are tried and tested paths. This goes doubly for the contract testing library Pact (and Pact.rb) which a good friend of mine has been encouraging me to port to Python for Biarri’s use.

A few other books, videos and tools popped up in the talk which I have noted for further investigation:

  • Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans
  • Implementing Domain-Driven Design by Vaughn Vernon
  • Apache Zookeeper, useful for managing services, something like consul.io which I discovered last week.
  • Postel’s law: Be conservative in what you do, be liberal in what you accept from others (often reworded as “Be conservative in what you send, be liberal in what you accept”). A useful principle for micro-service communication.
  • Hystrix, a “latency and fault tolerance library” by Netflix. Probably overkill for Biarri’s architecture but it could come in handy in the future.
  • 12factor.net, SaaS app principles which I also discovered last week but re-affirms it’s usefulness.
  • A video by Stuart Halloway (I think) dealing with real time data and versioning. I did a google and couldn’t find a link so I am going to chase up Sam about it.

In all well worth my time.

Infracoders June Meetup

I’ve set myself the goal of attending one software meetup per week. I enjoy attending conferences, and meetups are like micro-conferences. You meet people, you learn something and often there is beer and food at the end like this evening.

This evening I attended Infracoders (http://www.meetup.com/Infrastructure-Coders/), a devops group focused on tools that make devops easier and more fun. Given my current focus on Biarri’s overhaul of infrastructure and development workflows these sorts of tools are on my mind a fair bit.

The presentations were interesting. First up was Alexey Kotlyarov and Ross Williamson from Infoxchange talking about some Docker tools called Pallet and Forklift (https://github.com/infoxchange/docker-forklift). Their talk was rather information dense and I found it a little hard to follow. From what I understood with my small knowledge of Docker is they automate some common development and deployment tasks in a platform/language agnostic manner. They look like they have a similar stack and problems to Biarri in many ways with lots of projects and a fragmented environment, so I will be looking into it deeper.

They also linked to some interesting things in their talk which are worth looking at:

  • Zato (https://zato.io/), a python ESB and application server. I am not clear on what its value proposition is yet but it has something to do with managing SOA which is something I am spending a lot of time thinking about right now.
  • The twelve-factor app (http://12factor.net/), a set of guidelines or principles for building SaaS applications. I’d not seen this before and at first glance looks like good reading.
  • Serf (http://serfdom.io), cluster management.

The second presentation was by Colin Panisset from REA Group about Credulous (http://credulous.io/), a AWS credential management system. He was an excellent and amusing speaker but Credulous solves a problem we don’t have and unlikely to have in the medium term at Biarri. It does look like a good solution if you have a large team with access to your AWS infrastructure. His lack of usage of GPG as a tool to solve the key sharing and encryption part of Credulous’ bothered me a little, but his criticisms of the installation and setup of GPG weren’t without merit. He did highlight some aspects of AWS security that I had not considered and will discuss with my co-workers.

The evening concluded with free beer and dumplings which was nice. I will certainly consider attending again, there seems to be significant overlap with the Devops Melbourne meetup, though perhaps their focuses are different.

2013 Year in Review

A pretty landmark year was 2013.

April 6 first of all, as it was the day I got engaged to the spectacular Li atop the almost as spectacular Mt Sturgeon.

Taken atop the nearby Mt William

The other major change was I donned Lycra and took up the sport of cycling. This happened later in the year but I already attempted and completed a ride through the Strathbogie Ranges with Orica-GreenEDGE [GPS record of my ride]. I was majorly stuffed by the end of the ride I can tell you.

Some good friends got married, or engaged, or had babies, there is a lot of that going around at the moment so I don’t think I’m alone in thinking is was a pretty big year.

Some Stats/Favourites/Lists

  • Favourite Book: Leviathan Wakes by James S.A. Corey
  • Books read: 24
  • Favourite Movie: I don’t know! Need to use Goodfilms more.
  • Favourite TV: Game of Thrones, Modern Family, Almost Human.
  • Kilometres cycled: 1,100.2 km (4 months of owning a bicycle)
  • Online courses started: 2
  • Online courses finished: 0
  • Conferences attended: 2 (PyconAU and Yow! Melbourne)
  • Conferences talked at: 0
  • New programming languages learned: 0
  • Favourite programming library: Angularjs
  • Girls of my dreams got engaged to: 1

So 2014?

Well the obvious one that will be dominating the first half of the year is getting married in April and then honeymooning in North Africa and Europe. That will likely consume the first four months of the year.

Beyond that I have some aspirations:

  • Read more books: 26
  • Talk at a conference: PyconAU?
  • Properly learn a pure functional language: Scala? Closure? Haskell? I will leave the term properly vague.
  • Complete a couple of online courses.
  • Finish one of my hobby programming projects.
  • Blog once a week.

The Lost Fleet by Jack Campbell

I don’t think I have read a more cheesy science fiction book since I was a teenager reading early Heinlein novels. The Lost Fleet series is pretty simple, a legendary war hero returns from the dead (found in a escape pod after 100 years) to lead a fleet trapped behind enemy lines.

The setting is well put together, two large human space faring civilisations in a drawn out war where neither side can hold sway. Each book is a series of battles in various solar systems as the fleet fights to escape largely intact.

These fleet actions are some of the best I’ve read, realistic, using believable technology and science. They’re sweeping battles spread across light hours of space employing clever tactics and it makes for some good clean fun.

However these epic battle broken up by some weak intrigue with characters who for the most part a little more than cardboard cut outs. Apart from the main character you never get a feel for the other characters, the few there are and what characterization there is can be stilted and painful at times. It really holds back the series from greatness which it a pity as the kernel is solid and enjoyable.

3/5 stars

Attack on Titan

I haven’t watched much anime lately, Planetes and Death Note were the last two I watched through. However a few friends have been raving about Attack on Titan and one of them posted an AMV trailer which made it look really fun so I gave it a try.

I was both happy and disappointed, first the happiness. It is a beautiful production, the setting is gorgeous, the character design is solid and the art work is world class. The plotting is well paced, able to introduce a lot of characters without my head spinning. However the plot itself doesn’t leave the familiar rut anime has trodden in the whole 20 years I’ve been watching it. It doesn’t stifle my enjoyment but it does make the diamonds in the rough, Planetes I am looking at you, shine just that much brighter.

Board games I’ve played lately

I’ve played a bunch of new games lately that are worth sharing.

Race for the Galaxy

Economic euro game in a space card game disguise. Played a few 2 player games today and I really like it, now I just have to work out how to win a game.

http://boardgamegeek.com/boardgame/28143/race-for-the-galaxy

Small Worlds

Simple territory building game with some neat mechanics. You conquer territory to get victory points as you cycle through randomly generated races to maximise your points.

http://boardgamegeek.com/boardgame/40692/small-world

San Jaun

Another economic euro game very similar in play to Race for the Galaxy but with a renaissance city building theme.

http://boardgamegeek.com/boardgame/8217/san-juan

Dominion: Guilds

The newest of 8 expansions for my favourite board game. The cards centre around little gold pieces you save to spend later.

http://boardgamegeek.com/boardgameexpansion/137166/dominion-guilds

A CP Solution for XKCD NP Complete Restaurant Order

I’ve been messing around with Constraint Programming (CP) the past week. A few people at work have tried it out on some real world problems lately but it didn’t seem to stand up when given a lot of data and variables. This seemed sad as the declarative nature of CP attracts me and it strikes me there must be a set problems that it could be used for and it deserved a look.

The first CP model I wrote by myself is stupidly simple but since my non techie fiance understood the code I figure it is a good example. The problem is as described by XKCD below, select some appetizers from the menu so that the total cost adds up to $15.05.

NP Complete

I modeled the problem in Minizinc, a declarative language for modeling CP problems. I am just learning so if you know Minizinc and I’ve done something dumb don’t judge me too harshly.

Firstly we declare a bunch of variables that the solver needs to find values for. We provide the solver a domain in which the variables must lie, zero to ten for all variables in this case. Each of these variables represents the number of times as part of a solution we buy an item to add up to $15.05.

var 0..10: fruit;
var 0..10: fries;
var 0..10: salad;
var 0..10: wings;
var 0..10: sticks;
var 0..10: sampler;

Then we declare a constraint, something that the solver must meet to solve the problem. And in this case we say that a sum of the cost of the items (converted to cents) multiplied by the number of items in the solution must equal the required 1505 cents (that English version could be taken a couple of ways, the maths below makes better sense).

constraint fruit*215 + fries*275 + salad*335 + wings*355 + sticks*420 + sampler*580 == 1505;

We tell the solver to solve to satisfy.

solve satisfy;

And provide a format in which to output the solution.

output ["fruit=", show(fruit), "\t fries=", show(fries), 
        "\t salad=", show(salad), "\t wings=", show(wings),
        "\t sticks=", show(sticks), "\t sampler=", show(sampler)];

Running the model gives us:

$ minizinc --all-solutions xkcd.mzn
fruit=7     fries=0     salad=0     wings=0     sticks=0     sampler=0
----------
fruit=1     fries=0     salad=0     wings=2     sticks=0     sampler=1
----------
==========

So there we go, all the possible solutions to the poor waiters problem! We know it is all of the solutions because of the “==========” minizinc cryptically places at the end of its output. Of course this problem is easy to brute force with a couple of for loops, there aren’t that many combinations.

But it is a start along what I hope will be a fruitful path.

Update: as people have noted on HN and Reddit I originally screwed up transcribing the price for salad which produced a couple of extra solutions. Fixed that now.

WTForms and Cherrypy 3.1

I have been trialling WTForms, a HTML form input and validation library for Python with a project I am working on. Much to my irritation however WTForms and Cherrypy don’t play nicely in one small area. Using wtforms.FileField with the validator wtforms.validators.Required will always fail.

Cherrypy in 3.1 (but not 3.2 interestingly) uses the Python built in cgi.FieldStorage to handle file uploads. In Python 2.6 and 2.7 at least this is beause the code for cgi.FieldStorage.__nonzero__ [1] only checks self.list and ignores self.file which is where the data is (at least for Cherrypy 3.1). No idea why this is the case, google gives no love on the why of this issue.

There has been an issue raised with the WTForms guys about the same case with Pylons but the long and the short of it is the developers don’t want to add special cases for the various frameworks. Special cases are the bane of a coder’s existence, they bloat otherwise lean and understandable code and cause maintenance nightmares so I do understand.

So what about me? Moving to Cherrypy 3.2 is an option but I don’t want to deal with the worry of that migration right now. An easy fix on my end is to write a custom validator and use it in place of the built in one. But what about the next project and the dozens of other people on my team who have to remember to use the custom validator for file upload? I might need to continue looking at form libraries, at least there is a lot of them!

[1] http://hg.python.org/cpython/file/9f8771e09052/Lib/cgi.py line 602

Cardinal Pell

Status

Poor Cardinal Pell you’re right. Your church is unfairly being singled out for criticism and we should take your word that decades of systemic failures by your organisation have been rectified.