Tuesday, July 15, 2008

AMQP's Semantic Model and Mismatching

My last post with real technical content (on AMQP and standards) attracted some notice, so I wanted to follow-up with some specific issues that I find with AMQP as a programming and application semantic model, as opposed to a low-level description of the protocol itself.

A naive question might be "why in the world is there some type of semantic model in an application protocol? Surely it's just specifying how stuff goes back and forth?" But in almost all protocols there is some type of semantic model embedded: IMAP involves Messages and Folders and connections and IDLE and all that; HTTP has Headers and Cookies and all that. There's no way to provide a completely implementation-neutral protocol without some type of semantic model, because if you strive for it, you're striving for a protocol that doesn't do anything.

So that being said, we'll focus on AMQP as an application semantic model. I'll be comparing and contrasting it with the two systems/semantic models that I've used the most in a production environment: Tibco Rendezvous and JMS (personified by commercial examples from Tibco EMS and Progress SonicMQ). As I haven't used any AMQP system in anger, I won't be discussing anything specific about any particular AMQP system (e.g. Qpid, RabbitMQ). Everything's based on the 0-10 AMQP specification.

First off, let's address Virtual Hosts. Why are these useful at all? The only thing that I see is that they allow you to have the same port open for multiple uses with the same authentication system. But this is really only particularly useful because AMQP recommends a particular listening port. Systems like SonicMQ allow you to have multiple brokers share their administration domain (though on the same machine you'd have to have multiple TCP ports open), which is a much cleaner concept. Virtual Hosts look like a
solution to a problem nobody has. Just listen to different ports, and job's done.

Now there's the matter of the Exchange/Queue model. I had a conversation with the Progress people about this, and I don't think they fully got it, because the use of the term Queue here throws people from JMS-land off: they immediately think JMS Queue, which is a really horrible concept, rather than generic queues.

For clarity for those not familiar with AMQP, an Exchange is an inbound message router, and a Queue is a message delivery vehicle. Messages come into an Exchange, which directs them to Queues. Each Exchange is responsible for routing messages to the consumers that want them, and uses the logic embedded in it to support this. Therefore, I'm going to refer to these as Consumption Queues from here on for clarity.

AMQP defines a number of Exchange types, including
  • Direct, where all messages flow through directly to all Consumption Queues which match exactly
  • Fanout, where all messages flow through directly to all Consumption Queues regardless of any other criteria
  • Topic, where all messages flow through based on globbed Topic hierarchies (like foo.bar.*.baz)
  • Headers, where header properties determine the Consumption Queues.
So to start out with, let's look for how we'd provide JMS-style semantics. Oh, wait, we kinda can't: JMS Topics are routed via a combination of Topic and Headers, because you can arbitrarily decide what to select based on a combination: there's no Topic And Headers exchange type defined in the spec.

However, the problem goes much deeper, in that the choice of exchange is determined by the publisher. That's just wrong. As a publisher, I shouldn't be deciding on the routing rules for my messages except in the Direct case. I should only be deciding that I'm going to send messages and it should be 100% in the control of the consumers how they're filtered. Those semantics are exactly what JMS requires, and are exactly what all distributed MOM systems require (remember: those rely on publishers just throwing bits onto the wire; publishers don't even necessarily know if there are any consumers listening). Consumers can declare Bindings between their Consumption Queues and the Exchanges, but they can't control the type of exchange which a publisher is publishing.

This Matters.

It may seem like a small semantic issue, but most of the cases where I really care about the semantics of my MOM system are when things are going wrong and I'm live in production and have traders screaming down the phone to the front-line support people. Then, I need the flexibility of controlling things at a purely consumptive level, because when AMQP gets going, I will have no control over many publishers in my enterprise.

Let me elaborate slightly on that point. Assume that AMQP goes gangbusters. Then you'll start to see many systems that currently support things like Tibco Rendezvous (think Reuters data feeds, Sybase Replication Server, stuff like that) supporting publishing out to AMQP. And they'll be commercial systems, and I won't be able to control them in any way. Therefore, if they do something wonky with their Exchanges for their publishers, I won't be able to change them in any way. And I'll be royally screwed if they choose an Exchange type I don't like.

Oh, and I also don't like that clients specify their own temp queue names. Why not have the broker do it? You may say "choose a UUID" but honestly, I've had enough run-ins with lesser skilled developers to actually want to limit their ability to screw things up. Again, solution seeking a problem.

There are a lot of really nice things involved in the spec, but these two things are massive semantic problems to me that should be addressed.
blog comments powered by Disqus