Monday, July 28, 2008

Real MOM is Hard; Let's Use XMPP!

Dare Obasanjo brings up an argument that I've actually heard in the past: "You don't need a real message oriented middleware system, just use Jabber/XMPP!" Sigh.

As near as I can tell, this comes down to one of four possible rationales, all of which are equally likely:
  1. Many current web developers have never worked with enterprise tools sufficiently to understand that Message Oriented Middleware actually exists, and is worth using. Considering the number of times I've seen, even in Java applications, people going through massive coding gymnastics to avoid JMS or any similar solution, this would not surprise me. The LAMP stack defines itself almost directly by opposition to traditional "enterprise" tools, and while that's a good thing in a lot of respects, it means that users most comfortable with that stack are pretty ignorant of the good stuff that's out there.
  2. Microsoft has never made a reasonable MOM system, and anyone primarily familiar with Microsoft tools has thus never used one, given that the Microsoft development ecosystem is really defined by slavish acceptance of strict vendor lock-in.
  3. XMPP is more web-friendly (ooh! Well-defined XML!) and thus a better match for web developers, in their opinion. Again, much like #1, I just don't buy this.
  4. AMQP isn't done yet, and isn't a real standard yet. Therefore, people who are at least partial to avoiding vendor lock-in can't provide it as a protocol that can be used. XMPP is done and available today.
Fundamentally what you want is a proper Message Oriented Middleware solution here.

While you can use XMPP for this, why not just use AMQP or JMS or some other similar system designed for precisely this type of many-to-many matching?

Saturday, July 26, 2008

Nice UI feature in WebKIT nightlies (Updated With My Stupidity)

Update: I am an idiot. Turns out this is the default behavior in all versions of Safari, which although I've used a mac for years, never used enough to notice this behavior (just used Firefox). I am a loser. Also, this functionality rocks.

Original, Misguided Post:
I've been using WebKit nightlies as my primary browser on my Mac at home for a while now, and it works out pretty darn well.

Sometime in the last couple of revisions (yes, I do download whenever my launch page tells me to) they've changed the incremental search functionality, very much for the better. When it's on, the page you're looking at dims, and as you type, the incremental search is highlighted.

It's a little more elegant than FF 3 I think, in that it's pretty clear that you're in the incremental search mode (by dimming the window) and not in normal navigation mode, and the highlighting of the search term was very well done.

Honestly, I didn't think something so simple could make me go Wow, but in this case it really did work out quite well.

Tuesday, July 22, 2008

Open Source Business Strategies

Or, Find A Way To Get My CTO To Pay You

A couple of months ago I was meeting with the CTO of a tech startup and one of his sales guys trying to sell me on their latest and greatest Silver Bullet (and yes, I'm going to post about it, but not for this particular entry). I had already done some background evaluation on the technology and the founding team (Silicon Valley is a remarkably small place, even 4 years divorced from it; I managed to get two opinions from two people who had worked directly with said CTO in a matter of hours), and had already kicked the tires a little bit. Not enough to sign up to use it in production, but enough to feel that it actually did something useful, and enough to warrant my actually doing some technical POCs.

So we went through quite a bit of technical detail, and the CTO and I got along well, because he was an actual technologist (and had written much of the code for their go-live technology), not just a marketing droid with a technical title. Good sign. The sales guy shut up, quite rightly realizing that his speaking would be counter-productive for this part of the sale for this type of company. Another good sign.

[I don't like dealing with sales people: I don't have signing authority, so I'm useless to them; they have no technical knowledge whatsoever, so they're useless to me. A good software sales guy, trying to sell to me, does nothing but sit there and make sure that nobody's saying the wrong thing at the wrong time.]

So then the sales guy pipes up. "Wow, sounds like you like this, why don't you write me a big cheque?"

Uhm, for what?

Look, the technology may be fundamentally radical, it may change the world, in the immortal words of JWZ, it might even get me laid, but you haven't convinced me to pay you a single pound yet.

"We're selling insurance. Surely you want insurance on your production systems, right? You wouldn't fail to insure your home, would you?"

Sorry, wrong analogy. We don't insure things like that. That's not how we roll.

[And for the record, you're not selling insurance. Insurance is a contract where, if your technology fails, you pay us a monstrous amount of money; that's called Indemnification, and you'll be lying if your lawyers haven't written in a clause against implicit or explicit Indemnification in your contracts. What you're selling is Pre-Paid Technology Phone Relations.]

My firm hires as near to super-star status as we can. Yes, I know many firms think that, but there's a lot of dreck getting by in software engineering in general, and in the city in particular. I don't think we have a 100% success factor (we don't; nobody does; if you think you do, you're wrong), but we do pretty well.

As part of that, if you support a production system, you support that system. That means that you have to understand the technologies that you're working with to a really low depth, enough to solve really tricky problems when they arise. Because the simple fact is that no matter how good your SLA is, getting someone to remote diagnose a problem over a telephone line during a production outage with traders yelling at you is impossible. From thorough, deep understanding of your technology stack and how it's rolled out, you can solve most problems far faster than you ever could if you just said, "I don't need to know how X works, if it goes wrong in production, I can just call the vendor and they'll hold my hand." I suppose you could do that if you suck, but I don't suck.

This means that support is mostly useful pre-production, and post-outage post-mortem. In the middle of the crisis, it's less than useless. If you can't fix it, you shouldn't be running it.

Plus, this intentionally limits the scope of technologies that you can reasonably roll out into our production environments. You want to use Technology X? First, you have to understand it well enough to know that if you're the only person using it, if it ever goes wrong, you get woken up. Doesn't matter where you are, what time it is, you get woken up. And then someone really yells at you. So now you have to convince at least one other senior technical staff member that this technology is so good that they should use it on their projects so that there's a bigger base of knowledge. Good luck.

So only the cream rises (mostly: we've ended up as well with a lot of "I have a hammer and only a hammer, so I'm gonna hit stuff" syndrome; we also miss out on some really exceptional technologies because if I get to add a screwdriver, you get to add an adze, and Bob over there gets to add a hack saw, and pretty soon we all wish we just had hammers again).

Anyway, back to the vendor.

I explain this to them, and say, "look, I come from an open source background. Heck, I even founded an open source company [IP locked in purgatory forever, don't bother looking for it]; I believe in the tragedy of the commons. But I'm not signing contracts. I make technical recommendations. My CTO signs contracts. Give me something to go to him with other than charity, which is what a production support agreement fundamentally is." They really didn't have a good answer for that at the time.

And that's the thing. Assume I were to go to him; he's a smart guy (which is why he shits more than my salary). Imaginary Conversation:
Kirk: "Here's a really cool Open Source technology. Let's give the authors money."
CTO: "Why?"
Kirk: "Tragedy of the Commons, yo!"
CTO: "Huh? Dude, quit being a commie. We're heartless financial bastards here."
Kirk: "Uhm, the tragedy of the commons is at its heart one of the most positively capitalist fables ever, particularly since it speaks of the result of costless negative externalities..."
CTO: "Save it for the economists. You're a geek. Why should I sign a cheque?" [Ed. Note: he speaks in Bri'ish, notwithstanding his use of the Californism "dude"]
Kirk: "Uhm, insurance?"
CTO: "I pay you far more than you're worth if you + source code can't figure out a production outage faster than some support muppet on the phone can."
Kirk: "Righto. Please allow me to exit before you question the size of my next bonus."

So here's the deal. Let's say you're an open source company, and you're trying to figure out how to sell yourself to me in the Money (rather than the "hey, Kirk, try out Technology X") way. What you have to do is give me something tangible iff I give you money. It's just that simple. Split licensing doesn't work for us (and don't give me any crap about how there are multiple companies in any financial services organization and tech crossing financial boundaries and blah blah blah; it gets old for anything other than the most insane organizations). Support doesn't work for us.

Give me a cookie.

Seriously. Give me something I only get if I pay you.

[And, seriously, give me a cookie. I really like cookies. Millies down at Liverpool Street station are my favorite, but you can also go to Bens in Leadenhall Market. Some of my less American coworkers like them, but they're like muffintops and not cookies. Next vendor to meet me with a couple of packs of Millies Cookies gets massive props.]

It might be development tools (which I'm not going to use anyway, but I can pitch to grads or somebody who cares about that stuff). It might be support tools (which I really really like; see note above that our technologists also do a lot of support). It might be domain-specific functionality relating to the Insane Technology Stack I have to run at work and optimizations for that. Heck, figure out how to leverage GPGPUs or Infiniband or whatever. Just find something that I'm going to have to pay you for, and make sure it's of merit to me. Then sell me that with my support contract.

Now, you're not going to get the sale first-off, because now it's going to run like this:
Kirk: "Here's a really cool technology. Let's give them money.
CTO: "Why?"
Kirk: "Support, and they give us Really Cool Stuff if we pay them. I mean, we can use them no matter what, but Really Cool Stuff is really cool and I'd like it. Plus, the vendor gave the whole FOTech team cookies, so I'm going to pimp whatever they're selling."
CTO: "Use it in anger first, then tell me if you still like it. Don't talk to me until you've had your first production crisis with it and still like it after that."
Kirk: "Yessir, I'm off to code!"

See? There's the option for a sale. You ain't getting it day-one (it's a long sales process), but by the time we're ready for the sale, we're already in production with multiple projects. You can't even possibly hope for a better qualified lead than that. And if you can't convert someone with multiple production systems relying on your technology, then you're hopeless, and as an organization you should die, and thank Jebus you're Open Source.

[By Production Crisis I mean some production crisis, and that may not mean full loss of service, which is touching your technology. In that event, you're either causing it (at which point you're out the door most likely), or you're helping deal with it, or you're neutral. But I need to know what you're like to work with as a technology when traders are yelling, and they're only yelling when things are down or sub-optimal, and you can never simulate that kind of adrenaline rush.]

Oh, yeah, and we still haven't covered out a Collaborative Negotiating Environment (our CTO is notorious for getting everything we want for much less than the software company wants to sell it for; this is rubbish and useless in an Open Source environment, and I don't have a solution to this yet).

Golf is the new Wimbledon

For any of you following my Wimbledon IPTV Saga, I've gotten the ability using SCP to view our TV signals at my desk (man, that's some crazy high-noise signal!) and have been given permission to experiment/PoC with some VBrick stuff.

In the meantime, our Systems team have sent out a new email indicating that Golf is the new Wimbledon, and not only must we not watch live internet Wimbledon coverage, but we also must not watch Golf at our desks over the intertubes.

They really aren't ready for the Olympics. Not by a long mile.

Monday, July 21, 2008

iPhone Software I'd Love

Dear Lazyweb,

Here are the pieces of software I'd love to pay someone for in order to make my JesusPhone experience much better:

  • A ToDo system which doesn't suck
    • Sync with something (anything!) on my Mac or the InterTubes
    • Allow me priorities
    • Allow me notes, deadlines
    • Give me alerts
  • A Notes system which doesn't suck
    • Sync with something (anything!) on my Mac or the InterTubes
    • Comic Sans is the bane of my life
  • A Weather app written by someone who's seen the Tufte criticism of the iPhone.
    • And while I'm at it, when I go to the Weather App, why in the world if it can't make a connection does it refuse to show me what it saw last time? Stupid app.
    • And I'm in the UK. Give me Met Office data, and show me down to the weather station level, and show me the 3-day every-four-hour data as well.
  • A semi-offline Google Reader app
    • Allow me to sync up a bunch of my feeds that are full-text and don't do summaries in the feeds, and let me read them sans network, and then sync up again when I'm back in network range
    • This can't be Yet Another RSS Reader: it must synchronize my state with Google Reader.
  • An SMS application that allows me to queue multiple SMS messages while I'm underground in the Tube and sends them all when I get a network again. Heck, my Treo did this years ago. Why can't the JebusPhone do it?

Also, I really don't want to see any more BeerPhone shit. Man, that got old quickly.

Thank you,


Wednesday, July 16, 2008

AMQP Exchanges as Routers

Since I'm in a posting and commenting flurry, allow me to elaborate just a little bit on my post about the AMQP semantic model.

First of all, let me give a big shout out to the AMQP people in general. Although I've been told I've been a little harsh, my original issue here was frustration that precisely what I want wasn't already done. I like the design of the protocol itself, and I really feel the need to have a vendor-neutral wire protocol available, to make the industry better and further drive asynchronous systems. So mad props. See? I'm not just full of hate.

But something that "Matthew" said in a comment on my last post is that I'm thinking in Software terms rather than Networking terms, and that really the Exchanges are all about routing. I fully agree with that, and in fact, I think a super-powerful extension to the Exchange model is to have a fully programmable Exchange system where developers/administrators can upload custom programs which run in the process/memory/machine space of the broker itself (with a limited virtual machine of course) to give me complete and custom routing rules, all occurring within the confines of the broker.

That would give me the most amazing flexibility to implement any type of routing rules that my heart could desire, and honestly, I think done right it would be a Massive Win for whoever does it well.

So I do understand the terminology here, but let me state a bit more clearly why I think the "Exchange Is A Router" analogy is a little spurious.

If you go to a low-level networking analogy (Matthew's comment mixed in NICs and Routers), an Exchange can't be a Router, because packet publishers don't get to choose which Router they're using. They only get to send bits out onto the wire, and the bits get picked up. They might get picked up by a switch and sent to another switch. They might get picked up by a router. They might disappear into the ether (well, let's hope not in the case of persistent messaging). You publish and hope.

And that is where I believe AMQP should be going.

You've got an easy (semantically) way to rectify this (though not the issues with not having a pre-built Exchange specification that can handle JMS Topic messaging, which really should be specified as a "should" feature in the 1.0 specification): In classic software terminology, add in a layer of indirection.

Publishers simply have a potentially named Source for messages, and there would be a separate binding (though optionally linking to an Exchange by default, and allowing sources to specify their default, un-overridden Exchange declaration) for Sources. That insulates the input of messages into the system from their routing, and allows for the runtime redirection of messages to different types of exchanges for different types of routing.

Yes, it adds in another layer of indirection, but it resolves this problem entirely and gives a clean approach that more accurately separates out inputs of messages/packets, routing, and destinations for messages/packets.

Then again, I'm not on the AMQP working group, so my opinions are worth roughly the amount of time I've committed to the project.

I'm a Relentless Shill for Atlassian

I've been told in the past by work colleagues that I'm a relentless shill for Atlassian products. Currently, we're using most of the suite (JIRA for issue tracking, Clover for Java code coverage, Bamboo for continuous integration, FishEye as a Perforce enhancement). The only piece of shelf-ware we've paid for but haven't rolled out yet is Crowd, because of security issues in our company's IT policies.

Sometimes it seems that I do nothing but try to sell their products, so I wanted to justify quite why I like them so much.

First of all, they produce good products. Each of those products is particularly good in some way, and the fact that they focus on plugin development as well as a single product means that they have an ecosystem which does more than a single product ever could.

Second of all, they integrate well, and so you get network effects by using them together. Also a plus side.

But the main reason why I'm confident in recommending them is that they have an extremely good attitude towards support, and that really matters when you're trying to work with this stuff in a real environment.

Other Companies: Read This. It Matters.

They have an approach to support that I call Extreme Openness. Essentially, as a user (or even prospective user) of their software, you have the level of insight into their internal development processes that you wish you had from even open source projects (where unless you're trolling through the mailing list archives you often can't establish).

For example, the JIRA system that they use to track their own work is publicly accessible. Not only that, you can freely post things in it. For example, I've been quite active in providing bugs and feature enhancement requests. They make all that information available, and I can easily help myself in seeing whether things I've encountered or only dreamed of are on their roadmap, and vote for what I think is important. This gives me an insight into roughly what their priorities are. I may not always agree with them, but at least I'm aware.

It also means that I can see what's forthcoming, up to several releases in advance. This actually meant the difference between our buying a product and not buying it, because we saw that the roadmap involved some features that we wanted. For example, we were evaluating our Continuous Integration technology, and came up with the standard set of contenders (Bamboo, Hudson (which we already used), CruiseControl (which we already used), CruiseControl.NET (which we already used), and TeamCity). We ended up not buying anything, in part because we saw that Bamboo was working on the key feature we needed (remote build agents) and we wanted to wait for its implementation. As soon as they implemented it, we bought the product because it was better than the alternatives. Now we're almost done putting every single project firm-wide into our Bamboo installation.

Contrast this with your general Classic Big Software Company. Maybe you'll drip out bits and pieces of what's forthcoming to key customers, but you definitely won't let all your bugs out into the open like that, and you definitely wouldn't let customers and competitors see what you're working on (and sometimes more importantly, what you're not working on). I like the Atlassian approach quite a bit, because it gives me insight. And if I'm making a commitment to a platform like this, I need to have long-term confidence in it as a platform, and that means insight into development.

To give a contrasting example, SonicMQ comes out with a new release. Did I even know they were working on it? No (and at this point I have a direct line to the head of SonicMQ support for EMEA). Could I easily find out what was in it? No. There're reference notes, but they don't really tell you what's going on. As a customer, did they even email me? No. I found out meeting with the EMEA support manager on a standard meeting that it had come out as he asked me what we thought of it.

Next, you can download all their products and use them in some limited way. No contacting a sales person, no having to hunt for them, you just download them and use them. With SonicMQ, although I'm a customer, in order to get an "official" download of a new release, I have to have my account manager approve our account getting access to a new release, and then it goes through to the worst download manager in the world (including the new Sun JavaSE download web system) and hunt for the downloads that I need. That's rubbish. Atlassian ones are same download no matter what you're going to use, whether eval or paid. I like that a lot.

And they make it really easy to work with their support system, and they're really responsive. They aren't using some archaic system, they aren't making me work with something that makes me rip my hair out. Their system works really well (it's built on JIRA, so they're eating their own dog food), they're fast to respond (although the time zone differences between Sydney and London make for high latency sometimes), and they have proven themselves very intelligent over support channels, and I have no doubt that very often I'm communicating directly with a developer on the product.

Could you scale this up? I think so. Even when I was at BEA working on WebLogic Server, and we had over 50 core developers on the product, we all still did some type of support, we had public downloads of the major products, and while we didn't have public JIRA access, we did announce what features were forthcoming quite a bit and communicated on particular features on public newgroups. Perhaps that's one reason why at the time people liked us.

But you have to support this if you're trying to sell to developers. You have to think of the fact that many of your customers are extremely intelligent (some of them are lesser-skilled, and you have to support them too), and they won't put up with dealing with standard support drones. If your support engineers ever see a script they have to follow, and I sense that, I'm going to do everything in my power to never work with you again.

So if you're planning on starting a software company, let me give you some lessons that you can learn from Atlassian:
  • Make it easy (super duper extra easy) for me to download and use your products. Whether I've paid you or not.
  • Don't ever stop me from downloading your manuals. Before I even think of installing your software, I'm doing to read your manuals. Cover to cover.
  • Open up your processes and let me see what you're working on. I really want to know, and I really want to know in quite depth and detail. Your competitors may find out, but quite frankly, if you're that worried about individual features, you can hide those features themselves. And if your business model requires that nobody knows what you're working on, you're going to fail anyway.
  • Have good support. Have excellent support. Make me think that my annual support contract is a bargain. I'm going to pay you, so make me happy I'm paying you.
  • Make Good Software.

Tuesday, July 15, 2008

AMQP's Semantic Model and Mismatching

My last post with real technical content (on AMQP and standards) attracted some notice, so I wanted to follow-up with some specific issues that I find with AMQP as a programming and application semantic model, as opposed to a low-level description of the protocol itself.

A naive question might be "why in the world is there some type of semantic model in an application protocol? Surely it's just specifying how stuff goes back and forth?" But in almost all protocols there is some type of semantic model embedded: IMAP involves Messages and Folders and connections and IDLE and all that; HTTP has Headers and Cookies and all that. There's no way to provide a completely implementation-neutral protocol without some type of semantic model, because if you strive for it, you're striving for a protocol that doesn't do anything.

So that being said, we'll focus on AMQP as an application semantic model. I'll be comparing and contrasting it with the two systems/semantic models that I've used the most in a production environment: Tibco Rendezvous and JMS (personified by commercial examples from Tibco EMS and Progress SonicMQ). As I haven't used any AMQP system in anger, I won't be discussing anything specific about any particular AMQP system (e.g. Qpid, RabbitMQ). Everything's based on the 0-10 AMQP specification.

First off, let's address Virtual Hosts. Why are these useful at all? The only thing that I see is that they allow you to have the same port open for multiple uses with the same authentication system. But this is really only particularly useful because AMQP recommends a particular listening port. Systems like SonicMQ allow you to have multiple brokers share their administration domain (though on the same machine you'd have to have multiple TCP ports open), which is a much cleaner concept. Virtual Hosts look like a
solution to a problem nobody has. Just listen to different ports, and job's done.

Now there's the matter of the Exchange/Queue model. I had a conversation with the Progress people about this, and I don't think they fully got it, because the use of the term Queue here throws people from JMS-land off: they immediately think JMS Queue, which is a really horrible concept, rather than generic queues.

For clarity for those not familiar with AMQP, an Exchange is an inbound message router, and a Queue is a message delivery vehicle. Messages come into an Exchange, which directs them to Queues. Each Exchange is responsible for routing messages to the consumers that want them, and uses the logic embedded in it to support this. Therefore, I'm going to refer to these as Consumption Queues from here on for clarity.

AMQP defines a number of Exchange types, including
  • Direct, where all messages flow through directly to all Consumption Queues which match exactly
  • Fanout, where all messages flow through directly to all Consumption Queues regardless of any other criteria
  • Topic, where all messages flow through based on globbed Topic hierarchies (like*.baz)
  • Headers, where header properties determine the Consumption Queues.
So to start out with, let's look for how we'd provide JMS-style semantics. Oh, wait, we kinda can't: JMS Topics are routed via a combination of Topic and Headers, because you can arbitrarily decide what to select based on a combination: there's no Topic And Headers exchange type defined in the spec.

However, the problem goes much deeper, in that the choice of exchange is determined by the publisher. That's just wrong. As a publisher, I shouldn't be deciding on the routing rules for my messages except in the Direct case. I should only be deciding that I'm going to send messages and it should be 100% in the control of the consumers how they're filtered. Those semantics are exactly what JMS requires, and are exactly what all distributed MOM systems require (remember: those rely on publishers just throwing bits onto the wire; publishers don't even necessarily know if there are any consumers listening). Consumers can declare Bindings between their Consumption Queues and the Exchanges, but they can't control the type of exchange which a publisher is publishing.

This Matters.

It may seem like a small semantic issue, but most of the cases where I really care about the semantics of my MOM system are when things are going wrong and I'm live in production and have traders screaming down the phone to the front-line support people. Then, I need the flexibility of controlling things at a purely consumptive level, because when AMQP gets going, I will have no control over many publishers in my enterprise.

Let me elaborate slightly on that point. Assume that AMQP goes gangbusters. Then you'll start to see many systems that currently support things like Tibco Rendezvous (think Reuters data feeds, Sybase Replication Server, stuff like that) supporting publishing out to AMQP. And they'll be commercial systems, and I won't be able to control them in any way. Therefore, if they do something wonky with their Exchanges for their publishers, I won't be able to change them in any way. And I'll be royally screwed if they choose an Exchange type I don't like.

Oh, and I also don't like that clients specify their own temp queue names. Why not have the broker do it? You may say "choose a UUID" but honestly, I've had enough run-ins with lesser skilled developers to actually want to limit their ability to screw things up. Again, solution seeking a problem.

There are a lot of really nice things involved in the spec, but these two things are massive semantic problems to me that should be addressed.

Monday, July 14, 2008

Broker-based versus Distributed MOM

One thing that some people have commented to me in email is the difference between broker-based and distributed message oriented middleware. In general, broker-based involves a central machine (or machines) that all clients (publishers and subscribers) connect to over TCP (or something equivalent) and route all traffic through. Distributed MOM systems involve all machines simply listening to raw Ethernet and using some type of bit distribution system (Ethernet broadcast or Multicast) to get the required messages to all subscribers.

The major advantages of a distributed system are:
  • No Central Point of Failure. If you have a centralized broker getting in between all your publishers and subscribers, you have one central point where failure will really mess you up, and you have to engineer around that.
  • Low Latency. Because you're not relying on things like TCP, you can engineer a much lower-latency solution overall. In particular, very modern distributed MOM systems should be capable of leveraging Infiniband which would give you incredibly fast and low latency distribution (though I'm not aware of any who are; if you are, please let me know!). Even IP multicast is lower latency than publisher to broker to consumer TCP distribution.
  • No Central Bottlenecks. If everything's going through one chokepoint, and it has performance problems, everything slows down.
  • Slow Consumer Problem Avoided. The default behavior of centralized systems is that if you have a publisher and multiple subscribers, one slow or misconfigured subscriber will throttle the publisher down to the rate that the one slow subscriber can handle. This is the cause of many operational issues as you have to track down the bad code and kill it. This is particularly commonly seen if you have someone lazy enough to run a debugger against a production subscription feed.
On the converse, Broker-based MOM systems offer:
  • Single Central Point of Resiliency. While you have a single point of failure, that single point of failure is one that you can engineer the heck out of to avoid any single point of failure (for example, my firm's production infrastructure allows for a number of failures, across disks, network connections, power supplies, power feeds, cables, switches, RAM cards, etc.). Engineering that level of resiliency into multiple points can be very difficult to achieve, in the Broker-based model you only have one point to harden, so you can focus all your efforts against it.
  • Never Losing A Message. In a distributed system, there are far too many places where you can end up losing a message, or, more difficult to guard against, losing the order of a message in a stream where the order might matter (for example, delivering a "trade" order after the "cancel" order has already been processed). That leads to:
  • In Order Delivery. The central broker can and does (sometimes) act as an ordering mechanism, which means that you can impose different ordering models onto your messages (for example, global order, where every message is guaranteed to be delivered in the order in which the single central broker receives it; or publisher order, where every message is guaranteed to be delivered in the order in which the publisher produced it).
  • Statistics, Management, Advanced Features. There's a lot of stuff you can do if you can put a single man in the middle of your messaging stream, and broker-based approaches tend to do it.
To give you an example of where Broker-based messaging is a pretty clean win, assume a case where a single message absolutely, positively, MUST be delivered to two services, no matter what. To do that in a broker-based system using JMS terminology, you have two subscribers, each with their own Durable Subscription (which will store messages on the broker if the client process is disconnected), and the publisher does Persistent message publishing. Job's done, the broker does the rest.

In a distributed case, you have to have disk storage (redundant of course) on the publisher, and a distributed logging mechanism, and some type of handshaking between the logging mechanism and the publisher to notify the publisher that it can wipe the message from the perm storage, and some type of equivalent handshaking between the publisher and the clients and ... Yes, you can do it, but architecturally it's a lot more complicated, and there are a lot more points that you have to harden to make sure that nothing failing can kill a message from being delivered.

There's one other thing that also factors in here, which is cost and licensing models. Distributed MOM systems tend to cost per node, Broker-based MOM systems tend to cost per broker CPU. The cost of Broker-based systems is usually much better for a case where you're scaling out to your entire company's desktop infrastructure, distributed MOM systems tend to work out better if your application is limited to a few machines. And it doesn't usually take very many machines for the Broker-based systems to win.

So what do I recommend? In general, these days, I recommend coding against a Broker-based model unless you know you need the latency and performance characteristics that you can get out of a fully distributed system. If you've coded your system well, you can always retrofit in a distributed MOM system later on.

Anybody else have any thoughts?

Monday, July 07, 2008

Wimbledon Redux

An All-Williams ladies final, Nadal finally winning at the All England club. What an amazing fortnight it's been.

Didn't get my IPTV work done, but I'm definitely still pushing for it in the run-up to the Olympics. Wish me luck, still, going to meet with our head of Systems (I'm a lowly Front Office Technologist) when he's next free to try to discuss things. But I think I have a plan.

If anybody has worked with VBrick systems, please let me know!

Message Oriented Middleware and Open Protocols

Something that I was recently reminded to post on (due to a comment on my post on Progress and Iona) was the work that I had done into AMQP recently (encompassing RabbitMQ, Qpid, et al).

As some of you might have gathered, I'm a SonicMQ user. I'm a big fan of SonicMQ as a core product. The work that they've done on the core message broker has been phenomenal, and compared to anything else in the market, it's amazing. I'm quite pleased with many parts of the commercial decision that my company made (based on my technical input) to make a very substantial investment in SonicMQ. So don't let anything I'm saying act as SonicMQ bashing: if you are in the market for a commercial, centralized-broker-based MOM system, buy SonicMQ. With the caveats you'll get below.

The one thing that I will pretty openly criticize SonicMQ for (aside from their installers, which are openly toxic and should be approached with nothing other than open hostility) is their non-Java client libraries.

Quite simply put, their C++ client support is precisely what I would expect from a Java-based company: it's sporadic, not well engineered, looks exactly like Java translated to C++ (rather than idiomatic C++ using things like Smart Pointers and the like), and the porting is pretty abysmal (they seem to think that "Solaris" is one unified thing, rather than properly understanding the difference between hardware platforms, OS versions, compilers and versions). Their C# support ain't much better to be fair.

But it shouldn't have to be that way. Quite frankly, I'm shocked at this point that there isn't some type of IETF-supported protocol for MOM in its various forms.

So when I was alerted to AMQP, I was quite excited. The basic premise of AMQP is to provide a vendor neutral wire-level binary protocol for asynchronous, broker-based message oriented middleware. Huzzah! It's all good! We're saved! SonicMQ supports this, C++ domain experts write the client libraries, I can get Python native libraries, and I get the best of both worlds: client libraries in all the languages and environments I have to support, and the best broker-based middleware I can find.

Alas, it's not to be. While it might be time for a vendor-neutral middleware protocol, it doesn't appear that AMQP is it.

To start with, a little birdie who knows many of the LShift people (who are partially responsible for RabbitMQ) indicates that the protocol discussions are degenerating into SQL-99-level vendor arguments, before 1.0 of the protocol has even been ratified. All the vendors of pre-1.0 products are already jockeying for position. And I thought SQL was bad; but at least they all had commercial products out when they started fighting in committee! So the protocol itself seems stillborn by its own vendor quibbles.

But then there's the fact that Merrill LynchJPMorgan (one of the major client drivers of the protocol) pushed for something that doesn't really make sense. It's almost as though the protocol requires an implementation that's half bog-standard JMS, and half Tibco EMS, and all confusing. SonicMQ, for example, has IMHO, the best approach to overcoming the insanity of the JMS specification: Shared Subscriptions. But rather than allowing for vendor differentiation through features like this (and if you're evaluating MOM, you have to fully understand the power this gives you, and is unique to SonicMQ), the AMQP defines a MOM model that nobody currently implements, and is highly unintuitive.

Why, oh, why, couldn't they have just taken something like JMS, taken the edges off it (introspectible queues with online reordering? W-T-F?), and done that as a 1.0? Then at least they'd have a basis for something that would be commonly understood by MOM professionals, and would have had a scope for a vendor community to build implementation support.

Instead we face a situation where we have, simultaneously:

  • A protocol nobody supports; based on

  • A programming model nobody is used to; based on

  • A conceptual model nobody has worked with; with

  • No existing functioning reference implementations

This is not the way to build a community! It's classic chicken-and-egg. Give some existing stakeholder some reason to work with it, whether it's a vendor or an MOM professional or just an existing J2EE developer. You can't go from nothing to functioning ecosystem by redefining the world around you.

For example, I grabbed RabbitMQ and the Qpid JMS client (because it was the only one I could find). You'd think I'd be able to point the Qpid JMS client (built just on AMQP) at RabbitMQ and all would be well, right? Nope. Didn't work at all, never figured out why, nobody in the LazyWeb had ever tried such an insane thing. (You use the Qpid clients against Qpid! Not another AMQP product! That would be crazy!)

So my point is that while I think that open protocols for things like this are critical (I'll have a post on open protocols for RDBMS connectivity), this isn't it, for a variety of reasons. I wish it was, and I've pushed for Progress Software to support AMQP as an in-broker connector (entirely possible with their internal architecture), but I doubt it's going to happen.

Do I think SonicMQ is the end-all, be-all of MOM? Heck no. There's a lot of stuff they do badly. But I wish I could evaluate them as a broker rather than having to keep thinking in the back of my mind "no native cPython support; no GCC 4 Intel OpenSolaris x86 64-bit support; ...". Crack THAT, and you've got something.

UPDATED to clarify which other Financial Industry company was originally driving AMQP.