Monday, July 21, 2008

iPhone Software I'd Love

Dear Lazyweb,

Here are the pieces of software I'd love to pay someone for in order to make my JesusPhone experience much better:

  • A ToDo system which doesn't suck
    • Sync with something (anything!) on my Mac or the InterTubes
    • Allow me priorities
    • Allow me notes, deadlines
    • Give me alerts
  • A Notes system which doesn't suck
    • Sync with something (anything!) on my Mac or the InterTubes
    • Comic Sans is the bane of my life
  • A Weather app written by someone who's seen the Tufte criticism of the iPhone.
    • And while I'm at it, when I go to the Weather App, why in the world if it can't make a connection does it refuse to show me what it saw last time? Stupid app.
    • And I'm in the UK. Give me Met Office data, and show me down to the weather station level, and show me the 3-day every-four-hour data as well.
  • A semi-offline Google Reader app
    • Allow me to sync up a bunch of my feeds that are full-text and don't do summaries in the feeds, and let me read them sans network, and then sync up again when I'm back in network range
    • This can't be Yet Another RSS Reader: it must synchronize my state with Google Reader.
  • An SMS application that allows me to queue multiple SMS messages while I'm underground in the Tube and sends them all when I get a network again. Heck, my Treo did this years ago. Why can't the JebusPhone do it?

Also, I really don't want to see any more BeerPhone shit. Man, that got old quickly.

Thank you,

Kirk

Wednesday, July 16, 2008

AMQP Exchanges as Routers

Since I'm in a posting and commenting flurry, allow me to elaborate just a little bit on my post about the AMQP semantic model.

First of all, let me give a big shout out to the AMQP people in general. Although I've been told I've been a little harsh, my original issue here was frustration that precisely what I want wasn't already done. I like the design of the protocol itself, and I really feel the need to have a vendor-neutral wire protocol available, to make the industry better and further drive asynchronous systems. So mad props. See? I'm not just full of hate.

But something that "Matthew" said in a comment on my last post is that I'm thinking in Software terms rather than Networking terms, and that really the Exchanges are all about routing. I fully agree with that, and in fact, I think a super-powerful extension to the Exchange model is to have a fully programmable Exchange system where developers/administrators can upload custom programs which run in the process/memory/machine space of the broker itself (with a limited virtual machine of course) to give me complete and custom routing rules, all occurring within the confines of the broker.

That would give me the most amazing flexibility to implement any type of routing rules that my heart could desire, and honestly, I think done right it would be a Massive Win for whoever does it well.

So I do understand the terminology here, but let me state a bit more clearly why I think the "Exchange Is A Router" analogy is a little spurious.

If you go to a low-level networking analogy (Matthew's comment mixed in NICs and Routers), an Exchange can't be a Router, because packet publishers don't get to choose which Router they're using. They only get to send bits out onto the wire, and the bits get picked up. They might get picked up by a switch and sent to another switch. They might get picked up by a router. They might disappear into the ether (well, let's hope not in the case of persistent messaging). You publish and hope.

And that is where I believe AMQP should be going.

You've got an easy (semantically) way to rectify this (though not the issues with not having a pre-built Exchange specification that can handle JMS Topic messaging, which really should be specified as a "should" feature in the 1.0 specification): In classic software terminology, add in a layer of indirection.

Publishers simply have a potentially named Source for messages, and there would be a separate binding (though optionally linking to an Exchange by default, and allowing sources to specify their default, un-overridden Exchange declaration) for Sources. That insulates the input of messages into the system from their routing, and allows for the runtime redirection of messages to different types of exchanges for different types of routing.

Yes, it adds in another layer of indirection, but it resolves this problem entirely and gives a clean approach that more accurately separates out inputs of messages/packets, routing, and destinations for messages/packets.

Then again, I'm not on the AMQP working group, so my opinions are worth roughly the amount of time I've committed to the project.

I'm a Relentless Shill for Atlassian

I've been told in the past by work colleagues that I'm a relentless shill for Atlassian products. Currently, we're using most of the suite (JIRA for issue tracking, Clover for Java code coverage, Bamboo for continuous integration, FishEye as a Perforce enhancement). The only piece of shelf-ware we've paid for but haven't rolled out yet is Crowd, because of security issues in our company's IT policies.

Sometimes it seems that I do nothing but try to sell their products, so I wanted to justify quite why I like them so much.

First of all, they produce good products. Each of those products is particularly good in some way, and the fact that they focus on plugin development as well as a single product means that they have an ecosystem which does more than a single product ever could.

Second of all, they integrate well, and so you get network effects by using them together. Also a plus side.

But the main reason why I'm confident in recommending them is that they have an extremely good attitude towards support, and that really matters when you're trying to work with this stuff in a real environment.

Other Companies: Read This. It Matters.

They have an approach to support that I call Extreme Openness. Essentially, as a user (or even prospective user) of their software, you have the level of insight into their internal development processes that you wish you had from even open source projects (where unless you're trolling through the mailing list archives you often can't establish).

For example, the JIRA system that they use to track their own work is publicly accessible. Not only that, you can freely post things in it. For example, I've been quite active in providing bugs and feature enhancement requests. They make all that information available, and I can easily help myself in seeing whether things I've encountered or only dreamed of are on their roadmap, and vote for what I think is important. This gives me an insight into roughly what their priorities are. I may not always agree with them, but at least I'm aware.

It also means that I can see what's forthcoming, up to several releases in advance. This actually meant the difference between our buying a product and not buying it, because we saw that the roadmap involved some features that we wanted. For example, we were evaluating our Continuous Integration technology, and came up with the standard set of contenders (Bamboo, Hudson (which we already used), CruiseControl (which we already used), CruiseControl.NET (which we already used), and TeamCity). We ended up not buying anything, in part because we saw that Bamboo was working on the key feature we needed (remote build agents) and we wanted to wait for its implementation. As soon as they implemented it, we bought the product because it was better than the alternatives. Now we're almost done putting every single project firm-wide into our Bamboo installation.

Contrast this with your general Classic Big Software Company. Maybe you'll drip out bits and pieces of what's forthcoming to key customers, but you definitely won't let all your bugs out into the open like that, and you definitely wouldn't let customers and competitors see what you're working on (and sometimes more importantly, what you're not working on). I like the Atlassian approach quite a bit, because it gives me insight. And if I'm making a commitment to a platform like this, I need to have long-term confidence in it as a platform, and that means insight into development.

To give a contrasting example, SonicMQ comes out with a new release. Did I even know they were working on it? No (and at this point I have a direct line to the head of SonicMQ support for EMEA). Could I easily find out what was in it? No. There're reference notes, but they don't really tell you what's going on. As a customer, did they even email me? No. I found out meeting with the EMEA support manager on a standard meeting that it had come out as he asked me what we thought of it.

Next, you can download all their products and use them in some limited way. No contacting a sales person, no having to hunt for them, you just download them and use them. With SonicMQ, although I'm a customer, in order to get an "official" download of a new release, I have to have my account manager approve our account getting access to a new release, and then it goes through to the worst download manager in the world (including the new Sun JavaSE download web system) and hunt for the downloads that I need. That's rubbish. Atlassian ones are same download no matter what you're going to use, whether eval or paid. I like that a lot.

And they make it really easy to work with their support system, and they're really responsive. They aren't using some archaic system, they aren't making me work with something that makes me rip my hair out. Their system works really well (it's built on JIRA, so they're eating their own dog food), they're fast to respond (although the time zone differences between Sydney and London make for high latency sometimes), and they have proven themselves very intelligent over support channels, and I have no doubt that very often I'm communicating directly with a developer on the product.

Could you scale this up? I think so. Even when I was at BEA working on WebLogic Server, and we had over 50 core developers on the product, we all still did some type of support, we had public downloads of the major products, and while we didn't have public JIRA access, we did announce what features were forthcoming quite a bit and communicated on particular features on public newgroups. Perhaps that's one reason why at the time people liked us.

But you have to support this if you're trying to sell to developers. You have to think of the fact that many of your customers are extremely intelligent (some of them are lesser-skilled, and you have to support them too), and they won't put up with dealing with standard support drones. If your support engineers ever see a script they have to follow, and I sense that, I'm going to do everything in my power to never work with you again.

So if you're planning on starting a software company, let me give you some lessons that you can learn from Atlassian:
  • Make it easy (super duper extra easy) for me to download and use your products. Whether I've paid you or not.
  • Don't ever stop me from downloading your manuals. Before I even think of installing your software, I'm doing to read your manuals. Cover to cover.
  • Open up your processes and let me see what you're working on. I really want to know, and I really want to know in quite depth and detail. Your competitors may find out, but quite frankly, if you're that worried about individual features, you can hide those features themselves. And if your business model requires that nobody knows what you're working on, you're going to fail anyway.
  • Have good support. Have excellent support. Make me think that my annual support contract is a bargain. I'm going to pay you, so make me happy I'm paying you.
  • Make Good Software.

Tuesday, July 15, 2008

AMQP's Semantic Model and Mismatching

My last post with real technical content (on AMQP and standards) attracted some notice, so I wanted to follow-up with some specific issues that I find with AMQP as a programming and application semantic model, as opposed to a low-level description of the protocol itself.

A naive question might be "why in the world is there some type of semantic model in an application protocol? Surely it's just specifying how stuff goes back and forth?" But in almost all protocols there is some type of semantic model embedded: IMAP involves Messages and Folders and connections and IDLE and all that; HTTP has Headers and Cookies and all that. There's no way to provide a completely implementation-neutral protocol without some type of semantic model, because if you strive for it, you're striving for a protocol that doesn't do anything.

So that being said, we'll focus on AMQP as an application semantic model. I'll be comparing and contrasting it with the two systems/semantic models that I've used the most in a production environment: Tibco Rendezvous and JMS (personified by commercial examples from Tibco EMS and Progress SonicMQ). As I haven't used any AMQP system in anger, I won't be discussing anything specific about any particular AMQP system (e.g. Qpid, RabbitMQ). Everything's based on the 0-10 AMQP specification.

First off, let's address Virtual Hosts. Why are these useful at all? The only thing that I see is that they allow you to have the same port open for multiple uses with the same authentication system. But this is really only particularly useful because AMQP recommends a particular listening port. Systems like SonicMQ allow you to have multiple brokers share their administration domain (though on the same machine you'd have to have multiple TCP ports open), which is a much cleaner concept. Virtual Hosts look like a
solution to a problem nobody has. Just listen to different ports, and job's done.

Now there's the matter of the Exchange/Queue model. I had a conversation with the Progress people about this, and I don't think they fully got it, because the use of the term Queue here throws people from JMS-land off: they immediately think JMS Queue, which is a really horrible concept, rather than generic queues.

For clarity for those not familiar with AMQP, an Exchange is an inbound message router, and a Queue is a message delivery vehicle. Messages come into an Exchange, which directs them to Queues. Each Exchange is responsible for routing messages to the consumers that want them, and uses the logic embedded in it to support this. Therefore, I'm going to refer to these as Consumption Queues from here on for clarity.

AMQP defines a number of Exchange types, including
  • Direct, where all messages flow through directly to all Consumption Queues which match exactly
  • Fanout, where all messages flow through directly to all Consumption Queues regardless of any other criteria
  • Topic, where all messages flow through based on globbed Topic hierarchies (like foo.bar.*.baz)
  • Headers, where header properties determine the Consumption Queues.
So to start out with, let's look for how we'd provide JMS-style semantics. Oh, wait, we kinda can't: JMS Topics are routed via a combination of Topic and Headers, because you can arbitrarily decide what to select based on a combination: there's no Topic And Headers exchange type defined in the spec.

However, the problem goes much deeper, in that the choice of exchange is determined by the publisher. That's just wrong. As a publisher, I shouldn't be deciding on the routing rules for my messages except in the Direct case. I should only be deciding that I'm going to send messages and it should be 100% in the control of the consumers how they're filtered. Those semantics are exactly what JMS requires, and are exactly what all distributed MOM systems require (remember: those rely on publishers just throwing bits onto the wire; publishers don't even necessarily know if there are any consumers listening). Consumers can declare Bindings between their Consumption Queues and the Exchanges, but they can't control the type of exchange which a publisher is publishing.

This Matters.

It may seem like a small semantic issue, but most of the cases where I really care about the semantics of my MOM system are when things are going wrong and I'm live in production and have traders screaming down the phone to the front-line support people. Then, I need the flexibility of controlling things at a purely consumptive level, because when AMQP gets going, I will have no control over many publishers in my enterprise.

Let me elaborate slightly on that point. Assume that AMQP goes gangbusters. Then you'll start to see many systems that currently support things like Tibco Rendezvous (think Reuters data feeds, Sybase Replication Server, stuff like that) supporting publishing out to AMQP. And they'll be commercial systems, and I won't be able to control them in any way. Therefore, if they do something wonky with their Exchanges for their publishers, I won't be able to change them in any way. And I'll be royally screwed if they choose an Exchange type I don't like.

Oh, and I also don't like that clients specify their own temp queue names. Why not have the broker do it? You may say "choose a UUID" but honestly, I've had enough run-ins with lesser skilled developers to actually want to limit their ability to screw things up. Again, solution seeking a problem.

There are a lot of really nice things involved in the spec, but these two things are massive semantic problems to me that should be addressed.

Monday, July 14, 2008

Broker-based versus Distributed MOM

One thing that some people have commented to me in email is the difference between broker-based and distributed message oriented middleware. In general, broker-based involves a central machine (or machines) that all clients (publishers and subscribers) connect to over TCP (or something equivalent) and route all traffic through. Distributed MOM systems involve all machines simply listening to raw Ethernet and using some type of bit distribution system (Ethernet broadcast or Multicast) to get the required messages to all subscribers.

The major advantages of a distributed system are:
  • No Central Point of Failure. If you have a centralized broker getting in between all your publishers and subscribers, you have one central point where failure will really mess you up, and you have to engineer around that.
  • Low Latency. Because you're not relying on things like TCP, you can engineer a much lower-latency solution overall. In particular, very modern distributed MOM systems should be capable of leveraging Infiniband which would give you incredibly fast and low latency distribution (though I'm not aware of any who are; if you are, please let me know!). Even IP multicast is lower latency than publisher to broker to consumer TCP distribution.
  • No Central Bottlenecks. If everything's going through one chokepoint, and it has performance problems, everything slows down.
  • Slow Consumer Problem Avoided. The default behavior of centralized systems is that if you have a publisher and multiple subscribers, one slow or misconfigured subscriber will throttle the publisher down to the rate that the one slow subscriber can handle. This is the cause of many operational issues as you have to track down the bad code and kill it. This is particularly commonly seen if you have someone lazy enough to run a debugger against a production subscription feed.
On the converse, Broker-based MOM systems offer:
  • Single Central Point of Resiliency. While you have a single point of failure, that single point of failure is one that you can engineer the heck out of to avoid any single point of failure (for example, my firm's production infrastructure allows for a number of failures, across disks, network connections, power supplies, power feeds, cables, switches, RAM cards, etc.). Engineering that level of resiliency into multiple points can be very difficult to achieve, in the Broker-based model you only have one point to harden, so you can focus all your efforts against it.
  • Never Losing A Message. In a distributed system, there are far too many places where you can end up losing a message, or, more difficult to guard against, losing the order of a message in a stream where the order might matter (for example, delivering a "trade" order after the "cancel" order has already been processed). That leads to:
  • In Order Delivery. The central broker can and does (sometimes) act as an ordering mechanism, which means that you can impose different ordering models onto your messages (for example, global order, where every message is guaranteed to be delivered in the order in which the single central broker receives it; or publisher order, where every message is guaranteed to be delivered in the order in which the publisher produced it).
  • Statistics, Management, Advanced Features. There's a lot of stuff you can do if you can put a single man in the middle of your messaging stream, and broker-based approaches tend to do it.
To give you an example of where Broker-based messaging is a pretty clean win, assume a case where a single message absolutely, positively, MUST be delivered to two services, no matter what. To do that in a broker-based system using JMS terminology, you have two subscribers, each with their own Durable Subscription (which will store messages on the broker if the client process is disconnected), and the publisher does Persistent message publishing. Job's done, the broker does the rest.

In a distributed case, you have to have disk storage (redundant of course) on the publisher, and a distributed logging mechanism, and some type of handshaking between the logging mechanism and the publisher to notify the publisher that it can wipe the message from the perm storage, and some type of equivalent handshaking between the publisher and the clients and ... Yes, you can do it, but architecturally it's a lot more complicated, and there are a lot more points that you have to harden to make sure that nothing failing can kill a message from being delivered.

There's one other thing that also factors in here, which is cost and licensing models. Distributed MOM systems tend to cost per node, Broker-based MOM systems tend to cost per broker CPU. The cost of Broker-based systems is usually much better for a case where you're scaling out to your entire company's desktop infrastructure, distributed MOM systems tend to work out better if your application is limited to a few machines. And it doesn't usually take very many machines for the Broker-based systems to win.

So what do I recommend? In general, these days, I recommend coding against a Broker-based model unless you know you need the latency and performance characteristics that you can get out of a fully distributed system. If you've coded your system well, you can always retrofit in a distributed MOM system later on.

Anybody else have any thoughts?

Monday, July 07, 2008

Wimbledon Redux

An All-Williams ladies final, Nadal finally winning at the All England club. What an amazing fortnight it's been.

Didn't get my IPTV work done, but I'm definitely still pushing for it in the run-up to the Olympics. Wish me luck, still, going to meet with our head of Systems (I'm a lowly Front Office Technologist) when he's next free to try to discuss things. But I think I have a plan.

If anybody has worked with VBrick systems, please let me know!

Message Oriented Middleware and Open Protocols

Something that I was recently reminded to post on (due to a comment on my post on Progress and Iona) was the work that I had done into AMQP recently (encompassing RabbitMQ, Qpid, et al).

As some of you might have gathered, I'm a SonicMQ user. I'm a big fan of SonicMQ as a core product. The work that they've done on the core message broker has been phenomenal, and compared to anything else in the market, it's amazing. I'm quite pleased with many parts of the commercial decision that my company made (based on my technical input) to make a very substantial investment in SonicMQ. So don't let anything I'm saying act as SonicMQ bashing: if you are in the market for a commercial, centralized-broker-based MOM system, buy SonicMQ. With the caveats you'll get below.

The one thing that I will pretty openly criticize SonicMQ for (aside from their installers, which are openly toxic and should be approached with nothing other than open hostility) is their non-Java client libraries.

Quite simply put, their C++ client support is precisely what I would expect from a Java-based company: it's sporadic, not well engineered, looks exactly like Java translated to C++ (rather than idiomatic C++ using things like Smart Pointers and the like), and the porting is pretty abysmal (they seem to think that "Solaris" is one unified thing, rather than properly understanding the difference between hardware platforms, OS versions, compilers and versions). Their C# support ain't much better to be fair.

But it shouldn't have to be that way. Quite frankly, I'm shocked at this point that there isn't some type of IETF-supported protocol for MOM in its various forms.

So when I was alerted to AMQP, I was quite excited. The basic premise of AMQP is to provide a vendor neutral wire-level binary protocol for asynchronous, broker-based message oriented middleware. Huzzah! It's all good! We're saved! SonicMQ supports this, C++ domain experts write the client libraries, I can get Python native libraries, and I get the best of both worlds: client libraries in all the languages and environments I have to support, and the best broker-based middleware I can find.

Alas, it's not to be. While it might be time for a vendor-neutral middleware protocol, it doesn't appear that AMQP is it.

To start with, a little birdie who knows many of the LShift people (who are partially responsible for RabbitMQ) indicates that the protocol discussions are degenerating into SQL-99-level vendor arguments, before 1.0 of the protocol has even been ratified. All the vendors of pre-1.0 products are already jockeying for position. And I thought SQL was bad; but at least they all had commercial products out when they started fighting in committee! So the protocol itself seems stillborn by its own vendor quibbles.

But then there's the fact that Merrill LynchJPMorgan (one of the major client drivers of the protocol) pushed for something that doesn't really make sense. It's almost as though the protocol requires an implementation that's half bog-standard JMS, and half Tibco EMS, and all confusing. SonicMQ, for example, has IMHO, the best approach to overcoming the insanity of the JMS specification: Shared Subscriptions. But rather than allowing for vendor differentiation through features like this (and if you're evaluating MOM, you have to fully understand the power this gives you, and is unique to SonicMQ), the AMQP defines a MOM model that nobody currently implements, and is highly unintuitive.

Why, oh, why, couldn't they have just taken something like JMS, taken the edges off it (introspectible queues with online reordering? W-T-F?), and done that as a 1.0? Then at least they'd have a basis for something that would be commonly understood by MOM professionals, and would have had a scope for a vendor community to build implementation support.

Instead we face a situation where we have, simultaneously:

  • A protocol nobody supports; based on

  • A programming model nobody is used to; based on

  • A conceptual model nobody has worked with; with

  • No existing functioning reference implementations


This is not the way to build a community! It's classic chicken-and-egg. Give some existing stakeholder some reason to work with it, whether it's a vendor or an MOM professional or just an existing J2EE developer. You can't go from nothing to functioning ecosystem by redefining the world around you.

For example, I grabbed RabbitMQ and the Qpid JMS client (because it was the only one I could find). You'd think I'd be able to point the Qpid JMS client (built just on AMQP) at RabbitMQ and all would be well, right? Nope. Didn't work at all, never figured out why, nobody in the LazyWeb had ever tried such an insane thing. (You use the Qpid clients against Qpid! Not another AMQP product! That would be crazy!)

So my point is that while I think that open protocols for things like this are critical (I'll have a post on open protocols for RDBMS connectivity), this isn't it, for a variety of reasons. I wish it was, and I've pushed for Progress Software to support AMQP as an in-broker connector (entirely possible with their internal architecture), but I doubt it's going to happen.

Do I think SonicMQ is the end-all, be-all of MOM? Heck no. There's a lot of stuff they do badly. But I wish I could evaluate them as a broker rather than having to keep thinking in the back of my mind "no native cPython support; no GCC 4 Intel OpenSolaris x86 64-bit support; ...". Crack THAT, and you've got something.

UPDATED to clarify which other Financial Industry company was originally driving AMQP.