Monday, November 10, 2008

Sell Me An AMQP Appliance

I've been doing some thinking recently on the bevy of AMQP broker software which is being created, and I think it's a really positive thing that we're definitely seeing a new swathe of innovation in the broker-based middle ware space. In particular, the space seemed to many to be a fossilized space just a year or two ago, but now with AMQP and XMPP PubSub, is getting a lot of new development momentum. But ultimately, I'm starting to think that perhaps much of the innovation will be wasted on a large category of users.

I believe that we're going to see the broker-based MOM space divided into three categories:
  • Edge Distribution. Imagine that you have 10 processes on the same machine that are all receiving the same data. The most optimal way to distribute the data is for that machine to be running its own broker and connecting with the hub broker to distribute each message once for redelivery to the 10 processes on the edge system.
  • Embedded. In this case, an application is being written against the MOM protocols in question, but everything's happening inside the application's configuration, so that there are no external dependencies. Atlassian's Bamboo is a good example of this: they embed ActiveMQ so that their customers don't have to have an existing JMS infrastructure in place, and you don't even necessarily know you're running JMS at all.
  • Central Distribution. This is every other case, and generally means a node that is configured just to act as a broker, and isn't really doing other stuff.

The thing I've come to realize is that for the third case, Central Distribution or Dedicated Broker, I don't want to buy or run "software" at all. I want to buy an appliance.

Why An Appliance?
I want to buy an appliance because I don't want to have to deal with the complexity of optimizing my installation (and by that I mean the entire hardware/OS/software stack) for the purposes of running a MOM broker. Let's start anecdotizing.

When my company was planning on rolling out SonicMQ across our entire infrastructure, we spent a lot of time working on our proposed infrastructure. This included, but is not limited to:
  • Base Hardware Configuration. What processor? How fast? How many?
  • Failover Network Connections. For our High Availability Pairs, what network connection(s) did we want to use for the replication traffic? How precisely were they to be configured?
  • The OS and Patch Level. This got particularly hairy with some of the hardware options for the failover network connections, and we had to do a lot of patch wrangling to test out various hardware options.
  • Filesystem Layouts. For the various types of filesystem storage (WAL for persistent message storage, message database for persistent topic and queue storage), how did various options of disk partitioning, disk speed, filesystem layout and options affect performance?
  • RAM. How much? What GC settings for the JVM? Swap configuration?
  • SonicMQ Configuration. How did any of the above affect SonicMQ? Were there any things we had to tweak to get it to perform better?
How much of that do I think was really part of our jobs? None. None at all. I think it was completely wasted time. And that time costs money, in that I was working with our systems staff on testing and configuration and we don't work for cookies (cookies are merely a bonus that endear us to vendors).

If Progress had sold a 1U SonicMQ Appliance, which had 4 external NICs and a pair of HA connectors, we would have bought it. Even at a premium over the software+hardware, because that premium couldn't have cost more than our time did.

Things Are Getting Worse
Now, things are starting to move really quickly on a lot of fronts that would affect an MOM appliance:
  • Storage is changing rapidly. An ideal platform here would have some flash in it somewhere. But where? Fusion-IO-style PCIe card? Sun-style ZFS log acceleration? I have no idea, and I don't want to have to know precisely how to make your software go faster with Flash added. That's your job, Mr. Vendor.
  • Networks are also changing rapidly. What to use for an interconnect? 10 GbE? Infiniband? Something else? You figure it out for me.
  • Chips are opening up. Do you work best with Xeon? Opteron? Niagara? Does it matter? How fast do they have to be to saturate all network connections? I don't know. That's your job.
  • Can you go faster with TCPoE? Or does it clog you down? What about custom FPGAs for routing tables? Any way to leverage GPGPUs? Again, you're the vendor, figure out the most cost effective way for me to go as fast as possible.
  • Can you move into the kernel? If so, what parts? Remember, on a general purpose system my systems team will never allow a vendor to muck with the kernel. Once you're selling an "appliance", you can do what you want.
Just to pick on one of these that caused me no end of woe, even today, the choice of 10GbE HBAs matters: different cards have different bandwidth potentials per-socket vs. per-NIC, so this type of stuff really does start to matter more than it did before, and I don't want to have to test it all out for you. Which one do I use for an HA interconnect? What about for my connections to the outside world? I don't want to have to make that decision. I want you to do it for me based on what's optimal for your broker software.

Why Hasn't This Happened Yet?
I think the primary reason why this hasn't happened for things like MOM providers as of yet is that we, customers, just plain don't trust vendors. At all.

We don't trust them because:
  • They don't cycle hardware quickly. Our SonicMQ boxes are running on dual dual-core Opterons. It's monstrously overkill, and they're mostly idle. But generically most customers don't trust vendors, particularly of second-tier appliances, to keep up with industry trends, and that costs you in terms of overall performance. They have to buy in bulk to get the discounts from the chip vendors, and that means that they're going to push their stock as long as they have to.
  • They monstrously overcharge. Anyone who buys storage (and a perisistent MOM system is a storage beast as much as a network beast) knows this. Buy a disk drive for $X. Put it in a slide which costs $Y. Sell it for $Z where Z ~= 5(X+Y).
    Look, we get it, you need to charge a premium. But don't sell me ordinary equipment and charge a monstrous premium over what I can get myself from my reseller, or by going directly to the OEM. Clue Phone's Ringing: We're Not Idiots.
    You can charge a premium where it's warranted. But being an "appliance" vendor is not an excuse to jack prices up to a ridiculous level just because you can. You do that, and we'll start asking for generic software and configure it ourselves again.
  • They're Another Hardware Vendor. Each hardware vendor we have means another stupid type of disk slide and another "certified" RAM module which is the same as your standard PC RAM. It's all extra stock we have to hold, it's all extra work our purchasing department has to do, it's all extra validation we have to perform, it's all extra overhead.
  • They Don't Manage Well. Each appliance vendor seems to think that Their Approach is the Right One. So instead of saying "we're going to support getting machine statistics for Linux boxes" and just deploying that, we customers end up having a myriad of different SNMP MIBs and other little tweaks that means that for each new appliance, we have to change a whole lot of our management infrastructure. Blech.
    (Anecdote Time! NetApp provide SNMP MIBs on their network adapters. But for total bits, they don't provide a 64-bit counter. Meaning that if you have a 1Gbps network connection, it'll roll over in like 7 minutes. So if you're not running your stat gathering MRTG/Whatever every 5 minutes, you'll have to factor in rolls of the 32-bit counter. I will assume that you, as an "appliance" vendor, will have similar levels of Stupid).
Solution Study: LeftHand
LeftHand Networks had an extremely interesting approach to this, which is that they had software (SAN/iQ) that really benefitted from close hardware integration, but they partnered with Tier-1 system vendors (Dell, HP) to certify the LeftHand software on the system hardware. Brilliant option. They're now part of HP, but the idea was really sound. But take it a step further, and do what a lot of companies are now doing.

OEM the platform of a Tier-1 system vendor (Dell, HP, IBM, Sun). Sell us that platform as is. Don't bloody pretend it's something magical (for example, don't even bloody think of not allowing me to swap out one of your Special Magic Drives for a stock Dell/HP/IBM/Sun drive). Just sell it as an appliance with your software.

But even more than that, be quite open that what you're selling is:
  • Tier-1 Hardware, running
  • A customized Linux/OpenSolaris/*BSD operating system, with
  • Your MOM software, with
  • Perfect configuration, and
  • Special Sauce (optional).
Don't stop me from going in an poking around in a read-only way at what you're doing, make sure that I can replace your hardware with stuff from the Tier-1 vendor itself, and make sure standard Linux software (like the standard SNMP MIB and SSH and top) are all available to me in some minimal way.

This covers all the hardware vendor objections (even if you just choose one Tier-One vendor, they all have programs for just this, and we all have experience with all of them), and still allows you to optimize for the OS and hardware like mad.

The Open Source Problem
But what if you want to run things all open-source on your own hardware? Then have the project, rather than distributing an RPM, distribute a full bootable image that you load. Or have your favorite Linux/OpenSolaris distribution come up with an installer that rather than building a general purpose system, builds a system dedicated to just running that particular application (e.g. Anaconda, but custom RPMs and boot images).

I think this is probably going to start being a more common paradigm going forward, and the move to cloud computing is probably going to take things farther as people use something like CohesiveFT or some other mechanism to grab VM instances or instances for their favorite cloud computing provider for infrastructure software.

Why I don't think you're going to want a Cloud Computing solution to this in particular (e.g. an EC2 cluster) is that something like MOM done right in Central Distribution mode is pretty network topology sensitive: it doesn't make sense to virtualize it or run it as a VMWare image. It wants to be close to the hardware. Run it in another type of workload, but don't try to push tens of thousands of persistent messages through it per second; it's never going to work. But the same general approach has a lot of merit for other types of projects.

Conclusion
I think there's a big option for someone to provide an AMQP appliance for the centralized distribution case and push it heavily. There's no reason why I should be running what essentially amounts to a persistent network appliance on generic off-the-shelf hardware and operating systems, and there's no reason at all why I should have to do your performance optimization yourself.

But again, AMQP is the secret ingredient: I'll trust an appliance running custom software to provide a standard network service. I wouldn't like an appliance running custom software for a custom protocol.
blog comments powered by Disqus