Wednesday, October 01, 2008

IPC: Where My Queues At, Yo?

Updated 2008-10-20: Adrian Found Where My Queues At. Unbeknownst to me, they were there (somewhat all along) in POSIX Message Queues, more in the comments. I've changed this to be part of LazyWeb, because I really didn't know that they existed and haven't actually seen anybody using them. M4D props to Adrian.

Updated 2008-10-20 #2: POSIX Message Queues are close, but not quite it. I'll be doing another post on this, but turns out POSIX Message Queues aren't quite what I think mailbox semantics are, and some differences turn out to be key. Also, they've only been around since Linux kernel 2.6.6, which would explain why I've not seen them in more widespread use.

Not that long ago, my firm was interviewing someone from the Czech Republic who had been working in Frankfurt for the Deutsche Boerse, and some interesting stuff came out of that. Not the least interesting thing was that their entire infrastructure was still based on OpenVMS, and it was working extremely well. The secret? Queues. Or, rather, Mailboxes.

What a lot of people know about VMS, aside from the fact that it's Old and Not UNIX, is that it spawned Windows NT (WNT == VMS+1). What they don't realize is that it's actually quite a reasonable operating system that's kinda gotten a bad name because of the WNT thing, and because it's Not UNIX, but I want to address the thing that it's really gotten right, and that's queuing as an intrinsic form of IPC.

As I understood from the interviews, Deutsche Boerse uses this pretty extensively to distribute work: work is pulled from mailboxes, posted to other mailboxes, in a pretty standard modern MOM pattern. The recipients of messages may be on the same machine, or on another machine. They don't know; they don't care. Sounds like what you'd use an MOM system for, right? Only they don't use an MOM system, they use their operating system.

A Mailbox in VMS essentially forms the basis of a fully asynchronous queuing system. Applications can define mailboxes, to which you can publish messages, and the recipient will be notified at some point in the future that a message is available. And because VMS has a lot of clustering facilities, mailboxes can either be on the local machine or on another machine, and because it's all fully guaranteed delivery (within reasons of course), you have full fire-and-forget semantics.

Contrast all this with your UNIX forms of IPC: Pipes, Files, Signals. What do the former two have in common? They're either synchronous or near-enough in that they involve polling. Signals may be asynchronous, but they can't really carry a payload: they're just a number. These are enough to form the basis for almost all forms of common IPC, except for fully asynchronous ones. What do you do in a UNIX IPC model if you want to send a message and at some point in the future make sure that something else picks it up? Put it in a spool file? That seems pretty clunky, even today.

Sure, you can expand the UNIX model to involve asynchronous IO, but that's really just an asynch vaneer on top of traditional synchronous IPC. It doesn't change the fundamental basis that your basic units of system-system IPC are fundamentally synchronous. This all smells funny.

Given that there are a lot of AMQP people who read my blog apparently, where's the topicality? How about this: why is it necessary that I even have to use something like broker-based MOM when the operating system could realistically do this for me? Why is this all so resolutely in user space when operating systems were doing this decades ago? Why do I need to communicate using sockets (another system-system synchronous IPC system) to another process to do basic message-based IPC?

Why can't I take all the various implementations of this pretty basic concept (fire-and-forget, guaranteed messaging, asynchronous notification, single consumer delivery) and pull them out of their various programming-language (Scala, Erlang) and application-semantic specific models (JMS, WCF), and pull them up a level? Why can't I get this out of my operating system? Why can't we just break out of the UNIX model of decades ago?

Maybe we've accepted that the micro-kernel model of operating systems has really won. And by that, I don't mean the Mach-model of operating systems, but, rather, the model that system-level APIs are pretty much fixed in stone by history, and won't be expanded (yes, they might be expanded in the particulars, but they're not going to be significantly expanded in the general concepts that people are willing to defer to the kernel to do). People do everything they have to in user-space.

But something to me is wanting. I think this really simple form of IPC, which is so insanely beautiful that people are willing to program in actor-based concurrency systems like Erlang just to get it, really should be given a second thought. Maybe not in kernel space, but definitely as a standard that you can assume will be around on any system on which you program.

And the interviewee? A (hopefully) happy employee of my firm. We hire quality when we can find it.
blog comments powered by Disqus