Monday, October 20, 2008

POSIX Message Queues: Useful, but Limited

Since my previous post on looking for VMS-style Mailbox semantics for more modern operating systems, I've done some research into POSIX Message Queues based on a comment in the entry. I think they're pretty darn cool, but it seems like they might be targetting a different audience than what I think I was looking for.

It also appears that the reason why I've not really seen much (any?) actual code using POSIX or SysV message queues is that support at least for POSIX Message Queues is relatively recent in the Linux kernel timeframe (2.6.6), and thus hasn't really been around for long enough as a system-primitive to find itself worked into much other software.

Disclaimer: It's entirely possible that at this point I'm largely inventing what I'd love from an internal queuing system, and it's quite probable that VMS Mailboxes don't do any of this. From here on, assume this is Kirk's Dream Land rather than "VMS Rulez".

Message Size
The default Linux implementation has the maximum message size set to 8192 bytes. That's not a lot of data to be honest, although it would be more than enough for an bridge between Erlang processes for most data. Unfortunately, this is a kernel option, and so to increase it you need to muck with your runtime kernel parameters (/proc/sys/fs/mqueue/msgsize_max).

Maximum Queue Size
The maximum number of messages you can put into the queue before it starts to block the sender is 10 messages. Again, kernel configurable (/proc/sys/fs/mqueue/msg_max), but 10 messages is pretty darn small.

I think these two combine to give a pretty clear indication of the type of message semantics they're going for: very fast, small messages. I could see a pretty clear analogue to most tick data in a financial services environment, or audio packets, or Erlang tuples. But this is really clearly designed for super-latency-critical scenarios, and the 10 message maximum is surprisingly small.

In doing some research into this, I came across this comment on a KernelTrap post, which I think gives some pretty clear insight into what's going on here, which is that this is designed for ultra-fast communications between processes (trying to make sure that everything is staying inside the same core and L1 cache if at all possible I would presume).

Persistence
This all makes sense once you realize that POSIX message queues aren't persistent at all: they're like a Priority-queue channel between currently running processes on the same machine, they don't appear to survive machine restarts in any way. This is great as a low-level IPC mechanism, but doesn't really seem to apply to more failsafe long-running messaging scenarios.

When the kernel bounces, all messages and queues disappear.

Initial Reaction
After all this, it appears to me that POSIX Message Queues are really intended to be used for IPC at a very low level of interaction within a particular system, rather than for longer processes with more data. I'd love to see them used for an interaction channel through which a disk-based queue system could be built though.

For that reason, I probably wouldn't attempt to bridge them using AMQP: it would break the semantic model of super-low-latency messaging far too much to try to bridge them over anything slower/higher-latency than Infiniband (though that would be an excellent non-MPI way to use IB, particularly if you have RDMA for the message contents). A persistent queue implementation using disk leveraging POSIX Message Queues for control? That would probably be a much better candidate for AMQP bridging to me.
blog comments powered by Disqus