Friday, September 19, 2008

Solaris 10: A Terrible Choice For Java Continuous Integration

As some of you might know, I'm not permitted to run Linux at work. Rather, I have my choice of Windows or Solaris 10 x86. (And no, I'm not allowed to run OpenSolaris, it's boring old Sol 10). I do a lot of Java. I love Continuous Integration. I started pushing Bamboo here in house. And lo and behold, I converted the masses, and we ended up merging projects from CruiseControl.NET, CruiseControl, and Hudson all to one lovely Bamboo instance. We ended up with 110 build plans in Bamboo, all but about 10 of which (dependant and overnight plans) were hitting Perforce all the time to determine whether they should build.

And then things went truly, truly, horribly wrong, to the point of my almost rescinding my Atlassian Shill status. But it turns out that it's only partially their fault. It's actually Sun's fault. Follow me here on a path down process forking details of joy.

What a Java-based Continuous Integration server does when dealing with Perforce, because Chris won't allow them to open up the protocol or provide a usable Java interface, is:
  1. Check to see if there have been any changes in Perforce
  2. If there have, sync code and run a build
  3. Go to 1
For step #1, you've got two choices. You can either say "give me all changes, and I'll figure out whether the change applies to this build plan" or you can say "give me any changes that apply to each of my build plans." The former involves much fewer Perforce interactions, but the latter is directly supported by Perforce itself, which makes it easy to implement, and thus I would argue more correct, since Perforce has a lot of logic about exactly this type of operation (I want a CI system, not another SCM system).

Here's where the whole thing turns to Fail.

When we started adding more and more projects to our Bamboo installation, it started running more and more slowly, to a point where on a 4-core dual-dual-Opteron box it was averaging a load average of 7, with up to 80% of time spent in the kernel. That type of load was so extreme that all sorts of things started going wrong completely mysteriously.

Because of the whole Java+P4 issue, you have to shell out to run the p4 binary to interact with the Perforce server. Under Unix implementations of the JVM, that involves essentially a fork + exec (under Windows it does not, so this is arguably a superior behavior). Here's where things get sticky.

Working with Atlassian, we figured out that in part, this was because Bamboo was being over aggressive in hitting Perforce, and under aggressive in caching things that seldom/never change. Moreover, it was the cost of actually performing the fork far more than the cost of the subprocess that was killing you. 2.1.2 (to be released on Tuesday) resolves this, and so checking whether there are changes goes from 3 p4 invocations, to 1 p4 invocation. A factor of 3, which in their artificial test suite results in a 76% performance improvement.

Now the one thing that ties everything together here is that the parent process here is a JVM. But not just any JVM, a JVM tuned to running big web applications with a lot of users (the same Tomcat instance hosted our JIRA instance as well). And what's the generic rule of thumb for running a JVM for a Servlet container? More heap. More heap, more heap, more heap.

More heap, more betta, right?

Wrong.

Turns out that JVMs on Solaris 10 blow (funny that, given that they're both from the same company; you might think Sun would have an interest in making sure Java ran best on Solaris, but whatever).

In order to do a traditional fork, although there is a copy-on-write optimization to avoid actually duplication of system memory that's going to be abandoned, in order to comply with the Posix standard for fork, Solaris has to reserve enough swap space for the forked subprocess. Now in top, you won't see it, because that swap space isn't actually in use, just reserved, so you can't have it, but it's not being used. Great.

This is terrible, and is precisely why things like clone() were invented in Linux and used to great success. My best recollection (can someone clue me in here, Lazyweb?) is that this is what Runtime.exec() and ProcessBuilder.start() do on Linux-based JVMs. Solaris traditionally didn't have such a beast, so you're stuck with old Posix fork() behavior. Which sucks for this.

So what Bamboo is doing is running 3 2GB heap forks for every Perforce interaction, which is really really really bad.

When one of my colleagues did a fork test here to see how fast he could do a fork+exec in C with a 2GB memory allocation (and forking to a really small sub-process) on the same hardware and OS configuration, it turns out that doing that you're limited to 3/second, and it consumes about 50% of the machine's total CPU utilization in the kernel. Given that Bamboo was attempting to do the same thing in multiple threads, the fact that it was hitting 80% kernel utilization makes complete sense.

Hence, when I reduced the heap to a mere 512MB, Bamboo actually ran faster. Much faster. Back to being usable factor, even with my 110 plans. Once 2.1.2 comes out we should be really rolling, and we might be able to achieve our ultimate goal, which is to have every single build in our group managed from one metadata server with build agents scattered to the winds. And that would be sweet.

Oh, and don't think Sun doesn't know about this. Turns out in Solaris 10 they added a system call specifically to solve this problem. It's called posix_spawn, and it does precisely what you'd want. And Sun hasn't changed their JVM on Solaris to use it, probably because Sun, like everybody else in Solaris land, targets Solaris 8 for all those people who refuse to upgrade.

And I think that says a lot about the Solaris community.
blog comments powered by Disqus