Wednesday, August 25, 2010

Java Initialization Barrier Pattern With AtomicBoolean

I've found myself starting to use this little mini-pattern. You might find it useful.

private final AtomicBoolean _hasBeenInitialized = new AtomicBoolean(false);
public void expensiveInitialization() {
  if (_hasBeenInitialized.getAndSet(true)) {
    // Someone else has already done the initialization,
    // or is currently doing it.
    return;
  }
  // Do the initialization that I want to be done only once.
}

Much simpler than any of the other mutual exclusion patterns that I've found myself using.

The caveat here is that in the case of multiple threads, it's possible that one thread (that got to the party late) may return to the caller before the initialization is done (if another thread is currently doing the initialization). Therefore, this pattern isn't suitable where the caller has to guarantee that the initialization is done before continuing in a multi-threaded environment.

I primarily use it where there's an initialization method that many parts of the code are going to call as a defensive measure before continuing. Think of it as a simple solution to the initializeIfYouHaventBeenButDoNothingOtherwise problem.

Of course, it well could be that everybody else in the world is already doing single-initialization this way, and I'm too addled from being outside the day-to-day coding world to have caught up.

UPDATE 2010-08-25 : I changed the name of the control AtomicBoolean to make it clearer that this is a multiple-execution-over-time pattern, and not a general purpose synchronization barrier.

Wednesday, August 18, 2010

Code in an expert programming language

Provided for your pleasure, anonymously, from an Expert Programming Language. Scala fans, you're about 2 steps removed from this.

redacted: {[str]
   map: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
   pad: #str _ss "="
   var1: 2 _vs/: map?/:str@&64>map?/:str
   var1[-1+#var1]: 6#(*|var1),6#0
   var1: ,/(-6#/:(6#0),/:var1),(pad*6)#0
   : _ci 2 _sv/: -1 8#((#var1) - pad * 8)#var1
}

Bonus prizes if:

  • You can recognize the hideous abomination of a language
  • You can figure out the common algorithm it's doing. Hint: What should redacted be named?

Tuesday, August 17, 2010

I Want A New Programming Language

Preface: This has nothing to do with the Oracle/Google Java lawsuit. Read this first.

Dear Lazyweb and Programming Language Inventors:

I want a new programming language. Although I seldom code these days for OpenGamma, I've wanted a new programming language for quite some time. I don't want an extreme language (in syntax or constraints); I don't want a purely experimental language; I don't want a faddish language. What I want is what Stephen Colebourne coins a "journeyman language."

What Is A Journeyman Language

Quite simply, a journeyman language is a programming language designed for journeyman programmers. And those guys are the hundreds of thousands of men and women working on business applications and systems programming every day.

Although there is wide variation in the quality of journeyman programmers, in general, very few of them are in the infamous Top 10% Of All Programmers level. But that's actually okay, and self selecting: the rockstar programmers simply wouldn't do the jobs that the journeyman programmer has to do every day; the job is too boring and lacks enough challenge and intellectual satisfaction/achievement. I have familiarity with firms in the financial services space who explicitly don't even interview high achievers because the employer knows the employee would hate the job and leave.

So what does a Journeyman Programming Language need? It needs to have a few general characteristics:

  • It has to be simple enough in syntax and conceptual framework that people not in the top 10% of the profession can feel comfortable working with it.
  • It has to be flexible enough to cover the amount of back-end, rich GUI, web app, and systems programming that goes on in the industry.
  • It has to make simple, common programming errors (memory allocation, array bounds, etc.) difficult or impossible.
  • It has to make it easy for Journeyman Programmers to change projects or jobs on a regular basis.

But it's a mistake to think that a completely dumbed down language can appeal and make a lasting impact in this space. Journeyman Programmers aren't idiots or morons: they often are just as good as rockstars, just not as passionate. That means that they're not investigating programming language features and Open Source libraries in their spare time, they're not going to meetups and blogging and tweeting and everything else. Raw talent often, passion seldom.

The other reason it's a mistake to dumb down the Journeyman Programming Language is that in any sufficiently large firm or project, there is often at least one rockstar programmer. And he needs to be comfortable with the tools that he uses to set the architecture and framework for other developers. Give him BASIC and he'll recoil in horror.

So here's my test: if you could write a reasonably high performance RDBMS system in the language, it has enough features. If you couldn't, it's not good enough. I like this particular test because I've done it several times, and also because there are a bunch of fiddly things having to do with getting primitives in and out of large byte blocks in space that languages like Java are particularly terrible at for no good reason that I can see (which is why LucidDB does query compilation in Java and execution in C++). RDBMS' involve everything you need: functions in the scalar expression system; objects in the query validator, compiler, and optimizer; efficient memory work in the executor. You do all of those in a simple language and system, and you've got my vote.

Positive Features

These are all characteristics that New Programming Language should have.

C-Style Syntax
Like many developers, I was born and raised on C lineage programming languages: C, C++, Java, C#. I've dabbled in many other programming languages (Pascal, Perl, Python, Scheme, Assembler), but nothing to me has the simplicity of syntactical expression that the C lineage of languages has. Let's keep that.
Garbage Collection
Getting rid of forcing memory allocation duties on the developer has been probably the single greatest boon to getting more people writing better quality code than anything else that has come out of programming language and environment development in the last 40 years.
Unambiguous Syntax
This is both a feature and an anti-feature. I want the language to be unambiguous, so that I can look at a chunk of code and, with experience with the language, know what it does. That means that any type of DSL services need to be confined to creating DSLs for other files, not intermingling DSL syntax with New Programming Language syntax.
Objects and Functions
I want an object oriented language, but one which recognizes that I will often have things that are better expressed with functional blocks. The Java approach of hanging static methods on a final class with a private constructor and import static is ridiculous and everybody knows it.
Closures
I want closures with my functions. With a nice syntax. That doesn't look like line noise. I won't want to use them for everything, but I want them to be present.
Useful Concurrency System
I want a number of low-level concurrency systems, as well as convenience operations for actor-style concurrency.
Object Immutability Services
I want some facility to mark that an object instance, or a particular call on it, will not and can not mutate the state. In other words, I want a const style system that actually works. Doesn't have to replicate the exact nature of the const keyword in C++, but just give me something that will lock down an object instance and thus allow it to be shared in a thread-safe environment without limiting me to constructor-only value injection.
Properties
I want C#-style properties. And I want a serialization system that allows me to cleanly map transport-representation objects to hierarchical data representations like Avro and FudgeMsg and JSON and XML (and will bork if I create a DAG or cyclical graph rather than a tree). Not all objects of course, but just ones that I'm going to use for transport. And I want the ability to do fast metadata-style operations on properties (think: for (Property prop : anyObject) { doStuff(prop.getName(), prop.getValue()); }). The way JSON flows for free out of JavaScript is an excellent example. Encapsulation? Pshaw. Journeyman apps are about data as much as objects, and everybody bloody well knows it at this point.
Real Generics
I want Real Generics. With runtime type information. That allow partial specialization.
VM Operation
It better run cleanly on any major OS and hardware combination. And having a VM gives me a lot of runtime management facilities for back-end processes built in (think jconsole or jvisualvm for the JVM). In particular, I'd probably like the JVM if you can shoehorn much of this into the JVM. If not, please look into making the bytecodes register-based like LLVM (even if the binary bytecode format ends up having both stack and register implementations potentially).
Stack Allocation/Memory Packing
If I really know the lifecycle of an object, and I want to bunch several similar objects together to exploit cache line efficiency, don't get in my way. Don't make it the default, but support it.
Fast Compilation and Runtime Linking
In my mind, these are one and the same thing, because each requires and mutually enables the other. The programming language needs to be able to support compile-on-save functionality (a la Eclipse) and complete runtime linkage (i.e. classloading/modules/whatever). These are massive productivity wins for a journeyman programming language.
Convenient Native Code Integration
While the core of the language should be based on bytecodes, sometimes for performance I really want/need to be able to go down to low-level coding in C. Please make that easy (JNI, for example, is horrific). Even better, if I could mark something as "this is in New Programming Language, but if there are native versions of this method/function about, use those instead" that would solve much of the JNI-style problems.
Modules
I like OSGi. I would actually use something like OSGi if it wasn't, well, OSGi. It would have to be baked into the language and compiler stack to be useful.
Partial Classes
I personally love this for combining code generation with hand-written code; C# nailed it again. And Journeyman projects tend to have a lot of places that code generation can/does assist with, particularly given the amount of XML and RDBMS and rich GUI work that goes on in large organizations.
Traits/Shared Code Blocks
A great feature to have a block of code that I can mix in without all the standard multiple inheritance issues coming in. At compile time, of course.
Random Stuff
While I'm at it, a bunch of random stuff I'd like.
  • Duck Typing
  • switch on anything. Strings, integers, enums, whatever.
  • Fixed precision decimals as a first-class type.
  • Static typing, with var lvalue determination.
  • Matching between module/type/class/whatever name and file name, to aid in automatic refactorings.
  • A good standard library, like the Python one, that gives me both an least common denominator and greatest specific feature ability to access the OS' services.

Negative/Indifferent Features

These are all features that I don't care about, or actively want not in the language.

Reuse of existing libraries
If you're targeting an existing VM like the JVM, I really don't care if I can use all the same libraries. Would be nice, but not really necessary. The only ones I'd care about are all integration systems and libraries like JDBC drivers to be honest.
Operator Overloading/Invention
Remember what I said about understandable code? Don't make it easy/possible to create a horrific mess of a language that looks like APL. The only time you ever want operator overloading for the right reasons is for the [] characters. Gosling was right on this one.
Monads
Gack. Yes, I'm very impressed that you found a way to have side effects in your pure functional programming language. No, it's not actually that useful in practice. Imperative programming languages can easily incorporate functional language features, and should do, without incorporating all the baggage that made your Functional Languages Professor at school giggle with delight.
Obsessive Terseness
Verbosity, when applied correctly, makes unfamiliar code easy to understand by someone who wasn't the author. An obsession with achieving the tersest possible language, or smallest possible number of syntactic features, makes code harder to understand. Remember, I'm talking about journeymen programmers here.
Pointers
No, the language can't expose a bloody pointer. There are only a few valid uses for them these days, and I'd hope that they're handled through other facilities. If you really really REALLY need one, just drop down to C and thrash with the system to your heart's content.
Abuse of your compiler
No lambda calculus in the generics system, please. No template metaprogramming, no hiding everything behind private typedefs, let's keep it simple.
Faddish Language Features
XML in my programming language? Yeah, in 20 years that'll totally seem relevant. Keep it in the libraries, thank you very much. Same thing with HTML, CSS, whatever.
Checked Exceptions
File this one under Seemed Like A Good Idea At The Time.
Throw Your Mom
Very clever that C++ allows you to throw anything. Null? A string? Your mom? No, thank you. I'll limit it to proper exceptions, thank you very much.

Why I Don't Have It

So why don't I have this language yet? Well, partially because programming language craftsmanship is hard. I'm pretty sure I'm not good enough to do it, which is usually my default criteria for saying something is Really Hard.

But I think as well the k3wl languages coming out are coming out of language requirements of the Top 10% crowd. They're the ones good enough to actually write the languages, and they're going to write a language that makes them happy. But then you end up with Scala, and then you end up with this monstrosity, and then you make me cry. A language in which that thing is even possible will never be a candidate as a Journeyman Programming Language.

You know who's going to do it? Someone like Gosling, who set about with the needs of the journeyman programmer in Java. But the state of the art has moved on, and Java just isn't suitable anymore.

Who I would really like to do it is Anders Hejlsberg. I am a very big fan of C#-the-Language. It's just that .Net-the-Ecosystem is so Microsoft-specific and horrific it'll never catch on in the wider world, no matter what Miguel de Icaza thinks.

So how's about this:

  1. IBM, please hire Anders Hejlsberg away from Microsoft. You know the Oracle/Google suit is scaring the crap out of you right now given how much you've invested in Java. It's not the suit itself, but the sign that Oracle, a major competitor to you, is going to leverage whatever muscle it can around Java against you eventually.
  2. IBM, please let Anders build this, which I'll call C-Prime, with smart people from the Java and LLVM communities, who all have a lot to add here.
  3. Open Source friendly licensing abounds, and the runtime works on a whole lot of interesting platforms. And if you want to pull a Larry and not support the Solaris port yourself, we'd all totally understand. You support Linux, Windows, and Mac, and 99% of developers are happy.
  4. Developers rejoice as they have the New Programming Language.
  5. Kirk is happy.

Oh, yeah, and Microsoft? If you could have broken the near pathological obsession with platform lock-in that surrounds all your interesting technology you had a really good shot with the CLR. C# could have been a contender. But at this point, your organization is so broken internally, and your reputation with the types of journeymen who work at large organizations is so tainted, that nothing you produce will get traction. Which is why I want Anders to leave you.

A little revolution every now and then isn't a bad thing. And at this point, I think it's time. Java-the-language will never advance in a standard way going forward; the collapse of the JSR has seen to that. We as a community who has worked on Java needs to move forward and onto the next language designed for the types of people who currently code in Java.

The Oracle/Google Java/Android suit and a forthcoming blog post

For those of you who aren't aware, I don't just spit out blog posts stream-of-consciousness. I mean, it might seem that way based on my terrible writing style, but I actually work at this stuff.

Many of you will know that I'm a long-time Java developer. I've professionally done a lot of other stuff, but the vast majority of my experience has been in Java. I like the ecosystem, the toolchains, the JVM; I find it a productive environment, and OpenGamma's software is predominantly written in Java.

There's a blog post that I've been working on mentally for months, and in text form for about 2 weeks. I'm planning on publishing it tomorrow. It has nothing to do with the ORCL/GOOG suit. Nothing at all. I've felt the things in the post for ages, well before Sun imploded, well before Oracle bought them.

For various reasons, I'm not going to say what I think about the suit itself. If you want to know, read Charles Nutter's analysis of the suit. Also read Stephen Colebourne's analysis. Both of these will make you smarter. Me? I have nothing to add.

When my post comes out, though, I wanted a vehicle to point all this out and a simple link in case people thought I was talking about the ORCL/GOOG situation. It's messy and complicated and wrapped in ego and profit and law and policy, and Charles and Stephen put the points across better than I could.