Tuesday, March 23, 2010

Looking For a WordPress Theme Developer

In case you hadn't noticed, the OpenGamma web site doesn't really have a lot of content on it. We intend to change that.

We've got content broadly ready to go.

We've got a designer working on HTML templates for the content.

We're looking for a freelance designer to convert everything to a WordPress theme for a (mostly static) site that will definitely include a blog.

Nope, you don't need to be in London. You could be virtually anywhere in the world. Send an email to jobs or info or even kirk, all at opengamma.com.

Sunday, March 21, 2010

My MP's View On The Digital Economy Bill

Bases on the tools provided by 38 Degrees, I contacted my MP to urge proper debate on the Digital Economy Bill facing the current parliament here in the UK.

Apparently I wasn't the only one, as his office had a full form response prepared and ready to go. Here's what he sent.

Full Text:

Thank you for contacting me about the Digital Economy Bill.

For nearly twelve years, the Government has neglected this crucial area of our economy. We believe a huge amount needs to be done to give the UK a modern regulatory environment for the digital and creative industries. Whilst we welcome aspects of the Bill, there are other areas of great concern to us.

We want to make sure that Britain has the most favourable intellectual framework in the world for innovators, digital content creators and high tech businesses. We recognise the need to tackle digital piracy and make it possible for people to buy and sell digital intellectual property online. However, it is vital that any anti-piracy measures promote new business models rather than holding innovation back. This must not be about propping up existing business models but creating an environment that allows new ones to develop. That is why we were opposed to the original Clause 17 and are still opposed to CLause 29, which props up ITV regional news with License-Fee-payer's money.

The Government's failure to introduce the bill until the eleventh hour of this Parliament has given rise to considerable concern that we no longer have the time to scrutinise many controversial measures it contains. We believe they should be debated in the House of Commons, and only if we are confident that they have been given the scrutiny that they deserve will we support them. My colleagues in the Shadow Culture, Media and Sport and Shadow Business, Innovation and Skills teams will do everything in their power to work towards legislation that strengthens our digital sector and provides the security that our businesses and consumers need.

Once again, thank you for taking the time to write to me.

Thursday, March 18, 2010

OpenGamma is Looking for a Browser-Based Software Engineer

We've already put this up on our jobs page, but I wanted to highlight the job posting to my readers.

OpenGamma is now looking for someone to build up our browser-based software engineering efforts. We've got a super-strong set of server-side software engineers who are well versed in building the back-ends of applications (and delivering data to front-ends in browser-friendly ways). We've got people who are very familiar with extending well-defined applications to support new functionality. What we don't have is someone who lives and breathes the browser.

That's where you'd come in.

We want someone who can come in, and make sure that we present the platform that we're building in the best way possible to end users, delivered through browser-based mechanisms. You'd get a clean slate to work with, and the chance to work against what we know will be the limits of what browsers can do. And at no point ever will we ask you to support IE 6.

While we're building financial technology, we don't think you need to know a single thing about the financial services industry to take on this role; in fact, we think our ideal candidate isn't coming from finance at all (judging by the quality of web applications we've all seen in finance). Anything you need to know you'll pick up on pretty darn quickly.

We'd prefer someone local to London (our new offices are in the Bankside area, with views of the Tate Modern). If you're based outside the M25 and need accommodation for telecommuting, we don't need you in the office every single day.

We're a well-funded startup, we're building technology that has the potential to disrupt an entire industry, we have exceptional people to work with, we have a no-bullshit stance on bureaucracy, and we all have an equity stake in the firm.

Take a look at the more comprehensive spec on our web site, and if you think you or someone you know would be perfect, contact us (jobs at opengamma dot com).

Recruitment Agencies: We are not open to unsolicited profiles or CVs for this role without an existing MSA signed by OpenGamma.

Monday, March 15, 2010

QCon London 2010

The second half of last week the entire OpenGamma team (with the exception of our Head Quant) attended QCon London 2010. Last year I was only able to attend one day (the one in which I was presenting), and it was great to see the whole conference rather than just one part of it.

QCon to me is a great experience, because it's a technology conference designed for technologists who already get it rather than as an excuse to spend their company's training budget on a jolly to get out of the office for a few days. As such, the needs and desires of the attendees factor way more into the makeup of the conference than the needs of the corporate shills who provide much of the content for what passes for a "conference" these days. You get to see trends going on in near-real-time by attending (both through the presentations and the comments and questions), making it nearly invaluable to any type of software professional.

General Comments on QCon 2010

Here are some generic comments on the conference this year.

What Happened To The Finance Track?

I've been quite vocal about this, and I'll be even more so: where in the hell did the Financial IT track go? Looking around the QCon London crowd, at least 25% of the attendees worked in finance. Every single presentation with even a hint of financial content was Standing Room Only. And yet the organizers nixed the Finance track this year. What were they smoking?

Seriously, bring it back. It reflects the target crowd for the target city in a way that no other track does, and is one of the things that used to make QCon London special and location specific. And, apparently, in 2009 it was the only track not to receive a single red card for any presentation. So it's not like the audience didn't appreciate it.

Speakers: Know Your Audience

There were three types of presentations that I didn't feel went over particularly well:
  • Preaching to the Converted: The first keynote was the definition of trying to preach to the converted, and there were other presentations that had the same problem. You want to talk about how great agile development methodologies are? You want to talk about how great it is to continuously test? You want to talk about how we should be concerned with performance? It's over. QCon audiences already know that and already believe. And unless you're in a particular track, rah-rah talks don't go over well. We already get that stuff, so don't try to "sell us" stuff we've already bought.
  • Over-Specific Talks: A QCon audience is very very diverse. In the same crowd you might have people from games, online betting, finance, and consumer internet. We might develop in Java, C#, C++, Flex, JavaScript, Ruby, and Python. Pitching a general abstract but not showing any details outside of one domain won't work. If you're going to be a specialist presentation just say it, because saying that you're general and then talking exclusively about one technology won't work.
  • Targeting General Developers: There was at least one presentation I went to that was quite obviously a canned talk written for generic in-house developers who don't as a regular part of their job work with advanced technologies. While talking to a crowd of Visual Basic programmers might require extensive background in modern software techniques, the QCon crowd doesn't. You just lose the crowd in background that the audience doesn't require. We want meat on the bones.

Big Trends

All that being said, I think that there were several trends that really bubbled up to the foreground this year that we'll all be hearing about over the next year.

CAP Theorem

Eric Brewer's CAP Theorem states that for any distributed system, you face a constant tradeoff between Consistency, Availability, and Partition tolerance; you can choose at most two of those, but not all three.

It's a familiar scenario for developers familiar with the mantra of Features, Quality, Deadline, Resources (pick two). Software and systems engineering is a game of tradeoffs in desires and resources, and you have to have variable inputs to the system. The CAP theorem simply expands the general notion to the particular problem with distributed systems (which these days is all systems, as the days of 2-tier or 1-tier systems are long gone).

So why did I pick this as a Big Trend of QCon 2010? Because it come up over and over again:

  • All the NoSQL databases are designed around the requirements of the CAP Theorem and how it plays out in web systems;
  • Any distributed system designed for scale and availability has to be aware of the particulars of the CAP Theorem;
  • Performance and availability testing has to be aware of how the particular system has been optimized for the constraints of the CAP Theorem.

In essence, this one theorem (originally put forward in 2000) ended up appearing in at least 6 presentations I went to. That's really big.

Imagine that for the first eight years after Binary Search was invented nobody really used it or talked about it. Then the next year all of a sudden you started to see a whole bunch of innovation around searching ordered collections of stuff. Then the next year all anybody could talk about was how binary searching of stuff was going to change the industry. It's a tipping point thing, and we've hit it.

So my #1 trend from QCon London 2010 is the widespread knowledge and comprehension of Brewer's CAP Theorem driving systems architecture and development. Whether it's the adoption of NoSQL technologies, or just better development of distributed systems, people now get it and there's no turning back.

Performance Testing

The needs of any modern system to consider performance as a first-level requirement are pretty well understood by a QCon-style audience. But the presentations I attended had quite a few anecdotes that show how far this need is being addressed by technology-forward audiences:
  • An online games company that spent a full man-year up-front on developing a flexible performance testing framework before they were anywhere near feature complete.
  • A trading system that was optimizing cache hits in their software written in Java before they launched.
  • A web site spending a third of their research budget making sure their underlying technology stack could support the new functionality they had to support on that stack.

What became incredibly clear from a number of presentations is that the era of "write it and performance test if performance isn't good enough" is over.

Development teams are considering performance of the system from day-one, and are discussing that. Whether they're spending engineering effort on a custom scripting system to allow them to model user behavior and refine their scripts based on actual user interactions, or whether they're building elaborate tools to capture actual system performance at runtime regardless of the user interaction, teams are considering the performance of their system to be a critical requirement that has to have tools to support it from the start.

So my #2 trend from QCon London 2010 is performance management of distributed systems has to be a pre-launch requirement. Personally, I love the capturing and processing of actual system behavior as the system naturally runs. But no matter which approach you choose, you have to bake that into the system before it goes live.

Note that this statement isn't contradicting my point above about how we all already care about performance. The point here isn't that we haven't cared about performance before, but that weren't always structuring performance management and optimization into the cores of our systems. The trend to me is to start to think of performance just as seriously and up-front as we do about storage and templating and everything else that goes into our distributed systems.


Will I go back in 2011? Probably not unless I'm presenting, and then probably just for the day. This isn't to say that I didn't find the conference useful, because I did. It's far more to say that my personal situation as a startup CEO can't be met by any situation where I have to take three days out of the running of the company.

Will I send the rest of the OpenGamma team to QCon 2011? Hells yes, particularly if they reinstate the Financial IT track. The crowds, the conversations in the networking breaks, the presentations; it's a brilliant conference. It's pricey, but it's worth every penny.


Did I mention that Eric Brewer is a professor at the University of California at Berkeley? No? Oh. Go Bears!

Sunday, March 14, 2010

London Startup Office Financials

Astute readers will know that I'm the CEO of OpenGamma, a (semi-)stealth mode financial technology firm funded by Accel Partners. Longer-term readers will know that I'm a huge backer of London as a startup hub for all of Europe. A big part of navigating the London working environment is handling our pretty unique property market; I think our most recent past will be useful for someone else out there.

OpenGamma: Pre-Closing

We got our term sheet in the last half of July. Now while some people will argue that VCs take all of August off, our team didn't, and we knew that we had to work aggressively to make for a fast, clean closing.

The OpenGamma founders are a motley crew:

  • Elaine, our chief quant, had been unemployed while intentionally working through a quite enforceable non-compete.
  • Jim, our head of software engineering, arranged his end-of-contract to be the end of July, with lots of solicitor advice to ensure that there wasn't a binding and enforceable non-compete.
  • I was just ending a contract at a major international Investment Bank (astute readers will have heard of this as Big Bank B).

I argued that we needed a space that we could all be together just to make sure that all the various legal moves that resulted in the incorporation/Series A dance were pulled off okay. Although Jim and Elaine weren't sure, as CEO, I won.

Because up until then we were operating still out-of-pocket, we went with the cheapest serviced office that we could get away with. That basically meant one of the myriad of serviced offices along Borough High Street, and came down to 400 pounds per desk per month. That, to me, is the lowest you can possibly go for a private space in Zone 1 London.

In Which We Run Out Of Space

In the same serviced office block, we were able to expand to an adjoining room, giving us room for 8 people (with one space allocated for servers, scanners, and my Administrative Paperwork Overflow). That seemed fine, until we got to 6 people and had an offer accepted by Stephen Colebourne. Now we had a problem.

Up until then we had been paying for 7 potential workers (building management threw in the 8th for free), at 400/desk/month. But the problem is that the serviced office building that we moved into was full. When we took the initial space it was pretty much empty, but now we had no ability to expand at all. Worst of all, when it filled up, the space that seemed quite fine became completely unworkable.

It was clearly time to move.

Offices, Glorious Offices

We started to look at Proper Offices. We talked with two different tenant's agents, Devono and Carter Jonas, and we chose Carter Jonas as our agents. And we started to look properly.

Things looked very promising early. We found an amazing property right on Bermondsey Street that we liked, but couldn't really afford. We found other properties which were downright miserable compared to the best one, but were in our price range.

And then on a lark we saw a property that had just come on the market earlier that day. Perfect in every way: high ceilings; Victorian warehouse conversion; Geek-friendly museum in the ground floor; great location; cheap rent. This was the place. 2055 square foot (subject to survey of course).

We set our agents on negotiating the best rent possible, and once we agreed on all that, we set our lawyers on the property, started talking to build-out teams, and mentally prepared ourselves to moving.

For those of you familiar with the UK property market, we negotiated a 5-year lease with a 2-year break clause. The standard in London is a 5-year lease with a 3-year break clause, but if OpenGamma is doing well, we'll run out of space after 2 years, so the optionality is worth the slightly worse terms to be able to break early if we're a break-out success. For those of you counting, this cost us about 2 months of rent-free period at the outset of the lease. We also negotiated a 6-month deposit based on bank guarantee, where the industry norm is 12-month cash to the landlord.

The structure of the deal for the location that it's in is:

  • A discount on the asked rent over the 5-year duration of the lease;
  • 3 months rent-free at the onset of the lease;
  • 4 additional months rent-free if we choose not to exercise our break option.
If you're looking at Mayfair or the City today, you'd be looking at 5-7 months rent-free on the outset of a 5/3 lease, and 5-7 months after the break option. The South Bank area is a slightly different market, so we had to look at smaller rent-free periods.

A Tangent on Suitability of Space

Whenever I see Valley or New York firms that have successfully moved into decent space relatively quickly, I'm quite jealous, because their property markets don't work the same way as London's does.

In the states, a general rule of thumb is:

  • You move into the office as-is
  • If the office isn't in move-in condition, you negotiate what are known as Tenant Improvements to make it move-in condition, and the precise nature are a negotiation between tenant and landlord
  • When you move out, you move out of the space in as-is condition (meaning that you just leave all the "structural" stuff in place, but things like cubicles and desks you remove).

What this means in general is that there are a lot of spaces that are pretty much move-in condition if you're not particularly fussy about particulars: they have cabling running to a comms room in the major areas; they have a few offices or meeting rooms; they have connections to the outside world ready to light up.

London is a completely different market. Landlords force a completely vacant turnover; anything you've done once you took possession you have to undo. That means that if you're looking at a 2,000 square foot space, and want to build out two conference rooms and a comms room, you have to remove all those walls when you move out. You have to remove all cabling, all lighting, all electricals, all walls and doors, everything. Whether the landlord would find it easier to rent the space with all that intact is regardless: you have to remove the lot of it.

This means that when you're looking for office space you can't actually find, at all, any "move-in-condition" space unless the previous tenant went bankrupt in the middle of tenancy. And given the different treatment of bankruptcy here, that's actually quite uncommon.

It sucks.

In Which We Get Quotations

We used Carter Jonas' expertise and asked two teams to do a full build-out proposal.

The first company, which seemed to really "get" what we were going for, and which came back with proposals first, finally came back with a quotation for building out our 2,055 square foot converted Victorian warehouse space.

110,000 pounds. Just for the build-out; furniture was another 35,000 pounds.

Our heart sunk. This idea, which initially seemed reasonable, turned into a complete flight of fantasy. There's no way I could warrant as a CEO putting 110,000 into a build-out, 35,000 into furniture, and another 30,000 into a deposit on the property. No way at all.

We worked with the same firm on doing a multi-stage buildout, where we got the most necessary things right off the bat (comms room and one meeting room, and everything else in 6 months after we had a couple of sales), and that initial build-out was still coming in at 70,000 pounds. Plus deposit and furniture. Just to move in.

We had to look at other options.

So we started talking to the guys from Causata, another Accel-backed startup. They've been going for a year longer than we have, and they've spent their entire time in a better class of serviced office than we've been in.

Serviced Offices FTW?

So we started talking to higher quality serviced office providers than we were going with in the Borough High Street area. This would involve moving to second-tier space in the City (e.g. not commute convenience to Bank or Liverpool Street) or second-tier space in the posh parts of the West End (e.g. not convenient to Berkeley Square or St. James).

These buildings were much more suitable to growth companies that don't require bootstrap-levels of expenditure:

  • Chairs weren't crippling
  • The buildings had space in server rooms you could throw a few machines into
  • Phones were reasonable in quality

The thing was that these spaces were coming in at roughly 750 pounds/desk/month (9,000/year). For 12 employees, that ends up being 108,000 pounds per year.

And the best/worst part is that these numbers are based on potential employees in the space. The space in any serviced office is charged at a price per room, with a certain number of desks in it. If the space can support 20 employees and you only have 10 in the space, you pay the 750/month for 20 employees, not 10. So your entire cost savings are based on finding the right space for the right number of employees, and being able to move quickly to minimize overspend.

Capital Preservation Equals Startup Victory?

If you ask any startup, whether well-funded with a major VC backer or not, whether they're happy putting 110,000 into a capital cost on an office, they'll say no. It's a ridiculous predicament to be in.

If you run down the numbers, a Proper Office (even with 110k in capital expenditure) has an advantage (for our space) at 12 desks over the full run of the 2-year initial period, but it's all front-loaded: you have to spend capital to save at the end of the lease. If you stay in the space, in years 3-5 you're laughing at the savings.

But for a startup like ours, 110k works out at 2 months of burn. Now you're in a difficult quandry:

  • If you really believe the Next Round is going to happen at a good valuation, you JFDI; otherwise, you're pissing money down the drain.
  • If you want to preserve your options to drive for a harder bargain on the Next Round, you want to conserve capital at all costs, even if it means you're spending more per month.

It's a tough decision. Clearly (having seen the plans) the right choice for us is the full build-out, but are we willing to sacrifice the additional two months of burn for the optimal working environment over 2-5 years?

What We Did

What we ended up doing was a bit anti-climactic.

First of all, our second build-out firm finally came through with a quotation. 55,000 pounds for everything (from a requirement perspective) that the first firm did. Also, came in at 21,000 for furniture, including pretty darn good chairs. That changed everything.

So we ran the numbers. Turns out that if we assumed that a second round of funding was coming, paying all the fixed costs, no matter how noisome to me as a founding CEO, made sense as of 18 months into a two-year lease. And once the board told me "do what's best for the company over the next two years," that was a done deal.

It was tough though, because we knew all along that from a qualitative perspective, the space in Bankside was the right choice. What we needed was to be able to justify to ourselves and our board that we were sinking the right amount of capital into the right space to put the company on a trajectory that we didn't have to mess about with offices again. That point is quite important, because if you're growing rapidly you don't have the time to deal with this stuff.


If you want to boil all this down to a series of sound-bites, here they are:
  • The UK property market is dysfunctional in requiring tenants to restore the space to an uninhabitable shell on move-out;
  • This makes finding suitable startup office space in London a PITA;
  • ALWAYS get competitive quotes on EVERYTHING because you never know just how far off different vendors can be;
  • Serviced offices in London work out well if you feel the need to be in the City or Mayfair, but less well if you like places like the South Bank [Silicon Bridge] or Old Street [Silicon Roundabout]. Crossover point is about 10 employees;
  • Ask Your Board about the relative value apportioned to flexibility over the short term vs. cost savings over the long term. This is what your board is there for.


We move in in the next 5 weeks come hell or high water (because we've already given notice on our current space, and they've already let it). Wish us luck.

Tuesday, March 02, 2010

Whence Fudge? Why Not Just Use/Extend Avro/GPB/Thrift?

Or, In Which I Make The Case For Fudge Existing At All

tl;dr: Fudge is binary for speed and self describing to do interesting things when you don't have schema support at the time of processing.

After I announced the Fudge Messaging 0.2 release I realized that we had a back-link from @evan on Twitter with a quite pithy characterization of the situation. I've followed up with Evan quite a bit over twitter, and I think this requires a bit more detailed description of why Fudge is different from other encoding systems.

What Do You Transmit On The Wire?

Let's start with a very simple question: if you're trying to transmit data in between machines, whether temporally concurrent or not, how do you encode said data? Broadly speaking, you have a number of options:
Hand-Rolled Binary Protocol
Yep, you could directly emit the individual bytes in the appropriate byte order and hand-roll the encoders and decoders, and it'll probably be pretty fast. It'd be a complete and utter waste of development resources, but fast it would undoubtedly be.
Compact Schema-Based Representation
The next option is that you use a system based on a schema definition, and use that schema definition and the underlying encoding system to automatically do your encoding/decoding. Examples of this type of system are Google Protocol Buffers, Thrift, and Avro. This is going to be fast, and an efficient use of development resources, but the messages are useless without access to the schema.
Object Serialization
Every major Object-Oriented language has some built-in serialization system for object graphs, and it's usually dead easy to use. However, they all suck in a number of ways, and don't work at all if you don't have matching code objects on both sides of the communications channel.
Text-Based Representation
You could just say "I don't care about performance at all" and just use a text-based encoding like XML, JSON, or YAML. The messages are at least marginally human readable, and you can do stuff with them without any schema (which may or may not be used at all), but it's going to be slow, slow, slow.
Compact Schema-Free Representation
The final option is to say "I like having my data in a compact binary representation, but I don't want to have to have access to any type of schema in order to work with messages." The most well known example of this is TibrvMsg from Tibco Rendezvous, but it's not a very good implementation, and it's proprietary. This is what we created Fudge to solve.

Hey, Isn't Avro Schema-Free?

No, no it's not. The Avro encoding system requires access to a schema in order to be able to decode messages at all, because all metadata is lost during encoding. However, it has two characteristics which are similar to schema-free encoding systems:
  • It decodes into a schema-free representation. What I mean by this is that if you did have the schema for a message, you can decode it into an arbitrary data structure which isn't tightly bound to the schema (think generic structure vs. schema-generated code).
  • The communications protocols involve out-of-band schema transmission. When you start communicating using RPC Avro semantics, you communicate the schema of the messages that you're going to be transmitting using a JSON-encoded schema representation. But once you've done that handshake, the actual messages are schema-bound.

Thus you can view "Avro the communications protocol" as schema-free, as you don't have to have a compile-time-bound schema to be able to communicate effectively, but "Avro the encoding system" as schema-bound.

Why Is Schema-Free Encoding Useful?

If you're looking at temporally concurrent point-to-point communications (think: RPC over a socket), a schema-bound system is pretty useful. You can squeeze every single unnecessary byte out of the communications traffic, and that's pretty darn useful from an efficiency perspective. But not all communications are temporally concurrent and point-to-point.

Let's consider the other extreme: temporally separate publish/subscribe communications (think: market data ticks, log messages, entity update messages). In that case, you have to make sure that every single point where you want to process the message has access to the schema used at the point of message serialization; if you don't have it, all you have is an opaque byte array. That's a logistical nightmare to keep everything in sync over time, particularly when you start doing logging and replay of messages for debugging and auditing.

Moreover, there's a whole host of things that you can do in between producer and consumer if you have a schema-free encoding:

  • You can visually inspect the contents of messages. Not in a text editor unless you're using a text-based encoding system, but in some type of general purpose tool. This is an unbelievably useful development, debugging, and system support tool.
  • You can do content-based routing on them. It's quite easy using XML, for example, to create XPath rules that route messages through the system ("send messages where /Company/Ticker = AAPL to Node 5; send other messages to Node 7"). You don't have to compile each little rule into native code, you can do it based purely on configuration. Heck, you could even build a custom routing system for ActiveMQ or RabbitMQ and put the functionality right in your message oriented middleware brokers.
  • You can automatically convert them to other representations. Have data in XML but need it in JSON? No problem; you can auto-convert it. Have data in Fudge but need it in XML? No problem; you can auto-convert it.
  • You can store and query them efficiently. You can take any arbitrary document and put it into a semi-structured data store (think: XML Database, MongoDB), and then query it in an efficient way ("find me all documents where /System/Hostname is node-5 and /Log/Type is AUDIT or SECURITY").
  • You can transform the content using configuration. You can take an arbitrary message, and just using configuration (so no more custom code) change the contents to take into consideration renames, schema changes, add stuff, remove stuff, do whatever you want.

The SOA guys have known about these advantages for years, which is why they use XML for everything. The only primary flaw there is XML is, quite frankly, rubbish in every other way.

That's Why We Created Fudge

Most of the network traffic in the OpenGamma code isn't actually point-to-point RPC; it's distributed one-to-(one or more), and we use message oriented middleware extensively for communications. We needed a compact representation where given an arbitrary block of data, we could "do stuff" with it, no matter when it was produced.

None of the compact schema-free systems support this type of operation. If someone can point me to an Open Source system which covers these types of use cases, and it's better than ours, we'll gladly merge projects. But Avro ain't it. Avro's good for what it does, but what it does isn't what we need.

But Your Messages Must Be Huge

The Fudge Encoding Specification was very carefully designed to scale both down and up in terms of functionality. At the most verbose, for each field in a message/stream (of course Fudge supports streaming operations; if it didn't, do you think we could efficiently do all those auto-transformations?), the Fudge overhead consists of:
1-byte Field Prefix
This has the processing directives to say what else is in the stream for that field.
1-byte Type ID
We have to know at the very minimum the type of data being transmitted for that field.
The Field Name
The UTF-8 encoded name ("Bid", "Ask", "System", "UserName", etc.)
A 2-byte Field Ordinal
A numeric code for the field (nope, not the index into the message, just a numeric ID like the name is a textual ID)

Here's the thing though: only the field prefix and type ID are required. Although right now Fudge-Proto doesn't do so, you could easily use Fudge in a form where you rely on the ordering of fields in the message to determine the contents (which is exactly what Avro does). In that case, you pay a 2-byte penalty per field. Yes, if you're transmitting a lot of very small data fields that's a relatively high penalty, but if you're transmitting 8-byte numbers and text, it's really not a whole heck of a lot.

You want names so that you can have something like Map<String, Object> in your code, but you don't want to transmit those on the wire? That's what we invented Fudge Taxonomies for. 2-byte ordinal gives you full name data at runtime, without each message having to include the name.

So in one encoding specification, we can support very terse metadata as well as very verbose metadata.


So ultimately, we created Fudge to be binary (for speed and efficiency), self-describing (for schema-free operation even when a schema exists), and flexible (so you can only use the bits you need).

Yes, we could have started with an Avro or a GPB or a Thrift, but there's no way that you could have done what we've done without breaking existing implementations. And I think if you started with an Avro and worked down all the use cases we've run into in the past, you'd end up with Fudge in the end anyway; we just skipped a few steps.

The binary schema-bound systems are great and they define things like RPC semantics which Fudge doesn't. But hopefully it's clear that us creating Fudge wasn't just a case of NIH, it's that we had very specific use cases that they can't support naturally.