Monday, April 27, 2009

Sybase JDBC Drivers Ignore Your Database Selection

A very common strategy in doing database-driven functional tests is for each user to have their own database instance on a shared test machine. This means that they can do whatever they want on that database instance without impacting other users, and that multiple people can run potentially destructive test cases simultaneously.

A common approach to handling that is to append the user name to the database name, and use a system property to choose the correct database instance for that test run (for example, naming your DataSource bean in your Spring context.xml something like db_${user.name}). You then have the JDBC connection declared to have the user name in the connection string (a la jdbc:foo:bar:blahblahblah/myapp_${user.name}). All works great.

Except with Sybase.

In its voyage of annoying developers who have to use their managed language drivers (the C#/ADO.Net ones are particularly noxious), Sybase have decided to screw you by ignoring this parameter when it's not valid.

The Sybase low-level protocol for this scenario largely consists of the following steps (completely simplified):

  1. Connect as user Foo
  2. Connection is now bound to the default database defined for user Foo
  3. Issue a use correct_database statement
  4. Connection is now bound to the database specified in step #3

Here's the problem: if step 3 fails because the database desired doesn't exist, Sybase will silently swallow the failure, and you'll end up in the default database for the user, and not tell you (seriously, there's no log or exception or absolutely any sign of what database you're in). Even better, the Sybase driver doesn't even store this in any field that you can introspect in the debugger, so the Sybase client library thinks it's in the database instance you want, when in fact it's not.

If you have a shared test database (with some real data in it for doing manual/gui-directed testing), and surrogate databases for each user's "blow-away-the-world" style testing, this is Very Bad Behavior Indeed.

Solution: Stop using Sybase. Seriously. Yes, it may have been en vogue 10 years ago, but it sucks.

Tuesday, April 21, 2009

Oracle + Sun : A Java Perspective

Smarter people than I have written about this, but having worked at several of the major players here (a summer at Oracle proper, 12 months at BEA working on WebLogic Server, 2 years working at M7, which got bought by BEA, which got bought by Oracle), I figured I'd pipe in my $0.02.

My #1 concern here is that Larry is going to attempt to use everything in the IP arsenal that he's just acquired to screw IBM. He's done it before, he'll probably continue to do it. Considering the amount of investment that IBM has made into the Java ecosystem (at least as much as BEA, particularly when you consider Eclipse), and the amount of hostility between Oracle and IBM, this wouldn't surprise me one whit. Towards, that end, here's what I'd like to see clarification on:

  • Eclipse/SWT. For too long Sun's ridiculous love-fest with NetBeans and Swing has blocked any reasonable approach towards dealing with Eclipse and SWT. The worry that I have here is that since Eclipse == IBM in many people's minds, and both JDeveloper and NetBeans are now under the same company umbrella, Oracle may decide to let commercial considerations (sticking it to IBM) and staff considerations (keeping Sun developers who have a thing for NetBeans, plus everybody internally who's committed to JDeveloper) override ecosystem considerations (we all like Eclipse way more than NetBeans or JDeveloper). My request: please properly support SWT. Not to the exclusion of Swing, but don't fight against it.
  • OSGi. OSGi is probably tainted by Sun as being an Eclipse-technology, and therefore hurting NetBeans and helping IBM or some similar ridiculousness. Support it, please, and kill off anything whose sole raison d'etre is to replace it with something lamer. Modularizing the JRE is one thing, providing something completely useless except to replace OSGi is something altogether different.
  • Open Java Implementations. Stephen Colebourne has been talking about this at length, and blogged about the handover. Please don't be such jerks on the JCP.
  • Open Source Projects. Whither Glassfish and Metro and all the rest of it? They compete with existing BEA/Oracle assets, but are extremely valuable in making the ecosystem valuable enough to allow Oracle to extract value from their proprietary assets.
  • API Neutrality. One thing Sun has been good at, because they've never had any world-beating middleware or application infrastructure technologies, is helping to craft APIs that are by and large vendor neutral (such as JDBC and JMS). Oracle, however, doesn't. Is Oracle going to follow a path like Microsoft has with the .Net APIs, where there's enhanced support for whatever Microsoft is shilling and second-class support for everything else (ADO.Net anyone?), or is it going to realize that supporting the ecosystem means vendor neutrality as much as possible? Sun had no choice as all their app infrastructure is second-rate at best, but Oracle has a choice.
  • The Whole JCP Itself. What's going to happen to it? How is Oracle going to behave in general? Will we see projects falling out from under the JCP umbrella going forward?
  • ZFS [1]. ZFS rocks. Massively. But Oracle's been working on Btrfs in part because Sun refuses to allow ZFS to be licensed in such a way that it can be included in the Linux kernel. Given Oracle's investment in Linux, we can has ZFS in Linux?

Mostly, I think we still have yet to see whether Oracle is going to behave like the BEA side, or the Oracle side: is Oracle going to help build the ecosystem, or is it going to use its new IP assets for proprietary advantage against IBM and Microsoft? I hope it's the former, but only time will tell.

Footnotes

[1]: Yes, it's not Java. But really, I can has? Plz?

Sunday, April 19, 2009

My Ideal Social Network

I'm relatively plugged in. I blog (you're soaking in it now!); I have a FriendFeed; I tweet (and those of you who need to know who I am probably already do); I'm on Google Reader; I used to be on Orkut until I started getting befriended by Brazilians I had never met; I'm on LinkedIn (spoiler alert! it discloses who Derivatives Company A and Big Bank B are). I've never gotten onto FaceBook, largely because I've never seen any attraction to it (and if I want to be super-poked, I'd just as soon meet you first, thank you very much).

I don't actually like any of them as a full aggregation basis. I know that FriendFeed is attempting to aggregate everything together, but it's not quite what I need or want.

So all you Web 2.0 guys, here's what I want:

  • Subscriber Control: I want to have control over who subscribes to my feeds. In particular, I need to be able to divide my life into at least four categories:
    • Technical Contacts. People who follow me because periodically I write about something someone might find potentially meaningful from a technical perspective.
    • Personal Friends. People who I know personally and might be interested in zoos and what train I got home.
    • Current Coworkers. The main crux here is that there are things that I might want to share with ex-coworkers and personal friends, but not the current ones I have to see every day.
    • Family/Near-Family [1]. These are people who might be interested in stuff going on in my life that I'm not ready to share with other categories.
    Note that nowhere in there is "random gits wot I don't know who want to voyeuristically follow my life." [2] When I do this, I need to be able to either pre-approve a subscriber to a feed, or silently dump a subscriber (see later).
  • Publication Buckets: I need to be able to publish any particular item into a bucket, or a whole feed into that bucket.
  • Single Republication: I need to be able to take a URL from anywhere (e.g. treat something from Google Reader the same way as something I type directly into a text box) and have it appear to consumers the same way no matter how they choose to subscribe. [3]
  • Single Inbox: Sometimes I consume stuff on my phone, sometimes I consume stuff on my laptop, sometimes I consume stuff from Random Web Application. I want "I read this" to mean the same thing on everything.
  • Selective Subscribes: I might want to read stuff by Zack Urlocker on MySQL and Sun, but really not care how many miles he ran that day [4]. I should be able to do that, combined with publication buckets.
  • API Access: Anything I do with your web app I need to be able to do from any arbitrary app, potentially outside your control.
  • Silent Unsubscribes: I need to be able to unsubscribe to someone without them knowing that I've done it. (More on why this is important anon).

Here's the thing about the silent unsubscribe thing. Social Networks largely thrive on number of connections, because that is Very Important to them. That's fine. But I need to be able to structure things in terms of feeds that I follow without feeling like I've given a slight to someone by not re-"friend"ing them (following/friending/whatever), or by dropping them later on. In particular, I may have personal friends who produce drivel I don't have the time or energy to keep up with, but I don't want to slight them by making my personal subscription/publication decisions transparent to them. Fixing this is the heart of the "social" part of networking, and is why eventually every single social network fails as the number of connections grows beyond the desire for humans to have contact to that level.

The core thing here is that this isn't about whether I'm friends with anybody. Many of the people that are my closest friends in the world don't participate in any of this stuff at all. It's about my ability to manage the flow of information in and out of myself that I deem relevant. That's a completely different matter, and the whole focus on "Social" Networks as being about friends and connections is rubbish: it's about my ability to selectively publish and subscribe to categorized feeds that are relevant to my interests at any given point in time.

Someone who can actually do web development should do this [5]. It would rock.

Footnotes

[1]: Note to readers: If I've ever camped out in your house, or you have offspring who refer to me as "Uncle Kirk", you're in here whether I share DNA segments with you or not.
[2]: Seriously, my life isn't actually that interesting. Except that I play with Pumas and Sun Bears and Ocelots and you don't.
[3]: Note that this may mean either compliance on the part of the republishing services (e.g. Google Reader), or it may mean that I have a queue of pending stuff that I have to process before it gets republished; I'm fine with both approaches working together.
[4]: I can say this. M7 Alumni In Da Hizzy!
[5]: Not it. I'm secure enough in my 5k1LLz that I can say that these days, you want me to stay as far the heck away from the browser in a day-to-day coding perspective as possible.

Thursday, April 16, 2009

Rise Of The Appliances

I've posted before [1] on my desire for someone to sell me an AMQP Appliance, and while nobody has yet (probably the fact that 1.0 is in the final stages of getting finalized is impacting this), I'm encouraged that somebody will.

Memcached Appliances Ahoy!

I'm particularly encouraged by the last week's frenzy of Appliance Startup Announcements:

I've read more about the Schooner appliances than the other two [2], but they appear to be roughly what I said my ideal AMQP appliance would be: Tier-1 hardware, linux kernel, custom software, heavily tuned to the particular bits of hardware in the box itself. I'm a little surprised at the $45k price tag, but I suppose it's good to hit the high-end of the market first and work your way down over time. $45k would be justified with the amounts of RAM and SSDs in the Schooner box, but I'm sure they can scale things down to less Flash and less RAM and still have a reasonable margin.

Open Protocols Make It Possible

What makes this possible (particularly that everybody's targetting the currently very-much-in-vogue Memcached) is that there is a standard protocol. While the memcached protocol isn't a de jure standard, it is a de facto standard, and one that at this point is unlikely to change given the amount of software doing interesting things with it as a protocol.

It's only a matter of time before we see the same types of innovation in messaging or anything else for that matter if you can define a stable network interface to it. If you can define an open protocol (whether de facto or de jure), you can build an appliance to serve it. And it would probably be a better quality of service than rolling your own on top of the same hardware and OS combination. Systems which are heavily I/O bound (messaging, databases, network caches) benefit greatly from advanced tuning of the entire stack from hardware through to application code. Thus, they make sense to optimize holistically, whether you're using general purpose hardware or specialized hardware. Appliance vendors can do this. Generic software vendors have to rely on teaching their customers how to do it.

Cloud Hardware Combination

The one problem with this move going forward is that you can't do it with Cloud Computing, because you don't have access to the physical environment and can't put your own hardware in place. And there's a lot of stuff you can't do as a result:
  • Appliances like Schooner, much less ultra-high performance messaging brokers like Tervela or Solace
  • Direct network connections to the proprietary or specific network of your choice.
  • Storage optimization through the use of any type of Flash acceleration
  • Anything requiring significant local bandwidth on a single box

That leads me to question whether the current vogue of completely ignoring any type of CapEx whatsoever is actually optimal for your core infrastructure. For example, if you can spend $5k on a machine to do your MOM load perfectly, or your memcached big and fast, or whatever, how much extra are you going to have to spend in overall VM slice hours to equal that performance (for example, are you going to have to run 10 VM slices running memcached and an extra 5 app server slices because you can't hit the performance of a single box)? This is entirely possible for something as memory bandwidth intensive as memcached, or as network bandwidth intensive as an AMQP broker. Where's the crossover point?

The bright spot here, though, is Open Protocols. I know I keep banging on about it, but where this all merges together brilliantly is that you don't have to have your own appliances or hardware in your Cloud Computing vendor: you can just talk open protocols and allow the utility provider to sell it to you. You want memcached? They'll sell you an IP/port that has access to X gigabytes and guaranteed response type of Y for $Z/[request|month|core time]. Open network protocols thus act as the disconnected version of open APIs in the software development world: you can swap out the underlying implementation without substantial changes to the rest of your environment.

Furthermore, working with Open Protocols in this case insulates you from PaaS lock-in. If your utility provider is providing Caching As A Service using their own custom, proprietary protocol, in order to move to a different PaaS vendor you'll have to change your app stack in some pretty substantial ways; if it's both sides providing an equivalent service (meaning protocol and semantics), you can migrate with minimal efforts (particularly if it's a transient service).

Footnotes

[1]: Meta-note: haven't been posting recently, even with the AMQP F2F and Solace announcements, because I've been busy working on some big stuff that I can't talk about at all yet. If it happens, I'll let my adoring public know.
[2]: Because Red Point, which backed Radik in the past, funded them in their Series A. I like to see what they're investing in.

Wednesday, April 01, 2009

Google Launches VC Arm

This has been all over teh interwebs. It's not actually that unusual for a large, cash-rich technology firm to have its own VC arm: Intel and Cisco are notable players in the Corporate VC space already, and Microsoft used to be.

Large tech firms do this for a number of reasons:

  • To find a use for the mountains of cash they accumulate and never distribute back to shareholders
  • To fund firms that employees want to leave to start, to keep them in the corporate fold (and potentially compensate the founders better than they could as employees)
  • To fund technologies they want outside the stultifying corporate culture that stops true innovation (Cisco is famous for this)
  • To ensure that there are new firms using their technology and thus growing their revenue base

I'd say probably that the biggest common factor in those is a recognition that once you reach a certain size and scale you've reached the innovator's dilemma (in that you can't work on things that would cannibalize your existing revenue base), but also that big firms are terrible at doing pure innovation (more management, more rules, less innovation). In that, it's a sign that you're no longer a young, hip, cool company, but a big one. If I were a Google employee or investor, I'd take that as the main message here: Google accepting that they're a Big Company, with all that entails.