Saturday, May 22, 2004

Why Unix Hackers Prefer the Microsoft Solution 

I must say, I'm kind of shocked that we find ourselves in this discussion at all. Java is a language technology created by a company with roots in Unix, and given away for free five years before C# showed up on the scene. Why are programmers on Linux seriously considering adopting a technology that is rooted in the totally alien design philosophies of the Microsoft backwards-compatibility stack? I have some theories, but the theme would be: .Net actually fits the free software development model better.

How can that be? Well, let's look at some of the things .Net lets you do. For one thing, .Net solves a bigger problem than Java does, in that it is a common language environment. Many languages can run on .Net, and call each other's code trivially. For any single project, this doesn't seem like a huge draw (despite Microsoft's initial pie in the sky proclamations about writing programs in five different languages due to the preferences of the individual developers, and what not). But take a look at the actual Gnome CVS repository, and look at how many languages are represented there. The core platform is all C, but there are bindings for C++, Java, C#, Perl, Python, and tons of other languages. Many peripheral components of Gnome are written in languages other than C as well; I believe some of their games are written in Lisp (with bindings), for example. Being able to tie all that code together into a single coherent framework would be wonderful for a free software project like Gnome. The strong need to mix languages together is a problem of open source development way more often than it is for proprietary development.

Now technically, other languages can be compiled to the Java virtual machine, but this is rarely a clean fit, and often involves major compromises in speed or functionality. .Net doesn't always fit perfectly either, but it certainly gets a lot farther, due to being designed with that in mind. (Besides, Sun seemed to say, why would you want to write any part of your program in something other than Java?)

Also, going along with the bindings, .Net makes it much easier for developers to invoke native code through its P/Invoke mechanism. It's trivial. The comparable mechanism in Java, JNI, is a huge pain in the ass, and involves writing stub functions in C for everything. There's no reason (in my understanding) that a similar mechanism couldn't be written for Java, but it hasn't happened as far as I can tell. Think about this from the perspective of the Gnome project: they want to leverage as much existing code as possible by binding it to Java or C#. It's much easier for them to create bindings for C# given this situation.

What's even worse is that Sun appears to have wanted this to be a hard thing to do. The original plan was for Java to be the universal language: "write once, run anywhere." Ten years later, it doesn't seem we are anywhere near to that. It was probably not a realistic goal to begin with. It's not too late to back away from this, but Sun still seems to make decisions based on this attitude, and it costs them developers. The most seamlessly cross-platform language I've ever seen is Python, and Python makes it easy for you to invoke native code, and easy for you to write platform-specific code. Python says, "OK, this stuff isn't cross-platform, but use it if you still want to do that." This is a reasonable thing for most developers to want to do at some point. It's especially reasonable for open source programmers to want to do this.

This "Java only, everywhere" attitude also has extended to the virtual machine. The virtual machine hasn't changed at all since Java first came out. This is not a bad thing, but it has created problems. For example, Sun finally has an implementation of generics in Java, but it is a hack since the virtual machine itself isn't aware of them, since Sun isn't willing to make incompatible changes to the virtual machine. That would be fine if it didn't work so much better in C#, where the virtual machine is aware of them.

As I said before, many languages do target the JVM, but it's not like Sun has welcomed them or made anything easier for them. This is too bad, since it turns out there's a lot of value to be had here. And I'm convinced that if Java compiled seamlessly to either of two JVMs (old or new), there wouldn't be much of a problem with making incompatible changes. It's easy to see how this could be done so it was a simple recompile for developers. Yet Sun won't change the JVM. I don't understand it; the value is in the source code, not the compiled code.

Furthermore, I can see yet another JVM-related issue. Microsoft .NET virtual machine was designed specifically for just-in-time compilation as the main use case. I haven't looked into the specifics, but it's not hard to imagine how this could result in actual performance gains (theoretical and actual). Furthermore, the machine has supported the caching of compiled code, which Sun has only now gotten around to implementing in Java 1.5. This obviously isn't something that Java's main developer base (server-side programmers) seem to need, since their programs are invoked rarely and run forever. But for desktop software, this results in huge wins for startup time and runtime performance. It's not like people want to be chugging along with Swing, now is it?

I think the last problem is the most obvious one for the open source world: licensing. I think Sun would probably be way less hostile to Gnome than Microsoft, but Sun hasn't exactly acted the most open. The source for Java is available, but it isn't free in the way that open source programmers tend to think of it. You can reimplement it freely, but you need to pass Sun's expensive compatibility tests to be fully kosher. This is obviously unworkable for free software. Microsoft on the other hand, submitted the core of .Net to a rubber-stamp standards body, given the appearance of greater openness. (In reality, no one else seems to participate in steering .Net).

I think open source developers are being a little too prickly on this issue. Yes, it's a shame that Sun won't let you modify Java freely, but they do give very liberal terms. If they would just create an exemption so that open source groups could get their compatibility tests for free, I think people should try to get over it. It's very open, but has some restrictions on what you can do with it (like the GPL, but different actual restrictions). As it is, Sun has created a huge problem for adoption by free software groups (the testing requirements), and they don't seem willing to tear it down. If it weren't for that, I imagine there still would be objections, but I think that's the only really valid one.

Looking back at the things I think make .Net more attractive than Java, I don't see anything insurmountable. It's not too late to make Java a great platform for free software. A P/Invoke-like system could definitely be implemented, Sun just won't adopt it. That sort of thing hasn't stopped Mono. An entire open source oriented stack could be built, as Mono has built its alternative stack. Some effort could be spent fixing up GCJ or some other virtual machine to be viable, or God forbid, Sun could help this process along by easing some of their restrictions for their allies against Microsoft (unlikely, I know).

Shame on Sun for getting this wrong for so long. Their users have spoken, for years, and Sun has stubbornly refused to hear them.

Running to Mono to get around this, of course, is jumping out of the frying pan and into the fire.

Uh... this blog doesn't have enough pictures??!?
It's funny reading how .Net is touted as supporting multiple languages when in reality VB and C# compilers are nearly identitical except for the parser: even the abstract syntax trees would be the same. Alas, Microsoft did a neat marketing trick, and now papers are being published on the grounds of "this prior solution was good, but only supported one language... this is better, because it works with .Net and therefore multiple languages"--and the kicker is some people think that merely replicating results but putting the Microsoft spin on multiple languages is actual language research.

Originally, Sun was trying to design the JVM to support multiple languages. The best of the C++/Java/C# genre of languages out there (Dylan) was heavily discussed. Dylan has by far the slickest object-model, and made cases like multiple-inheritence conceptually and pragmatically simple: it was just a better way of doing things all around. Sun did this in hopes that more people would use the JVM: but when it appeared everywhere already, Sun's goal changed to make the JVM only support Java. Indeed, unless your language is higher-level than Java, compiling to Java bytecodes involves a fair number of tricks and hacks.

Bravo for mentioning Python as the best platform independent language. As I read it I thought "wait, Smalltalk is by far better; being the only language to give you bit-per-bit compatibility" but then I remembered yet again that Smalltalk *is* a platform; it just runs on top of other platforms.

BTW, in Java 1.4 the addition of asserts changed the JVM.

Anyway, after reading this I looked into the C# language features. Some parts are so bad, like getting rid of checked exceptions, which was a failed experiment on the part of the original language developers. In some situations checked exceptions seem like a good idea, but the costs far outweight the benefits. Also, the ref/out parameters are good. But then when looking into it all, it seems any cute language feature they could come up with they welcomed with flowers. Not to be critical about bells and whistles... but if they are going to throw control out the window, they should at least get closures for real (Java is better there; but Java is really Scheme in disguise--now if only it had TCO; though that sucks for debugging stack traces).

So, anyway, when Java 1.5 came out, I searched what people were saying about it, and I surprised so many people were focusing on dumb stuff like syntax. I think largely the programming masses, and OSS developers included, just really like surface issues. It's important for some social reasons I can't understand. Look at Python discussions, and people complain about the tabs more than they do real problems, like when you misspell a variable name, you inadvertently create a new variable. (Some slots would be nice, guys.)
See, that's the thing. Who cares how similar the VB and C# compilers are? That doesn't even have any practical effect on programmers. Sure, those two languages are similar. But to say that it doesn't actually support multiple languages is wrong. Yes, it does specify a specific model of languages that it supports, and most pre-existing languages need to be tweaked a bit to run on it. However, this can still be a very useful thing to have. If you look at the list of languages that have been made to work practically on .Net, it includes Python, ML (and variants), Eiffel, and some others that have been created specifically for .Net, like Nemerle, another functional language. That's a lot of variety, and even if tweaks are required, that doesn't mean it's not multi-language, or that this capability isn't useful to people.

Adding asserts to 1.4 was purely additive, and doesn't address any of the other short-comings I brought up. Plus, big deal. Asserts.

I remember originally thinking the same way you did about C#'s kitchen-sink set of language features. But now I think that it's just practicality, and something Java would benefit from. In fact, Java has recently added many of the desirable features of C#, like autoboxing (long overdue!), and a foreach contstruct (much uglier than C#'s). So hey.

Those surface issues you put down ARE what's important. Perhaps you look at them and go "Well, fundamentally it doesn't change what you can do." But to coders, they can have actual effects like "Saving me lots of time," or "Helping me express things in a cleaner way," or "Opening things up to the point where people will make huge errors in judgement when using this feature, seriously hurting the reliability and maintainability of this software."

Full disclosure: You used to work at Sun. :)
(Previous comment was from me, actually).

Also, let me not forget. How the hell is Java "Scheme in disguise?" That's a stretch.
Post a Comment

This page is powered by Blogger. Isn't yours?