SourceForge for the 21st Century

Monday, December 21st, 2009

Lately I’ve been thinking a lot about continuous deployment for reasons I’m not quite yet at liberty to disclose. This has inspired me to improve the XOM release process, to make it more of a one click process, or, to be more accurate, a one ant target process. I can now release a new version simply by typing:

$ ant -Dpassword = secret -Dwebpassword=other_secret release

This not only builds the entire project. It tags the release in CVS, uploads the zip and tar.gz files to IBiblio, and uploads the documentation to my web host. It doesn’t yet file a bug to upload the maven files, but I’m working on that.

During the process of setting this up, I realized that my organization is a little backwards. In particular, I’m pushing all the artifacts from my local system. Instead, I should merely be committing everything to the source code control repository; tagging a release; and then having the further downstream artifacts like the zip and tar.gz files and documentation pulled from source code control onto the Web servers.

There are some commercial products that are organized like this, including ThoughtWorks’s Cruise, but none of the major open source hosting sites such as SourceForge and work like this. Certainly, SourceForge and similar sites have been major contributors to the open source revolution. They have enabled hobbyist developers working in their garages to use tools and techniques of software development that were previously limited to corporations. They have it enabled far-flung developers around the world to collaborate with each other far more effectively than they could do by e-mailing each other tar files. They have removed the burden of system administration from many programmers, thus enabling them to devote more time to writing code. Make no mistake. SourceForge et al. are real force for good in the community.

That said, the state of the art in software development has moved forward significantly since these sites were founded. CVS has mostly been replaced by Subversion. On some projects, Subversion has been been replaced by distributed version control systems such as git and Mercurial. Unit testing and test driven development have moved from extreme practices to standard operating procedure. Continuous integration using products like Hudson and Cruise Control is routine. Nonetheless, most project hosting sites still offer little beyond a source code repository, a bug tracker, and some webspace. Not that that’s not important, but we can do so much more.

It’s time to think about what a modern project hosting site might want to offer and what it might look like.

A Square Is Not a Rectangle

Friday, September 11th, 2009

The following example, taken from an introductory text in object oriented programming, demonstrates a common flaw in object oriented design. Can you spot it?

public class Rectangle {

  private double width;
  private double height;

  public void setWidth(double width) {
    this.width = width;

  public void setHeight(double height) {
    this.height = height;

  public double getHeight() {
    return this.height;

  public double getWidth() {
    return this.width;

  public double getPerimeter() {
    return 2*width + 2*height;

  public double getArea() {
    return width * height;

public class Square extends Rectangle {

  public void setSide(double size) {


(I’ve changed the language and rewritten the code to protect the guilty.)

Why Pair Programming Works

Tuesday, June 30th, 2009

Pair programming is like magic in more ways than one. It dramatically improves programmer productivity and reduces bug count, and yet it does so through a technique that’s completely counter-intuitive. You can’t help but think that there’s some trick yet to be exposed; that pair programming is just slight of hand. In this article, I will endeavor to pull back the curtain and reveal the secrets of the pair programming magicians.

Specifically, I identify six reasons pair programming succeeds:

  • Continuous Code Review
  • Fewer blockages
  • Masking distractions
  • Guaranteed focus
  • Multiple points of view
  • Reduced training cost and time


Imagine There’s No Null

Wednesday, May 27th, 2009

A couple of weeks ago I spent a considerable amount of time chasing down bugs involving null in a large code base: null checks after a variable had already been dereferenced, nulls passed to methods that would immediately dereference them, equals() methods that didn’t check for null, and more. Using FindBugs, I identified literally hundreds of bugs involving null handling; and that got me thinking: Could we just eliminate null completely? Should we?

What follows is a thought experiment, not a serious proposal. Still it might be informative to think about it; and perhaps it will catch the eye of the designer of the next great language.

In Praise of Draconian Error Handling, Part 1

Monday, January 12th, 2009

I’m doing a bit of work on XOM, trying to optimize and improve some of the Unicode normalization code. A lot of this is autogenerated from the Unicode data files, and I’m actually working on the meta-code that parses those files and then generates the actual shipping code. In this code, I’m setting up a switch statement like this one:

       switch(i) {
          case 0:
            return result + "NOT_REORDERED";
          case 1:
            return result + "OVERLAY";
          case 7:
            return result + "NUKTA";
          case 8:
            return result + "KANA_VOICING";
          case 9:
            return result + "VIRAMA";
          case 202:
            return result + "ATTACHED_BELOW";
          case 216:
            return result + "ATTACHED_ABOVE_RIGHT";
          case 218:
            return result + "BELOW_LEFT";
          case 220:
            return result + "BELOW";
          case 222:
            return result + "BELOW_RIGHT";
          case 224:
            return result + "LEFT";
          case 226:
            return result + "RIGHT";
          case 228:
            return result + "ABOVE_LEFT";
          case 230:
            return result + "ABOVE";
          case 232:
            return result + "ABOVE_RIGHT";
          case 233:
            return result + "DOUBLE_BELOW";
          case 234:
            return result + "DOUBLE_ABOVE";
          case 240:
            return result + "IOTA_SUBSCRIPT";
            return result + "NOT_REORDERED";

And then I stop myself. Do you see the bug? Actually it’s a meta bug that leads to the true bug.