Overloading int considered harmful: part 2

After writing Overloading int considered harmful, three concerns were raised:

  • How does this affect object serialization?
  • What about ClassLoaders?
  • How about the new enum keyword in Java 1.5?

I’d like to address those now.

Serialization

Design pattern aficionados will have recognized the type-safe enum pattern I’ve advocated here as just a generalization of the Singleton pattern. However, instead of having one instance of a class, we can have two, three, then, or ten thousand. The main idea is that the actual instances are fixed. programs cannot create new instances not provide for in the source code. or at least they can’t if Java’s object orientation is complete; in particular its data encapsulation. However, it turns out there are a couple of holes that allow programs to sneak around the usual constraints of privacy.

The first such hole is object serialization. If a type-safe enum class is not serializable and does not need to be as a serializable field of other classes, there’s no problem. However, sometimes it does need to be serializable. For instance, the java.awt.Font class that started this whole discussion is indeed serializable. if we were to replace its style information with a type-safe enum class as I’ve recommended, then we would want the style information to be serialized as well. There are two ways to solve this:

  1. Define custom readObject and writeObject methods in Font that knows how to serialize and deserialize these objects.
  2. Make the FontStyle class implement Serializable.

Defining custom readObject and writeObject methods in Font is the preferred solution. It’s much more stable and robust over the long term than relying on Java’s default serialization. If this is your approach, the code is straight-forward:

public class Font {

 // other code...

  private void writeObject(ObjectOutputStream out) throws IOException {

    out.defaultWriteObject(out);
    if (this.style == FontStyle.PLAIN) {
      out.writeInt(1);
    }
    else if (this.style == FontStyle.BOLD) {
      out.writeInt(2);
    }
    else if (this.style == FontStyle.ITALIC) {
      out.writeInt(3);
    }
    else {
      out.writeInt(4);
    }

  }
 
 
  private void readObject(ObjectInputStream in)
   throws IOException, ClassNotFoundException {
   
     in.readDefaultObject();  
     int state = in.readInt();
     switch (state) {
      case 1:
        this.style = FontStyle.PLAIN;
        break;
      case 2:
        this.style = FontStyle.BOLD;
        break;
      case 3:
        this.style = FontStyle.ITALIC;
        break;
      case 4:
        this.style = FontStyle.BOLD_ITALIC;
        break;
    }

  } 
 

}

This isn’t an ideal solution. Some extra FontStyle objects will still be created, but they should never escape from the private scope of the Font class. All that will be exposed to client classes will be the four fixed objects.

The approach that trips people up is the second one. It seems easier. Simply declare that FontStyle implements Serializable like so:

public class FontStyle implements Serializable {

  public static FontStyle PLAIN  = new FontStyle();
  public static FontStyle BOLD   = new FontStyle();
  public static FontStyle ITALIC = new FontStyle();

  private FontStyle() {};

}

However, there’s an unexpected bug here, lying in wait to trap the unwary. Deserialization does not rely on calling the constructor. Therefore, if you deserialize an object, you have not recreated the original object. Instead you’ve created a new object, even though the constructor was private, and even though no constructor was ever called. Therefore all the nice singleton properties fail. Note that this is a general problem with any singleton style class in Java that implements Serializable. It is not unique to type-safe enum classes.

However, the designers of object serialization anticipated this problem. Each serializable class can have a readResolve() method that changes the object read from the stream. However, we should include a little extra state in each object to figure out which of the three types we’re using:

import java.io.*;

public class FontStyle implements Serializable {

  private int value;

  public static FontStyle PLAIN  = new FontStyle(1);
  public static FontStyle BOLD   = new FontStyle(2);
  public static FontStyle ITALIC = new FontStyle(3);
  public static FontStyle BOLD_ITALIC = new FontStyle(4);

  private FontStyle(int value) {
    this.value = value;
  };
  
  private Object readResolve() throws ObjectStreamException {
    
    switch (value) {
      case 1:
        return PLAIN;
      case 2:
        return BOLD;
      case 3:
        return ITALIC;
      case 4:
        return BOLD_ITALIC;
      default:
        throw new InvalidObjectException("Unrecognized font style");
    }
    
  }

}

Internally, I’ve effectively reintroduced an enumerated int constant, but that’s purely private. It’s completely invisible from outside this class. It is still impossible for clients to screw up by using the wrong int in the wrong place. They still have to use one of the four instances of the FontStyle class defined here.

Multiple ClassLoaders

The next issue is trickier, and again is really a problem with any singleton style class; not merely type-safe enum classes. If you’re operating in an environment with more than one class loader (This is more common than you might think: it includes servlets, applets, and ant builds, among others.) each separate class loader will load the class again. If there are two class loaders, there will be two copies of each type-safe enum value. If there are three class loaders, there will be three copies of each one. If there are four class loaders, there will be four copies of each one, and so on. Worse yet, these copies won’t be equal. You can easily work yourself into a place where FontStyle.ITALIC != FontStyle.ITALIC is true.

This wouldn’t actually be a problem for the hypothetical FontStyle class, because this class would be loaded only once by the bootstrap class loader, like all classes in the core library. However, it can be a very real problem for classes in user packages.

Sadly, there’s no easy solution for this problem. All I can suggest here is that if there’s the remotest chance your program might need to run in a multiclassloader environment, then give the class an equals method (and the corresponding hashCode method too of course) and use that to compare objects rather than ==. Because things get really hinky when comparing instances of the same class loaded by different class loaders, you have to be careful to do this without using either instanceof or casts, as you would in a less robust equals method. You can usually leverage off of the class’s name and either its hashCode or toString method instead. For example, the hypothetical FontStyle class could be implemented thusly:

public class FontStyle  {

  private int value;

  public static FontStyle PLAIN  = new FontStyle(1);
  public static FontStyle BOLD   = new FontStyle(2);
  public static FontStyle ITALIC = new FontStyle(3);
  public static FontStyle BOLD_ITALIC = new FontStyle(4);

  private FontStyle(int value) {
    this.value = value;
  }
  
  private boolean equals(Object o) {
  
    if (o != null && o.getClass().getName().equals("FontStyle")) {
      return this.hashCode() == o.hashCode();  
    }
    return false;
    
  }
  
  private int hashCode() {
    return this.value;
  }

}

Note that the ints used here are purely arbitrary. They’re merely used to compare objects across class loaders. They have no particular meaning in and of themselves.

Enums

It was to avoid all this, plus the problems of using ints as constants, that Java 1.5 introduces enums. An enum is really just syntax sugar for a class. It is defined in a separate file like a class. However, it uses the enum keyword rather than the class keyword. For instance, this is a FontStyle enum:

public enum FontStyle { PLAIN, BOLD, ITALIC, BOLD_ITALIC }

If you decompile the byte code for an enum, you’ll see it’s really just a type-safe enumerated class, like I’ve been talking about all along. Indeed you can even add additional methods to an enum, though that’s unusual. However, code that uses enums only work in Java 1.5 and later. The byte code format changes made to support enums are incompatible with earlier versions. Unlike generics, you cannot compile with Java 1.5, and deploy to 1.4 VMs. Code that needs to operate in Java 1.4 and earlier environments will still be using type-safe enum classes for the foreseeable future.

However, for Java 1.5 environments, there are a few advantages enums have over user-written type-safe enum classes:

  • Enums implement Comparable. The comparison algorithm is based on the order in which the enum constants are declared. That doesn’t make a lot of sense here, but it might if you were enumerating military ranks, diamond grades, or something else that had a well-defined order.
  • Enums provide toString and valueOf methods. The toString() method returns the name of the enum constant. The valueOf method returns the right constant object for a string, or throws an IllegalArgumentException if there’s no matching constant.
  • The values method returns a list of all the enum values, and can be used with Java 1.5’s new for loop syntax.
  • Enums can be used in switch statements. (This is the only thing we really can’t do with a type-safe enum class in earlier versions of Java.)

Most importantly for this article, serialization is handled automatically. You no longer have to worry about deserializing an extra instance. The same is true for working in multiple classloader environments.

On the other hand, enums do not handle multi-classloader issues automatically. It is absolutely possible to have FontStyle.PLAIN != FontStyle.PLAIN when the FontStyle class is loaded twice by two different class loaders. If this is a real possibility in your code (and if you’re writing a library, it’s always a real possibility), the obvious solution is to override equals with a method that does work. Unfortunately, you can’t do that. The equals method in java.lang.Enum is declared final, and all it does is compare objects for object identity with ==. So you really have no choice but to buckle down, write your own type-safe enum class, and warn client programmers to always use equals instead of == when comparing objects.

7 Responses to “Overloading int considered harmful: part 2”

  1. Elliotte Rusty Harold Says:

    Here’s the simple enum class that can be used to prove the enum contract is violated in multiple-classloader environments:

    public enum EnumTest { FIELD }

    You’ll need Java 1.5 to compile it.

  2. elharo Says:

    Loading enums with multiple class loaders

    Here’s a simple class that loads the same Enum class from two
    different class loaders, and then compares their respective enum
    constants for equality.

    import java.lang.reflect.Field;
    import java.net.*;
    import java.io.*;
    
    public class ClassLoaderTest {
    
        public static void main(String[] args) 
          throws IOException, ClassNotFoundException, SecurityException, 
          NoSuchFieldException, IllegalAccessException {
          
            URL url = new URL("http://www.cafeaulait.org/classes/");
            URL[] sources = new URL[1];
            sources[0] = url;
            
            URLClassLoader loader1 = new URLClassLoader(sources);
            URLClassLoader loader2 = new URLClassLoader(sources);
            
            Class enumTest1 = loader1.loadClass("EnumTest");
            Class enumTest2 = loader2.loadClass("EnumTest");
            
            Field field1 = enumTest1.getField("FIELD");
            Field field2 = enumTest2.getField("FIELD");
            Object o1 = field1.get(enumTest1);
            Object o2 = field2.get(enumTest2);
            
            if (o1 == o2) {
                System.out.println("Success!");
            }
            else {
                System.out.println("Failure!");            
            }
            
        }
        
    }
  3. Curt Says:

    Sun could make this work well and efficiently For type-safe enums, I extend a base class called AbstractConstant. It is based on an old JavaWorld article that provides a nice starting point. The base class handles serialization and multiple class loaders transparently. It is all based on using reflection and comparing the values using their field names, so it is transparent, but not all that efficeint. Sun could change the spec so that Tiger enums have their desired singleton-like behaviour across class loaders. They are in the unique position to be able to enforce that efficiently via information currently private to the class loaders. Namely:

    if (thisClassIsAlsoInAnotherLoader())
        slowerLogic();
    else
        fasterLogic();

    This seems like a good idea to me, since it would make the most commonly desired behaviour the easiest to implement. Can anybody think of a time when you would want something else for the exact same class in two different loaders?

  4. andyjbs Says:

    Sealing your JARs would help some

    If you are distributing a library, sealing your JARs would help mitigate the multiple classloader problem somewhat. To do this, simply add: Sealed: true

    to your JAR’s Manifest (you can also seal a single package). If a package is sealed, then all classes loaded from that package must come from the same JAR file, otherwise the VM throws an exception. I would think this would help in the case where 2 different versions of a library that you distribute are inadvertantly included on the classpath of an application. If “EnumTest” is loaded from more than one JAR, a SecurityException should immediately flag the problem. I’d have to investigate more on which cases Sealing may or may not solve. I have a feeling it would not work for ERH’s example above and, to be frank, most of the reference material I’ve found on Sealing doesn’t discuss how it behaves with multiple classloaders. It may not help at all!

    BTW, ERH – will the final XOM release be Sealed?

  5. germano.leichsenring Says:

    more on the Multiple ClassLoaders problem

    Hi, I believe that if you have the same class loaded with different classloaders, there’s much more trouble than the one explained, so you won’t even get close to the stated situtation.

    Quote: If there are four class loaders, there will be four copies of each one, and so on. Worse yet, these copies won’t be equal. You can easily work yourself into a place where FontStyle.ITALIC != FontStyle.ITALIC is true.

    When a class is loaded by different classloaders, internally they’re different classes. So if you have one FontStyle loaded by CL1 and CL2, internally they’ll be CL1@FontStyle and CL2@FontStyle, for example. Any single class can be linked to CL1@FontStyle or CL2@FontStyle, never both, and it can’t be changed after loading. So as long as your class is linked to one of these (e.g. CL1@FontStyle) you can never access the other (e.g. CL2@FontStyle). One loaded class Font will only work with one or the other, never both. You have a problem only when a) you use reflection to access the fields, and b) you access only java.lang.Object methods, as in the given example. In this example you cannot cast the objects to FontStyle, you will see that enumTest1.isAssignableFrom(enumTest2) returns false, so you can never really use the classes. At the time you try to cast them to FontStyle, you get a ClassCastException. So it comes down to my question: is there a real world example where everything is working fine, and then you have a problem that FontStyle.ITALIC is not the same as FontStyle.ITALIC? I believe not. Please enlighten me.

  6. Elliotte Rusty Harold Says:

    I’ve definitely hit this issue sometime in the past few years, though I’ll have to hunt around in my files to remember exactly when. I think it was in a servlet container. The only reason this problem ever occurred to me is that I lost a day or so to debugging it once. I suspect the issue arises when using == to compare two objects. Even if two classes are loaded by different classloaders, instances of each class can still be passed to the methods of the other one. In other words, linking is not required. As long as one object can see objects of the other type, the problem can arise. Reflection is not required. This was just the easiest way to boil the problem down to a simple example.

  7. werutzb Says:

    Hi!

    I would like extend my SQL experience.
    I red so many SQL books and would like to
    get more about SQL for my position as oracle database manager.

    What would you recommend?

    Thanks,
    Werutz