Return to Home Page
      Blog     Consulting     Seminars     Calendar     Books     CD-ROMS     Newsletter     About     FAQ      Search
 

9-19-04 Puzzling Through Erasure

If you've been struggling at all with Java Generics in JDK 5.0 (currently the proper way to refer to the latest version of the Java programming environment. If you want to talk about the platform, it's J2SE 5.0. See here for clarification), you know that the implementation decision with the biggest impact is that of "erasure." And people have been arguing about it for quite some time – this search on the Java forums shows that some of the arguments have gotten quite heated, and misunderstanding abounds.

In this and other articles (which will either appear, or be the basis of, portions of the Generics chapter in Thinking in Java, fourth edition), I'll attempt to sidestep any political feelings about Java Generics, and erasure in particular, and just look at some of the implications (if I happen to vent about something, I'll try to keep that to articles that are clearly marked for that purpose).

Let's first consider a very basic impact of erasure. If I have a type parameter T, not only am I prevented from making an instance of that type (because, with erasure, the type is forgotten), I cannot make an array of that type. However, I can generate an array of Object and cast it to T[].

Consider, for example, a simple generic wrapper around an array:

public class GenericArray<T> {
  private T[] array;
  @SuppressWarnings("unchecked") // Not supported in 1.5.0
  public GenericArray(int sz) {
    // It's what you mean, but it won't work (erasure):
    // array = new T[sz];
    array = (T[])new Object[sz];
  }
  public void put(int index, T item) {
    array[index] = item;
  }
  public T get(int index) {
    return array[index];
  }
  public static void main(String[] args) {
    GenericArray<Integer> gai =
      new GenericArray<Integer>(10);
    for(int i = 0; i < 10; i ++)
      gai.put(i, i);
    for(int i = 0; i < 10; i ++)
      System.out.println(gai.get(i));
  }
}

In the constructor, what you mean to do is create an array of T. I suspect one of the predominant stumbling blocks in Java Generics will be the fact that you must always remind yourself "Oh, yeah, it only seems like we know something about the type parameter. That information has actually been erased." So here, you mean T[] array = new T[sz];, but erasure forgets about T so that's not legal. To solve the problem, you create an array of objects and cast it.

The compiler responds by giving you a warning:

Note: GenericArray.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
Those of us who worked with C, especially pre-ANSI C, remember a particular effect of warnings: when you discover you can ignore them, you do. For that reason, it's best to not issue any kind of a message from the compiler unless the programmer must do something about it.

In this case, we've gotten a single warning, and we believe that it's about the cast. But if you really want to make sure, you should run - Xlint:unchecked:

GenericArray.java:9: warning: [unchecked] unchecked cast
found   : java.lang.Object[]
required: T[]
    array = (T[])new Object[sz];
                 ^
1 warning

Yep, sure enough it's complaining about that cast. But now that I've verified this I will feel pretty confident that in the future I can just ignore the warnings. Of course that was sarcasm – the best thing we could possibly do is, once we verify that a particular warning is expected, is to turn it off. That way, when a warning does appear, we'll actually investigate it.

To turn off the warning, Java provides an annotation, the one that you see in the listing:

@SuppressWarnings("unchecked")
Notice that this is placed on the method that generates the warning, rather than the entire class. It's best to be as "focused" as possible when you turn off a warning, so that you don't accidentally cloak a real problem by turning off warnings too broadly.

So the SuppressWarnings annotation is just what we need ... except that, for some reason, it is not enabled in JDK 5.0. This brings up two questions: (1) what do the library designers do when they need to create an array inside of a generic type? (2) If they cast Object arrays, do the libraries produce warnings when they compile?

To answer the first question, I don't think there's a better solution than GNU grep. From habit, I started by trying find, and over time I've even gotten at least tolerably functional with that tool. But GNU grep has added some really good extensions to the old *Nix grep. If you observe that you are using Windows which has no GNU grep, you're in luck – there's a terrific project called cygwin that Red Hat supports, where most of the Linux environment has been simulated under Windows. Just go to http://www.cygwin.com and follow the directions, and you'll end up with the equivalent of a DOS command window running Linux, with all the tools like grep built in. If you want more information about grep, just type info grep. If you'd like to be able to have a "bash prompt here" facility, see this page.

Later: someone pointed out that Windows now has findstr already installed. I've been using cygwin and grep for awhile now so I wasn't looking for this, and didn't know it had been added (or when). But it appears to be quite powerful and includes regular expressions -- and has the benefit of already being on your machine. I use cygwin to solve many other problems, so it has far more value than just grep.

With cygwin installed, go to the root of the Java source code directory (after you've installed the lastest version of JDK 5.0 from java.sun.com, of course). Here's the grep command that I used:

grep -R -n -B 3 -A 10 --include="*.java" "([A-Z]\[\])" .
The -R flag tells grep to recurse through all subdirectories. The --include="*.java" flag tells it to only include Java files in the search. -n says to print line numbers. -B 3 says to print three lines before the match, and -A 10 says to print 10 lines after the match. And the regular expression itself says to find everything that looks like an array cast to a generic type, for example '(T[])'. You can see that GNU grep is a powerful tool for searching the Java sources for example code.

The findstr equivalent is:

findstr /N /S /R "([A-Z]\[\])" *.java
Unfortunately, you do not get the benefit of being able to print lines before and after the match, as you do with GNU grep. (It is possible that someone has ported GNU grep to Windows by itself, so that you don't have to install all of Cygwin).

Using the above search, I discovered that, yes indeed, there are casts from Object array to parameterized types everywhere in the Java sources. So that appears to be the accepted and proper approach. For example, here's the copy-ArrayList-from-Collection constructor, after cleaning up and simplifying:

public ArrayList(Collection c) {
  size = c.size();
  elementData = (E[])new Object[size];
  c.toArray(elementData);
}

If you look through ArrayList.java, you'll find plenty more of these casts. And what happens when we compile it?

Note: ArrayList.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

Sure enough, the standard libraries produce lots of warnings. So "cast-and-ignore-warnings" is the acceptable idiom for JDK 5.0. I'm told that in a future release (I'm guessing a point release of JDK 5), SuppressWarnings will be enabled, and at that point the warnings will become meaningful again.

Feedback Wiki Page

    Links I Read
Cafe Au Lait
Artima
Daily Python URL
Martin Fowler
Joel on Software
Paul Graham
Cringely
Search     Home     Web Log     Articles     Calendar     Books     CD-ROMS     Seminars     Services     Newsletter     About     Contact     Site Feedback     Site Design     Server Maintenance     Powered by Zope
©2003 MindView, Inc.