9-19-04 Puzzling Through Erasure
If you've been struggling at all with Java Generics in JDK 5.0 (currently the
proper way to refer to the latest version of the Java programming environment.
If you want to talk about the platform, it's J2SE 5.0. See here for
clarification), you know that the implementation decision with the biggest
impact is that of "erasure." And people have been arguing about it for quite
some time this search on the Java forums shows that
some of the arguments have gotten quite heated, and misunderstanding abounds.
In this and other articles (which will either appear, or be the basis of,
portions of the Generics chapter in Thinking in Java, fourth edition),
I'll attempt to sidestep any political feelings about Java Generics, and erasure
in particular, and just look at some of the implications (if I happen to vent
about something, I'll try to keep that to articles that are clearly marked for
that purpose).
Let's first consider a very basic impact of erasure. If I have a type
parameter T, not only am I prevented from making an instance of that type
(because, with erasure, the type is forgotten), I cannot make an array of that
type. However, I can generate an array of Object and cast it to
T[].
Consider, for example, a simple generic wrapper around an array:
public class GenericArray<T> {
private T[] array;
@SuppressWarnings("unchecked") // Not supported in 1.5.0
public GenericArray(int sz) {
// It's what you mean, but it won't work (erasure):
// array = new T[sz];
array = (T[])new Object[sz];
}
public void put(int index, T item) {
array[index] = item;
}
public T get(int index) {
return array[index];
}
public static void main(String[] args) {
GenericArray<Integer> gai =
new GenericArray<Integer>(10);
for(int i = 0; i < 10; i ++)
gai.put(i, i);
for(int i = 0; i < 10; i ++)
System.out.println(gai.get(i));
}
}
In the constructor, what you mean to do is create an array of T.
I suspect one of the
predominant stumbling blocks in Java Generics will be the fact that you must
always remind yourself "Oh, yeah, it only seems like we know something
about the type parameter. That information has actually been erased." So here,
you mean T[] array = new T[sz];, but erasure forgets about T
so that's not legal. To solve the problem, you create an array of objects and
cast it.
The compiler responds by giving you a warning:
Note: GenericArray.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
Those of us who worked with C, especially pre-ANSI C, remember a
particular effect of warnings: when you discover you can ignore them, you do.
For that reason, it's best to not issue any kind of a message from the compiler
unless the programmer must do something about it.
In this case, we've gotten a single warning, and we believe that it's about
the cast. But if you really want to make sure, you should run -
Xlint:unchecked:
GenericArray.java:9: warning: [unchecked] unchecked cast
found : java.lang.Object[]
required: T[]
array = (T[])new Object[sz];
^
1 warning
Yep, sure enough it's complaining about that cast. But now that I've verified
this I will feel pretty confident that in the future I can just ignore the
warnings. Of course that was sarcasm the best thing we could possibly do
is, once we verify that a particular warning is expected, is to turn it off.
That way, when a warning does appear, we'll actually investigate it.
To turn off the warning, Java provides an annotation, the one that you
see in the listing:
@SuppressWarnings("unchecked")
Notice that this is placed on the method that generates the warning,
rather than the entire class. It's best to be as "focused" as possible when you
turn off a warning, so that you don't accidentally cloak a real problem by
turning off warnings too broadly.
So the SuppressWarnings annotation is just what we need ... except
that, for some reason, it is not enabled in JDK 5.0. This brings up two
questions: (1) what do the library designers do when they need to create an
array inside of a generic type? (2) If they cast Object arrays, do the
libraries produce warnings when they compile?
To answer the first question, I don't think there's a better solution than
GNU grep. From habit, I started by trying find, and over time I've even
gotten at least tolerably functional with that tool. But GNU grep has
added some really good extensions to the old *Nix grep. If you observe
that you are using Windows which has no GNU grep, you're in luck
there's a terrific project called cygwin that Red Hat supports, where
most of the Linux environment has been simulated under Windows. Just go to http://www.cygwin.com and follow the
directions, and you'll end up with the equivalent of a DOS command window
running Linux, with all the tools like grep built in. If you want more
information about grep, just type info grep. If you'd like to be
able to have a "bash prompt here" facility, see this page.
Later: someone pointed out that Windows now has findstr already installed.
I've been using cygwin and grep for awhile now so I wasn't looking for this, and
didn't know it had been added (or when). But it appears to be quite powerful and
includes regular expressions -- and has the benefit of already being on your machine.
I use cygwin to solve many other problems, so it has far more value than just grep.
With cygwin installed, go to the root of the Java source code
directory (after you've installed the lastest version of JDK 5.0 from
java.sun.com, of course). Here's the grep command that I used:
grep -R -n -B 3 -A 10 --include="*.java" "([A-Z]\[\])" .
The -R flag tells grep to recurse through all subdirectories. The
--include="*.java" flag tells it to only include Java files in the
search. -n says to print line numbers. -B 3 says to print three
lines before the match, and -A 10 says to print 10 lines after the match.
And the regular expression itself says to find everything that looks like an
array cast to a generic type, for example '(T[])'. You can see that GNU
grep is a powerful tool for searching the Java sources for example code.
The findstr equivalent is:
findstr /N /S /R "([A-Z]\[\])" *.java
Unfortunately, you do not get the benefit of being able to print lines before
and after the match, as you do with GNU grep. (It is possible that someone has
ported GNU grep to Windows by itself, so that you don't have to install all of Cygwin).
Using the above search, I discovered that, yes indeed, there are casts from
Object array to parameterized types everywhere in the Java sources. So
that appears to be the accepted and proper approach. For example, here's the
copy-ArrayList-from-Collection constructor, after cleaning up
and simplifying:
public ArrayList(Collection extends E> c) {
size = c.size();
elementData = (E[])new Object[size];
c.toArray(elementData);
}
If you look through ArrayList.java, you'll find plenty more of these casts.
And what happens when we compile it?
Note: ArrayList.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
Sure enough, the standard libraries produce lots of warnings. So
"cast-and-ignore-warnings" is the acceptable idiom for JDK 5.0. I'm told that in a future
release (I'm guessing a point release of JDK 5), SuppressWarnings will be
enabled, and at that point the warnings will become meaningful again.