9-26-04 Puzzling Through Erasure III
In a
this response to my article
Puzzling Through Erasure, Neal Gafter points out that he was lazy when rewriting the
Java Libraries (he doesn't have enough to do writing the compiler, he has to
write the libraries too?). That we should not do what he did.
So, first lesson: even though something appears in the Java library sources,
that's not necessarily the right way to do it. This is disappointing, since
"Java from Sun" has usually been held up as the reference implementation. Now
when I find something coded in the libraries, I'll have to question whether this
is the good way to do it, or if it was just expedience.
Neal goes on to say:
It would have been better for me to use an Object array inside the collection,
and add a cast (yes, to T) everywhere necessary when an element is removed from
the array.
Let's see how that would look with the GenericArray.java example, and then move on to
what Neal says should be the correct solution. First, moving the cast:
public class GenericArray2<T> {
private Object[] array;
public GenericArray2(int sz) {
array = new Object[sz];
}
public void put(int index, T item) {
array[index] = item;
}
@SuppressWarnings("unchecked") // Not supported in 1.5.0
public T get(int index) {
return (T)array[index];
}
public static void main(String[] args) {
GenericArray2<Integer> gai =
new GenericArray2<Integer>(10);
for(int i = 0; i < 10; i ++)
gai.put(i, i);
for(int i = 0; i < 10; i ++)
System.out.println(gai.get(i));
}
}
Initially, this doesn't look very different, just that the cast has been
moved. You still get an unchecked warning. However, the internal representation
is now Object[] rather than T[]. Actually, in the
above example it makes no difference at all, but Neal's point was that if
you ever exposed the internal representation (typically not a good idea if you
can avoid it) by returning it in GenericArray.java as T[],
you end up with a ClassCastException.
When we try it by modifying GenericArray.java, the results are in fact
quite surprising:
public class GenericArrayExposed<T> {
private T[] array;
public GenericArrayExposed(int sz) {
array = (T[])new Object[sz];
}
public void put(int index, T item) {
array[index] = item;
}
public T get(int index) {
return array[index];
}
// Method that exposes the underlying representation:
public T[] rep() { return array; }
public static void main(String[] args) {
GenericArrayExposed<Integer> gai =
new GenericArrayExposed<Integer>(10);
// Surprise! This causes a ClassCastException:
Integer[] ia = gai.rep();
// This is OK:
Object[] oa = gai.rep();
}
}
I've instantiated a GenericArrayExposed<Integer>, and
rep() returns a T[], which should be an
Integer[], but if I call it and try to capture it in an
Integer[] reference, I get a ClassCastException, presumably because
the actual runtime type is Object[]. I would not have guessed this
initially, but since it is the case I actually find the ClassCastException
reassuring, as I am glad to receive such information whenever I can get it (even
at runtime).
Now let's try it using Neal's first suggestion:
public class GenericArray2Exposed<T> {
private Object[] array;
public GenericArray2Exposed(int sz) {
array = new Object[sz];
}
public void put(int index, T item) {
array[index] = item;
}
public T get(int index) {
return (T)array[index];
}
// Return the underlying representation:
public T[] rep() { return (T[])array; }
// Could this be what Neal meant?:
public Object[] rep2() { return array; }
public static void main(String[] args) {
GenericArray2Exposed<Integer> gai =
new GenericArray2Exposed<Integer>(10);
// Still causes a ClassCastException:
// Integer[] ia = gai.rep();
// This is OK, like before:
Object[] oa = gai.rep();
// This also works
Object[] oa2 = gai.rep2();
// These cause ClassCast Exceptions:
// Integer[] ia2 = (Integer[])gai.rep();
// Integer[] ia3 = (Integer[])gai.rep2();
}
}
I have to admit that I'm still confused. The results of using an
Object[] instead of a T[] don't look very different to
me. I will have to hope that Neal can take the time to explain it.
However, note that this issue only comes up if you actually expose the
underlying representation by returning a reference to it, and since Neal did not
do this in the examples that my grep through the Standard Java Libraries
discovered, he is safe for the moment. But if someone comes along in the future
and decides to extend the Java APIs by returning a reference to the underlying
implementation, their unit tests will immediately show that they will have to
change the underlying code. So maybe it's not so dangerous after all.
But this still isn't the way Neal said we should do it, for new code (he
points out that he could not fix the Java library code without breaking the
existing interface). To do it right, we need to pass in a type token,
that is, explicitly pass in the class type that gets erased. So now the
Generic Array looks like this:
import java.lang.reflect.*;
public class GenericArrayWithTypeToken<T> {
private T[] array;
public GenericArrayWithTypeToken(Class<T> type, int sz) {
array = (T[])Array.newInstance(type, sz);
}
public void put(int index, T item) {
array[index] = item;
}
public T get(int index) {
return array[index];
}
// Expose the underlying representation:
public T[] rep() { return array; }
public static void main(String[] args) {
GenericArrayWithTypeToken<Integer> gai =
new GenericArrayWithTypeToken<Integer>(Integer.class, 10);
// Both of these now work:
Integer[] ia = gai.rep();
Object[] oa = gai.rep();
}
}
The type token Class<T> type is passed into the
constructor in order to recover from the erasure, so that we can create the
actual type of array that we need. And once we do get the actual type, we can
return it and get the desired results, as you see in main().
I will respond to the remainder of Neal's comments in the next article.