3-10-04 Generics Aren't
Note: there is a
follow-up article
I was invited to be the guest at the Silicon Valley Patterns Group last
night, and was able to choose the topic. In preparation for learning about JDK
1.5, I chose Java Generics, and we all ended up with a bit of a shock as a
result. Our primary source of information was the new Sun paper on Java
Generics.
My experience with "parameterized types" comes from C++, which was based on
ADA's generics. Apparently Lisp was the first language to do type
parameterization, and someone said Simula had it as well. In those languages,
when you use a type parameter, that parameter takes on a latent type: one
that is implied by how it is used, but never explicitly specified. That is, the
latent type is implied by the methods that you call on it. If your template
function calls f() and g() on a type, then you imply a
type that has methods f() and g(), even though that
type is never actually defined anywhere.
So for example, in Python you can do this:
def speak(anything):
anything.talk()
Notice that there is no constraint on the type of anything, which is
just an identifier. Except that it must be able to perform the operations
that speak() asks of it, so that implies an interface, but you
never have to explicitly write out that interface so it's latent.
Now I can say:
class Dog:
def talk(self): print "Arf!"
def reproduce(self): pass
class Robot:
def talk(self): print "Click!"
def oilChange(self): pass
a = Dog()
b = Robot()
speak(a)
speak(b)
speak() doesn't care about the type of its argument, so I can
pass anything to it, so long as the object I pass supports the
talk() method. The Ruby solution, I believe, has exactly the same
properties.
In C++ you can do the equivalent:
class Dog {
public:
void talk() { }
void reproduce() { }
};
class Robot {
public:
void talk() { }
void oilChange() { }
};
template<class T> void speak(T speaker) {
speaker.talk();
}
int main() {
Dog d;
Robot r;
speak(d);
speak(r);
}
Again, speak() doesn't care about the type of its argument. But
it still makes sure at compile time that it can actually send
those messages.
But in Java (and apparently C#), you can't seem to say "any type." The
following won't compile with JDK 1.5 (note you must invoke the compiler with
the
source -"1.5"
flag to compile Java Generics):
public class Communicate {
public <T> void speak(T speaker) {
speaker.talk();
}
}
However, this will:
public class Communicate {
public <T> void speak(T speaker) {
speaker.toString(); // Object methods work!
}
}
Java Generics use "erasure," which drops everything back to
Object if you try to say "any type." So when I say
<T>, it doesn't really mean "anything" like C++/ADA/Python
etc. does, it means "Object."
Apparently the "correct Java Generic" way of doing this is to define an
interface with the speak method in it, and specify that interface as a
constraint. this compiles:
interface Speaks { void speak(); }
public class Communicate {
public <T extends Speaks> void speak(T speaker) {
speaker.speak();
}
}
What this says is that "T must be a subclass or implementation
of Speaks." So my reaction is "If I have to specify a subclass, why
not just use the normal extension mechanism and avoid the extra clutter and
confusion?" Like this:
interface Speaks { void speak(); }
public class CommunicateSimply {
public void speak(Speaks speaker) {
speaker.speak();
}
}
In this example, generics have no advantage. In fact, it's confusing if you
see them used, because you scratch your head and wonder "why does he need a
generic here? What is the advantage?" Answer: none.
If we want to implement the "Dogs and Robots" example using generics, we are
forced to use an interface or superclass, to make the so-called "generic" be a
specific type:
interface Speaks { void talk(); }
class Dog implements Speaks {
public void talk() { }
public void reproduce() { }
}
class Robot implements Speaks {
public void talk() { }
public void oilChange() { }
}
class Communicate {
public static <T extends Speaks> void speak(T speaker) {
speaker.talk();
}
}
public class DogsAndRobots {
public static void main(String[] args) {
Dog d = new Dog();
Robot r = new Robot();
Communicate.speak(d);
Communicate.speak(r);
}
}
(Aside: note the use of extends rather than
implements in the generic type constraint. implements
won't work. Java is precise and consistent because Sun says it is).
Again, this has zero advantage over the simple interface-only approach:
interface Speaks { void talk(); }
class Dog implements Speaks {
public void talk() { }
public void reproduce() { }
}
class Robot implements Speaks {
public void talk() { }
public void oilChange() { }
}
class Communicate {
public static void speak(Speaks speaker) {
speaker.talk();
}
}
public class SimpleDogsAndRobots {
public static void main(String[] args) {
Dog d = new Dog();
Robot r = new Robot();
Communicate.speak(d);
Communicate.speak(r);
}
}
So if we write generic code that actually takes a "type of anything," that
type can only be an Object, and our generic code must only call
Object methods on it. So really, we are restricted to code that is
already "generic to Object," except for casting up to
Object and down from Object, which this wonderful new
syntax will do for us. Sounds like it's a solution for collection classes and
not much else, doesn't it? The conclusion that the Silicon Valley Patterns Group
came to was that these so-called Generics seem to only solve the problem of
automatically casting in and out of containers.
There seems to be an argument that if <T> could really
use an "anything" argument, it wouldn't be type-safe. This is obviously
incorrect because C++ catches such errors at compile-time. "Aha," they say, "but
we (chose to)/(were forced to) implement Java Generics in a different way."
(Previous rant removed). So generics are really "autocasting." That's the way of the
Java world, and we are going to miss out on latent typing (it's actually possible
to simulate latent typing using reflection, as I do once or twice in Thinking in Java,
but it's messy and much less elegant). I was shocked at first, but now I'm over it
and at least it's clear that this is the way things are going to happen. C# also doesn't
support latent typing, and although it has a better generics model than Java (since they
went ahead and changed the underlying IL so that, for example, a static field in a class
will be different between class and class). So if you want latent typing you'll
have to use C++ or D (or Python, Smalltalk, Ruby, etc.).
Follow-up article
Please add comments to the wiki page here. In particular, add
code that you've compiled to show how it could be done better (for example,
please don't just say: "hey, you should use wildcards" unless you actually post
code that compiles showing the use of wildcards).