Return to Home Page
      Blog     Consulting     Seminars     Calendar     Books     CD-ROMS     Newsletter     About     FAQ      Search
 

3-11-04 About Latent Typing

The very first question to yesterday's weblog (where we discover that "Java Generics" are actually not generics at all, but simply an autocasting mechanism) was this:

Is latent typing (in a statically-typed language) desirable? Or rather, does it "go against the grain" of the language?

Let's say I have classes Gun and Camera, both with a shoot() method. If I had another method
void makeMyDay(T t){t.shoot()}
do I really want to be able to pass an instance of either Gun or Camera to it, without caring? Wouldn't it be better to use an interface like FireArm and make that the parameter to makeMyDay()?

Doesn't it make it likely that I'll "shoot myself in the foot" (pun intended) if I just call shoot() on any old object? In Java, the semantics of a method are not bound to its name, so much as they are bound to the interface the method belongs to. e.g.: print() means different things to a Stream than it does to an AWT component.

This question reveals just about every issue at hand. Let's look at the implications of this question and then address each one:

  1. Latent typing is somehow connected to whether the language is static or dynamic.
  2. A true generic mechanism does something other than loosen the explicit constraints on the use of a type.
  3. Java doesn't already have such a constraint-loosening mechanism, so the issue doesn't exist without true generics.
    And the big one:
  4. The syntax of a program is somehow connected to the semantics of that program.

Before diving into this, I want to point out that Martin Fowler recently made the brilliant observation that the difference in attitudes about issues like "when do we do type-correctness testing?" depends on whether you have a "directing" approach (you want to provide guidance to prevent people from falling down) or an "enabling" approach (you want to provide tools and abilities that allow people to move forward faster). Both approaches are reasonable and neither is wrong. I have been in both situations; for example, trying to prevent interns from ignoring or even actively circumventing coding style guidelines (where more "direction" was required), and on the other hand being frustrated by the loss of productivity that comes from being forced to conform to constraints that I wouldn't have violated anyway, and therefore was not benefiting from. Both approaches can be helpful in appropriate situations, and neither approach is an inviolable solution.

What is Latent Typing?

As I pointed out yesterday, when you use a latent typing mechanism (the core concept in a true implementation of generics), the type is still there, it's just implied. So on a very simple level, you could look at it as a device that writes interfaces for you, so that you don't have to. It's an enabling device that reduces the amount of code you have to write.

Why do we want to use latent typing? It's a code organization and reuse mechanism. With it I can write a piece of code that can be reused more easily than without it. Code organization and reuse are the foundational "levers" of all computer programming: write it once, use it more than once, keep the code in one place. The most fundamental organization-and-reuse mechanism is hard-coded into the opcodes of your microprocessor: the subroutine. Procedural programming builds upon and improves the idea of the subroutine, and object-oriented programming collects structures and associated subroutines together to produce a larger and more sensible code chunk with better organization and reuse factors than independent structures and subroutines. Alongside developments that improve organization and reuse come, we hope, better ways to discover errors in our code.

Because I am not required to name an exact interface that my code operates upon, with latent typing I can write less code and apply it more easily in more places.

Why can Latent Typing Appear "Dangerous"?

I believe that latent typing appears dangerous for two reasons. First, because you don't see the type explicitly defined, it can seem like it's not there and thus the type rules might not be enforced. However, in C++ the latent type is clearly enforced, and at compile time. In dynamic languages like Python, Smalltalk, etc., the latent type is also enforced: you still cannot successfully send an improper message to an object.

The second reason for the appearance of danger is that somehow it makes it easier to "shoot yourself in the foot" if a type is latent. To see why it's no easier with or without latent typing, it's important to see the distinction between latent typing and weakening the type constraints on a function's arguments.

The argument of a function can be exactly specified ("I will only accept a String for this argument"), which is what a procedural language does (one that has strong type checking, anyway: assembly-language and old-style C simply took bits without discrimination). One of the great benefits of object- oriented programming is that it weakens this constraint a bit: polymorphism means that you can stick more than one type of object into a particular variable. So now our function argument can be "a Shape or anything derived from a Shape." Of course, it's possible that a "bad" kind of Shape could then be passed into our function and we would blithely operate upon it, and then be horribly surprised when our program breaks. However, with experience we've come to believe that this rarely happens, and when it does it comes from a gross misunderstanding of the paradigm.

Even saying that something expects a particular class, or subclass of that class, can be overconstraining. Without a formal interface mechanism in C++ (although "pure abstract base classes" could be created by hand), programmers tended to use classes everywhere and ended up with code that, for example, would only accept a Shape, when that particular piece of code could also have been used on anything that is "Drawable." By making the interface explicit, Java promoted this idea to first-class status.

But look at what the interface does: it weakens the type constraints by first decoupling the interface from implementation – so you only have an outline of what the type looks like, but no semantics attached. The second way the constraints are weakened may have a bigger impact: you can easily fragment groups of functions into interfaces according to what you want – and thus create code that is more generic by minimizing the constraints upon the argument to be only those methods that you are actually going to call. So with interfaces, you're able to say "I don't care what type you are as long as you can perform these operations." Which is exactly what the questioner was concerned about. He was worried that we could end up doing something like this:

interface Shootable {
  void shoot();
}

class Camera implements Shootable {
  public void shoot() { System.out.println("Click!"); }
}

class Gun implements Shootable {
  public void shoot() { System.out.println("Bang!"); }
}

public class Shooting {
  public static void shootEmUp(Shootable s) {
    System.out.println("Yee Haw!");
    s.shoot();
  }
  public static void main(String[] args) {
    Camera c = new Camera();
    Gun g = new Gun();
    shootEmUp(c);
    shootEmUp(g);
  }
}

Which of course we can, with interfaces. The questioner's solution was to try to attach semantics to the interface in order to prevent its misuse – to increase the constraints on the argument of shootEmUp() so that it could only be a firearm, and not a camera. Certainly this is an appropriate design in some cases, when you always know that you only want to ever pass a firearm or something derived from it. And if you are really concerned that this would be misused – that someone might pass in a gun that explodes – you can make the Firearm class final. But that is not the point of either classes or interfaces. Their point is to reduce the constraints, to provide more flexible programming opportunities, and only to increase the constraints when absolutely necessary.

So the constraint-loosening mechanism is already there, in the form of interfaces. Latent typing simply takes this one step further, and makes that interface latent so you don't have to express the interface, or to implement it in every class that you want to use in the function. Since we already have interfaces, latent typing is just a coding convenience.

The questioner's other concern is independent of whether you have interfaces and/or latent typing. If the constraints are weakened this way, won't the code be misused? This is also a concern that arose when large numbers of programmers first began to program with an object-oriented language: because an object of a type can be replaced with an object of its subtype, won't that open up the potential for misuse? This is a fair question, but no one asks it anymore because we have come, through experience, to know that it only happens in the most egregious examples of misunderstanding, and in those cases the problems are not isolated to type errors.

Basically, the questioner is saying: "we can stop problems from happening by preventing the programmer from passing a Camera to shootEmUp()." To which I reply: "if that's a bad thing and your programmer is trying to do it, that programmer will find some other way to mess up your program even if you prevent him/her from breaking it here." We learn again and again that it's not possible to prevent people from doing something bad with your programming system no matter how safe you attempt to make it. And there's a boundary beyond which all the "directing" guidance will fail -- a programmer must have a certain level of understanding and be able to buy into a particular language, environment, framework, etc., up to a certain level in order to use those tools properly. Less than that, and they need training, not type-checking.

So to summarize, OO programming allows you to lessen the constraints on a function argument. Interfaces reduce these constraints even more by explicitly separating interface from implementation, and thus allowing you to require only the smallest possible set of operations on the argument, opening up the function's application to a larger set of possible classes. Latent typing simply makes the interfaces implicit by, in effect, writing both the interface part and the implementation part for you. But all these mechanisms reduce the constraints so you can apply your code more easily. (But type checking still happens!)

The Semantic Issue

An entirely separate discussion is "the meaning of an interface." Because there are no operations attached to the collection of signatures in an interface, the only thing that the compiler does when given an interface is to ensure that the signatures conform. That is the intent: the interface is separated from the implementation, thus reducing the coupling between the function that accepts the interface and the object that it acts upon.

But I often hear the argument that "an interface implies a semantic contract." For example, the questioner above states:

In Java, the semantics of a method are not bound to its name, so much as they are bound to the interface the method belongs to.

Perhaps I misunderstand his meaning, but there are no semantics at all "bound to the interface." If semantics can be said to exist in a program, they can only exist in the methods that are part of a class, rather than the signatures that are part of an interface (that is, a type, as distinguished from a class).

The only semantics that are associated with an interface are the ones that you enforce. A class can be an implementation of a particular interface, and that class has its own semantics. You may require that all classes that implement an interface follow a certain set of rules, but you can only enforce those rules using tests that you apply outside of the compiling environment. Your tests may be code walkthroughs or a conformance test framework, but even if you feel strongly that your interface implies a semantic contract, the only contract that is enforced by the language is that any class that implements the interface will include the signatures in that interface.

Again, that's the point: by rigorously decoupling interface and implementation using the interface keyword, Java reduces the constraints on an object that's passed into your function, and thus allows you to apply your function more flexibly.

And we've found through experience that, just like polymorphism in OOP, our anticipated dramatic increase of bugs because of this technology doesn't seem to happen.

Feedback

    Links I Read
Cafe Au Lait
Artima
Daily Python URL
Martin Fowler
Joel on Software
Paul Graham
Cringely
Search     Home     Web Log     Articles     Calendar     Books     CD-ROMS     Seminars     Services     Newsletter     About     Contact     Site Feedback     Site Design     Server Maintenance     Powered by Zope
©2003 MindView, Inc.