12-03-2004 JDK 5 Class File Format Puzzle
Jeremy Meyer (who's here from London "sprinting" on the Annotations chapter
with me) and I did a marathon pair programming session yesterday writing a
method to extract the qualified class name (including the package) from a class
file (so that you can load a class using Class.forName() given only
the class file).
We used Bill Venner's "Inside the Java 2 Virtual Machine" (and called Bill at
one point) as a reference, and also found the source code for Chris Rathman's Jasper to be helpful. Here's
what we came up with to extract the name:
import java.io.*;
import java.util.*;
public class ClassNameFinder {
static final int
UTF = 1,
INTEGER = 3,
FLOAT = 4,
LONG = 5,
DOUBLE = 6,
CLASS = 7,
STRING = 8,
FIELD_REF = 9,
METHOD_REF = 10,
INTERFACE_METHOD_REF = 11,
NAME_AND_TYPE = 12;
public static String thisClass(String classFile) {
Map<Integer, Integer> offsetTable =
new HashMap<Integer, Integer>();
Map<Integer, String> classNameTable =
new HashMap<Integer, String>();
try {
DataInputStream data = new DataInputStream(
new BufferedInputStream(
new FileInputStream(classFile)));
int magic = data.readInt(); // 0xcafebabe
int minorVersion = data.readShort();
int majorVersion = data.readShort();
int constant_pool_count = data.readShort();
int[] constant_pool = new int[constant_pool_count];
for(int i = 1; i < constant_pool_count; i++) {
int tag = data.read();
int tableSize;
switch(tag) {
case CLASS:
int offset = data.readShort();
offsetTable.put(i, offset);
break;
case UTF:
int length = data.readShort();
char[] bytes = new char[length];
for(int k = 0; k < bytes.length; k++)
bytes[k] = (char)data.read();
String className = new String(bytes);
classNameTable.put(i, className);
break;
case LONG:
case DOUBLE:
data.readLong(); // discard 8 bytes
// Here's the fix: (see wiki comments)
i++; // Special skip necessary
break;
case STRING:
data.readShort(); // discard 2 bytes
break;
default:
data.readInt(); // discard 4 bytes;
}
}
short access_flags = data.readShort();
int this_class = data.readShort();
int super_class = data.readShort();
String thisClassName =
classNameTable.get(offsetTable.get(this_class));
return thisClassName;
} catch(Exception e) {
throw new RuntimeException(e);
}
}
public static void main(String[] args) {
System.out.println(thisClass(args[0]));
}
}
This follows the class file information that we had, and we seemed to get it
working, but later discovered that it only works on some files. Other files that
appear to include some Java 5 constructs will trip it up. Those same files fail
with Jasper, as well, so it appears that the class file format has changed for
JDK 5. Here's a Sun document that describes it, but
nothing jumped out at us right away and we ran out of steam.
If anyone has insights please post them to the wiki comments. Thanks.
My current fallback position is to first try to use the name of the class
file as the name of the class to load, and if that fails use a regular
expression to look through the class file; I've tried it with this Python
program and it seems to work (and it seems reasonable to assume that the first
name in the package will be all letters, because it will typically be something
like 'com'):
import os, re
for root, dirs, files in os.walk('.'):
for f in [ f for f in files if f.endswith('.class')]:
path = root + os.sep + f
bytes = file(path, "rb").read()
name = f.split('.')[0]
name = name.replace('$', '\$')
qualified = re.findall(
'([a-z]+/[a-z0-9_/]*' + name + ')[^$;]', bytes)
if qualified:
print path
print " ", qualified[0].replace('/', '.')