Skip to content

Improve the code quality of Item and of related code in ClassWriter and ClassReader.

Eric Bruneton requested to merge improve-item-class into master
  • Item was a mutable class with many set* methods, used to set predefined Item key instances in ClassWriter, to perform lookups (while avoiding the creation of new instances for this). These preallocated keys were error prone, due to the nested calls (a nested call could inadvertently modify the key used by a caller).
  • The Javadoc was largely obsolete, refering mostly to the constant pool, while Item was also used for bootstrap methods and for the ASM specific type table.
  • The number of elements in ClassWriter#items was poorly tracked, for instance bootstrap methods were not taken into account in ClassWriter#put.
  • The threshold to limit hash collisions was not always respected (for instance in copyPool, items was created with a size equal to the constant pool count, without margin).
  • Lookups in ClassWriter#get were not optimal: a check on hashcode could be used before calling the expensive isEqualTo method.

Now Item is replaced with Symbol, which is immutable (except for the lazily computed info field). Also the hashCode and next fields are hidden in a private subclass (they are only used inside SymbolTable). The preallocated Item key, key2, ... fields are removed, while keeping the zero allocation property for lookups. The number of elements in entries is now explicitely tracked with entryCount, and the threshold is correctly used. Lookups are now optimized to do a quick hashcode comparison before slow equality tests. All Symbol management is now done is a separate SymbolTable class, allowing a cleaner implementation for the copyPool option (more private fields, less exposure of internal details to other classes).

According to performance tests using benchmarks/read-write, this new code is slightly faster than before (e.g. ~1.05 speedup for R+W with or without copyPool). The jar size is now 99581 bytes vs 95209 for ASM 6.0.

This is a large refactor, please review carefully.

PS: I spent several days on this, and considered several designs before choosing this one. For instance, I also considered splitting SymbolTable in 3 (e.g. ConstantPoolWriter, BootstrapMethodsWriter and TypeTable, all delegating to a shared (custom) HashSet), and/or splitting Symbol using one class per type of symbol, to get something even more clean from an Object Oriented design point of view. But I rejected this as "going too far" from the current design.

Merge request reports