Java memory profiling with jmap and jhat
My colleagues and I have just spent over a week tracking down a repeated OutOfMemoryError in a fairly complex web application. In the process we looked at the jmap and jhat memory profiling tools for the JVM.
Starting with Java 1.5, Sun has been shipping a neat tool called jmap which allows you to attach to any 1.5 JVM and obtain heap layout information, class histograms and complete heap snapshots. The neat thing is that you don’t have to configure the JVM with any special options, and that it therefore runs exactly as during normal operation. This is important since it makes analysis much more predictable.
Obtaining information on the heap layout is a near-instantaneous operation, so it doesn’t slow down execution. By contrast, taking class histograms and heap snapshots can take considerable time, during which the execution of application code is stalled. For example, a snapshot of a ~700MB heap took nearly half an hour to complete on a 3GHz dual-Xeon Redhat box.
Btw, it took a while to figure out the correct options for taking a heap snapshot:
jmap -heap:format=b <pid>
This will produce a file called “heap.bin” in the current directory. The option is not mentioned in the jmap docs, though running
does list it.
Taking a heap snapshot is all very well, but what do you do with it? That’s when we ran into problems. There seem to be hardly any tools out there than can read the binary heap dump format. Sun ships one as part of JDK 1.6 (Mustang), called jhat, of which an older version is available for earlier JVMs. The tool is run like this:
jhat -J-mx768m -stack false heap.bin
which sets the memory available to jhat to a value just above the size of the heap to be analysed and suppresses tracking of object allocations (leaving that on just seems to result in thousands of warnings).
On the surface, jhat looks quite promising – it allows you to traverse the heap, display histograms etc – all via a web browser interface. However, on many occasions jhat failed to be able to successfully read our heap dumps, failing with the error
java.io.IOException: Unrecognized heap dump sub-record type: 254 at com.sun.tools.hat.internal.parser.HprofReader.readHeapDump(HprofReader.java:505) at com.sun.tools.hat.internal.parser.HprofReader.read(HprofReader.java:223) at com.sun.tools.hat.internal.parser.Reader.readFile(Reader.java:91) at com.sun.tools.hat.Main.main(Main.java:143)
The value in the first line varies depending on the snapshot, but the rest of the error is always the same. Having spent hours waiting for a process to reach a critical state, and another half an hour taking the heap snapshot, it is more than a little frustrating if all you have to show for at the end is a few hundred megs of useless heap dump data.
Another problem with jhat is that the analysis capabilities are very basic. When one is analysing heaps with millions of objects, one desperately needs tools that can group information in sensible ways. For example, it would be incredibly useful to select a class and get a percentage breakdown of what classes refer to how many instances of the chosen class, and then extend this view all the way back to the root set. jhat just doesn’t have anything like that.
Still, even the basic analysis did point us into some useful directions of enquiry in tracking down our memory leak. In fact, even just looking at the class histograms produced by jmap – no need to take a full heap snapshot and fire up jhat – produced good clues. The final insight came from another one of jmap’s outputs – the heap layout information. This pointed out that we were running out of space in the “permanent generation” – even though we had set its limit to an unusually high value. There isn’t much information around on the role of the permanent generation. Our theory is that it is a space in which objects are allocated that the JVM believes are very unlikely to ever be garbage collected. The permanent generation still takes part in a full GC, but unlike the old generation there is probably no copying and compacting going on. Anyway, the problem turned out to be that we were calling into Jython in a way that resulted in it associating a few megabytes of Java reflection information with individual threads. Tomcat is using several hundred threads and, more importantly, is pooling them. Do the maths.