Java memory profiling with jmap and jhat

By: on March 8, 2006

My colleagues and I have just spent over a week tracking down a repeated OutOfMemoryError in a fairly complex web application. In the process we looked at the jmap and jhat memory profiling tools for the JVM.

Starting with Java 1.5, Sun has been shipping a neat tool called jmap which allows you to attach to any 1.5 JVM and obtain heap layout information, class histograms and complete heap snapshots. The neat thing is that you don’t have to configure the JVM with any special options, and that it therefore runs exactly as during normal operation. This is important since it makes analysis much more predictable.

Obtaining information on the heap layout is a near-instantaneous operation, so it doesn’t slow down execution. By contrast, taking class histograms and heap snapshots can take considerable time, during which the execution of application code is stalled. For example, a snapshot of a ~700MB heap took nearly half an hour to complete on a 3GHz dual-Xeon Redhat box.

Btw, it took a while to figure out the correct options for taking a heap snapshot:

jmap -heap:format=b

This will produce a file called “heap.bin” in the current directory. The option is not mentioned in the jmap docs, though running

jmap -help

does list it.

Taking a heap snapshot is all very well, but what do you do with it? That’s when we ran into problems. There seem to be hardly any tools out there than can read the binary heap dump format. Sun ships one as part of JDK 1.6 (Mustang), called
jhat, of which an older version is available for
earlier JVMs. The tool is run like this:

jhat -J-mx768m -stack false heap.bin

which sets the memory available to jhat to a value just above the size of the heap to be analysed and suppresses tracking of object allocations (leaving that on just seems to result in thousands of warnings).

On the surface, jhat looks quite promising – it allows you to traverse the heap, display histograms etc – all via a web browser interface. However, on many occasions jhat failed to be able to successfully read our heap dumps, failing with the error

java.io.IOException: Unrecognized heap dump sub-record type: 254
at com.sun.tools.hat.internal.parser.HprofReader.readHeapDump(HprofReader.java:505)
at com.sun.tools.hat.internal.parser.HprofReader.read(HprofReader.java:223)
at com.sun.tools.hat.internal.parser.Reader.readFile(Reader.java:91)
at com.sun.tools.hat.Main.main(Main.java:143)

The value in the first line varies depending on the snapshot, but the rest of the error is always the same. Having spent hours waiting for a process to reach a critical state, and another half an hour taking the heap snapshot, it is more than a little frustrating if all you have to show for at the end is a few hundred megs of useless heap dump data.

Another problem with jhat is that the analysis capabilities are very basic. When one is analysing heaps with millions of objects, one desperately needs tools that can group information in sensible ways. For example, it would be incredibly useful to select a class and get a percentage breakdown of what classes refer to how many instances of the chosen class, and then extend this view all the way back to the root set. jhat just doesn’t have anything like that.

Still, even the basic analysis did point us into some useful directions of enquiry in tracking down our memory leak. In fact, even just looking at the class histograms produced by jmap – no need to take a full heap snapshot and fire up jhat – produced good clues. The final insight came from another one of jmap’s outputs – the heap layout information. This pointed out that we were running out of space in the “permanent generation” – even though we had set its limit to an unusually high value. There isn’t much information around on the role of the permanent generation. Our theory is that it is a space in which objects are allocated that the JVM believes are very unlikely to ever be garbage collected. The permanent generation still takes part in a full GC, but unlike the old generation there is probably no copying and compacting going on. Anyway, the problem turned out to be that we were calling into Jython in a way that resulted in it associating a few megabytes of Java reflection information with individual threads. Tomcat is using several hundred threads and, more importantly, is pooling them. Do the maths.

FacebookTwitterGoogle+

18 Comments

  1. Alan says:

    The jmap -heap:format=b option which you found is not a documented option in J2SE 5.0. It was added in a 5.0 update to allow for the recovery of heap dump from a core file or hung process. As you noted, it takes considerable time to take a snapshot of a running process. If you download Mustang (Java SE 6) you’ll find that jmap gets a new option called -dump which will do what you want. jmap -dump:file= will write a snapshot of the heap to the specified file. It is relatively quick (a 700MB heap can typically be dumped to a file in 15-30 seconds depending on the speed of your system and disk). Also you’ll find that jmap -histo is very fast and will not disrupt a production system. The heap dump generated by the new -dump option will be parsable by jhat and you should not see any of the “Unrecognized heap dump…” and other errors which are observed when you recover a heap dump from a core file or process snapshot. As you noted, jhat is really only intended for rudimentary analysis but you can use the OQL interface to create your own queries. In time you can expect to see other profilers adding options to import these heap dumps.

  2. James Tauber says:

    So did you work out how to prevent the problem with Jython and Tomcat? I’m seeing the same thing.

  3. david says:

    In our application, we were creating new system states in order to support different class loaders and python paths, for every request. The solution was simply to cache these – so there was only a single instance for each class loader.

    In fact we did this only indirectly – by keeping the python objects themselves, and invoking methods on them, rather than executing scripts for each request. This is much faster anyway.

    If the web application reloads you will still be left with unused system states attached to each thread, until all the new threads have been used in your application. This might still be a problem, depending on the servlet containers behaviour and the number of system states you need.

    A better (and easier) way is to make sure a servlet never leaks a system state:

        PySystemState old = Py.setSystemState(PyStateLoader.PYTHON_SYSTEM_STATE);
        try {
            ...
        }
        finally {
            Py.setSystemState(old);
        }

    The first line will actually create the default system state the first time its called in the VM, and this will end up attaching that singleton state to all the threads.

    You could do this with a filter on you whole application.

  4. Henry Goff says:

    Having exactly the same problem as matthias. JHAT from JDK 1.6 can read heap dump xxx.hprof created automatically by JVM on OutOfMemoryError, but it can not read heap.bin generated by JMAP with -heap:format=b. Same with Yourkit – it can read .hprof dump but not heap.bin. Also tried jmap from JDK 1.6 to connect to server runing under JDK 1.5.0_07 but jmap from 1.6 can not connect to older JVMs.

  5. Philippe Lantin says:

    Workaround for buggy/unusable jhat heap format behavior:

    1) grab the core using gcore or similar
    2) generate an hprof binary from the core
    3) load the hprof binary using jhat from jdk16 beta 1 (beta2 proved to be buggy)

    I am able to analyze cores from 1.4.2 vm’s using jhat this way.

  6. Aaron Boyd says:

    Looks like what Phillippe suggest for 1.4.2 vm’s will only work on Solaris 1.4.2 vm’s because jmap was only backported to 1.4 for Solaris.

    http://java.sun.com/j2se/1.4.2/ReleaseNotes.html

    Anyone managed to turn a Linux 1.4.2 core into an hprof binary? Any response very much appreciated.

  7. Nirmal says:

    JHAT from JDK 1.6 can read heap dump xxx.hprof created automatically by JVM on OutOfMemoryError, but it can not read heap.bin generated by JMAP with -heap:format=b.

    More ever I am using SAP Memory Analyzer which also cannot read heap.bin generated by JMAP with -heap:format=b.

    I need heap dumps at different points or scenarios while the application runs to identify memory related bottlenecks i.e. Heap Dump on Demand.

    Is there another way to take the “Heap Dump on Demand” using JDK 1.5 so that I can use that heap dump in SAP Memory Analyzer?

    I have to use jdk 1.5 only….

    Thanks & Regards,
    Nirmal

  8. Vlad says:

    To read heap.bin file with JHAT or SAP Memory Analyzer all you need to do is rename heap.bin to heap.hprof ( no really), or run jmpa specifying the file name, e.g.:

    jmpa -F -dump:format=b,file=/tmp/heap.hprof

  9. Vlad says:

    Sorry jmap not jmpa…

  10. Vlad says:

    By the way SAP Memory Analyzer is a very useful tool for analysing heap dumps – it is definitely worth using if you have heap problems (and its free).

  11. aschwin says:

    What would you guys is the typically run time for JMAP?
    running jdk1.6.0_16. I am trying to see what is happening to my in Tomcat deployed webapps. It is taking forever to run, doesn’t seem right to me
    Do I need to close every other program. I am just running on localhost Windows Vista here. I was just trying out this tool, because I want to use it for Memory issues on our production Tomcat server :-(

  12. Trejkaz says:

    Seconding SAP Memory Analyzer. I hadn’t heard of this app before today, and it’s incredible. I was skeptical of its feature where it finds the memory leak for you, but indeed, it found it! And it actually loaded a 2GB heap without any issues, unlike jhat.

Post a comment

Your email address will not be published.

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>