Tuesday, October 21, 2008

Tweaks to get maximum Heap size with Sun JDK (Windows)

- Does your Java process using Sun JRE fails with -Xmx1200m on Windows 32-bit OS but you want more ? Here is some explaination on why it happens and how to fix it.

32-bit Windows by-default provides 2 GB of Virtual Address Space for each process; this is independent of the physical RAM in the system. When a process starts executing it loads process binary and other resources like DLLs at different addresses within the 2 GB Virtual Address Space.
Sun JVM Heap requires that a contiguous block of Virtual Address space be reserved (I guess it is for some GC optimizations) and the size of this contiguous block depends on other resources loaded by the JVM.

Following screen-shot of Process Explorer for a simple Java process helps in explaining this concept better. A Java process is run with following command-line
  java -Xshare:on -Xmx500m MemTest




Now we search for big contiguous chunks of memory and find two of them between
 a) java.exe(0x400000) and classes.jsa (0x2AB80000) = 679 MB
 b) classes.jsa(0x2C03B000) and lpk.dll(0x629C0000) = 873 MB

From these figures we see that if sharing mode is ON, it bisects the contiguous memory, greatly reducing the max heap size. If I modify the command-line to use -Xmx900m, the process will fail to start. 

DLLs should also be watched out for such issues. From my experience some of the DLLs from older service packs from Microsoft would load around 0x4000000, which reduced the max heap size possible.

Lastly, if your Java application is deployed at customer-site, then some apps like anti-virus, etc tend to inject their DLLs in every running process. They can also cause such issue.

What to do for fixing this up
a) use -Xshare:off (sharing can be ON by default, so add this explicitly)
b) if some DLL is loading after 0x400000 but before 0x6000000, find a work-around for it (some missing hotfix, rebase DLL, etc)

External Resources:

Saturday, October 18, 2008

JVM thread dump: simple yet effective Profiler

Most people working with Java would already know uses of Thread-Dump for
a) debugging deadlocks or
b) see if a thread is still alive

There is another important use which i was reminded by my colleague yesterday while we were debugging a performance issue at a customer deployment. We had a theory based on logs that a particular piece of code is making the processing slow but we were still not able to pin-point the real cause. We wanted to try it in-house using Yourkit profiler but that would have taken few hours to reproduce the customer issue. At that point my colleague reminded that few Thread-Dumps taken 10-15 seconds apart can tell which part of code was consuming lot of time because there was high probability that this code would be found running in majority of these Thread-Dumps.

Luckily, we had also asked for a Thread-Dump from customer along with logs and within 2 minutes the root-cause of slow processing was found using that Thread-Dump. The slow processing was due to extra database hits from a feature which was excessively used and it wasn't optimized for such purpose.

There are various ways to generate thread-dump, some of them are OS specific or have some constraints. For web application, the most useful way I find is to embed a JSP which can be invoked from the browser. For enterprise application, this helps customer to invoke it easily and send along with logs for debugging purpose. Code for the JSP I use is available here. (Note: This JSP doesn't display the lock information which is available by other means of generating Thread-dump).

External Resources: