As part of
the regular pre-Production testing it is common to conduct Performance tests
for load under Peak and Soak conditions.
So we had
this Weblogic platform on which all Peak tests were good and so also were the
12 hour Soak tests.
The JVM
memory usage graph plotted showed regular GC clearing up memory and no obvious
leaks. A simple graph plotted in Excel showing JVM Used Heap over time is shown
below. It shows the pattern fairly close to an ideal sawtooth.
An additional
test planned was an Extended Soak – mainly letting the system run as per
expected normal volumes over 7 or 10 days to see if the JVM would throw up any
unusual memory usage patterns or memory leaks.
And we came across some interesting issues !!
We did not
have much of a problem for the first 24 and 48 hours – but after 48 hours (2
whole days) of continuous load, the graph showed that the JVM was unable to clear up any memory even after
GC.
As the
graphs above show the Weblogic server JVM inspite of multiple GCs was not
reducing the utilized memory down to the usual 30 - 40% or so. It was staying
at 80% or so until after 7 days the servers went OutOfMemory and just crashed.
So in order
to analyze this we decided to take JVM Heap Dumps and analyze this using
VisualVM or Eclipse MAT.
In this
instance Eclipse MAT which is an easy Eclipse plugin gave an instant feedback and we
were able to narrow down on Leak Suspects and actually find the root cause of
the problem.
I will
elaborate on that in Part 2 of this series.