Thursday, April 17, 2014

Understand G1 gc log

Basic flags: -XX:+PrintGCTimeStamps -XX:+PrintGCDetails

G1 prints out various logs, depending on the flags you use.  The basic ones are:
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails

Log example:

39.570: [GC pause (G1 Evacuation Pause) (young), 0.2028160 secs]
   [Parallel Time: 149.2 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 39571.0, Avg: 39571.1, Max: 39571.2, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 0.4, Avg: 0.5, Max: 0.6, Diff: 0.2, Sum: 4.0]
      [Update RS (ms): Min: 5.1, Avg: 6.6, Max: 11.7, Diff: 6.6, Sum: 52.7]
         [Processed Buffers: Min: 1, Avg: 1.5, Max: 4, Diff: 3, Sum: 12]
      [Scan RS (ms): Min: 11.0, Avg: 14.8, Max: 17.8, Diff: 6.9, Sum: 118.7]
      [Code Root Scanning (ms): Min: 2.3, Avg: 3.7, Max: 4.7, Diff: 2.4, Sum: 29.4]
      [Object Copy (ms): Min: 121.1, Avg: 122.0, Max: 122.9, Diff: 1.8, Sum: 975.6]
      [Termination (ms): Min: 0.0, Avg: 0.5, Max: 0.6, Diff: 0.6, Sum: 4.0]
      [GC Worker Other (ms): Min: 0.1, Avg: 0.5, Max: 0.9, Diff: 0.8, Sum: 4.0]
      [GC Worker Total (ms): Min: 148.1, Avg: 148.6, Max: 148.9, Diff: 0.8, Sum: 1188.4]
      [GC Worker End (ms): Min: 39719.3, Avg: 39719.7, Max: 39720.0, Diff: 0.8]
   [Code Root Fixup: 0.0 ms]
   [Code Root Migration: 3.7 ms]
   [Clear CT: 4.6 ms]
   [Other: 45.4 ms]
      [Choose CSet: 0.1 ms]
      [Ref Proc: 0.8 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 43.2 ms]
   [Eden: 7744.0M(7744.0M)->0.0B(7743.0M) Survivors: 448.0M->449.0M Heap: 8199.9M(16.0G)->457.5M(16.0G)]
 [Times: user=1.26 sys=0.01, real=0.20 secs]

Parallel Phases:

39.570: [GC pause (G1 Evacuation Pause) (young), 0.2028160 secs]
At timestamp 39.570, a young gc happened, this young gc lasted 0.2028160 seconds.
[Parallel Time: 149.2 ms, GC Workers: 8]
Out of 202.8ms, 149.2 ms is executed by the parallel gc threads. We have 8 GC threads.
[GC Worker Start (ms): Min: 39571.0, Avg: 39571.1, Max: 39571.2, Diff: 0.2]
The gc threads start timestamp.
[Ext Root Scanning (ms): Min: 0.4, Avg: 0.5, Max: 0.6, Diff: 0.2, Sum: 4.0]
External root scanning takes a average of 0.5 ms. This is the time gc workers to scan the external roots, such as registers, thread stacks, system dictionary.
[Update RS (ms): Min: 5.1, Avg: 6.6, Max: 11.7, Diff: 6.6, Sum: 52.7]
         [Processed Buffers: Min: 1, Avg: 1.5, Max: 4, Diff: 3, Sum: 12]
Update remember set takes a average of 6.6 ms. On average, each thread processed 1.5 buffers. These buffers are completed but not processed by concurrent refinement threads.
[Scan RS (ms): Min: 11.0, Avg: 14.8, Max: 17.8, Diff: 6.9, Sum: 118.7]
Scan RS takes a average of 14.8ms. This is the time to scan the remember set for references into CSet.
[Code Root Scanning (ms): Min: 2.3, Avg: 3.7, Max: 4.7, Diff: 2.4, Sum: 29.4]
Time to scan code root.
[Object Copy (ms): Min: 121.1, Avg: 122.0, Max: 122.9, Diff: 1.8, Sum: 975.6]
Time to copy object from 1 region to the other region.
[Termination (ms): Min: 0.0, Avg: 0.5, Max: 0.6, Diff: 0.6, Sum: 4.0]
Time between the threads are offered to terminate, and when it terminates.
[GC Worker Other (ms): Min: 0.1, Avg: 0.5, Max: 0.9, Diff: 0.8, Sum: 4.0]
Time that can not be contributed to any known categories.
[GC Worker Total (ms): Min: 148.1, Avg: 148.6, Max: 148.9, Diff: 0.8, Sum: 1188.4]
Total parallel time for gc threads.
[GC Worker End (ms): Min: 39719.3, Avg: 39719.7, Max: 39720.0, Diff: 0.8]
Time stamp when gc worker threads end.
For all the parallel gc activities, if big difference is observed between min and max, that indicates the work is not distributed among gc worker threads evenly.

Sequential phases (we may introduce more parallelism later)

[Code Root Fixup: 0.0 ms]
Time for clean up the code root
[Code Root Migration: 3.7 ms]
Time for update the retained code root
[Other: 45.4 ms]
      [Choose CSet: 0.1 ms]
      [Ref Proc: 0.8 ms]
      [Ref Enq: 0.0 ms]
      [Free CSet: 43.2 ms]
Choose CSet: time to decide regions added to CSet
Ref Proc: time spent in processing the deferred references from the prior gc
Ref Enq: time spent in processing the deferred references from the prior gc
Free CSet: time spent freeing the set of regions that have been collected, also includes the time to free their remembered sets.
Note, some times, we see uncounted Other time, especially when 'to-space exhausted' happened. Evacuation failure handling time includes recalculating the used heap, removing self-forward references, and restoring remembered sets.

Heap Summary

[Eden: 7744.0M(7744.0M)->0.0B(7743.0M) Survivors: 448.0M->449.0M Heap: 8199.9M(16.0G)->457.5M(16.0G)]
Eden: used Eden before gc is 7744.0M, the Eden size for this gc is 7744.0M. After gc, the used Eden is 0M, and the eden size for next gc should be 7743M.
Survivors: survivor used before gc is 448.0M, and after gc is 449.0M.
Heap: Heap used before gc is 8199.9M. The heap size before gc is 16.0G. Heap used after gc is 457.5M. Heap size for next gc is 16.0G.
TimeSummary
[Times: user=1.26 sys=0.01, real=0.20 secs]
High system CPU could be swapping, or some other kind of OS/kernel interaction, or activity
Low User CPU relative to real CPU can be an indication of lack of parallelism and root cause should also be identifiable in the “Parallel Time” stats area of the GC log

-XX:+PrintAdaptiveSizePolicy

If this flag is used, G1 will print out more information about how g1 makes the ergonomic decisions.
2.900: [GC pause (G1 Evacuation Pause) (mixed) 2.900: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 21789, predicted base time: 31.52 ms, remaining time: 168.48 ms, target pause time: 200.00 ms] 2.900: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 3 regions, survivors: 9 regions, predicted young region time: 22.64 ms]
 2.900: [G1Ergonomics (CSet Construction) finish adding old regions to CSet, reason: old CSet region num reached max, old: 26 regions, max: 26 regions]
 2.900: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 3 regions, survivors: 9 regions, old: 26 regions, predicted pause time: 96.49 ms, target pause time: 200.00 ms]
 2.927: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 12.52 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
At the beginning of a gc cycle, how G1 decides the regions in the CSet.
The minimum young regions included in cset should include the survivor regions, can keep up with allocation rate. Also the predicted time to process the cards and evacuate the regions should be under the target pause.
2.927: [G1Ergonomics (Concurrent Cycles) do not request concurrent cycle initiation, reason: still doing mixed collections, occupancy: 178257920 bytes, allocation request: 0 bytes, threshold: 120795930 bytes (45.00 %), source: end of GC]
 2.927: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate old regions available, candidate old regions: 49 regions, reclaimable: 28742816 bytes (10.71 %), threshold: 10.00 %]
At the end of gc, continue to do another mixed gc. Because the reclaimable old gen(28742816 bytes (10.72%)) is higher than the G1HeapWastedPercent(10%).
3.019: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 189792256 bytes, allocation request: 2097168 bytes, threshold: 120795930 bytes (45.00 %), source: concurrent humongous allocation]
 3.019: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: requested by GC cause, GC cause: G1 Humongous Allocation]
 3.019: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested]
At the end of gc, we decide to start a concurrent cycle, since the heap occupancy (189792256 bytes) is higher than InitiatingHeapOccupancyPercent(45%).

-XX:+UnlockExperimentalVMOptions -XX:G1LogLevel=finest

This flag prints out more detailed information, such as per gc worker thread.
2.900: [GC pause (G1 Evacuation Pause) (mixed) 2.900: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 21789, predicted base time: 31.52 ms, remaining time: 168.48 ms, target pause time: 200.00 ms] 2.900: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 3 regions, survivors: 9 regions, predicted young region time: 22.64 ms]
 2.900: [G1Ergonomics (CSet Construction) finish adding old regions to CSet, reason: old CSet region num reached max, old: 26 regions, max: 26 regions]
 2.900: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 3 regions, survivors: 9 regions, old: 26 regions, predicted pause time: 96.49 ms, target pause time: 200.00 ms]
 2.927: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 12.52 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
 2.927: [G1Ergonomics (Concurrent Cycles) do not request concurrent cycle initiation, reason: still doing mixed collections, occupancy: 178257920 bytes, allocation request: 0 bytes, threshold: 120795930 bytes (45.00 %), source: end of GC]
 2.927: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate old regions available, candidate old regions: 49 regions, reclaimable: 28742816 bytes (10.71 %), threshold: 10.00 %]
, 0.0271740 secs]
   [Parallel Time: 21.0 ms, GC Workers: 23]
      [GC Worker Start (ms):  2900.3  2900.3  2900.4  2900.4  2900.5  2900.5  2900.6  2900.6  2900.6  2900.6  2900.6  2900.6  2900.7  2900.7  2900.7  2900.8  2900.8  2900.8  2900.9  2900.9  2900.9  2900.9  2901.0
       Min: 2900.3, Avg: 2900.7, Max: 2901.0, Diff: 0.7]
      [Ext Root Scanning (ms):  0.5  0.4  0.4  0.4  0.3  0.3  0.2  0.2  0.2  0.2  0.2  0.2  0.1  0.1  0.1  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
       Min: 0.0, Avg: 0.2, Max: 0.5, Diff: 0.5, Sum: 3.8]
      [Update RS (ms):  3.8  3.9  3.6  3.8  3.7  3.9  4.0  3.8  4.3  4.2  4.1  4.2  3.7  3.6  3.9  4.0  3.9  3.9  3.6  3.9  4.1  3.7  3.5
       Min: 3.5, Avg: 3.9, Max: 4.3, Diff: 0.8, Sum: 89.0]
         [Processed Buffers:  4  3  5  3  5  3  2  4  4  3  4  6  5  3  4  3  3  3  3  4  4  6  3
          Min: 2, Avg: 3.8, Max: 6, Diff: 4, Sum: 87]
      [Scan RS (ms):  1.4  1.3  1.6  1.4  1.5  1.3  1.2  1.4  0.9  1.0  1.1  1.1  1.4  1.6  1.2  1.1  1.3  1.4  1.5  1.2  1.0  1.3  1.5
       Min: 0.9, Avg: 1.3, Max: 1.6, Diff: 0.7, Sum: 29.6]
      [Code Root Scanning (ms):  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
       Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms):  13.4  13.4  13.4  13.4  13.4  13.4  13.4  13.4  13.4  13.4  13.4  13.4  13.5  13.4  13.5  13.4  13.4  13.3  13.4  13.4  13.4  13.4  13.5
       Min: 13.3, Avg: 13.4, Max: 13.5, Diff: 0.1, Sum: 308.7]
since the number of gc workers is 23, it prints out the time statistics for each thread:
[GC Worker Start (ms):  2900.3  2900.3  2900.4  2900.4  2900.5  2900.5  2900.6  2900.6  2900.6  2900.6  2900.6  2900.6  2900.7  2900.7  2900.7  2900.8  2900.8  2900.8  2900.9  2900.9  2900.9  2900.9  2901.0
       Min: 2900.3, Avg: 2900.7, Max: 2901.0, Diff: 0.7]
thread-1 starts at 2900.3
thread-2 starts at 2900.3
...
the minimum thread start time stamp is 2900.3.
the average thread start time stamp is 2900.7.
the latest thread start time stamp is 2901.0.