JVM Garbage Collection Tuning — G1 vs ZGC vs Parallel

Java garbage collection got dramatically better over the last decade. For a backend service started today, picking the right GC takes 30 seconds and prevents a class of latency problems teams used to spend weeks on. This article covers the practical decision.

The three collectors worth knowing

G1 (Garbage-First) — the default since Java 9. Balances latency and throughput. Region-based, incremental collection. Target pause times typically 50-200ms, capped by MaxGCPauseMillis.

ZGC — concurrent collector, scales to heaps of TB+. Pause times measured in microseconds (usually < 1ms). Pays ~5-15% in throughput vs parallel. Generational since Java 21.

Parallel (ParallelGC) — throughput-first. Longer stop-the-world pauses (hundreds of ms to seconds) in exchange for highest raw throughput. Rarely the right choice for user-facing workloads; great for batch jobs.

ShenandoahGC is also concurrent and similar to ZGC in spirit; less adopted in enterprise Java. The practical competition in 2025 is G1 vs ZGC.

The decision in one paragraph

For most user-facing services: G1 if heap < 8GB, ZGC if heap > 8GB or tail latency matters. For batch jobs and offline processing: ParallelGC. If unsure: G1.

G1 — the safe default

Spring Boot services, heaps up to 8GB, normal load: G1 does the job. Configuration usually minimal:

-Xms4g -Xmx4g
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200

-Xms = -Xmx prevents heap resizing pauses. MaxGCPauseMillis=200 tells G1 to target 200ms pauses. It usually achieves this on reasonable loads.

Common problems and fixes:

Long pauses — heap too small, too much allocation; increase -Xmx or find allocation hotspots
Full GC happening — G1 fell behind concurrent marking; -XX:ConcGCThreads may help, or increase heap
Humongous objects — allocations > half region size. Either fix the code or increase -XX:G1HeapRegionSize

For heaps < 4GB, G1 is practically invisible. Above 8GB, pause times creep up. That’s when ZGC earns its place.

ZGC — when you need it

Large heaps, latency-sensitive workloads:

-Xms16g -Xmx16g
-XX:+UseZGC
-XX:+ZGenerational

Since Java 21, generational ZGC is the default variant — dramatically better throughput than the original. For a service with heap > 8GB that cares about p99 latency, ZGC is almost always the right choice in 2025.

Trade-offs:

~5-10% lower raw throughput than G1
Memory overhead ~5-10% higher
Can briefly use more CPU during heavy allocation

For user-facing traffic, those costs are usually fine. For batch processing, they’re wasted.

Parallel — for specific cases

-XX:+UseParallelGC

Right for:

Batch jobs where total throughput is the metric
Background workers with no user-facing latency
Small heaps where stop-the-world pauses are brief enough

Not for:

User-facing services (pauses > 100ms are noticeable)
Heaps > 8GB (pauses can exceed seconds)

Heap sizing

Separate decision from collector choice:

Minimum heap — enough for the application’s working set + overhead
Container-aware — use -XX:MaxRAMPercentage=75 to let the JVM size based on container memory

Wrong heap size is a bigger problem than wrong collector. Too-small heap → GC churning → everything slow. Too-large heap → long GC pauses (for G1), wasted memory.

Start with 75% of container memory as max heap, adjust based on actual usage.

Metrics to watch

With Micrometer’s JvmGcMetrics:

GC pause duration (p99) — climbing means you need ZGC or more heap
GC overhead % — time spent in GC as fraction of wall clock; should be < 5%
Allocation rate — bytes allocated per second; sustained high rate → allocation hotspot
Heap used after GC — trend upward = leak

Without these, GC is invisible until it’s breaking prod.

Flight recorder

JFR is free diagnostic data, always on:

-XX:StartFlightRecording=name=prod,filename=/var/log/jfr/rec.jfr,maxage=1h,maxsize=200m,settings=profile

When GC misbehaves, the JFR recording shows exactly which allocations are heaviest, which threads are waiting, which GC phases took time. Priceless diagnostic when you need it.

Things I’ve seen mess up

Large strings or byte arrays in caches. Huge allocation churn. Fix: stream, don’t materialize.

JSON deserialization on hot paths creating millions of temporary objects. Cache parsed objects or use a faster parser.

Protobuf messages held longer than needed. Pin objects in memory; promote to old gen; GC pressure.

Oversized heaps. -Xmx32g “just in case” means old-gen GC takes forever. Right-size.

Collector choice wrong for workload. Parallel on user-facing = long pauses. ZGC on small heap = unnecessary overhead.

Generational ZGC specifics

Before Java 21, ZGC was non-generational — all objects went through the same collection cycle, making it slower than G1 on throughput. Generational ZGC (JEP 439, final in 21) handles young objects separately, closing most of the gap.

In 2025, if you’re on Java 21+ and picking ZGC, enable -XX:+ZGenerational. It’s not yet the default but will be.

Closing note

GC tuning in 2025 is mostly “pick the right collector and size the heap correctly.” Deep tuning of specific GC flags is almost never necessary. Start with G1 + -Xms=-Xmx + container-aware sizing. Measure latency. If p99 matters and heap is large, switch to ZGC. If throughput is all and pauses are tolerable, ParallelGC. Move on — the 2010s era of obsessive GC tuning is mostly behind us.