2023-01-28 08:55:05 +00:00
|
|
|
---
|
|
|
|
title: "What is GC? How does it work in JVM?"
|
2023-02-09 09:04:56 +00:00
|
|
|
date: 2023-01-28T20:15:10+11:00
|
|
|
|
draft: false
|
2023-01-28 08:55:05 +00:00
|
|
|
showSummary: true
|
|
|
|
summary: "Tuning Garbage Collection on Minecraft servers is something of an arcane art that few people seem to understand
|
|
|
|
at a fundamental level. Let's remedy that. We'll start at fundamentals of GC, the JVM, and use that to form the
|
|
|
|
foundations of our GC choice and how we tune it."
|
|
|
|
series:
|
|
|
|
- "JVM, GC, and Minecraft"
|
|
|
|
series_order: 1
|
|
|
|
---
|
|
|
|
|
|
|
|
# Part One: What is GC? How does it work?
|
|
|
|
|
|
|
|
> **(Tammy)** To preempt any fears you may have, Ashe has assured us that this will be written at a level where someone
|
|
|
|
> who just wants to run their minecraft server well will understand.
|
|
|
|
>
|
|
|
|
> **(Ashe)** And for those of you who do have a technical understanding of Java, and/or sysadmin, I hope this deepens
|
|
|
|
> your understanding of the topics at hand.
|
|
|
|
>
|
|
|
|
> **(Doll)** Will this one be able to understand?
|
|
|
|
>
|
|
|
|
> **(Ashe)** That's the idea, yeah.
|
|
|
|
>
|
|
|
|
> **(Doll)** YAY!
|
|
|
|
|
|
|
|
## What is GC?
|
|
|
|
|
|
|
|
GC is an initialism for Garbage Collection, and refers to the process of a program (or operating system, or anything
|
|
|
|
else) freeing unused memory for re-use. We don't need to go super in-depth here,
|
|
|
|
except to say that the garbage collection alorithm and mechanism will determine when and how garbage collection is done,
|
|
|
|
and what limitations there are to GC.
|
|
|
|
|
|
|
|
## Why do I care?
|
|
|
|
|
|
|
|
The choice of garbage collection algorithm determines a lot about how the underlying program will behave.
|
|
|
|
Broadly, there are three categories of garbage collection algorithms:
|
|
|
|
|
|
|
|
- Stop-the-world
|
|
|
|
- Concurrent
|
|
|
|
- Hybrid
|
|
|
|
|
|
|
|
### Stop-the-world garbage collection
|
|
|
|
|
|
|
|
{{< hover "STW" >}} Stop-the-world {{< /hover >}} GC algorithms, as the name suggests, halt program execution while
|
|
|
|
freeing memory. Often, they will also compact live objects as there's no risk of a reference changing
|
|
|
|
while program execution is paused.
|
|
|
|
|
|
|
|
The Serial GC in JVM (which is the default in many implementations) is an example of a stop-the-world GC algorithm.
|
|
|
|
|
|
|
|
{{< hover "Stop-the-world" >}}STW{{< /hover >}} garbage collectors are ideal when short pauses in program execution
|
|
|
|
are acceptable and the running program should have the lion's share of resources.
|
|
|
|
|
|
|
|
### Concurrent garbage collection
|
|
|
|
|
|
|
|
Concurrent garbage collectors *do not* pause program execution while doing clean-up, and as such run on a thread
|
|
|
|
parallel to program execution. Concurrent algorithms cannot make the same assumptions as stop-the-world algorithms
|
|
|
|
about program execution and as such must go to much more effort to ensure that a reference is stale or dead before
|
|
|
|
cleaning it up.
|
|
|
|
As such, they tend to be much more CPU and RAM intensive in exchange for never forcing the program to stop.
|
|
|
|
|
|
|
|
Concurrent GC algorithms are ideal when program pauses are unacceptable (such as in realtime or networked applications),
|
|
|
|
and when some inefficiency in CPU and RAM usage is an acceptable price to pay for this.
|
|
|
|
|
|
|
|
The {{< hover "Z Garbage Collector" >}}ZGC{{< /hover >}} in Java is an example a fully concurrent garbage collector.
|
|
|
|
|
|
|
|
### Hybrid
|
|
|
|
|
|
|
|
Hybrid garbage collectors are, as the name suggests, some hybrid of the above.
|
|
|
|
They will do some of their work concurrently, but do require {{< hover "Stop-the-world" >}}STW{{< /hover >}}
|
|
|
|
pauses to perform some of their functions.
|
|
|
|
|
|
|
|
The Concurrent-Mark-and-Sweep collector (deprecated since JVM 9, removed in JVM 14) is an example of a hybrid GC.
|
|
|
|
It marks memory as stale concurrently using a small amount of the available threads,
|
|
|
|
and then stops-the-world to do collection and compaction.
|
|
|
|
|
|
|
|
The {{< hover "Garbage-First Garbage Collector" >}}G1GC{{< /hover >}}, which replaced the Concurrent-Mark-and-Sweep
|
|
|
|
collector, is another example.
|
|
|
|
|
|
|
|
### A secret fourth thing
|
|
|
|
|
|
|
|
> **(Tammy)** Wait. Could you... simply not?
|
|
|
|
>
|
|
|
|
> **(Ashe)** Good catch. Yes! It's even useful sometimes!
|
|
|
|
|
|
|
|
You could simply *not* collect garbage and allow the program to continue using more and more memory until
|
|
|
|
it either terminates, or inevitably runs out of resources and crashes.
|
|
|
|
|
|
|
|
> **(Lorelai)** You only need last long enough to fulfill your purpose. I can see the appeal.
|
|
|
|
|
|
|
|
Right. The appeal here is that you get the full processing power of the machine, *and* you never have to stop the world
|
|
|
|
to collect garbage; it's the best of both worlds. However, this also means that you're either relying on the programmer
|
|
|
|
to clean up their own garbage (in languages that allow this), or simply allowing the program to use more
|
|
|
|
and more memory until the host runs out of available memory and the program crashes.
|
|
|
|
For our purposes today, it is worth noting that disabling the garbage collector in the JVM means that
|
|
|
|
no garbage collection will be done, and, as such no memory will ever be freed,
|
|
|
|
since Java lacks explicit destruction of objects.
|
|
|
|
|
|
|
|
But Lorelai has the right idea. This is the approach used by some stock-trading software,
|
|
|
|
where any delay is unacceptable,
|
|
|
|
and the software has a bounded lifecycle. It needs to run for 8 hours every weekday. As such, so long as it doesn't crash
|
|
|
|
within those 8 hours, it can do whatever.
|
|
|
|
|
|
|
|
> **(Ashe)** As an aside, this is how we came across the phrase "Giant mainframes with 256GB of RAM running JVM 8,
|
|
|
|
crashing every 9 hours, all according to keikaku[^1]"
|
|
|
|
|
|
|
|
[^1]: Keikaku means plan.
|
|
|
|
|
|
|
|
## The JVM and Garbage Collection
|
|
|
|
|
|
|
|
To understand the particularities of the various GCs available in modern Java, we'll have to take a small digression into
|
|
|
|
how the Java Virtual Machine itself works. The Java Virtual Machine is precisely that: a virtual machine.
|
|
|
|
It has its own instruction set, memory management, everything. It is in most ways a full architecture[^2].
|
|
|
|
As such, the Java compiler compiles Java source code into JVM bytecode, which the virtual machine then executes.
|
|
|
|
Java achieves its ability to "run on anything, including toasters" by publishing
|
|
|
|
{{< hover "Java Runtime Environments" >}}JREs{{< /hover >}} for many different underlying architectures.
|
|
|
|
|
|
|
|
[^2]: As an interesting side note, there are even some SOCs that can run Java bytecode natively.
|
|
|
|
|
|
|
|
Java as a programming language lands squarely in the Object-Orientated camp,
|
|
|
|
but makes several critical diversions from C and friends to make itself easier to port to different architectures:
|
|
|
|
|
|
|
|
- No exposed pointers
|
|
|
|
- No explicit free/destroy
|
|
|
|
|
|
|
|
### No Exposed pointers
|
|
|
|
|
|
|
|
Java as a language refuses to expose raw pointers to the programmer. It does this to make porting JVM to
|
|
|
|
different architectures with potentially wildly different memory architectures much easier, as the programmer can never
|
|
|
|
refer to a pointer by address or perform pointer arithmetic. This means that the particular address of any particular
|
|
|
|
object can change at any time. This makes garbage-collection significantly easier, as it means that the GC algorithm
|
|
|
|
can copy live data at any time, which makes memory compaction much easier.
|
|
|
|
|
|
|
|
### No explicit free/destroy
|
|
|
|
|
|
|
|
The JVM does not provide any mechanism to explicitly free or destroy an object.
|
|
|
|
Even `System.GC()` is nothing more than a suggestion.
|
|
|
|
|
|
|
|
This is important as it takes a lot of control away from the programmer and grants it to the runtime instead.
|
|
|
|
The person running the program can determine the GC algorthim, how often it runs, what its runtime targets are,
|
|
|
|
and so on, based on the particular performance characteristics that they need. One could argue that this could
|
|
|
|
equally be done by the programmer or designer of the program, but in doing this, JVM again gains a lot of portability,
|
|
|
|
as it means that the sole arbiter of when an object is freed is the runtime.
|
|
|
|
|
|
|
|
## How does this relate to Minecraft?
|
|
|
|
|
|
|
|
> **(Doll)** But Miss Ashe! How does this help us get all the dolls on one biiiig Minecraft Server?
|
|
|
|
|
|
|
|
I'm glad you asked!
|
|
|
|
|
|
|
|
An interesting note here is that *because* the JVM abstracts away memory management from the programmer, it is up to
|
|
|
|
the deployment to choose the correct GC and parameters to ensure that their program performs the way that it is
|
|
|
|
intended to. Many programs ship with relatively sensible defaults, however Minecraft (and especially modded Minecraft)
|
|
|
|
was notoriously bad about this. There was some folk wisdom back in the day the suggested using
|
|
|
|
the {{< hover "Concurrent-Mark-and-Sweep" >}}CMS{{< /hover >}} garbage collector along side some sensible,
|
|
|
|
but unexplained, parameters that seemed to perform well, but this forum post seems to either have been lost to bit-rot,
|
|
|
|
or is too well-hidden for our google-fu.
|
|
|
|
|
|
|
|
In more modern times, various launchers have converged on some relatively sane defaults, using CMS at first
|
|
|
|
(as it performed best with Minecraft's workload), and moving to G1GC more recently, as Minecraft itself moved to using
|
|
|
|
JDK versions where CMS was deprecated (or even gone!).
|
|
|
|
|
|
|
|
PolyMC currently ships the following on all of their servers:
|
|
|
|
```bash
|
|
|
|
# Xmx and Xms set the maximum and minimum RAM usage, respectively.
|
|
|
|
# They can take any number, followed by an M or a G.
|
|
|
|
# M means Megabyte, G means Gigabyte.
|
|
|
|
# For example, to set the maximum to 3GB: -Xmx3G
|
|
|
|
# To set the minimum to 2.5GB: -Xms2500M
|
|
|
|
# A good default for a modded server is 4GB.
|
|
|
|
|
|
|
|
-Xms4G
|
|
|
|
-Xmx6G
|
|
|
|
-XX:+UseG1GC
|
|
|
|
-XX:+ParallelRefProcEnabled
|
|
|
|
-XX:MaxGCPauseMillis=200
|
|
|
|
-XX:+UnlockExperimentalVMOptions
|
|
|
|
-XX:+DisableExplicitGC
|
|
|
|
-XX:+AlwaysPreTouch
|
|
|
|
-XX:G1NewSizePercent=30
|
|
|
|
-XX:G1MaxNewSizePercent=40
|
|
|
|
-XX:G1HeapRegionSize=8M
|
|
|
|
-XX:G1ReservePercent=20
|
|
|
|
-XX:G1HeapWastePercent=5
|
|
|
|
-XX:G1MixedGCCountTarget=4
|
|
|
|
-XX:InitiatingHeapOccupancyPercent=15
|
|
|
|
-XX:G1MixedGCLiveThresholdPercent=90
|
|
|
|
-XX:G1RSetUpdatingPauseTimePercent=5
|
|
|
|
-XX:SurvivorRatio=32
|
|
|
|
-XX:+PerfDisableSharedMem
|
|
|
|
-XX:MaxTenuringThreshold=1
|
|
|
|
```
|
|
|
|
|
|
|
|
If this is all Greek to you, don't worry! We'll go over this in our next post and see where we can go from here.
|
|
|
|
|
|
|
|
Suffice it to say there's plenty of room for improvement here without degrading performance, especially if your
|
|
|
|
workload differs significantly from what these defaults assume.
|
|
|
|
|
|
|
|
## Next Time!
|
|
|
|
|
|
|
|
This is probably a good place to call it for today. While we haven't gotten into actually tuning Minecraft GC,
|
|
|
|
I think this forms a good groundwork for understanding why we're going to make certain decisions down the line.
|
|
|
|
|
|
|
|
So we've gone over how and why the JVM manages memory, so next time, we'll focus on the various GCs that modern JREs
|
|
|
|
ship with, and what their various advantages and disadvantages are. If you'd like a head-start,
|
|
|
|
the [Oracle GC-Tuning Documentation](https://docs.oracle.com/en/java/javase/17/gctuning/preface.html)
|
|
|
|
is extremely thorough!
|
|
|
|
|
|
|
|
> **(Tammy)** Okay, I managed to keep up with all of that, nice work.
|
|
|
|
>
|
|
|
|
> **(Tess)** Was there ever any doubt?
|