add first GC post
parent
526f5b2adc
commit
c963ef30c3
|
@ -0,0 +1,212 @@
|
|||
---
|
||||
title: "What is GC? How does it work in JVM?"
|
||||
date: 2023-01-22T22:22:22+11:00
|
||||
draft: true
|
||||
showSummary: true
|
||||
summary: "Tuning Garbage Collection on Minecraft servers is something of an arcane art that few people seem to understand
|
||||
at a fundamental level. Let's remedy that. We'll start at fundamentals of GC, the JVM, and use that to form the
|
||||
foundations of our GC choice and how we tune it."
|
||||
series:
|
||||
- "JVM, GC, and Minecraft"
|
||||
series_order: 1
|
||||
---
|
||||
|
||||
# Part One: What is GC? How does it work?
|
||||
|
||||
> **(Tammy)** To preempt any fears you may have, Ashe has assured us that this will be written at a level where someone
|
||||
> who just wants to run their minecraft server well will understand.
|
||||
>
|
||||
> **(Ashe)** And for those of you who do have a technical understanding of Java, and/or sysadmin, I hope this deepens
|
||||
> your understanding of the topics at hand.
|
||||
>
|
||||
> **(Doll)** Will this one be able to understand?
|
||||
>
|
||||
> **(Ashe)** That's the idea, yeah.
|
||||
>
|
||||
> **(Doll)** YAY!
|
||||
|
||||
## What is GC?
|
||||
|
||||
GC is an initialism for Garbage Collection, and refers to the process of a program (or operating system, or anything
|
||||
else) freeing unused memory for re-use. We don't need to go super in-depth here,
|
||||
except to say that the garbage collection alorithm and mechanism will determine when and how garbage collection is done,
|
||||
and what limitations there are to GC.
|
||||
|
||||
## Why do I care?
|
||||
|
||||
The choice of garbage collection algorithm determines a lot about how the underlying program will behave.
|
||||
Broadly, there are three categories of garbage collection algorithms:
|
||||
|
||||
- Stop-the-world
|
||||
- Concurrent
|
||||
- Hybrid
|
||||
|
||||
### Stop-the-world garbage collection
|
||||
|
||||
{{< hover "STW" >}} Stop-the-world {{< /hover >}} GC algorithms, as the name suggests, halt program execution while
|
||||
freeing memory. Often, they will also compact live objects as there's no risk of a reference changing
|
||||
while program execution is paused.
|
||||
|
||||
The Serial GC in JVM (which is the default in many implementations) is an example of a stop-the-world GC algorithm.
|
||||
|
||||
{{< hover "Stop-the-world" >}}STW{{< /hover >}} garbage collectors are ideal when short pauses in program execution
|
||||
are acceptable and the running program should have the lion's share of resources.
|
||||
|
||||
### Concurrent garbage collection
|
||||
|
||||
Concurrent garbage collectors *do not* pause program execution while doing clean-up, and as such run on a thread
|
||||
parallel to program execution. Concurrent algorithms cannot make the same assumptions as stop-the-world algorithms
|
||||
about program execution and as such must go to much more effort to ensure that a reference is stale or dead before
|
||||
cleaning it up.
|
||||
As such, they tend to be much more CPU and RAM intensive in exchange for never forcing the program to stop.
|
||||
|
||||
Concurrent GC algorithms are ideal when program pauses are unacceptable (such as in realtime or networked applications),
|
||||
and when some inefficiency in CPU and RAM usage is an acceptable price to pay for this.
|
||||
|
||||
The {{< hover "Z Garbage Collector" >}}ZGC{{< /hover >}} in Java is an example a fully concurrent garbage collector.
|
||||
|
||||
### Hybrid
|
||||
|
||||
Hybrid garbage collectors are, as the name suggests, some hybrid of the above.
|
||||
They will do some of their work concurrently, but do require {{< hover "Stop-the-world" >}}STW{{< /hover >}}
|
||||
pauses to perform some of their functions.
|
||||
|
||||
The Concurrent-Mark-and-Sweep collector (deprecated since JVM 9, removed in JVM 14) is an example of a hybrid GC.
|
||||
It marks memory as stale concurrently using a small amount of the available threads,
|
||||
and then stops-the-world to do collection and compaction.
|
||||
|
||||
The {{< hover "Garbage-First Garbage Collector" >}}G1GC{{< /hover >}}, which replaced the Concurrent-Mark-and-Sweep
|
||||
collector, is another example.
|
||||
|
||||
### A secret fourth thing
|
||||
|
||||
> **(Tammy)** Wait. Could you... simply not?
|
||||
>
|
||||
> **(Ashe)** Good catch. Yes! It's even useful sometimes!
|
||||
|
||||
You could simply *not* collect garbage and allow the program to continue using more and more memory until
|
||||
it either terminates, or inevitably runs out of resources and crashes.
|
||||
|
||||
> **(Lorelai)** You only need last long enough to fulfill your purpose. I can see the appeal.
|
||||
|
||||
Right. The appeal here is that you get the full processing power of the machine, *and* you never have to stop the world
|
||||
to collect garbage; it's the best of both worlds. However, this also means that you're either relying on the programmer
|
||||
to clean up their own garbage (in languages that allow this), or simply allowing the program to use more
|
||||
and more memory until the host runs out of available memory and the program crashes.
|
||||
For our purposes today, it is worth noting that disabling the garbage collector in the JVM means that
|
||||
no garbage collection will be done, and, as such no memory will ever be freed,
|
||||
since Java lacks explicit destruction of objects.
|
||||
|
||||
But Lorelai has the right idea. This is the approach used by some stock-trading software,
|
||||
where any delay is unacceptable,
|
||||
and the software has a bounded lifecycle. It needs to run for 8 hours every weekday. As such, so long as it doesn't crash
|
||||
within those 8 hours, it can do whatever.
|
||||
|
||||
> **(Ashe)** As an aside, this is how we came across the phrase "Giant mainframes with 256GB of RAM running JVM 8,
|
||||
crashing every 9 hours, all according to keikaku[^1]"
|
||||
|
||||
[^1]: Keikaku means plan.
|
||||
|
||||
## The JVM and Garbage Collection
|
||||
|
||||
To understand the particularities of the various GCs available in modern Java, we'll have to take a small digression into
|
||||
how the Java Virtual Machine itself works. The Java Virtual Machine is precisely that: a virtual machine.
|
||||
It has its own instruction set, memory management, everything. It is in most ways a full architecture[^2].
|
||||
As such, the Java compiler compiles Java source code into JVM bytecode, which the virtual machine then executes.
|
||||
Java achieves its ability to "run on anything, including toasters" by publishing
|
||||
{{< hover "Java Runtime Environments" >}}JREs{{< /hover >}} for many different underlying architectures.
|
||||
|
||||
[^2]: As an interesting side note, there are even some SOCs that can run Java bytecode natively.
|
||||
|
||||
Java as a programming language lands squarely in the Object-Orientated camp,
|
||||
but makes several critical diversions from C and friends to make itself easier to port to different architectures:
|
||||
|
||||
- No exposed pointers
|
||||
- No explicit free/destroy
|
||||
|
||||
### No Exposed pointers
|
||||
|
||||
Java as a language refuses to expose raw pointers to the programmer. It does this to make porting JVM to
|
||||
different architectures with potentially wildly different memory architectures much easier, as the programmer can never
|
||||
refer to a pointer by address or perform pointer arithmetic. This means that the particular address of any particular
|
||||
object can change at any time. This makes garbage-collection significantly easier, as it means that the GC algorithm
|
||||
can copy live data at any time, which makes memory compaction much easier.
|
||||
|
||||
### No explicit free/destroy
|
||||
|
||||
The JVM does not provide any mechanism to explicitly free or destroy an object.
|
||||
Even `System.GC()` is nothing more than a suggestion.
|
||||
|
||||
This is important as it takes a lot of control away from the programmer and grants it to the runtime instead.
|
||||
The person running the program can determine the GC algorthim, how often it runs, what its runtime targets are,
|
||||
and so on, based on the particular performance characteristics that they need. One could argue that this could
|
||||
equally be done by the programmer or designer of the program, but in doing this, JVM again gains a lot of portability,
|
||||
as it means that the sole arbiter of when an object is freed is the runtime.
|
||||
|
||||
## How does this relate to Minecraft?
|
||||
|
||||
> **(Doll)** But Miss Ashe! How does this help us get all the dolls on one biiiig Minecraft Server?
|
||||
|
||||
I'm glad you asked!
|
||||
|
||||
An interesting note here is that *because* the JVM abstracts away memory management from the programmer, it is up to
|
||||
the deployment to choose the correct GC and parameters to ensure that their program performs the way that it is
|
||||
intended to. Many programs ship with relatively sensible defaults, however Minecraft (and especially modded Minecraft)
|
||||
was notoriously bad about this. There was some folk wisdom back in the day the suggested using
|
||||
the {{< hover "Concurrent-Mark-and-Sweep" >}}CMS{{< /hover >}} garbage collector along side some sensible,
|
||||
but unexplained, parameters that seemed to perform well, but this forum post seems to either have been lost to bit-rot,
|
||||
or is too well-hidden for our google-fu.
|
||||
|
||||
In more modern times, various launchers have converged on some relatively sane defaults, using CMS at first
|
||||
(as it performed best with Minecraft's workload), and moving to G1GC more recently, as Minecraft itself moved to using
|
||||
JDK versions where CMS was deprecated (or even gone!).
|
||||
|
||||
PolyMC currently ships the following on all of their servers:
|
||||
```bash
|
||||
# Xmx and Xms set the maximum and minimum RAM usage, respectively.
|
||||
# They can take any number, followed by an M or a G.
|
||||
# M means Megabyte, G means Gigabyte.
|
||||
# For example, to set the maximum to 3GB: -Xmx3G
|
||||
# To set the minimum to 2.5GB: -Xms2500M
|
||||
# A good default for a modded server is 4GB.
|
||||
|
||||
-Xms4G
|
||||
-Xmx6G
|
||||
-XX:+UseG1GC
|
||||
-XX:+ParallelRefProcEnabled
|
||||
-XX:MaxGCPauseMillis=200
|
||||
-XX:+UnlockExperimentalVMOptions
|
||||
-XX:+DisableExplicitGC
|
||||
-XX:+AlwaysPreTouch
|
||||
-XX:G1NewSizePercent=30
|
||||
-XX:G1MaxNewSizePercent=40
|
||||
-XX:G1HeapRegionSize=8M
|
||||
-XX:G1ReservePercent=20
|
||||
-XX:G1HeapWastePercent=5
|
||||
-XX:G1MixedGCCountTarget=4
|
||||
-XX:InitiatingHeapOccupancyPercent=15
|
||||
-XX:G1MixedGCLiveThresholdPercent=90
|
||||
-XX:G1RSetUpdatingPauseTimePercent=5
|
||||
-XX:SurvivorRatio=32
|
||||
-XX:+PerfDisableSharedMem
|
||||
-XX:MaxTenuringThreshold=1
|
||||
```
|
||||
|
||||
If this is all Greek to you, don't worry! We'll go over this in our next post and see where we can go from here.
|
||||
|
||||
Suffice it to say there's plenty of room for improvement here without degrading performance, especially if your
|
||||
workload differs significantly from what these defaults assume.
|
||||
|
||||
## Next Time!
|
||||
|
||||
This is probably a good place to call it for today. While we haven't gotten into actually tuning Minecraft GC,
|
||||
I think this forms a good groundwork for understanding why we're going to make certain decisions down the line.
|
||||
|
||||
So we've gone over how and why the JVM manages memory, so next time, we'll focus on the various GCs that modern JREs
|
||||
ship with, and what their various advantages and disadvantages are. If you'd like a head-start,
|
||||
the [Oracle GC-Tuning Documentation](https://docs.oracle.com/en/java/javase/17/gctuning/preface.html)
|
||||
is extremely thorough!
|
||||
|
||||
> **(Tammy)** Okay, I managed to keep up with all of that, nice work.
|
||||
>
|
||||
> **(Tess)** Was there ever any doubt?
|
Loading…
Reference in New Issue