From c963ef30c35c5a0f95dfff10b8fc598ba57eb63d Mon Sep 17 00:00:00 2001 From: Rin Date: Sat, 28 Jan 2023 19:55:05 +1100 Subject: [PATCH] add first GC post --- content/posts/ | 212 +++++++++++++++++++++++++++++++++ 1 file changed, 212 insertions(+) create mode 100644 content/posts/ diff --git a/content/posts/ b/content/posts/ new file mode 100644 index 0000000..3e6c7d2 --- /dev/null +++ b/content/posts/ @@ -0,0 +1,212 @@ +--- +title: "What is GC? How does it work in JVM?" +date: 2023-01-22T22:22:22+11:00 +draft: true +showSummary: true +summary: "Tuning Garbage Collection on Minecraft servers is something of an arcane art that few people seem to understand +at a fundamental level. Let's remedy that. We'll start at fundamentals of GC, the JVM, and use that to form the +foundations of our GC choice and how we tune it." +series: + - "JVM, GC, and Minecraft" +series_order: 1 +--- + +# Part One: What is GC? How does it work? + +> **(Tammy)** To preempt any fears you may have, Ashe has assured us that this will be written at a level where someone +> who just wants to run their minecraft server well will understand. +> +> **(Ashe)** And for those of you who do have a technical understanding of Java, and/or sysadmin, I hope this deepens +> your understanding of the topics at hand. +> +> **(Doll)** Will this one be able to understand? +> +> **(Ashe)** That's the idea, yeah. +> +> **(Doll)** YAY! + +## What is GC? + +GC is an initialism for Garbage Collection, and refers to the process of a program (or operating system, or anything +else) freeing unused memory for re-use. We don't need to go super in-depth here, +except to say that the garbage collection alorithm and mechanism will determine when and how garbage collection is done, +and what limitations there are to GC. + +## Why do I care? + +The choice of garbage collection algorithm determines a lot about how the underlying program will behave. +Broadly, there are three categories of garbage collection algorithms: + +- Stop-the-world +- Concurrent +- Hybrid + +### Stop-the-world garbage collection + +{{< hover "STW" >}} Stop-the-world {{< /hover >}} GC algorithms, as the name suggests, halt program execution while +freeing memory. Often, they will also compact live objects as there's no risk of a reference changing +while program execution is paused. + +The Serial GC in JVM (which is the default in many implementations) is an example of a stop-the-world GC algorithm. + +{{< hover "Stop-the-world" >}}STW{{< /hover >}} garbage collectors are ideal when short pauses in program execution +are acceptable and the running program should have the lion's share of resources. + +### Concurrent garbage collection + +Concurrent garbage collectors *do not* pause program execution while doing clean-up, and as such run on a thread +parallel to program execution. Concurrent algorithms cannot make the same assumptions as stop-the-world algorithms +about program execution and as such must go to much more effort to ensure that a reference is stale or dead before +cleaning it up. +As such, they tend to be much more CPU and RAM intensive in exchange for never forcing the program to stop. + +Concurrent GC algorithms are ideal when program pauses are unacceptable (such as in realtime or networked applications), +and when some inefficiency in CPU and RAM usage is an acceptable price to pay for this. + +The {{< hover "Z Garbage Collector" >}}ZGC{{< /hover >}} in Java is an example a fully concurrent garbage collector. + +### Hybrid + +Hybrid garbage collectors are, as the name suggests, some hybrid of the above. +They will do some of their work concurrently, but do require {{< hover "Stop-the-world" >}}STW{{< /hover >}} +pauses to perform some of their functions. + +The Concurrent-Mark-and-Sweep collector (deprecated since JVM 9, removed in JVM 14) is an example of a hybrid GC. +It marks memory as stale concurrently using a small amount of the available threads, +and then stops-the-world to do collection and compaction. + +The {{< hover "Garbage-First Garbage Collector" >}}G1GC{{< /hover >}}, which replaced the Concurrent-Mark-and-Sweep +collector, is another example. + +### A secret fourth thing + +> **(Tammy)** Wait. Could you... simply not? +> +> **(Ashe)** Good catch. Yes! It's even useful sometimes! + +You could simply *not* collect garbage and allow the program to continue using more and more memory until +it either terminates, or inevitably runs out of resources and crashes. + +> **(Lorelai)** You only need last long enough to fulfill your purpose. I can see the appeal. + +Right. The appeal here is that you get the full processing power of the machine, *and* you never have to stop the world +to collect garbage; it's the best of both worlds. However, this also means that you're either relying on the programmer +to clean up their own garbage (in languages that allow this), or simply allowing the program to use more +and more memory until the host runs out of available memory and the program crashes. +For our purposes today, it is worth noting that disabling the garbage collector in the JVM means that +no garbage collection will be done, and, as such no memory will ever be freed, +since Java lacks explicit destruction of objects. + +But Lorelai has the right idea. This is the approach used by some stock-trading software, +where any delay is unacceptable, +and the software has a bounded lifecycle. It needs to run for 8 hours every weekday. As such, so long as it doesn't crash +within those 8 hours, it can do whatever. + +> **(Ashe)** As an aside, this is how we came across the phrase "Giant mainframes with 256GB of RAM running JVM 8, +crashing every 9 hours, all according to keikaku[^1]" + +[^1]: Keikaku means plan. + +## The JVM and Garbage Collection + +To understand the particularities of the various GCs available in modern Java, we'll have to take a small digression into +how the Java Virtual Machine itself works. The Java Virtual Machine is precisely that: a virtual machine. +It has its own instruction set, memory management, everything. It is in most ways a full architecture[^2]. +As such, the Java compiler compiles Java source code into JVM bytecode, which the virtual machine then executes. +Java achieves its ability to "run on anything, including toasters" by publishing +{{< hover "Java Runtime Environments" >}}JREs{{< /hover >}} for many different underlying architectures. + +[^2]: As an interesting side note, there are even some SOCs that can run Java bytecode natively. + +Java as a programming language lands squarely in the Object-Orientated camp, +but makes several critical diversions from C and friends to make itself easier to port to different architectures: + +- No exposed pointers +- No explicit free/destroy + +### No Exposed pointers + +Java as a language refuses to expose raw pointers to the programmer. It does this to make porting JVM to +different architectures with potentially wildly different memory architectures much easier, as the programmer can never +refer to a pointer by address or perform pointer arithmetic. This means that the particular address of any particular +object can change at any time. This makes garbage-collection significantly easier, as it means that the GC algorithm +can copy live data at any time, which makes memory compaction much easier. + +### No explicit free/destroy + +The JVM does not provide any mechanism to explicitly free or destroy an object. +Even `System.GC()` is nothing more than a suggestion. + +This is important as it takes a lot of control away from the programmer and grants it to the runtime instead. +The person running the program can determine the GC algorthim, how often it runs, what its runtime targets are, +and so on, based on the particular performance characteristics that they need. One could argue that this could +equally be done by the programmer or designer of the program, but in doing this, JVM again gains a lot of portability, +as it means that the sole arbiter of when an object is freed is the runtime. + +## How does this relate to Minecraft? + +> **(Doll)** But Miss Ashe! How does this help us get all the dolls on one biiiig Minecraft Server? + +I'm glad you asked! + +An interesting note here is that *because* the JVM abstracts away memory management from the programmer, it is up to +the deployment to choose the correct GC and parameters to ensure that their program performs the way that it is +intended to. Many programs ship with relatively sensible defaults, however Minecraft (and especially modded Minecraft) +was notoriously bad about this. There was some folk wisdom back in the day the suggested using +the {{< hover "Concurrent-Mark-and-Sweep" >}}CMS{{< /hover >}} garbage collector along side some sensible, +but unexplained, parameters that seemed to perform well, but this forum post seems to either have been lost to bit-rot, +or is too well-hidden for our google-fu. + +In more modern times, various launchers have converged on some relatively sane defaults, using CMS at first +(as it performed best with Minecraft's workload), and moving to G1GC more recently, as Minecraft itself moved to using +JDK versions where CMS was deprecated (or even gone!). + +PolyMC currently ships the following on all of their servers: +```bash +# Xmx and Xms set the maximum and minimum RAM usage, respectively. +# They can take any number, followed by an M or a G. +# M means Megabyte, G means Gigabyte. +# For example, to set the maximum to 3GB: -Xmx3G +# To set the minimum to 2.5GB: -Xms2500M +# A good default for a modded server is 4GB. + +-Xms4G +-Xmx6G +-XX:+UseG1GC +-XX:+ParallelRefProcEnabled +-XX:MaxGCPauseMillis=200 +-XX:+UnlockExperimentalVMOptions +-XX:+DisableExplicitGC +-XX:+AlwaysPreTouch +-XX:G1NewSizePercent=30 +-XX:G1MaxNewSizePercent=40 +-XX:G1HeapRegionSize=8M +-XX:G1ReservePercent=20 +-XX:G1HeapWastePercent=5 +-XX:G1MixedGCCountTarget=4 +-XX:InitiatingHeapOccupancyPercent=15 +-XX:G1MixedGCLiveThresholdPercent=90 +-XX:G1RSetUpdatingPauseTimePercent=5 +-XX:SurvivorRatio=32 +-XX:+PerfDisableSharedMem +-XX:MaxTenuringThreshold=1 +``` + +If this is all Greek to you, don't worry! We'll go over this in our next post and see where we can go from here. + +Suffice it to say there's plenty of room for improvement here without degrading performance, especially if your +workload differs significantly from what these defaults assume. + +## Next Time! + +This is probably a good place to call it for today. While we haven't gotten into actually tuning Minecraft GC, +I think this forms a good groundwork for understanding why we're going to make certain decisions down the line. + +So we've gone over how and why the JVM manages memory, so next time, we'll focus on the various GCs that modern JREs +ship with, and what their various advantages and disadvantages are. If you'd like a head-start, +the [Oracle GC-Tuning Documentation]( +is extremely thorough! + +> **(Tammy)** Okay, I managed to keep up with all of that, nice work. +> +> **(Tess)** Was there ever any doubt?