diff --git a/content/posts/server-build.md b/content/posts/server-build.md index 55342fc..bb3b1c3 100644 --- a/content/posts/server-build.md +++ b/content/posts/server-build.md @@ -59,7 +59,7 @@ Now that we have a spec, we need to figure out the best way to fulfill it. ### The Case This would normally be something to consider carefully, but a good friend was getting rid of a 3U server case with 12 3.5" bays, so we'll nick that off of him, and move on. He also donated us a few -[Arctic P12s](https://www.arctic.de/en/P12/ACFAN00118A) for cooling as they'll be quieter than the jet engines +[Arctic P12s](https://www.arctic.de/en/P12/ACFAN00118A) for cooling as they'd be quieter than the jet engines that came with the case. ### The PSU @@ -70,7 +70,7 @@ It's probably strange to start with the storage, but we'll need to know how much the ZFS instance properly. #### The Goal -The goal is to have a scaleable template that can be expanded in sets without breaking the bank each time, +The goal is to have a scaleable template that can be expanded without breaking the bank each time, while maintaining favourable failure characteristics. We'll budget out 3000 AUD for each expansion. **(Doll)** That sounds like a lot, Miss... @@ -107,9 +107,10 @@ With 11 usable slices our best bet would probably be 8/3/1 (8 data, 3 parity, 1 we'd probably want to go for either 7/3/2, or 8/2/2. 7/3/2 is certainly the safest set-up, as it gives us 3 parity slices and two hot-spares, potentially allowing for -5 HDD failures before data-loss occurs. However this is over-conservative for the pool that we'll be building out initially. -This pool won't be the only storage location for critical data, so we can afford to be a little more aggressive -with our dRAID setup, so we're left with 8/2/2, or 8/3/1. +5 HDD failures before data-loss occurs. However this is over-conservative for the pool that we'll be building out. +This pool won't be used for critical data, so we can afford to be a little more aggressive +with our dRAID setup. +That leaves us with 8/2/2, or 8/3/1. These can both tolerate 4 sequential drive failures, however 8/2/2 has a slightly higher write and resilver speed at the cost of tolerating less concurrent failures (2 at a time vs 3 at a time). @@ -118,13 +119,14 @@ With how large spinning rust drives can get these days (18 - 20 TB each), two pa we'll go with 8/3/1 as our template. #### ZIL/L2ARC -We'll also need a ZIL and L2ARC device, which we can use some SSDs for. -We can also use these drives as a read cache. The recommendation is two use two mirrored drives for this -for resilience reasons, so we'll do that too. We'll want something with relatively high IOPS for this, as well as -good random IO speeds. +We'll also need a ZIL and L2ARC device. Sun's old documentation recommends mirroring this device for resilience and +reliability reasons. That seems entirely reasonable, so let's do that, too. +We can also use these drives as a read cache to improve latency on popular data. +We'll want something with relatively high IOPS for this, as well as good random IO speeds. In theory, we could go hog wild and get Intel Optane and never worry about it again, but that's... prohibitively expensive to put it lightly. -Again, balancing cost and performance, 2 ✕ 500GB Seagate FireCuda 530s does the job well. Relatively high endurance, + +Balancing cost and performance, 2 ✕ 500GB Seagate FireCuda 530s does the job well. Relatively high endurance, good random IO performance, and not too expensive. Two of these sets us back 600 AUD. #### Final Storage Configuration @@ -135,6 +137,7 @@ With 8 data slices, 3 parity slices, and 1 hot spare slice, we should end up wit **(Ashe)** It is worth mentioning here that dRAID does not allow for variable stripe width due to how sequential resilvering works, so compression ratios and real vs on-disk utilisation may suffer. {{< /alert >}} + ### The CPU We're going to make an executive decision and go with an AMD EPYC CPU, because we've always wanted to use one. That said, we have a few options: @@ -143,7 +146,7 @@ That said, we have a few options: | -------- |:----------:|:---------:|:---------------:|:-------------:|:-------------:|:-------:| | 7272 | Rome | 625 | 12 (24) | 2.9 - 3.2 | 64 | 120 | | 7302 | Rome | 978 | 16 (32) | 3.0 - 3.3 | 128 | 155 | -| **7352** | **Rome** | **1350** | **24 (48)** | **2.3 - 3.2** | **128** | **155** | +| **7352** | **Rome** | **1350** | **24 (48)** | **2.3 - 3.2** | **128** | **155** | | 7402 | Rome | 1783 | 24 (48) | 2.8 - 3.35 | 128 | 180 | | 7F72 | Rome | 2450 | 24 (48) | 3.5 - 3.9 | 192 | 240 | | 7452 | Rome | 2025 | 32 (64) | 2.35 - 3.35 | 128 | 155 | @@ -151,11 +154,11 @@ That said, we have a few options: | 7413 | Milan | 1825 | 24 (48) | 2.85 - 4.00 | 128 | 200 | | 7513 | Milan | 2840 | 32 (64) | 2.6 - 3.65 | 128 | 200 | -The odd one out here is the `7F72`, which is a frequency-optimised model, designed for maximum performance per core, +The odd one out here is the `7F72`, which is a frequency-optimised model, designed for maximum performance per core to get around per-core licensing issues in enterprise applications. While cool, it being nearly double the price of the comparable `7352` puts it outside our budget for this particular build. -Balancing Price and Performance, we've decided to go with a `AMD EPYC 7352`, as 24/48 exceeds our spec, and doesn't break +Again, balancing price and performance, we've decided to go with a `AMD EPYC 7352`, as 24/48 exceeds our spec, and doesn't break the bank while doing so. We miss out on some of the performance of the Milan line, but that's acceptable here. The SP3 socket also allows us to upgrade to Milan(-X) down the line if we need more performance (with a BIOS update). @@ -164,7 +167,7 @@ Shipped to a friend, this sets us back ~1500 AUD. ### The Motherboard With our CPU chosen, we need a motherboard that fulfills our purposes. -Of the options, we are looking for something with an +Of the readily-available options, we are looking for something with an IPMI/BMC, and dual Ethernet interfaces onboard, as our data port requirement can be fulfilled by a PCIe network card. 8 SATA ports under one SATA controller would be nice, as it makes configuring passthrough for ZFS easier, but is not @@ -177,7 +180,16 @@ The `AsRock Rack ROMED8U-2T` serves our purposes perfectly: - [X]   2 ✕ 10GbE + 1 ✕ IPMI We don't *need* 10GbE for our management network, but this allows us to build this out with 2 ✕ 10GbE for data, and -upgrading to 2 ✕ 40GbE later, which may be what we end up doing. +upgrade to 2 ✕ 40GbE later, which may be what we end up doing. Notably, this leaves us without separate management and +data interfaces. + +**(Selene)** Is that a problem? + +**(Ashe)** Not really, no. It makes securing the network *slightly* trickier as we can't airgap the management network, +but it's not the end of the world. We intend to get a PCIe Ethernet card at some point anyway, so this'll +be temporary. + +**(Octavia)** Famous last words. New from Newegg, this sets us back ~1100 AUD. @@ -191,27 +203,27 @@ There are two ways we can estimate how much RAM we'll need. 1. Foundry Server * 8GB will be plenty for this 2. Chibisafe - * Chibisafe prides itself on running slim, so we should be able to go down to even 1 or 2 GB + * Chibisafe prides itself on running lean, so we should be able to go down to as low as 1 or 2 GB 3. ZFS Storage Cluster * The extremely conservative guideline (as published by Sun back in the day) is 1GB RAM per TB of storage. - This guideline was published with the idea that at the level you should *never* encounter any bottlenecks or issues. + This guideline was published with the idea that at this level you should *never* encounter any bottlenecks or issues. * We do not need such a strict performance guarantee. * We should be able to halve or even quarter this and not encounter bottlenecks. * We'll initially provision 32GB, and adjust as necessary. 4. Modded Minecraft Server * 16 - 32 GB is the rough ballpark for good performance with 8 - 16 players - * This is likely overkill for this, so we can dial it back to 12GB with some GC tuning on the server end + * This is likely overkill for our server, so we can dial it back to 12GB with some GC tuning on the server end So that totals up to 8 + 2 + 32 + 12 = 54 GB. We want to allow room for growth and for friends to start their own VMs, so the next logical stepping stones are 64 or 128 GB. #### Total Required RAM -We have 48 vCPUs in our current setup (with no overcommit, but more on that later). +We have 48 vCPUs in our current setup (with no overcommit, but more on that in the next blog post). Very broadly, most environments that we've had exposure to allocate approximately 4GB/vCPU, and adjust based on how CPU-hungry or memory-hungry a particular workload is. -Since we're expecting mixed workloads (from friends), we'll follow the same guideline of 4GB RAM per vCPU, -so we'll need at least 4 ✕ 48 = 192GB. With 8 slots on the mobo, +Since we're expecting mixed workloads (from friends), we'll follow that same guideline. +We'll need at least 4 ✕ 48 = 192GB. With 8 slots on the mobo, that means we'll have to use 32GB modules (or 64GB modules if we feel like going overboard). Our motherboard comes with 8 RAM slots and the Rome EPYC CPUs support octo-channel RAM, so we'd get