--- title: "understanding thread stack sizes and how alpine is different" date: "2021-06-25" --- From time to time, somebody reports a bug to some project about their program crashing on Alpine.  Usually, one of two things happens: the developer doesn't care and doesn't fix the issue, because it works under GNU/Linux, or the developer fixes their program to behave correctly _only_ for the Alpine case, and it remains silently broken on other platforms. ## The Default Thread Stack Size In general, it is my opinion that if your program is crashing on Alpine, it is because your program is dependent on behavior that is not guaranteed to actually exist, which means your program is not actually portable.  When it comes to this kind of dependency, the typical issue has to deal with the _thread stack size limit_. You might be wondering: what is a thread stack, anyway?  The answer, of course, is quite simple: each thread has its own stack memory, because it's not really feasible for multiple threads to use the same stack memory, and on most platforms the size of that memory is much smaller than the main thread's stack, though programmers are not necessarily aware of that discontinuity. Here is a table of common `x86_64` platforms and their default stack sizes for the main thread (process) and child threads: | OS | Process Stack Size | Thread Stack Size | | --- | --- | --- | | Darwin (macOS, iOS, etc) | 8 MiB | 512 KiB | | FreeBSD | 8 MiB | 2 MiB | | OpenBSD (before 4.6) | 8 MiB | **64 KiB** | | OpenBSD (4.6 and later) | 8 MiB | 512 KiB | | Windows | 1 MiB | 1 MiB | | Alpine 3.10 and older | 8 MiB | 80 KiB | | Alpine 3.11 and newer | 8 MiB | 128 KiB | | GNU/Linux | 8 MiB | **8 MiB** | I've highlighted the OpenBSD and GNU/Linux default thread stack sizes because they represent the smallest and largest possible default thread stack sizes. Because the Linux kernel has overcommit mode, GNU/Linux systems use 8 MiB by default, which leads to a potential problem when running code developed against GNU/Linux on other systems.  As most threads only need a small amount of stack memory, other platforms use smaller limits, such as OpenBSD using only 64 KiB and Alpine using at most 128 KiB by default.  This leads to crashes in code which assumes a full 8MiB is available for each thread to use. If you find yourself debugging a weird crash that doesn't make sense, and your application is multi-threaded, it likely means that you're exhausting the stack limit. ## What can I do about it? To fix the issue, you will need to either change the way your program is written, or change the way it is compiled.  There's a few options you can take to fix the problem, depending on how much time you're willing to spend.  In most cases, these sorts of crashes are caused by attempting to manipulate a large variable which is stored on the stack.  Generally, moving the variable off the stack is the best way to fix the issue, but there are alternative options. ### Moving the variable off the stack Lets say that the code has a large array that is stored on the stack, which causes the stack exhaustion issue.  In this case, the easiest solution is to move it off the stack.  There's two main approaches you can use to do this: _thread-local storage_ and _heap storage_.  Thread-local storage is a way to reserve additional memory for thread variables, think of it like `static` but bound to each local thread.  Heap storage is what you're working with when you use `malloc` and `free`. To illustrate the example, we will adjust this code to use both kinds of storage: void some\_function(void) {
 char scratchpad\[500000\];

 memset(scratchpad, 'A', sizeof scratchpad);
 } Thread-local variables are referenced with the `thread_local` keyword.  You must include `threads.h` in order to use it: #include 

void some\_function(void) { 
 thread\_local char scratchpad\[500000\];

 memset(scratchpad, 'A', sizeof scratchpad);
 } You can also use the heap.  The most portable example would be the obvious one: #include 

const size\_t scratchpad\_size = 500000;
 
void some\_function(void) {
 char \*scratchpad = calloc(1, scratchpad\_size);

 memset(scratchpad, 'A', scratchpad\_size);

 free(scratchpad);
 } However, if you don't mind sacrificing portability outside `gcc` and `clang`, you can use the `cleanup` attribute: #include 

#define autofree \_\_attribute\_\_(cleanup(free))

 const size\_t scratchpad\_size = 500000;

 void some\_function(void) {
 autofree char \*scratchpad = calloc(1, scratchpad\_size);

 memset(scratchpad, 'A', scratchpad\_size);
 } This is probably the best way to fix code like this if you're not targeting compilers like the Microsoft one. ### Adjusting the thread stack size at runtime `pthread_create` takes an optional `pthread_attr_t` pointer as the second parameter.  This can be used to set an alternate stack size for the thread at runtime: #include 

pthread\_t worker\_thread; 

void launch\_worker(void) {
 pthread\_attr\_t attr; 

 pthread\_attr\_init(&attr);
 pthread\_attr\_setstacksize(&attr, 1024768);

 pthread\_create(&worker\_thread, &attr, some\_function);
 } By changing the stacksize when calling `pthread_create`, the child thread will have a larger stack. ### Adjusting the stack size at link time In modern Alpine systems, since 2018, it is possible to set the default thread stack size at link time.  This can be done with a special `LDFLAGS` flag, like `-Wl,-z,stack-size=1024768`. You can also use tools like [chelf](https://github.com/Gottox/chelf) or [muslstack](https://github.com/yaegashi/muslstack) to patch pre-built binaries to use a larger stack, but this shouldn't be done inside Alpine packaging, for example. Hopefully, this article is helpful for those looking to learn how to solve the stack size issue.