ARM64 fixes, Meson build system, and Maco-O support #22

Closed
osy wants to merge 13 commits from qemu-support into master
osy commented 2021-01-05 00:28:51 +00:00 (Migrated from github.com)

These changes are to get libucontext working with Apple Silicon as well as building as a subproject of QEMU. Additionally, there's a few bug fixes.

  • FETCH_LINKPTR was defined to use both register and memory output. But for x86, arm, and arm64, the MOV instruction only accepts register outputs. This causes an assembler error if the compiler decides to use a memory output. To be safe, I changed all the archs to only use register outputs.
  • On arm64 and arm, FETCH_LINKPTR had the wrong operands. Other than x86_64 (correct), I did not check any other archs.
  • _GNU_SOURCE may be defined by the build system (i.e. when building with QEMU) so I added a guard to it
  • makecontext should also be exported only when EXPORT_UNPREFIXED is defined
  • In Mach-O, function names are not automatically underscored when defined in assembly, so I added a macro to do it.
  • Clang's ARM64 assembler does not allow multiple directives to be placed on the same line (i.e. .global func ; .align 2 ; func:). This may be a bug but it means we cannot use preprocessor macros to output multiple directives as macros always expand to a single line regardless. Unfortunately this makes the arm64 functions out of place, so maybe there's a better solution (like a manual preprocess step that expands the newlines).
  • ARM64's bits.h uses size_t which requires stddef.h
  • Moved the location of all the bits.h headers so -isystem is not required. This allows Meson builds to work.
  • Meson doesn't support different CFLAGs for .S and .c files so moved stuff around to remove the need for -DLIBUCONTEXT_ASSEMBLY.
  • Created VERSION file to keep track of the version number which is used by both Make and Meson
  • Added Meson build scripts with support for building as a subproject. Docs are currently not built in meson.
  • ARM64 had incorrect types in bits.h leading to wrong offsets. I also added _Static_assert to verify that the offset matches what's in def.h. I would recommend this for the other archs as well.
  • The man page for makecontext says that each argument is int type. In aarch64, this means 32 bit ints in 64 bit registers as well as 32 bit ints in the stack.
These changes are to get libucontext working with Apple Silicon as well as building as a subproject of QEMU. Additionally, there's a few bug fixes. * `FETCH_LINKPTR` was defined to use both register and memory output. But for x86, arm, and arm64, the MOV instruction only accepts register outputs. This causes an assembler error if the compiler decides to use a memory output. To be safe, I changed all the archs to only use register outputs. * On arm64 and arm, `FETCH_LINKPTR` had the wrong operands. Other than x86_64 (correct), I did not check any other archs. * `_GNU_SOURCE` may be defined by the build system (i.e. when building with QEMU) so I added a guard to it * `makecontext` should also be exported only when `EXPORT_UNPREFIXED` is defined * In Mach-O, function names are not automatically underscored when defined in assembly, so I added a macro to do it. * Clang's ARM64 assembler does not allow multiple directives to be placed on the same line (i.e. `.global func ; .align 2 ; func:`). This may be a bug but it means we cannot use preprocessor macros to output multiple directives as macros always expand to a single line regardless. Unfortunately this makes the arm64 functions out of place, so maybe there's a better solution (like a manual preprocess step that expands the newlines). * ARM64's `bits.h` uses size_t which requires stddef.h * Moved the location of all the `bits.h` headers so `-isystem` is not required. This allows Meson builds to work. * Meson doesn't support different CFLAGs for .S and .c files so moved stuff around to remove the need for `-DLIBUCONTEXT_ASSEMBLY`. * Created `VERSION` file to keep track of the version number which is used by both Make and Meson * Added Meson build scripts with support for building as a subproject. Docs are currently not built in meson. * ARM64 had incorrect types in bits.h leading to wrong offsets. I also added `_Static_assert` to verify that the offset matches what's in def.h. I would recommend this for the other archs as well. * The man page for `makecontext` says that each argument is `int` type. In aarch64, this means 32 bit ints in 64 bit registers as well as 32 bit ints in the stack.
ariadne reviewed 2021-01-05 09:51:12 +00:00

I don't think we actually declare libucontext_sigset_t anywhere?

I don't think we actually declare `libucontext_sigset_t` anywhere?
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);

What happens if we pass a pointer as a stack argument to makecontext then?

What happens if we pass a pointer as a stack argument to makecontext then?
ariadne reviewed 2021-01-05 09:52:26 +00:00
ariadne left a comment

I think it would be better to just change the non-Meson build system to do -D_GNU_SOURCE on the commandline.

I think it would be better to just change the non-Meson build system to do -D_GNU_SOURCE on the commandline.
osy (Migrated from github.com) reviewed 2021-01-05 16:31:36 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);
osy (Migrated from github.com) commented 2021-01-05 16:31:36 +00:00

Wouldn’t be allowed according to the man pages https://man7.org/linux/man-pages/man3/makecontext.3.html

When this context is later activated (using setcontext(3) or
swapcontext()) the function func is called, and passed the series
of integer (int) arguments that follow argc; the caller must
specify the number of these arguments in argc.

Wouldn’t be allowed according to the man pages https://man7.org/linux/man-pages/man3/makecontext.3.html > When this context is later activated (using setcontext(3) or > swapcontext()) the function func is called, and passed the series > of integer (int) arguments that follow argc; the caller must > specify the number of these arguments in argc.
osy (Migrated from github.com) reviewed 2021-01-05 16:34:55 +00:00
osy (Migrated from github.com) commented 2021-01-05 16:34:54 +00:00

I added it above but it’s not used and can easily be substituted. I just copy pasted the layout from some C library (forgot which).

I added it above but it’s not used and can easily be substituted. I just copy pasted the layout from some C library (forgot which).
osy commented 2021-01-05 16:40:47 +00:00 (Migrated from github.com)

I think it would be better to just change the non-Meson build system to do -D_GNU_SOURCE on the commandline.

~Sure but there’s still a problem where if it’s used as a sub project and it inherits cflags from the parent (i.e. qemu) then it gets defined but if built standalone with meason it needs to be defined.~

EDIT: nvm I just realized there’s no problem with having -D_GNU_SOURCE twice.

> I think it would be better to just change the non-Meson build system to do -D_GNU_SOURCE on the commandline. ~Sure but there’s still a problem where if it’s used as a sub project and it inherits cflags from the parent (i.e. qemu) then it gets defined but if built standalone with meason it needs to be defined.~ EDIT: nvm I just realized there’s no problem with having -D_GNU_SOURCE twice.
ariadne reviewed 2021-01-08 09:43:48 +00:00

Musl defines sigset_t as:

typedef struct __sigset_t { unsigned long __bits[128/sizeof(long)]; } sigset_t;

That works out to 16 bytes on aarch64.

Musl defines `sigset_t` as: ``` typedef struct __sigset_t { unsigned long __bits[128/sizeof(long)]; } sigset_t; ``` That works out to 16 bytes on aarch64.
ariadne reviewed 2021-01-08 09:46:45 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);

The NetBSD and glibc implementations also do not narrow the arguments like this code is doing. The macOS implementation also does not appear to do so. What problem are you trying to solve here?

The NetBSD and glibc implementations also do not narrow the arguments like this code is doing. The macOS implementation also does not appear to do so. What problem are you trying to solve here?
osy (Migrated from github.com) reviewed 2021-01-08 09:48:29 +00:00
osy (Migrated from github.com) commented 2021-01-08 09:48:29 +00:00

I see, well it doesn’t seem to actually matter since it’s not used and is laid out after the regs.

I see, well it doesn’t seem to actually matter since it’s not used and is laid out after the regs.
osy (Migrated from github.com) reviewed 2021-01-08 09:52:26 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);
osy (Migrated from github.com) commented 2021-01-08 09:52:26 +00:00

Well you need to choose some size or the stack layout won’t work. The old code doesn’t work because it assumed every register is 4 bytes so I bumped the register arguments to 8 bytes and kept the stack arguments at 4 bytes. You can make everything 8 bytes as I originally tried but then I read the man pages and realized it expects ‘int’ which is 4 bytes for all arguments.

Well you need to choose some size or the stack layout won’t work. The old code doesn’t work because it assumed every register is 4 bytes so I bumped the register arguments to 8 bytes and kept the stack arguments at 4 bytes. You can make everything 8 bytes as I originally tried but then I read the man pages and realized it expects ‘int’ which is 4 bytes for all arguments.
osy (Migrated from github.com) reviewed 2021-01-08 09:54:30 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);
osy (Migrated from github.com) commented 2021-01-08 09:54:30 +00:00

To clarify I didn’t “narrow” any arguments. I widened it for the register arguments. unsigned long is 4 bytes on arm and arm64.

To clarify I didn’t “narrow” any arguments. I widened it for the register arguments. unsigned long is 4 bytes on arm and arm64.
ariadne reviewed 2021-01-08 10:27:33 +00:00

Hmm, but mcontext comes after the sigset. __pad was intended to include the sigset here.

Hmm, but mcontext comes after the sigset. __pad was intended to include the sigset here.
ariadne reviewed 2021-01-08 10:38:43 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);

Yeah, I was misreading what the patch did. This is fine.

Yeah, I was misreading what the patch did. This is fine.
ariadne reviewed 2021-01-08 10:43:49 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);

Wait what?

On my aarch64 machine, unsigned long is definitely 8 bytes:

petrie:~/libucontext$ cat moo2.c
#include <stdint.h>
#include <stdio.h>

int main() {
        printf("%zu\n", sizeof(unsigned long));
        return 0;
}
petrie:~/libucontext$ gcc -o moo2 moo2.c
petrie:~/libucontext$ ./moo2
8
Wait what? On my `aarch64` machine, `unsigned long` is definitely 8 bytes: ``` petrie:~/libucontext$ cat moo2.c #include <stdint.h> #include <stdio.h> int main() { printf("%zu\n", sizeof(unsigned long)); return 0; } petrie:~/libucontext$ gcc -o moo2 moo2.c petrie:~/libucontext$ ./moo2 8 ```
ariadne reviewed 2021-01-08 10:47:27 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);

Apple says unsigned long is 8 bytes on Darwin ARM64 too. https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms

Apple says `unsigned long` is 8 bytes on Darwin ARM64 too. https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms
ariadne reviewed 2021-01-08 10:48:29 +00:00
@ -48,3 +57,3 @@
for (; i < argc; i++)
*sp++ = va_arg (va, unsigned long);
*sp++ = va_arg (va, int);

It seems to me that having it any other way would break alignment assumptions made by aarch64 CPUs? So I'm back to being confused about the two ARM-specific patches.

It seems to me that having it any other way would break alignment assumptions made by aarch64 CPUs? So I'm back to being confused about the two ARM-specific patches.
ariadne reviewed 2021-01-08 11:23:21 +00:00

This was an interesting bug, it turns out gcc was optimizing the struct layout in a way that matched my assumptions, but clang was not doing the same optimization. So I fixed the padding size in b1902729.

This was an interesting bug, it turns out gcc was optimizing the struct layout in a way that matched my assumptions, but clang was not doing the same optimization. So I fixed the padding size in b1902729.
ariadne approved these changes 2021-01-08 11:25:14 +00:00

Everything in here except for the clang assembler directive workaround should be merged (excluding the struct layout changes, I took a different approach there), please give the current head a go.

Everything in here except for the clang assembler directive workaround should be merged (excluding the struct layout changes, I took a different approach there), please give the current head a go.

Pull request closed

Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: ariadne/libucontext#22
There is no content yet.