blog: add writing portable arm64 assembly article
ci/woodpecker/push/woodpecker Pipeline was successful Details

Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
main
Ariadne Conill 2023-04-12 23:25:49 -07:00
parent 5070f39353
commit f412e77867
1 changed files with 115 additions and 0 deletions

View File

@ -0,0 +1,115 @@
---
title: Writing portable ARM64 assembly
date: '2023-04-13'
---
An unfortunate side effect of the rising popularity of Apple's ARM-based
computers is an increase in unportable assembly code which targets the
64-bit ARM ISA. This is because developers are writing these bits of
assembly code to speed up their programs when run on Apple's ARM-based
computers, without considering the other 64-bit ARM devices out there,
such as SBCs and servers running Linux or BSD.
The good news is that it is very easy to write assembly which targets
Apple's computers as well as the other 64-bit ARM devices running
operating systems other than Darwin. It just requires being aware of
a few differences between the Mach-O and ELF ABIs, as well as knowing
what Apple-specific syntax extensions to avoid. By following the
guidance in this blog, you will be able to write assembly code which
is portable between Apple's toolchain, the official ARM assembly
toolchain, and the GNU toolchain.
## Differences between the ELF and Mach-O ABIs
Modern UNIX systems, including Linux-based systems largely use the
[ELF binary format][elf]. Apple uses [Mach-O][mach-o] in Darwin
instead for historical reasons. This is not a requirement for Apple
imposed by their use of Mach, indeed, OSFMK, the kernel that Darwin,
MkLinux and OSF/1 are all based on, supports ELF binaries just fine.
Apple just decided to use the Mach-O format instead.
[elf]: https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
[mach-o]: https://en.wikipedia.org/wiki/Mach-O
When it comes to writing assembly (or, really, just linking code
in general) targeting Darwin, the main difference to be aware of is
that all symbols are prefixed with a single underscore. For example,
if you have a function that would be declared in C like:
```c
extern void unmask(const char *payload, const char *mask, size_t len);
```
On Darwin, the function in your assembly code must be defined as `_unmask`.
The other major difference is that ELF defines different classes of
data, for example `STT_FUNC` and `STT_OBJECT`. There is no equivalence
in Mach-O, and thus the `.type` directive that you would use when writing
assembly for ELF targets is not supported.
## Apple-specific vector mnemonics
The other main thing to watch out for is Apple's custom mnemonics for
NEON. In order to make writing NEON code less cumbersome, Apple
introduced a set of mnemonics that allow simplification of specifying
NEON instructions. For example, if you are targeting Apple devices
only, you might write an exclusive-or NEON instruction like so:
```asm
eor.16b v2, v2, v0
```
This is an Apple-specific extension to the ARM assembly syntax. The
[official ARM assembly manual][armasm] specifies that the memory layout
must be specified for each register:
```asm
eor v2.16b, v2.16b, v0.16b
```
[armasm]: https://developer.arm.com/documentation/dui0802/b/A64-SIMD-Vector-Instructions/EOR--vector-
## Abstracting the ABI details with some macros
The good news is that the ABI details can easily be abstracted with a
few macros. As for using NEON functions, the answer is simple: stick to
what the ARM manual says to do, rather than using Apple's mnemonics.
There are two macros that you need. These can be placed in a header
file somewhere if wanted.
The first macro allows you to deal with the underscore requirement of the
Darwin ABI:
```c
#ifdef __MACH__
# define PROC_NAME(__proc) _ ## __proc
#else
# define PROC_NAME(__proc) __proc
#endif
```
The second macro is optional, but it allows you to define the correct
ELF symbol types outside of Apple's toolchain:
```c
#ifdef __clang__
# define TYPE(__proc, __typ)
#else
# define TYPE(__proc, __typ) .type __proc, __typ
#endif
```
Then you just write your assembly as normal, but using these macros:
```asm
.global PROC_NAME(unmask)
.align 2
TYPE(unmask, @function)
PROC_NAME(unmask):
...
```
And that's all there is to it. As long as you follow these guidelines,
you will have assembly which is portable to any UNIX-like environment on
64-bit ARM.