blog: add writing portable arm64 assembly article
ci/woodpecker/push/woodpecker Pipeline was successful
Details
ci/woodpecker/push/woodpecker Pipeline was successful
Details
Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>main
parent
5070f39353
commit
f412e77867
|
@ -0,0 +1,115 @@
|
|||
---
|
||||
title: Writing portable ARM64 assembly
|
||||
date: '2023-04-13'
|
||||
---
|
||||
|
||||
An unfortunate side effect of the rising popularity of Apple's ARM-based
|
||||
computers is an increase in unportable assembly code which targets the
|
||||
64-bit ARM ISA. This is because developers are writing these bits of
|
||||
assembly code to speed up their programs when run on Apple's ARM-based
|
||||
computers, without considering the other 64-bit ARM devices out there,
|
||||
such as SBCs and servers running Linux or BSD.
|
||||
|
||||
The good news is that it is very easy to write assembly which targets
|
||||
Apple's computers as well as the other 64-bit ARM devices running
|
||||
operating systems other than Darwin. It just requires being aware of
|
||||
a few differences between the Mach-O and ELF ABIs, as well as knowing
|
||||
what Apple-specific syntax extensions to avoid. By following the
|
||||
guidance in this blog, you will be able to write assembly code which
|
||||
is portable between Apple's toolchain, the official ARM assembly
|
||||
toolchain, and the GNU toolchain.
|
||||
|
||||
## Differences between the ELF and Mach-O ABIs
|
||||
|
||||
Modern UNIX systems, including Linux-based systems largely use the
|
||||
[ELF binary format][elf]. Apple uses [Mach-O][mach-o] in Darwin
|
||||
instead for historical reasons. This is not a requirement for Apple
|
||||
imposed by their use of Mach, indeed, OSFMK, the kernel that Darwin,
|
||||
MkLinux and OSF/1 are all based on, supports ELF binaries just fine.
|
||||
Apple just decided to use the Mach-O format instead.
|
||||
|
||||
[elf]: https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
|
||||
[mach-o]: https://en.wikipedia.org/wiki/Mach-O
|
||||
|
||||
When it comes to writing assembly (or, really, just linking code
|
||||
in general) targeting Darwin, the main difference to be aware of is
|
||||
that all symbols are prefixed with a single underscore. For example,
|
||||
if you have a function that would be declared in C like:
|
||||
|
||||
```c
|
||||
extern void unmask(const char *payload, const char *mask, size_t len);
|
||||
```
|
||||
|
||||
On Darwin, the function in your assembly code must be defined as `_unmask`.
|
||||
|
||||
The other major difference is that ELF defines different classes of
|
||||
data, for example `STT_FUNC` and `STT_OBJECT`. There is no equivalence
|
||||
in Mach-O, and thus the `.type` directive that you would use when writing
|
||||
assembly for ELF targets is not supported.
|
||||
|
||||
## Apple-specific vector mnemonics
|
||||
|
||||
The other main thing to watch out for is Apple's custom mnemonics for
|
||||
NEON. In order to make writing NEON code less cumbersome, Apple
|
||||
introduced a set of mnemonics that allow simplification of specifying
|
||||
NEON instructions. For example, if you are targeting Apple devices
|
||||
only, you might write an exclusive-or NEON instruction like so:
|
||||
|
||||
```asm
|
||||
eor.16b v2, v2, v0
|
||||
```
|
||||
|
||||
This is an Apple-specific extension to the ARM assembly syntax. The
|
||||
[official ARM assembly manual][armasm] specifies that the memory layout
|
||||
must be specified for each register:
|
||||
|
||||
```asm
|
||||
eor v2.16b, v2.16b, v0.16b
|
||||
```
|
||||
|
||||
[armasm]: https://developer.arm.com/documentation/dui0802/b/A64-SIMD-Vector-Instructions/EOR--vector-
|
||||
|
||||
## Abstracting the ABI details with some macros
|
||||
|
||||
The good news is that the ABI details can easily be abstracted with a
|
||||
few macros. As for using NEON functions, the answer is simple: stick to
|
||||
what the ARM manual says to do, rather than using Apple's mnemonics.
|
||||
|
||||
There are two macros that you need. These can be placed in a header
|
||||
file somewhere if wanted.
|
||||
|
||||
The first macro allows you to deal with the underscore requirement of the
|
||||
Darwin ABI:
|
||||
|
||||
```c
|
||||
#ifdef __MACH__
|
||||
# define PROC_NAME(__proc) _ ## __proc
|
||||
#else
|
||||
# define PROC_NAME(__proc) __proc
|
||||
#endif
|
||||
```
|
||||
|
||||
The second macro is optional, but it allows you to define the correct
|
||||
ELF symbol types outside of Apple's toolchain:
|
||||
|
||||
```c
|
||||
#ifdef __clang__
|
||||
# define TYPE(__proc, __typ)
|
||||
#else
|
||||
# define TYPE(__proc, __typ) .type __proc, __typ
|
||||
#endif
|
||||
```
|
||||
|
||||
Then you just write your assembly as normal, but using these macros:
|
||||
|
||||
```asm
|
||||
.global PROC_NAME(unmask)
|
||||
.align 2
|
||||
TYPE(unmask, @function)
|
||||
PROC_NAME(unmask):
|
||||
...
|
||||
```
|
||||
|
||||
And that's all there is to it. As long as you follow these guidelines,
|
||||
you will have assembly which is portable to any UNIX-like environment on
|
||||
64-bit ARM.
|
Loading…
Reference in New Issue