blog: add writing portable arm64 assembly article
ci/woodpecker/push/woodpecker Pipeline was successful
Details
ci/woodpecker/push/woodpecker Pipeline was successful
Details
Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>main
parent
5070f39353
commit
f412e77867
|
@ -0,0 +1,115 @@
|
||||||
|
---
|
||||||
|
title: Writing portable ARM64 assembly
|
||||||
|
date: '2023-04-13'
|
||||||
|
---
|
||||||
|
|
||||||
|
An unfortunate side effect of the rising popularity of Apple's ARM-based
|
||||||
|
computers is an increase in unportable assembly code which targets the
|
||||||
|
64-bit ARM ISA. This is because developers are writing these bits of
|
||||||
|
assembly code to speed up their programs when run on Apple's ARM-based
|
||||||
|
computers, without considering the other 64-bit ARM devices out there,
|
||||||
|
such as SBCs and servers running Linux or BSD.
|
||||||
|
|
||||||
|
The good news is that it is very easy to write assembly which targets
|
||||||
|
Apple's computers as well as the other 64-bit ARM devices running
|
||||||
|
operating systems other than Darwin. It just requires being aware of
|
||||||
|
a few differences between the Mach-O and ELF ABIs, as well as knowing
|
||||||
|
what Apple-specific syntax extensions to avoid. By following the
|
||||||
|
guidance in this blog, you will be able to write assembly code which
|
||||||
|
is portable between Apple's toolchain, the official ARM assembly
|
||||||
|
toolchain, and the GNU toolchain.
|
||||||
|
|
||||||
|
## Differences between the ELF and Mach-O ABIs
|
||||||
|
|
||||||
|
Modern UNIX systems, including Linux-based systems largely use the
|
||||||
|
[ELF binary format][elf]. Apple uses [Mach-O][mach-o] in Darwin
|
||||||
|
instead for historical reasons. This is not a requirement for Apple
|
||||||
|
imposed by their use of Mach, indeed, OSFMK, the kernel that Darwin,
|
||||||
|
MkLinux and OSF/1 are all based on, supports ELF binaries just fine.
|
||||||
|
Apple just decided to use the Mach-O format instead.
|
||||||
|
|
||||||
|
[elf]: https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
|
||||||
|
[mach-o]: https://en.wikipedia.org/wiki/Mach-O
|
||||||
|
|
||||||
|
When it comes to writing assembly (or, really, just linking code
|
||||||
|
in general) targeting Darwin, the main difference to be aware of is
|
||||||
|
that all symbols are prefixed with a single underscore. For example,
|
||||||
|
if you have a function that would be declared in C like:
|
||||||
|
|
||||||
|
```c
|
||||||
|
extern void unmask(const char *payload, const char *mask, size_t len);
|
||||||
|
```
|
||||||
|
|
||||||
|
On Darwin, the function in your assembly code must be defined as `_unmask`.
|
||||||
|
|
||||||
|
The other major difference is that ELF defines different classes of
|
||||||
|
data, for example `STT_FUNC` and `STT_OBJECT`. There is no equivalence
|
||||||
|
in Mach-O, and thus the `.type` directive that you would use when writing
|
||||||
|
assembly for ELF targets is not supported.
|
||||||
|
|
||||||
|
## Apple-specific vector mnemonics
|
||||||
|
|
||||||
|
The other main thing to watch out for is Apple's custom mnemonics for
|
||||||
|
NEON. In order to make writing NEON code less cumbersome, Apple
|
||||||
|
introduced a set of mnemonics that allow simplification of specifying
|
||||||
|
NEON instructions. For example, if you are targeting Apple devices
|
||||||
|
only, you might write an exclusive-or NEON instruction like so:
|
||||||
|
|
||||||
|
```asm
|
||||||
|
eor.16b v2, v2, v0
|
||||||
|
```
|
||||||
|
|
||||||
|
This is an Apple-specific extension to the ARM assembly syntax. The
|
||||||
|
[official ARM assembly manual][armasm] specifies that the memory layout
|
||||||
|
must be specified for each register:
|
||||||
|
|
||||||
|
```asm
|
||||||
|
eor v2.16b, v2.16b, v0.16b
|
||||||
|
```
|
||||||
|
|
||||||
|
[armasm]: https://developer.arm.com/documentation/dui0802/b/A64-SIMD-Vector-Instructions/EOR--vector-
|
||||||
|
|
||||||
|
## Abstracting the ABI details with some macros
|
||||||
|
|
||||||
|
The good news is that the ABI details can easily be abstracted with a
|
||||||
|
few macros. As for using NEON functions, the answer is simple: stick to
|
||||||
|
what the ARM manual says to do, rather than using Apple's mnemonics.
|
||||||
|
|
||||||
|
There are two macros that you need. These can be placed in a header
|
||||||
|
file somewhere if wanted.
|
||||||
|
|
||||||
|
The first macro allows you to deal with the underscore requirement of the
|
||||||
|
Darwin ABI:
|
||||||
|
|
||||||
|
```c
|
||||||
|
#ifdef __MACH__
|
||||||
|
# define PROC_NAME(__proc) _ ## __proc
|
||||||
|
#else
|
||||||
|
# define PROC_NAME(__proc) __proc
|
||||||
|
#endif
|
||||||
|
```
|
||||||
|
|
||||||
|
The second macro is optional, but it allows you to define the correct
|
||||||
|
ELF symbol types outside of Apple's toolchain:
|
||||||
|
|
||||||
|
```c
|
||||||
|
#ifdef __clang__
|
||||||
|
# define TYPE(__proc, __typ)
|
||||||
|
#else
|
||||||
|
# define TYPE(__proc, __typ) .type __proc, __typ
|
||||||
|
#endif
|
||||||
|
```
|
||||||
|
|
||||||
|
Then you just write your assembly as normal, but using these macros:
|
||||||
|
|
||||||
|
```asm
|
||||||
|
.global PROC_NAME(unmask)
|
||||||
|
.align 2
|
||||||
|
TYPE(unmask, @function)
|
||||||
|
PROC_NAME(unmask):
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
And that's all there is to it. As long as you follow these guidelines,
|
||||||
|
you will have assembly which is portable to any UNIX-like environment on
|
||||||
|
64-bit ARM.
|
Loading…
Reference in New Issue