UNDER CONSTRUCTION
This article describes target-specific details about AArch32 in ELFlinkers. I described AArch64 in a previousarticle.
AArch32 is the 32-bit execution state for the Arm architecture andruns the A32 and T32 instruction sets. A32 refers to the old ISA with a32-bit fixed width, while T32 refers to the mixed 16-bit and 32-bitThumb2 instructions.
"AArch32", "A32", and "T32" are new names. Many projects use "ARM","Arm", or "arm" as their port name.
The Global Offset Table consists of two sections:
.got.plt
holds code addresses for PLT..got
holds other addresses and offsets.The symbol _GLOBAL_OFFSET_TABLE_
is defined at thebeginning of the .got
section. GNU ld reserves a singleentry for .got
and .got[0]
holds the link-timeaddress of _DYNAMIC
for a legacy reason Versions of glibcprior to 2.35 have the _DYNAMIC
requirement. See Allabout Global Offset Table.
The PLT header looks like:
1 | L1: str lr, [sp, #-4]! |
If .git.plt-.plt-4
exceeds the +-128MiB range, along-form PLT header is needed.
--cmse-implib
This option is for linker support for the Cortex-MSecurity Extensions (CMSE). It does two jobs:
--out-implib=
isspecifiedIf a non-local symbol __acle_se_$sym
is present, reportan error if $sym
is not defined. Otherwise$sym
is considered a secure gateway veneer. Both__acle_se_$sym
and $sym
must be a non-absolutefunction with an odd st_value
.
If the addresses of __acle_se_$sym
and $sym
are not equal, the linker considers that there is an inline securegateway and doesn't do anything special; otherwise the linkersynthesizes a secure gateway veneer in a special section.gnu.sgstubs
with the following logic.
The linker allocates an input section in .gnu.sgstubs
and defines $sym
relative to it. In the output file,$sym
is moved to .gnu.sgstubs
, a differenttext section. 1
2
3
4
5<.gnu.sgstubs>:
...
$sym:
sg
b.w __acle_se_$sym
If --in-implib
is specified and the library defines$sym
(say the address is $addr
), in the output$sym
has a fixed address of $addr
. Otherwise,the linker assigns an address (larger than all synthesized securegateway veneers with fixed addresses).
--out-implib=out.lib
Used with --cmse-implib
. Write the CMSE import libraryto out.lib
.
out.lib
will have 3 sections:.symtab, .strtab, .shstrtab
. For every synthesized SecureGateway veneer, write a SHN_ABS
symbol whose address is$addr
(if specified by the --in-implib
library) or the linker-assigned address. (The CMSE import library doesnot contain text sections, so a defined symbol has to useSHN_ABS
.)
AArch32 uses a variant of TLS Variant I: the static TLS blocks areplaced above the thread pointer. The thread pointer points to the end ofthe thread control block.
The linker doesn't perform TLS optimization.
The traditional general dynamic and local dynamic TLS models are usedby default. There is a TLSDESCABI.
See Allabout thread-local storage.
A destination not reachable by the branch instruction needs a rangeextension thunk. ARM and Thumb state changes need a thunk as well.
ARMv4T introduced the 16-bit Thumb instruction set. These processorsdo not support BLX. ARM/Thumb state change must be done using BX.
In the ARM state, the relocation typesR_ARM_PC24/R_ARM_PLT32/R_ARM_JUMP24/R_ARM_CALL
may needrange extension or state change. lld 16.0.0 has added thunk support forThumb. We can build Game Boy Advance or Nintendo DS roms using Thumbcode with ld.lld.
1 | // ARM to ARM, absolute |
R_ARM_THM_CALL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27// Thumb to ARM, absolute
bx
b #-6
ldr pc, [pc, #-4]
L1: .word S
// Thumb to Thumb, absolute
bx
b #-6
ldr ip, [pc]
bx ip
L1: .word S
// Thumb to ARM, position-independent
bx
b #-6
ldr ip, [pc, #]
L1: add ip, pc, ip
L2: .word S - (L1 + 8)
// Thumb to Thumb, position-independent
bx
b #-6
ldr ip, [pc, #-4]
L1: add ip, pc, ip
bx ip
L2: .word S - (L1 + 8)
These are pre-Cortex processors that support BLX, but not Thumbbranch range extension or MOVT/MOVW.
There is no Thumb branch instruction in ARMv5 that supports thunks.LDR can switch processor states.
1 | // absolute (LDR can switch processor states) |
Only Thumb instructions are supported. These processors support BLXand J1 J2 encodings (branch range extension), but not MOVT/MOVW.
1 | // near |
ARMv6T2 introduced Thumb-2. These processors support BLX/MOVT/MOVWand Thumb branch range extension. (All architectures used in Cortexprocessors with the exception of v6-M and v6S-M have the MOVT/MOVWinstructions.)
In the ARM state, these relocation typesR_ARM_PC24, R_ARM_PLT32, R_ARM_JUMP24, R_ARM_CALL
may needthunk for range extension or state change.
1 | // near and the destination is in the ARM state |
In the Thumb state, these relocation typesR_ARM_THM_JUMP19, R_ARM_THM_JUMP24, R_ARM_THM_CALL
may needthunk for range extension or state change.
1 | // near and the destination is in the Thumb state |
--fix-cortex-a8
This option enables a linker workaround for Arm Cortex-A8 Errata657417. Linkers scan a 4-byte Thumb-2 branch instruction (Bcc.w, B.w,BLX.w, BL.w) that spans two 4KiB pages, and the target address of thebranch falls within the first region. The branch instruction follows a4-byte non-branch instruction. This may result in an incorrectinstruction fetch or processor deadlock.
Oncea erratum condition is detected, linkers try to rewrite it intoan alternative code sequence. See the comments in the implementationsfor detail.
SHT_ARM_ATTRIBUTES
Usually named .ARM.attributes
.
SHT_ARM_EXIDX
(.ARM.exidx
) and .ARM.extab
See Stackunwinding
--be8
When linking big-endian images there are the deprecated BE-32 mode(word-invariant addressing big-endian mode) and the new BE-8 mode(byte-invariant addressing big-endian mode).
BE-32 is used by older architectures like arm7tdmi andarm926ej-s.
For ARMv6-M, ARMv7, and later architectures the default is BE8. Therelocatable object files have big-endian code and data. Compiler driverspass --be8
to the linker to to convert big-endian code tolittle-endian.
The linker finds $a
/$t
/%t
mapping symbols to locate ARM and Thumb code and perform byte swapping.ARM code is reversed as 4-byte units, Thumb code is reversed as 2-byteunits, while data is unchanged. The linker sets theEF_ARM_BE8
flag in the ELF header.
1 | cat > a.s <<e |
a.o
and a
have data (.word
) ofthe same endianness, but a
has little-endianinstructions.
1 | % llvm-objdump -s -d -j .text -j .text.2 a.o |