IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    Linker notes on AArch32

    MaskRay发表于 2023-04-24 07:17:43
    love 0

    UNDER CONSTRUCTION

    This article describes target-specific details about AArch32 in ELFlinkers. I described AArch64 in a previousarticle.

    AArch32 is the 32-bit execution state for the Arm architecture andruns the A32 and T32 instruction sets. A32 refers to the old ISA with a32-bit fixed width, while T32 refers to the mixed 16-bit and 32-bitThumb2 instructions.

    "AArch32", "A32", and "T32" are new names. Many projects use "ARM","Arm", or "arm" as their port name.

    ABI documents

    • ELFfor the Arm® Architecture
    • ProcedureCall Standard for the Arm® Architecture
    • C++ABI for the Arm® Architecture
    • ExceptionHandling ABI for the Arm® Architecture

    Global Offset Table

    The Global Offset Table consists of two sections:

    • .got.plt holds code addresses for PLT.
    • .got holds other addresses and offsets.

    The symbol _GLOBAL_OFFSET_TABLE_ is defined at thebeginning of the .got section. GNU ld reserves a singleentry for .got and .got[0] holds the link-timeaddress of _DYNAMIC for a legacy reason Versions of glibcprior to 2.35 have the _DYNAMIC requirement. See Allabout Global Offset Table.

    Procedure Linkage Table

    The PLT header looks like:

    1
    2
    3
    4
    L1: str lr, [sp, #-4]!
    add lr, pc, #0x0NN00000 &(.got.plt - L1 - 4)
    add lr, lr, #0x000NN000 &(.got.plt - L1 - 4)
    ldr pc, [lr, #0x00000NNN] &(.got.plt -L1 - 4)

    If .git.plt-.plt-4 exceeds the +-128MiB range, along-form PLT header is needed.

    Cortex-M Security Extensions

    --cmse-implib

    This option is for linker support for the Cortex-MSecurity Extensions (CMSE). It does two jobs:

    • synthesize secure gateway veneers
    • write a CMSE import library when --out-implib= isspecified

    If a non-local symbol __acle_se_$sym is present, reportan error if $sym is not defined. Otherwise$sym is considered a secure gateway veneer. Both__acle_se_$sym and $sym must be a non-absolutefunction with an odd st_value.

    If the addresses of __acle_se_$sym and $symare not equal, the linker considers that there is an inline securegateway and doesn't do anything special; otherwise the linkersynthesizes a secure gateway veneer in a special section.gnu.sgstubs with the following logic.

    The linker allocates an input section in .gnu.sgstubsand defines $sym relative to it. In the output file,$sym is moved to .gnu.sgstubs, a differenttext section.

    1
    2
    3
    4
    5
    <.gnu.sgstubs>:
    ...
    $sym:
    sg
    b.w __acle_se_$sym

    If --in-implib is specified and the library defines$sym (say the address is $addr), in the output$sym has a fixed address of $addr. Otherwise,the linker assigns an address (larger than all synthesized securegateway veneers with fixed addresses).

    --out-implib=out.lib

    Used with --cmse-implib. Write the CMSE import libraryto out.lib.

    out.lib will have 3 sections:.symtab, .strtab, .shstrtab. For every synthesized SecureGateway veneer, write a SHN_ABS symbol whose address is$addr (if specified by the --in-impliblibrary) or the linker-assigned address. (The CMSE import library doesnot contain text sections, so a defined symbol has to useSHN_ABS.)

    Thread Local Storage

    AArch32 uses a variant of TLS Variant I: the static TLS blocks areplaced above the thread pointer. The thread pointer points to the end ofthe thread control block.

    The linker doesn't perform TLS optimization.

    The traditional general dynamic and local dynamic TLS models are usedby default. A TLSDESC ABI exists.

    See Allabout thread-local storage.

    Thunks

    A destination not reachable by the branch instruction needs a rangeextension thunk.

    v4, v4T

    ARMv4T introduced the 16-bit Thumb instruction set. These processorsdo not support BLX. ARM/Thumb state change must be done using BX.

    In the ARM state, the relocation typesR_ARM_PC24/R_ARM_PLT32/R_ARM_JUMP24/R_ARM_CALL may needrange extension or state change. lld 16.0.0 has added thunk support forThumb. We can build Game Boy Advance or Nintendo DS roms using Thumbcode with ld.lld.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    // ARM to ARM, absolute
    ldr pc, [pc, #-4]
    L1: .word S

    // ARM to Thumb, absolute
    ldr r12, [pc] ; L1
    bx r12
    L1: .word S

    // ARM to ARM, position-independent
    ldr ip, [pc]
    L1: add pc, pc, ip
    .word S - (L1 + 8)

    // ARM to Thumb, position-independent
    ldr ip, [pc]
    L1: add ip, pc, ip
    bx ip
    .word S - (L1 + 8)

    R_ARM_THM_CALL

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    // Thumb to ARM, absolute
    bx
    b #-6
    ldr pc, [pc, #-4]
    L1: .word S

    // Thumb to Thumb, absolute
    bx
    b #-6
    ldr ip, [pc]
    bx ip
    L1: .word S

    // Thumb to ARM, position-independent
    bx
    b #-6
    ldr ip, [pc, #]
    L1: add ip, pc, ip
    L2: .word S - (L1 + 8)

    // Thumb to Thumb, position-independent
    bx
    b #-6
    ldr ip, [pc, #-4]
    L1: add ip, pc, ip
    bx ip
    L2: .word S - (L1 + 8)

    v5, v6, v6-K, v6-KZ

    These are pre-Cortex processors that support BLX, but not Thumbbranch range extension or MOVT/MOVW.

    There is no Thumb branch instruction in ARMv5 that supports thunks.LDR can switch processor states.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    // absolute (LDR can switch processor states)
    ldr pc, [pc, #-4]
    .word S

    // position-independent
    ldr ip, [pc, #4]
    L1: add ip, pc, ip
    bx ip
    .word S - (L1 + 8)

    v6-M

    Only Thumb instructions are supported. These processors support BLXand J1 J2 encodings (branch range extension), but not MOVT/MOVW.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    // near
    b.w S

    // far, absolute
    push {r0, r1}
    ldr r0, [pc, #4]
    str r0, [sp, #4]
    pop {r0, pc}
    .word S

    // far, position-independent
    push {r0}
    ldr r0, [pc, #8]; L2
    mov ip, r0
    pop {r0}
    L1: add pc, ip
    nop
    L2: .word S - (L1 + 4)

    v6T2, v7 (v7-A, v7-R,v7-M, v7E-M) and newer

    ARMv6T2 introduced Thumb-2. These processors support BLX/MOVT/MOVWand Thumb branch range extension. (All architectures used in Cortexprocessors with the exception of v6-M and v6S-M have the MOVT/MOVWinstructions.)

    In the ARM state, these relocation typesR_ARM_PC24, R_ARM_PLT32, R_ARM_JUMP24, R_ARM_CALL may needthunk for range extension or state change.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    // near and the destination is in the ARM state
    b S

    // absolute
    movw ip, :lower16:S
    movt ip, :upper16:S
    bx ip

    // position-independent
    movw ip, :lower16:S-(L1+8)
    movt ip, :upper16:S-(L1+8)
    L1: add ip, ip, pc
    bx ip

    In the Thumb state, these relocation typesR_ARM_THM_JUMP19, R_ARM_THM_JUMP24, R_ARM_THM_CALL may needthunk for range extension or state change.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    // near and the destination is in the Thumb state
    b.w S

    // absolute
    movw ip, :lower16:S
    movt ip, :upper16:S
    bx ip

    // position-independent
    movw ip, :lower16:S-(L1+4)
    movt ip, :upper16:S-(L1+4)
    L1: add ip, ip, pc
    bx ip

    --fix-cortex-a8

    This option enables a linker workaround for Arm Cortex-A8 Errata657417. Linkers scan a 4-byte Thumb-2 branch instruction (Bcc.w, B.w,BLX.w, BL.w) that spans two 4KiB pages, and the target address of thebranch falls within the first region. The branch instruction follows a4-byte non-branch instruction. This may result in an incorrectinstruction fetch or processor deadlock.

    Oncea erratum condition is detected, linkers try to rewrite it intoan alternative code sequence. See the comments in the implementationsfor detail.

    SHT_ARM_ATTRIBUTES

    Usually named .ARM.attributes.

    SHT_ARM_EXIDX(.ARM.exidx) and .ARM.extab

    See Stackunwinding



沪ICP备19023445号-2号
友情链接