IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    Exploring the section layout in linker output

    MaskRay发表于 2023-12-18 04:45:13
    love 0

    Let's begin with a Linux x86-64 example involving global variablesexhibiting various properties such as read-only versus writable,zero-initialized versus non-zero, and more.

    1
    2
    3
    4
    5
    6
    7
    #include <stdio.h>
    const int ro = 1;
    int w0, w1 = 1;
    int *const pw0 = &w0;
    int main() {
    printf("%d %d %d %p\n", ro, w0, w1, pw0);
    }
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    % clang -c -fpie a.c
    % clang -pie -fuse-ld=lld -Wl,-z,separate-loadable-segments a.o -o a
    % objdump -wt a | grep -P 'main|w[01]|ro$'
    00000000000010f0 g F .text 000000000000002e main
    0000000000003044 g O .bss 0000000000000004 w0
    0000000000003010 g O .data 0000000000000004 w1
    000000000000058c g O .rodata 0000000000000004 ro
    0000000000002010 g O .data.rel.ro 0000000000000008 pw0
    % readelf -Wl a
    ...
    Program Headers:
    Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
    PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R 0x8
    INTERP 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x00001c 0x00001c R 0x1
    [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
    LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000628 0x000628 R 0x1000
    LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x000180 0x000180 R E 0x1000
    LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x0001e0 0x001000 RW 0x1000
    LOAD 0x003000 0x0000000000003000 0x0000000000003000 0x000040 0x000048 RW 0x1000
    DYNAMIC 0x002018 0x0000000000002018 0x0000000000002018 0x0001a0 0x0001a0 RW 0x8
    GNU_RELRO 0x002000 0x0000000000002000 0x0000000000002000 0x0001e0 0x001000 R 0x1
    GNU_EH_FRAME 0x0005a0 0x00000000000005a0 0x00000000000005a0 0x00001c 0x00001c R 0x4
    GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0
    NOTE 0x0002c4 0x00000000000002c4 0x00000000000002c4 0x000020 0x000020 R 0x4
    ...

    (We will discuss -Wl,-z,separate-loadable-segmentslater.)

    We can see that these functions and global variables are placed indifferent sections.

    • .rodata: read-only data without dynamic relocations,constant in the link unit
    • .text: functions
    • .data.rel.ro: read-only data associated with dynamicrelocations, constant after relocation resolving, part of thePT_GNU_RELRO segment
    • .data: writable data
    • .bss: writable data known to be zeros

    Section and segment layout

    TODO I may write more about how linkers layout sections and segments.

    Anyhow, the linker will place .data and.bss in the same PT_LOAD program header(segment) and the rest into different PT_LOAD segments.(There are some nuances. If you use GNU ld's -z noseparate-codeor lld's --no-rosegment,.rodata and .text will be placed in the samePT_LOAD segment.)

    The PT_LOAD segments have different flags(p_flags): PF_R, PF_R|PF_X,PF_R|PF_W.

    Subsequently, the dynamic loader, also known as the dynamic linker,will invoke mmap to map the file into memory usingpermissions specified by p_flags. For aPT_LOAD segment, its associated memory area starts atalignDown(p_vaddr, pagesize) and ends atalignUp(p_vaddr+p_memsz, pagesize).

    1
    2
    3
    4
    5
        Start Addr           End Addr       Size     Offset  Perms  objfile
    0x555555554000 0x555555555000 0x1000 0x0 r--p /tmp/c/a
    0x555555555000 0x555555556000 0x1000 0x1000 r-xp /tmp/c/a
    0x555555556000 0x555555557000 0x1000 0x2000 r--p /tmp/c/a
    0x555555557000 0x555555558000 0x1000 0x3000 rw-p /tmp/c/a

    Let's assume the page size is 4096 bytes. We'll calculate thealignDown(p_vaddr, pagesize) values and display themalongside the "Start Addr" values:

    1
    2
    3
    4
    5
    Start Addr       alignDown(p_vaddr, pagesize)
    0x555555554000 0x0000000000000000
    0x555555555000 0x0000000000001000
    0x555555556000 0x0000000000002000
    0x555555557000 0x0000000000003000

    We observe that the start address equals the base address plusalignDown(p_vaddr, pagesize).

    --no-rosegment

    1
    2
    3
    4
        Start Addr           End Addr       Size     Offset  Perms  objfile
    0x555555554000 0x555555555000 0x1000 0x0 r-xp /tmp/c/a
    0x555555555000 0x555555556000 0x1000 0x0 r--p /tmp/c/a
    0x555555556000 0x555555557000 0x1000 0x1000 rw-p /tmp/c/a

    MAXPAGESIZE

    A page serves as the granularity at which memory exhibits differentpermissions, and within a page, we cannot have varying permissions.Using the previous example where p_align is 4096, if thepage size is larger, for example, 65536 bytes, the program mightcrash.

    Typically, the dynamic loader allocates memory for the firstPT_LOAD segment (PF_R) at a specific addressallocated by the kernel. Subsequent PT_LOAD segments thenoverwrite the previous memory regions. Consequently, certain code pagesor significant global variables might be replaced by garbage, leading toa crash.

    So, how can we create a link unit that works across different pagesizes? We simply determine the maximum page size, let's say, 2097152,and then pass -z max-page-size=2097152 to the linker. Thelinker will set p_align values of PT_LOADsegments to MAXPAGESIZE.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    Program Headers:
    Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
    PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R 0x8
    INTERP 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x00001c 0x00001c R 0x1
    [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
    LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000628 0x000628 R 0x10000
    LOAD 0x010000 0x0000000000010000 0x0000000000010000 0x000180 0x000180 R E 0x10000
    LOAD 0x020000 0x0000000000020000 0x0000000000020000 0x0001e0 0x001000 RW 0x10000
    LOAD 0x030000 0x0000000000030000 0x0000000000030000 0x000040 0x000048 RW 0x10000
    DYNAMIC 0x020018 0x0000000000020018 0x0000000000020018 0x0001a0 0x0001a0 RW 0x8
    GNU_RELRO 0x020000 0x0000000000020000 0x0000000000020000 0x0001e0 0x001000 R 0x1
    GNU_EH_FRAME 0x0005a0 0x00000000000005a0 0x00000000000005a0 0x00001c 0x00001c R 0x4
    GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0
    NOTE 0x0002c4 0x00000000000002c4 0x00000000000002c4 0x000020 0x000020 R 0x4

    In a linker script, the max-page-size can be obtainedusing CONSTANT(MAXPAGESIZE).

    -z separate-loadable-segments

    In previous examples using-z separate-loadable-segments, the p_vaddrvalues of PT_LOAD segments are multiples of MAXPAGESIZE.The generic ABI says "loadable process segments must have congruentvalues for p_vaddr and p_offset, modulo the page size."

    p_offset - This member gives the offset from the beginning of thefile at which the first byte of the segment resides.

    p_vaddr - This member gives the virtual address at which the firstbyte of the segment resides in memory.

    This alignment requirement aligns with the mmapdocumentation. For example, Linux man-pages specifies, "offset must be amultiple of the page size as returned by sysconf(_SC_PAGE_SIZE)."

    The p_offset values are also multiples of MAXPAGESIZE.After layouting out a PT_LOAD segment, the linker must padthe end by inserting zeros so that the next PT_LOAD segmentstarts at a multiple of MAXPAGESIZE.

    However, the alignment padding is wasteful. Fortunately, we can linka.o using different MAXPAGESIZE and different alignmentsettings:-z noseparate-code,-z separate-code,-z separate-loadable-segments.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    clang -pie -fuse-ld=lld -Wl,-z,noseparate-code a.o -o a0.4096
    clang -pie -fuse-ld=lld -Wl,-z,noseparate-code,-z,max-page-size=65536 a.o -o a0.65536
    clang -pie -fuse-ld=lld -Wl,-z,noseparate-code,-z,max-page-size=2097152 a.o -o a0.2097152

    clang -pie -fuse-ld=lld -Wl,-z,separate-code a.o -o a1.4096
    clang -pie -fuse-ld=lld -Wl,-z,separate-code,-z,max-page-size=65536 a.o -o a1.65536
    clang -pie -fuse-ld=lld -Wl,-z,separate-code,-z,max-page-size=2097152 a.o -o a1.2097152

    clang -pie -fuse-ld=lld -Wl,-z,separate-loadable-segments a.o -o a2.4096
    clang -pie -fuse-ld=lld -Wl,-z,separate-loadable-segments,-z,max-page-size=65536 a.o -o a2.65536
    clang -pie -fuse-ld=lld -Wl,-z,separate-loadable-segments,-z,max-page-size=2097152 a.o -o a2.2097152
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    % stat -c %s a0.4096 a0.65536 a0.2097152
    6168
    6168
    6168
    % stat -c %s a1.4096 a1.65536 a1.2097152
    12392
    135272
    4198504
    % stat -c %s a2.4096 a2.65536 a2.2097152
    16120
    200440
    6295288

    We can derive two properties:

    • Under one MAXPAGESIZE, we havesize(noseparate-code) < size(separate-code) < size(separate-loadable-segments).
    • For -z noseparate-code, increasing MAXPAGESIZE does notchange the output size.

    -z noseparate-code

    How does -z noseparate-code work? Let's illustrate thiswith an example.

    At the end of the read-only PT_LOAD segment, the addressis 0x628. Instead of starting the next segment atalignUp(0x628, MAXPAGESIZE) = 0x1000, we start atalignUp(0x628, MAXPAGESIZE) + 0x628 % MAXPAGESIZE = 0x1628.Since the .text section has an alignment(sh_addralign) of 16, we start at 0x1630. Although theaddress is advanced beyond necessity, the file offset (congruent to theaddress, modulo MAXPAGESIZE) can be decreased to 0x630, merely 8 bytes(due to alignment padding) after the previous section's end.

    Moving forward, the end of the executable PT_LOADsegment has an address of 0x17b0. Instead of starting the next segmentat alignUp(0x17b0, MAXPAGESIZE) = 0x2000, we start atalignUp(0x17b0, MAXPAGESIZE) + 0x17c0 % MAXPAGESIZE = 0x27b0.While we advance the address more than needed, the file offset can bedecreased to 0x7b0, precisely at the previous section's end.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    % readelf -WSl a0.4096
    ...
    [Nr] Name Type Address Off Size ES Flg Lk Inf Al
    [ 0] NULL 0000000000000000 000000 000000 00 0 0 0
    [ 1] .interp PROGBITS 00000000000002a8 0002a8 00001c 00 A 0 0 1
    ...
    [12] .eh_frame PROGBITS 00000000000005c0 0005c0 000068 00 A 0 0 8
    [13] .text PROGBITS 0000000000001630 000630 00011e 00 AX 0 0 16
    ...
    [16] .plt PROGBITS 0000000000001780 000780 000030 00 AX 0 0 16
    [17] .fini_array FINI_ARRAY 00000000000027b0 0007b0 000008 08 WA 0 0 8
    ...
    [20] .dynamic DYNAMIC 00000000000027c8 0007c8 0001a0 10 WA 7 0 8
    [21] .got PROGBITS 0000000000002968 000968 000028 00 WA 0 0 8
    [22] .relro_padding NOBITS 0000000000002990 000990 000670 00 WA 0 0 1
    [23] .data PROGBITS 0000000000003990 000990 000014 00 WA 0 0 8
    ...
    [26] .bss NOBITS 00000000000039d0 0009d0 000008 00 WA 0 0 4
    ...
    LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000628 0x000628 R 0x1000
    LOAD 0x000630 0x0000000000001630 0x0000000000001630 0x000180 0x000180 R E 0x1000
    LOAD 0x0007b0 0x00000000000027b0 0x00000000000027b0 0x0001e0 0x000850 RW 0x1000
    LOAD 0x000990 0x0000000000003990 0x0000000000003990 0x000040 0x000048 RW 0x1000
    DYNAMIC 0x0007c8 0x00000000000027c8 0x00000000000027c8 0x0001a0 0x0001a0 RW 0x8
    GNU_RELRO 0x0007b0 0x00000000000027b0 0x00000000000027b0 0x0001e0 0x000850 R 0x1

    -z separate-code performs the trick when transiting fromthe first RW PT_LOAD segment to the second, whereas-z separate-loadable-segments doesn't.

    WhenMAXPAGESIZE is larger than the actual page size

    Let's consider two adjacement PT_LOAD segments. Thememory area associated with the first segment ends atalignUp(load[0].p_vaddr+load[0].p_memsz, pagesize) whilethe memory area associated with the second one starts atalignDown(load[1].p_vaddr, pagesize). When the actual pagesize equals MAXPAGESIZE, the two addresses are identical. However, ifthe actual page size is smaller, a gap emerges between theseaddresses.

    A typical link unit generally presents three gaps. These gaps mighteither be unmapped or mapped. When mapped, they necessitatestruct vm_area_struct objects within the Linux kernel. Asof Linux 6.3.13, the size of struct vm_area_struct is 152bytes. For instance, 10000 mapped object files would require10000 * 3 * sizeof(struct vm_area_struct) = 4,560,000 bytes,signifying a considerable memory footprint. You can refer to Extrastruct vm_area_struct with ---p created when PAGE_SIZE <max-page-size.

    Dynamic loaders typically invoke mmap usingPROT_READ, encompassing the whole file, followed bymultiple mmap calls using MAP_FIXED and thecorresponding flags. When dynamic loaders, like musl, don't processgaps, the gaps retain r--p permissions. However, in glibc'self/dl-map-segments.h, the has_holes codeemploys mprotect to transition permissions fromr--p to ---p.

    While ---p might be perceived as a security enhancement,personally, I don't believe it significantly impacts exploitability.While there might be numerous gadgets in r-xp areas,reducing gadgets in r--p areas doesn't seem notablyimpactful. (https://isopenbsdsecu.re/mitigations/rop_removal/)

    Unmap the gap

    Within Linux kernel loads the executable and its interpreter (itpresent) (fs/binfmt_elf.c), the gap gets unmapped, therebyfreeing a struct vm_area_struct object. Implementing asimilar approach in dynamic loaders could yield comparable savings.

    However, unmapping the gap carries the risk of an unrelated futuremmap occupying the gap:

    1
    2
    3
    564d8e90f000-564d8e910000 r--p 00000000 08:05 2519504        /sample/build/main
    ================ an unrelated mmap may be placed in the gap
    564d8e91f000-564d8e920000 r-xp 00010000 08:05 2519504 /sample/build/main

    It is not clear whether the potential occurrence of an unrelated mmapconsidered a regression in security. Personally, I don't think thisposes a significant issue as the program does not access the gaps. Thisproperty can be guaranteed for direct access when input relocations tothe linker use symbols with in-bounds addends (e.g. when x is definedrelative to an input section, we know R_X86_64_PC32(x) mustbe in-bounds).

    However, some programs may expect contiguous maps areas of a file(such as when glibc link_map::l_contiguous is set to 1).Does this choice render the program exploitable if an attacker canensure a map within the gap instead of outside the file? It seems to methat they could achieve everything with a map outside of the file.

    Having said that, the presence of an unrelated map between mapsassociated with a single file descriptor remains odd, so it's preferableto avoid it if possible.

    Extend the memory areato cover the gap

    The loader code can adjust p_filesz andp_memsz when invoking mmap.

    1
    2
    564d8e90f000-**564d8e91f000** r--p 00000000 08:05 2519504        /sample/build/main  (the end is extended)
    564d8e91f000-564d8e920000 r-xp 00010000 08:05 2519504 /sample/build/main

    This appears the best solution.

    A new linker option?

    Personally I favor end extending approach. I've also pondered whetherthis falls under the purview of linkers. Such a change seems intrusiveand unsightly. If the linker extends the end of p_memsz to cover thegap, should it also extend p_filesz?

    • If it doesn't, we create a PT_LOAD with p_filesz/p_memsz that is notfor BSS, which is weird.
    • If it does, we have an output file featuring overlapping file offsetranges, which is weird as well.

    Moreover, a PT_LOAD whose end isn't backed by a section is unusual.I'm concerned that many binary manipulation tools may not handle thiscase correctly. Utilizing a linker script can intentionally creatediscontiguous address ranges. I'm concerned that the linker might notdiscern such cases with intelligent logic regardingp_filesz/p_memsz.

    This feature request seems to be within the realm of loaders andspecific information, such as the page size, is only accessible toloaders. I believe loaders are better equipped to handle this task."

    Transparent huge pagesfor mapped files

    Some programs optimize their usage of the limited TranslationLookaside Buffer (TLB) by employing transparent huge pages. When theLinux kernel loads an executable, it takes into account thep_align field to create a memory area. Ifp_align is 4096, the memory area will commence at amultiple of 4096, but not necessarily at a multiple of a huge page.

    Transparent huge pages for mapped files require both the startaddress and the start file offset to align with a huge page. To ensurecompatibility with MADV_HUGEPAGE, linking the executableusing -z max-page-size= with the huge page size isrecommended. However, in -z noseparate-code layouts, thefile content might start somewhere at the first page, potentiallywasting half a huge page on unrelated content.

    Switching to -z separate-code allows reclaiming thebenefits of the half huge page but increases the file size. Balancingthese aspects poses a challenge. One potential solution is usingfallocate(FALLOC_FL_PUNCH_HOLE), which introducescomplexity into the linker. However, this approach feels like aworkaround to address a kernel limitation. It would be preferable if afile-backed huge page didn't necessitate a file offset aligned to a hugepage boundary.

    Cost of RELRO

    To accommodate PT_GNU_RELRO, the RW PT_LOADsegment will possess two permissions after the runtime linker maps theprogram. While lld employs two explicit RW PT_LOADsegments, GNU ld provides one RW segment split by the runtime linker.Ultimately, the effects of lld and GNU ld are similar.

    Due to RELRO, covering the two RW PT_LOAD segmentsnecessitates a minimum of 2 huge pages. In contrast, without RELRO, onlyone huge page is required at minimum. This means potentially wasting upto MAXPAGESIZE-1 bytes, which could otherwise be utilized by huge pagesto cover more data.

    Nowadays, RELRO is considered a security baseline and removing itmight unsettle security-minded individuals.



沪ICP备19023445号-2号
友情链接