This article describes SHF_ALLOC|SHF_COMPRESSED
sectionsin ELF and a proposed linker option --compress-sections
tocompress arbitrary sections.
In recent years, metadata sections have gained more uses. Some usersmay find compression attractive if they weigh file sizes overdecompression overhead.
Some metadata sections are non-SHF_ALLOC
, like DWARFdebug sections. They may be unused by a runtime library but provideadditional information for an analysis tool to inspect the content.
Other metadata sections have the SHF_ALLOC
flag andtherefore are part of a PT_LOAD
segment. When the programis executed, we will have both the compressed and uncompressed copies inmemory. This may appear inefficient if the runtime library needs toallocate a buffer for the decompressed content. Why do some peopleaccept this tradeoff? Well, they may prioritize file size or they simplyaccept this inefficiency.
Some factors may lean more toward compression.
Anyhow, compression is sometimes useful and designers may want toimplement compression within the format. However, this would lead toduplicated code, considering that we have a generic feature at theobject file format level: ELF SHF_COMPRESSED
.
SHF_COMPRESSED
sectionsCompressed debugsections I wrote last year describes SHF_COMPRESSED
.Currently, its use cases are limited to debug sections.
Many ELF linkers provide--compress-debug-sections=[zlib|zstd]
to compress.debug_*
sections. This functionality can be extended toarbitrary sections. I filed a GNU ldfeature request in 2021, and recently decided to push forward withthe effort. I have implemented [ELF] Add--compress-sections in ld.lld.
My proposed syntax is--compress-sections <section-glob>=[zlib|zstd]
. If anoutput section matches the glob pattern, it will be compressed using thespecified format: zlib or zstd. A natural question arises: if aSHF_ALLOC
section is matched, should the linker compressthe section?
SHF_ALLOC|SHF_COMPRESSED
sectionsI believe the answer is yes. However, I have concerns regardingnon-compliance with the ELF standard, as stated in the current genericABI documentation:
SHF_COMPRESSED - This flag identifies a section containing compresseddata. SHF_COMPRESSED applies only to non-allocable sections, and cannotbe used in conjunction with SHF_ALLOC. In addition, SHF_COMPRESSEDcannot be applied to sections of type SHT_NOBITS.
I believe this restriction is unnecessary for all of relocatableobject files, executables, and shared objects. Therefore, I have proposed achange in the generic-abi group to remove the incompatibilitybetween SHF_COMPRESSED
and SHF_ALLOC
in thewording.
For SHF_ALLOC|SHF_COMPRESSED
sections in relocatableobject files, they are fine as long as the linker is capable ofdecompressing the content. It is also important to ensure thatnon-linker consumers are compatible with the section, which isrelatively easy to confirm.
For linker output (executable or shared object), having both theSHF_ALLOC
and SHF_COMPRESSED
flags should bepermissible. For example, we can use PC-relative relocations in ametadata section to reference text sections. These relocations will beresolved at link time. 1
2
3
4
5
6
7
8
9
10
11
12
13
14 ...
leaq __start_foo0(%rip), %rdi
leaq __stop_foo0(%rip), %rsi
call runtime
.section .text.foo,"ax"
nop
.section .text.bar,"ax"
nop
.section foo,"a"
.p2align 3
.quad .text.foo-.
.quad .text.bar-.
Section headers are optional in executables and shared objects andare ignored by runtime loaders. The runtime loader does not require anyspecial handling for the SHF_COMPRESSED
flag.
When the program is executed, the runtime library responsible for themetadata section will allocate memory and decompress the section'scontent. Let's say foo
has the address 0x201000 in memory.If we allocate the decompressed buffer at address 0x202000, we need toadd the offset 0x201000-0x202000 = -0x1000 to the label differences infoo
.
Section decompression imposes certain limitations on use cases.
First, the sections cannot have dynamic sections. Relocation offsetsare relative to the decompressed section instead of the compressedsection. However, the runtime loader doesn't know there are compressedsections, so it would blindly apply the relocation. In the common casethat the compressed section is smaller, the runtime loader will modifybytes in a subsequent section or an unmapped address, leading to runtimecrashes or corruption of writable data sections.
In general, absolute references should be replaced with labeldifferences. 1
2
3
4.section foo.wrong,"a"
.p2align 3
.quad .text.foo # wrong
.quad .text.bar # wrong
Second, a symbol defined relative to an input section designate anoffset relative to the compressed content. If the program tries to readthe content at the symbol, it will load a random location from thecompressed content or another section. Similar to the arithmetic exampleabove (0x201000-0x202000 = -0x1000), the runtime library may adjust thesymbol value by addr(compressed)-addr(decompressed)
toobtain an offset relative to the decompressed content. If there are twosymbols defined relative to one or two input sections, taking thedifference can be more convenient.
Third, there is a circular dependency between the compressed sectioncontent and the section layout.
.quad .text.foo-.
) change values when the text sectionaddress changes.. += expr;
) in anoutput section description may change.1 | SECTIONS { |
If the linker compresses the section content at one step, thecompressed content will be invalidated when the section content or sizechanges.
A sophisticated implementation can run compression and sectioncontent/size computation iteratively in a loop. In my ld.lldimplementation, I just compute the uncompressed section size and contentonce, apply compression, and then perform other computations that needto be done iteratively. Since most metadata sections don't need anoutput section description using a linker script, I believe thelimitation is acceptable.
SHF_ALLOC
provides protection from strip, which can beseen as a secondary benefit. I think it is valuable in practice, aslinking and stripping are separate steps, and the developers may haveless control on the strip side.
In the GNU world, I think distributions have some strip optionrequirement/restriction. Therefore, there was even a post usingsection flags to indicate stripable or persistent sectionsdiscussing whether we want a section flag to avoid stripping.
Discerningbetween compressed and uncompressed sectionsThe runtime library can identify the section content with a pair ofencapsulation symbols __start_<sectionname>
and__stop_<sectionname>
. These symbols are definedrelative to the output section. In my ld.lld implementation,__stop_<sectionname> - __start_<sectionname>
equals the compressed section size.
Note: if we use linker scripts to define a symbol in the middle ofthe output section, it will not have clear semantics. Such constructsshould be avoided.
The content at __start_<sectionname>
may beuncompressed starting with a regular metadata section header orcompressed starting with anElf32_Chdr
/Elf64_Chdr
header.
1 | typedef struct { |
The metadata section header may start with a magic number. Inpractice, ch_type
can only be 1(ELFCOMPRESS_ZLIB
) or 2 (ELFCOMPRESS_ZSTD
).When I added ELFCOMPRESS_ZSTD
,https://groups.google.com/g/generic-abi/c/satyPkuMisk/m/xRqMj8M3AwAJ Iacknowledged that we did not intend to include a plethora of formats. Soeven if we want to be a good ELF citizen and don't collide with futureELF extensions, there are still numerous values available forallocation.
While there is a slight risk of collision, it is not a significantcause for concern. Metadata sections generally faces fewer backwardcompatibility restrictions since prebuilt libraries with specificinstrumentation are considered awkward and therefore uncommon. In thesuper rare circumstance that the metadata section header collides withan Elf32_Chdr
/Elf64_Chdr
header, we can updatethe metadata section header.
You may read the first paragraph of C++standard library ABI compatibility where I describe differentcompatibility guarantees. In reality, for many metadata sections, mixingdifferent versions are unsupported. If they work for a user mixingdifferent versions, it's great. If not, the user should know that theonus is on their side.
The description has primarily focused on ELF, which is the mostwidely used object file format for server platforms. Many metadatasections are designed specifically for ELF and prioritize it as theprimary platform.
In contrast, PE/COFF and Mach-O, while popular in their respectiveoperating systems, have comparatively lower demand for certain metadatafeatures. However, it would be beneficial if these formats alsosupported a generic section compression feature.
Designing metadata sections can be challenging due to limitations inPE/COFF and Mach-O. For instance, COFF ARM64 does not supportIMAGE_REL_ARM64_REL64
. As a workaround, my https://reviews.llvm.org/D104564 introduced the use of.quad .text.foo-.
. Mach-O requires atoms (non-temporarylabels) as separators for subsections.