UNDER CONSTRUCTION
This article describes SHF_ALLOC|SHF_COMPRESSED
sectionsin ELF and a proposed linker option --compress-sections
tocompress arbitrary sections.
In recent years, metadata sections have gained more uses. Some usersmay find compression attractive if they weigh file sizes overdecompression overhead.
Some metadata sections are non-SHF_ALLOC
, like DWARFdebug sections. They may be unused by a runtime library but provideadditional information for an analysis tool to inspect the content.
Other metadata sections have the SHF_ALLOC
flag andtherefore are part of a PT_LOAD
segment. We will have boththe compressed and uncompressed copies in memory. This may appearinefficient if the runtime library needs to allocate a buffer for thedecompressed content. Why do some people accept this tradeoff? Well,they may prioritize file size or they simply accept thisinefficiency.
Some factors may lean more toward compression.
Anyhow, compression is sometimes useful and designers may want toimplement compression within the format. However, this would lead toduplicated code, considering that we have a generic feature at theobject file format level: ELF SHF_COMPRESSED
.
SHF_COMPRESSED
sectionsCompressed debugsections I wrote last year describes SHF_COMPRESSED
.Currently, its use cases are limited to debug sections.
Many ELF linkers provide--compress-debug-sections=[zlib|zstd]
to compress.debug_*
sections. This functionality can be extended toarbitrary sections. I filed a GNU ldfeature request in 2021, and recently decided to push forward withthe effort. I have implemented [ELF] Add--compress-sections in ld.lld.
My proposed syntax is--compress-sections <section-glob>=[zlib|zstd]
. If anoutput section matches the glob pattern, it will be compressed using thespecified format: zlib or zstd. A natural question arises: if aSHF_ALLOC
section is matched, should the linker compressthe section?
SHF_ALLOC|SHF_COMPRESSED
sectionsI believe the answer is yes. However, I have concerns regardingnon-compliance with the ELF standard, as stated in the current genericABI documentation:
SHF_COMPRESSED - This flag identifies a section containing compresseddata. SHF_COMPRESSED applies only to non-allocable sections, and cannotbe used in conjunction with SHF_ALLOC. In addition, SHF_COMPRESSEDcannot be applied to sections of type SHT_NOBITS.
I believe this restriction is unnecessary for all of relocatableobject files, executables, and shared objects. Therefore, I have proposed achange in the generic-abi group to remove the incompatibilitybetween SHF_COMPRESSED
and SHF_ALLOC
in thewording.
For SHF_ALLOC|SHF_COMPRESSED
sections in relocatableobject files, they are fine as long as the linker is capable ofdecompressing the content. It is also important to ensure thatnon-linker consumers are compatible with the section, which isrelatively easy to confirm.
For linker output (executable or shared object), having both theSHF_ALLOC
and SHF_COMPRESSED
flags should bepermissible. For example, we can use PC-relative relocations in ametadata section to reference text sections. These relocations will beresolved at link time. 1
2
3
4
5
6
7
8
9
10
11
12
13
14 ...
leaq __start_foo0(%rip), %rdi
leaq __stop_foo0(%rip), %rsi
call runtime
.section .text.foo,"ax"
nop
.section .text.bar,"ax"
nop
.section foo,"a"
.p2align 3
.quad .text.foo-.
.quad .text.bar-.
Section headers are optional in executables and shared objects andare ignored by runtime loaders. The runtime loader does not require anyspecial handling for the SHF_COMPRESSED
flag.
When the program is executed, the runtime library responsible for themetadata section will allocate memory and decompress the section'scontent. Let's say foo
has the address 0x201000 in memory.If we allocate the decompressed buffer at address 0x202000, we need toadd the offset 0x201000-0x202000 = -0x1000 to the label differences infoo
.
Section decompression imposes certain limitations on use cases.
First, the sections cannot have dynamic sections. Relocation offsetsare relative to the decompressed section instead of the compressedsection. However, the runtime loader doesn't know there are compressedsections, so it would blindly apply the relocation. In the common casethat the compressed section is smaller, the runtime loader will modifybytes in a subsequent section or an unmapped address, leading to runtimecrashes or corruption of writable data sections.
In general, absolute references should be replaced with labeldifferences. 1
2
3
4.section foo.wrong,"a"
.p2align 3
.quad .text.foo # wrong
.quad .text.bar # wrong
Second, a symbol defined relative to an input section designate anoffset relative to the compressed content. If the program tries to readthe content at the symbol, it will load a random location from thecompressed content or another section. Similar to the arithmetic exampleabove (0x201000-0x202000 = -0x1000), the runtime library may adjust thesymbol value by addr(compressed)-addr(decompressed)
toobtain an offset relative to the decompressed content. If there are twosymbols defined relative to one or two input sections, taking thedifference can be more convenient.
Third, there is a circular dependency between the compressed sectioncontent and the section layout. The compressed section size affectsaddresses of subsequent sections and symbol assignments. On the otherhand, some linker scripts constructs may change the uncompressed sectioncontent or size. For example, if we use data commands in the outputsection description of a compressed section, the values may change whilethe linker computes section layout/symbol values iteratively. Similarly,if we use . += expr
, the uncompressed section size maychange. 1
2
3
4SECTIONS {
...
foo : { *(foo*) QUAD(expr1) . += expr2; }
}
If the linker compresses the section content at one step, thecompressed content will be invalidated when the section content or sizechanges.
A sophisticated implementation can run compression and sectioncontent/size computation iteratively in a loop. In my ld.lldimplementation, I just compute the uncompressed section size and contentonce, apply compression, and then perform other computations that needto be done iteratively. Since most metadata sections don't need anoutput section description using a linker script, I believe thelimitation is acceptable.
SHF_ALLOC
provides protection from strip, which can beseen as a secondary benefit. I think it is valuable in practice, aslinking and stripping are separate steps, and the developers may haveless control on the strip side.
In the GNU world, I think distributions have some strip optionrequirement/restriction. Therefore, there was even a post usingsection flags to indicate stripable or persistent sectionsdiscussing whether we want a section flag to avoid stripping.
Discerningbetween compressed and uncompressed sectionsThe runtime library can identify the section content with a pair ofencapsulation symbols __start_<sectionname>
and__stop_<sectionname>
. These symbols are definedrelative to the output section. In my ld.lld implementation,__stop_<sectionname> - __start_<sectionname>
equals the compressed section size.
Note: if we use linker scripts to define a symbol in the middle ofthe output section, it will not have clear semantics. Such constructsshould be avoided.
The content at __start_<sectionname>
may beuncompressed starting with a regular metadata section header orcompressed starting with anElf32_Chdr
/Elf64_Chdr
header.
1 | typedef struct { |
The metadata section header may start with a magic number. Inpractice, ch_type
can only be 1(ELFCOMPRESS_ZLIB
) or 2 (ELFCOMPRESS_ZSTD
).When I added ELFCOMPRESS_ZSTD
,https://groups.google.com/g/generic-abi/c/satyPkuMisk/m/xRqMj8M3AwAJ Iacknowledged that we did not intend to include a plethora of formats. Soeven if we want to be a good ELF citizen and don't collide with futureELF extensions, there are still numerous values available forallocation.
While there is a slight risk of collision, it is not a significantcause for concern. Metadata sections generally faces fewer backwardcompatibility restrictions since prebuilt libraries with specificinstrumentation are considered awkward and therefore uncommon. In thesuper rare circumstance that the metadata section header collides withan Elf32_Chdr
/Elf64_Chdr
header, we can updatethe metadata section header.
You may read the first paragraph of C++standard library ABI compatibility where I describe differentcompatibility guarantees. In reality, for many metadata sections, mixingdifferent versions are unsupported. If they work for a user mixingdifferent versions, it's great. If not, the user should know that theonus is on their side.