IT博客汇 | Exploring object file formats

Exploring object file formats

MaskRay发表于 2024-01-15 08:11:04

My journey with the LLVM project began with a deep dive into theworld of lld and binary utilities. Countless hours were spent unravelingthe intricacies of object file formats and shaping LLVM's relevantcomponents. Though my interests have since broadened, object fileformats remain a personal fascination, often drawing me into discussionsaround potential changes within LLVM.

This article compares several prominent object file formats, drawingupon my experience and insights.

At the heart of each format lies the representation of essentialcomponents like symbols, sections, and relocations. For each controlstructure, We'll begin with ELF, a widely used format, before venturinginto the landscapes of other notable formats.

History of object fileformats

The appendix of All aboutProcedure Linkage Table contains some history of these object fileformats. I have move some the notes here with some additions.

a.out

The a.out format was designed for PDP-11 (1970). The quantities were16-bit, but can be naturally extended to 32-bit or 64-bit.

In Proceedings of the Summer 1990 USENIX Conference, ELF: AnObject File to Mitigate Mischievous Misoneism by James Q. Arnoldprovided some description.

For 32-bit machines, the a.out format was extended in several ways.Most obviously, 16-bit quantities were enlarged to 32-bit values. Thesymbol table changed to allow names of unlimited length. Relocationentries also changed significantly. Larger programs and differentrelocation conventions made it necessary to associate a relocation entrywith an explicit address, instead of relying on the implicitcorrespondence between program sections and relocation records.

Many UNIX and UNIX-like operating systems, including SunOS, HP-UX,BSD, and Linux, used a.out before switching to ELF.

The most noticeable extension is dynamic shared library support.(This feature is distinct from static shared library, where each sharedlibrary needs a fixed address in the address space.) There are twoflavors:

In 1988, SunOS 4.0 was released with an extended a.out binary formatwith dynamic shared library support.
In 1993, on NetBSD, https://github.com/NetBSD/src/commit/97ca10e37476fb84a20a8ec4b0be3188db703670(A linker supporting shared libraries.) and https://github.com/NetBSD/src/commit/3d68d0acaed0a32f929b2c174146c62940005a18(A linker supporting shared libraries (run-time part).)added shared library support similar to the SunOS scheme.

FreeBSDa.out(5) provides a nice description.

If you follow recent years' Linux kernel news, there were somediscussions when Linux eventually removeda.out support in 2022.

COFF

a.out supports three fixed loadable sections TEXT, DATA, and BSS,which is too restrictive. COFF introduces custom section support andallows up to 32767 sections. The ELF paper contains some remarks:

Common Object File Format (COFF), was designed primarily to supportelectronic switching systems (the telephone network). Its distinguishingfeatures were multiple sections (text, data, uninitialized memory,reserved memory, overlays, etc.), some support for multiple targetprocessors, defined structures for symbol tables and relocations, anddebugging information tailored for the C language.

According to scnhdr.h in System V Release 2 for NS32xxx,COFF was designed no later than 1982. Then, System V Release 3 adoptedCOFF, which motivated a lot of follow-ups.

Windows extended COFF to the PortableExecutable (PE) format.
Texas Instruments modifiedCOFF for its TI toolset and then switched to ELF.
ECOFF used by Tru64 UNIX changed symbol representation.
IBM developed XCOFF (COFF combined with the TOC module formatconcept, CSECT, etc) and used it for AIX.

Prominent drawbacks:

Hard-wiring debugging information tailored for the C language intothe symbol structure is complex, space-inefficient, and ugly.
The auxiliary symbol record design is inflexible andinefficient.
Not 32-bit-aligned symbol and section structures caused performanceissue to earlier systems.

Mach-O

Carnegie Mellon University developed the Mach kernel as a proof ofthe microkernel concept. The operating system used a format derived froma.out, named the Mach object file format. The abbreviation, Mach-O, isoften used instead. The NeXTSTEP operating system and then Darwinadopted Mach-O.

Dynamic shared library support on Mach-O came later than other objectfile formats. In a NeXTSTEP manual released in 1995, I can findMH_FVMLIB (fixed virtual memory library, which appears tobe a static shared library scheme) but not MH_DYLIB (usedby modern macOS for .dylib files).

ELF

Frustrations and inherent constraints of COFF, coupled with aself-imposed byte order dilemma, AT&T introduced a groundbreakingformat: Executable and Linking Format (ELF). ELF revisited fixed contentand hard-wired concepts in previous object file formats, removedunnecessary elements, and made control structures more flexible.

This pivotal shift was embraced by System V Release 4, marking a newera in object file format design. In the 1990s, many UNIX and UNIX-likeoperating systems, including Solaris, IRIX, HP-UX, Linux, and FreeBSD,switched to ELF.

Symbols

The minimum of a symbol control structure needs to encode the name,section, and value. We can require that every symbol is defined inrelation to some section. We can use a section index of zero torepresent an undefined symbol.

In a minimum object file format with only few hard-coded sections(a.out), the section field can be omitted. A type field can be used todecide whether the symbol can reference a function or a data object.

// ELFCLASS32, 16 bytes
typedef struct {
  Elf32_Wordst_name;
  Elf32_Addrst_value;
  Elf32_Wordst_size;
  unsigned charst_info;
  unsigned charst_other;
  Elf32_Halfst_shndx;
} Elf32_Sym;

// ELF, 24 bytes
typedef struct {
  Elf64_Word st_name;     // index into the string table
  unsigned char st_info;  // type and binding
  unsigned char st_other; // visibility and others
  Elf64_Half st_shndx;    // section index
  Elf64_Addr st_value;
  Elf64_Xword st_size;
} Elf64_Sym;

The symbol name is represented as a 32-bit index into the stringtable. A 32-bit integer suffices, while a 16-bit integer would be toosmall.

st_shndx uses a size-saving trick. The 16-bit memberencodes a section index. If the member is SHN_XINDEX(0xffff), then the actual value is contained in the associated sectionof type SHT_SYMTAB_SHNDX. This is a very nice trick becausethe number of sections are almost always smaller than 0xff00. Inpathologic cases, there can be more sections, where a section of typeSHT_SYMTAB_SHNDX is needed.

st_info specifies the symbol's type (4 bits) and binding(4 bits) attributes. Types are allocated very conservatively and usuallyimply different linker behaviors. The inherently different linkerbehaviors for symbol types are not that many. So 4 bits seem small, theyare sufficient in practice. As we will learn, this is significantlysmaller than COFF's type and storage class representation. A symbol'sbinding is for the local/weak/global distinction. The reserved 4 bitscan accommodate more values, but only GNU reserves one value(STB_GNU_UNIQUE) (a misfeature in my opinion).

In COFF, function symbols can use an auxiliary symbol record toencode the size of function (x_fsize;TotalSize in PE). In ELF, st_size is a fixedmember, used for copy relocations and symbolizers. If we eliminate copyrelocations and don't need the symbolization heuristics, this field willbecome garbage.

Here is a demonstration if we remove st_size.

// 16 bytes
struct Elf64_Sym_minimized {
  Elf64_Word st_name;     // index into the string table
  unsigned char st_info;  // type and binding
  unsigned char st_other; // visibility and others
  Elf64_Half st_shndx;    // section index
  Elf64_Addr st_value;
} Elf64_Sym;

Symbols (a.out)

// a.out (System V), 16 bytes
struct nlist {
  char n_name[8];
#if pdp11
  int n_type;
#else
  char n_type;
  char n_other;
  short n_desc;
#endif
  unsigned n_value;
};

a.out uses a nlist to represent a symbol table entry. Inthe original format, the name cannot contain more than 8 bytes. Unix'sappreciation of shorter identifier names is related to this:)

To support longer names, extensions allow n_name to beinterpreted as an index (n_strx) into the string table.This member then becomes a size-saving trick by inlining a short name (8bytes or less) into the structure. Some variants, like binutils' 64-bita.out format, use an index exclusively and removedn_name.

n_type, broken down into three sub-fields, describeswhether a symbol is defined or undefined, external or local, and thesymbol type. The values listed on the FreeBSD manpage are also used onPDP-11. Later System V releases extended the list with a lot of valuesfor stabs (a debugging format). Some values seem specific toFORTRAN.

For a defined symbol, n_type describes whether it isrelative to TEXT, DATA, or BSS.

Symbols (COFF)

// COFF (System V Release 3), 18 bytes in the absence padding
struct syment {
  union {
    char _n_name[SYMNMLEN]; /* old COFF version */
    struct {
      long _n_zeroes; /* new == 0 */
      long _n_offset; /* offset into string table */
    } _n_n;
  } _n;
  unsigned long n_value; /* value of symbol */
  short n_scnum; /* section number */
  unsigned short n_type; /* type and derived type */
  char n_sclass; /* storage class */
  char n_numaux; /* number of aux. entries */
};

COFF adopts a.out's approach to save space in symbol names. Thislikely made sense when most symbols were shorter. However, with today'soften lengthy symbol names, this inlining technique complicates code andincreases the control structure size (from 4 to 8 bytes).

The section number is a 16-bit signed integer, supporting up to32,767 sections. Positive values indicate a section index, while specialvalues include:

N_UNDEF (0): Undefined symbol (distinct from a.out'sn_type representation).
N_ABS (-1): Symbol has an absolute value.
N_DEBUG (-2): Special debugging symbol (value ismeaningless).

COFF's n_type and n_sclass encode C' typeand storage class information. PE assigns longer names to these typesand storage classes longer names, e.g.,IMAGE_SYM_TYPE_CHAR/IMAGE_SYM_TYPE_SHORT,IMAGE_SYM_CLASS_AUTOMATIC/IMAGE_SYM_CLASS_EXTERNAL. Whilevalues are mostly consistent, minor differences exist:

PE's IMAGE_SYM_TYPE_VOID (1) is different from System VRelease 3's#define T_ARG 1 /* function argument (only used by compiler) */.
PE's IMAGE_SYM_CLASS_WEAK_EXTERNAL (105) is differentfrom System V Release 3's#define C_ALIAS 105 /* duplicate tag */.

Symbols with C_EXT(IMAGE_SYM_CLASS_EXTERNAL) are global and added to thelinker's global symbol table, akin to ELF's STB_GLOBALsymbol binding.

If we discard the influence of stabs, n_type andn_class just provide a wasteful counterpart to ELF'sst_info.

n_numaux relates to Auxiliary Symbol Records, allowingextra information but introducing non-uniform symbol table entries.While seemingly beneficial, their use cases are limited and could oftenbe encoded using separate sections. In PE, an auxiliary symbol recordcan represent weak definitions, but weak references are not supported.They can also provide extra information to section symbols.

ECOFF defines Local Symbol Entry (SYMR) and External Symbol Entry(EXTR).

typedef struct {
  coff_long value;
  coff_int iss;
  coff_uint st : 6;
  coff_uint sc : 5;
  coff_uint reserved : 1;
  coff_uint index : 20;
} SYMR, *pSYMR;

typedef struct {
  SYMR asym;
  coff_uint jmptbl:1;
  coff_uint cobol_main:1;
  coff_uint weakext:1;
  coff_uint reserved:29;
  coff_int ifd;
} EXTR, *pEXTR;

Symbols (Mach-O)

// Mach-O, 16 bytes
struct nlist_64 {
  uint32_t n_strx;
  uint8_t n_type;
  uint8_t n_sect;
  uint16_t n_desc;
  uint64_t n_value;
};

Mach-O's nlist_64 is not that different from a.out's,with n_other changed to n_sect to indicate thesection index. The 8-bit n_sect field restricts representable sectionsto 255 without out-of-band data (discussed later). If we extendn_sect to 32-bit, with alignment padding the structure sizewill increase to 24 bytes, the same as Elf64_Sym.

Like a.out, the N_EXT bit of n_typeindicates an external symbol. The N_PEXT bit indicates aprivate external symbol.

Key bits in n_desc are N_WEAK_DEF,N_WEAK_REF, and N_ALT_ENTRY.

Sections

// ELF, 40 bytes
typedef struct {
Elf32_Wordsh_name;
Elf32_Wordsh_type;
Elf32_Wordsh_flags;
Elf32_Addrsh_addr;
Elf32_Offsh_offset;
Elf32_Wordsh_size;
Elf32_Wordsh_link;
Elf32_Wordsh_info;
Elf32_Wordsh_addralign;
Elf32_Wordsh_entsize;
} Elf32_Shdr;

// ELF, 64 bytes
typedef struct {
Elf64_Wordsh_name;
Elf64_Wordsh_type;
Elf64_Xwordsh_flags;
Elf64_Addrsh_addr;
Elf64_Offsh_offset;
Elf64_Xwordsh_size;
Elf64_Wordsh_link;
Elf64_Wordsh_info;
Elf64_Xwordsh_addralign;
Elf64_Xwordsh_entsize;
} Elf64_Shdr;

The section name is represented as a 32-bit index into the stringtable. If we use a 16-bit integer, a large number of section names witha symbol suffix (e.g. .text.foo .text.bar)could make the index overflow.

sh_type categorizes the section's contents andsemantics. It avoids hard-coding magic names in many scenarios.Technically a 16-bit type could work pretty well but was deemedinsufficient for flexibility.

sh_flags describe miscellaneous attributes, e.g.writable and executable permissions, and whether the section shouldappear in a loadable segment. This member is 32-bit inElf32_Shdr while 64-bit in Elf64_Shdr. Inpractice no architecture defines flags for bits 32 to 63, therefore thismember is somewhat wasteful.

Location and size. sh_offset gives the byte offset fromthe beginning of the file to the first byte in the section. To supportobject files larger than 4GiB, this member has to be 64-bit.sh_size gives the section's size in bytes. A section typeof SHT_NOBITS occupies no space in the file. To supportsections larger than 4GiB, this member has to be 64-bit.

Address and alignment. sh_addr describes the address atwhich the section's first byte should reside for an executable or sharedobject. It should be zero for relocatable files.sh_addralign holds the address alignment. In practice thismember must be a power of 2 even if the generic ABI does not require so.This member is 64-bit in ELF64, which allows an alignment up to2**63. In practice, an alignment larger than the page size(or the largest huge page size, if huge pages are enabled) does not makesense, and a maxiumm value of 2**31 is sufficient. Therefore, we coulduse a log2 value to hold the alignment.

Connection information. sh_link holds a section index.sh_info holds either a section index or a symbol index. Ifyou recall that st_shndx is 16 bits for very solid reason,you will know that the two fields are somewhat wasteful.

For a table of fixed-size entries, sh_entsize holds theentry size in bytes. In some use cases this member is not a power oftwo. In practice, one byte suffices.

While ELF's section header structure is designed for flexibility,potential optimizations could reduce its size without significant lossof functionality. By using smaller data types for sh_flags,sh_link, sh_info, and sh_entsizebased on practical needs, we could make the structure significantlysmaller.

// 32 bytes
struct Elf32_Shdr_minimized {
  Elf32_Wordsh_name;
  Elf32_Wordsh_type;    // Making this uint16_t and reordering it can decrease the size to 28 bytes
  Elf32_Wordsh_flags;
  Elf32_Addrsh_addr;
  Elf32_Offsh_offset;
  Elf32_Wordsh_size;
  uint8_tsh_addralign;
  uint8_tsh_entsize;
  Elf32_Halfsh_link;
  Elf32_Halfsh_info;
};

// 40 bytes
struct Elf64_Shdr_minimized {
  Elf64_Word sh_name;
  Elf64_Word sh_flags;
  Elf64_Addr sh_addr;
  Elf64_Off sh_offset;
  Elf64_Xword sh_size;
  Elf64_Half sh_type;
  uint8_t sh_addralign;
  uint8_t sh_entsize;
  Elf64_Half sh_link;
  Elf64_Half sh_info;
};

Reducing sh_type into 2 bytes loses flexibility a bit.If this deems insufficient, we could take 3 bits fromsh_addralign (by turning it into a bitfield) and give themto sh_type.

Sections (COFF)

// COFF (System V Release 3), 40 bytes, when sizeof(long) == 4
struct scnhdr {
  char            s_name[8];      /* section name */
  long            s_paddr;        /* physical address */
  long            s_vaddr;        /* virtual address */
  long            s_size;         /* section size */
  long            s_scnptr;       /* file ptr to raw data for section */
  long            s_relptr;       /* file ptr to relocation */
  long            s_lnnoptr;      /* file ptr to line numbers */
  unsigned short  s_nreloc;       /* number of relocation entries */
  unsigned short  s_nlnno;        /* number of line number entries */
  long            s_flags;        /* flags */
};

// PE, 40 bytes
struct section {
  char Name[8];
  uint32_t VirtualSize;
  uint32_t VirtualAddress;
  uint32_t SizeOfRawData;
  uint32_t PointerToRawData;
  uint32_t PointerToRelocations;
  uint32_t PointerToLineNumbers;
  uint16_t NumberOfRelocations;
  uint16_t NumberOfLineNumbers;
  uint32_t Characteristics;
};

PE's section control structure demonstrates a minor modificationcompared to COFF, s_paddr => VirtualSize.

The presented structure measures as 40 bytes when longis 4 bytes. If we extends_paddr, s_vaddr, s_size, s_scnptr, s_relptr, s_lnnoptr to8 bytes, the structure will be of 64 bytes.

The section name supports up to 8 bytes. A longer name would requirean extension similar to the symbol control structure.

Encoding both s_paddr and s_vaddr iswasteful. ELF encodes the physical address in the segment and thereforeremoves the member from its section structure.

COFF embeds the location and size of relocations into the sectionstructure. This is actually pretty nice. A 16-bit s_nrelocmay appear restritive but is sufficient for relocatable files. Inpractice, the number of relocations can exceed 65536 for a singlesection using relocatable linking.

Line number entries are used to relate addresses to source file linenumbers. s_lnnoptr and s_nlnno are forobsoleted line number information. They are embedded in the sectionstructure, which is very inflexible.

Sections (Mach-O)

// Mach-O, 80 bytes
struct section_64 {
  char sectname[16];
  char segname[16];
  uint64_t addr;
  uint64_t size;
  uint32_t offset;
  uint32_t align;
  uint32_t reloff;
  uint32_t nreloc;
  uint32_t flags;
  uint32_t reserved1;
  uint32_t reserved2;
  uint32_t reserved3;
};

A Mach-O binary is divided into segments, each housing one or moresections. The section structure encodes the section name and the segmentname, both can be up to 16 bytes. This representation allows the sectionnames to be read without a string table, but restrictive for descriptivenames. Section semantics are derived from the name (unlike ELF).

The segment name is redundantly encoded within the section structure.We could derive the segment from the section name and flags, e.g.,S_ATTR_SOME_INSTRUCTIONS => __TEXT ,S_ZEROFILL => ZeroFill __DATA .

There is a severe limitation: maximum of 255 sections due tonlist::n_sect being a uint8_t. This isapparently too restrictive. Thankfully, an innovative feature.subsections_via_symbols overcomes the limitation. Thefeature uses a monolithic section with "atoms" dividing it into pieces(subsections). This is more size-efficient than ELF's-ffunction-sections -fdata-sections. However, there areassembler limitations, relocation processing complexity, and potentialloss of ability to ensure that two non-local symbols are notreordered.

Like COFF, Mach-O embeds the location and size of relocations intothe section structure.

The three trailing reserved members are a bad idea. They increase thesize considerably. A better approach would be to just change the versionnumber when the structure needs to grow. It's unlikely that an olderconsumer can interepret a new section with new members set for newsemantics.

Relocations

// ELFCLASS32, 8 bytes
typedef struct {
 Elf32_Addr r_offset;
 Elf32_Word r_info;
} Elf32_Rel;
// ELFCLASS32, 12 bytes
typedef struct {
 Elf32_Addr r_offset;
 Elf32_Word r_info;
 Elf32_Sword r_addend;
} Elf32_Rela;

// ELF, 16 bytes
typedef struct {
  Elf64_Addr r_offset;
  Elf64_Xword r_info;    // relocation type and symbol index
} Elf64_Rel;
// ELF, 24 bytes
typedef struct {
  Elf64_Addr r_offset;
  Elf64_Xword r_info;    // relocation type and symbol index
  Elf64_Sxword r_addend;
} Elf64_Rela;

r_info specifies the symbol table index with respect towhich the relocation must be made, and the type of relocation toapply.

ELFCLASS32: 8-bit type, 24-bit symbol index
ELFCLASS64: 32-bit type, 32-bit symbol index

There are two variants, REL and RELA. Let's quote the genericABI:

As specified previously, only Elf32_Rela and Elf64_Rela entriescontain an explicit addend. Entries of type Elf32_Rel and Elf64_Relstore an implicit addend in the location to be modified. Depending onthe processor architecture, one form or the other might be necessary ormore convenient. Consequently, an implementation for a particularmachine may use one form exclusively or either form depending oncontext.

Relocatable files need a lot of relocatable types while executablesand shared objects need only a few. The former is often called staticrelocations while the latter is called dynamic relocations.

Of the few dynamic relocation types, most do not need the addendmember. lld provides an option -z rel to useSHT_REL/DT_REL dynamic relocations.

If we disregard the REL dynamic relocation scenario, then all modernarchitectures use RELA exclusively. Most architectures encode theimmediate with only few bits, which are inadequate for many relocatablefile uses.

ELFCLASS64, with its 64-bit members, doubles the size compared toELFCLASS32's 32-bit members. Since relocations often comprise asubstantial portion of object files, this size difference can lead touser concerns. However, in practice, a 24-bit symbol index is oftensufficient, even in 64-bit contexts. Therefore, if a 64-bitarchitecture's relocation type requirements are less than 256,ELFCLASS32 can be a viable and more size-efficient option.

Relocations (a.out)

// a.out (System V Release 2), 8 bytes
struct relocation_info {
  int   r_address;
  unsigned r_symbolnum : 24,
           r_pcrel : 1,
           r_length : 2,
           r_extern : 1,
           r_pad : 4;
};

r_symbolnum mirrors ELF's ELF32_R_SYM.

The other bitfields, resembling ELF's ELF32_R_TYPE, butsplit into distinct fields:

r_pcrel
r_length
others

Reserving dedicated semantics for individual bits can limitadaptability. COFF and ELF opted to remove bitfields in favor of a typeto provide greater flexibility.

Relocations (COFF)

// COFF, 10 bytes on disk, 12 bytes with alignment padding
struct reloc {
  long r_vaddr;
  long r_symndx;
  unsigned short r_type;
};

This format resembles ELF's Elf32_Rel.

r_vaddr gives the virtual address of the location atwhich to apply the relocation action. If we interpretr_vaddr as an offset (as PE does) and restrict section sizeto 32 bits, we could reuse this structure for 64-bit architectures.

r_symndx is a 32-bit symbol table index.

r_type is a 16-bit relocation type, limited in numbercompared to ELF.

COFF generally supports fewer relocation types than ELF. System VRelease 3 defines very few relocations for each architecture. Inbinutils, include/coff/*.h files define relocations formore architectures.

Relocations (Mach-O)

// Mach-O, 16 bytes
struct relocation_info {
   int32_tr_address;/* offset in the section to what is being
   relocated */
   uint32_t     r_symbolnum:24,/* symbol index if r_extern == 1 or section
   ordinal if r_extern == 0 */
r_pcrel:1, /* was relocated pc relative already */
r_length:2,/* 0=byte, 1=word, 2=long, 3=quad */
r_extern:1,/* does not include value of sym referenced */
r_type:4;/* if not 0, machine specific relocation type */
};

struct scattered_relocation_info {
#ifdef __BIG_ENDIAN__
  uint32_t r_scattered : 1, r_pcrel : 1, r_length : 2, r_type : 4,
      r_address : 24;
#else
  uint32_t r_address : 24, r_type : 4, r_length : 2, r_pcrel : 1,
      r_scattered : 1;
#endif
  int32_t r_value;
};

Mach-O's relocation structure closely mirrors a.out's with adaptedr_symbolnum meaning. When r_extern == 0(local), the r_symbolnum member references a section indexinstead of a symbol index. This is to support custom sections, breakingthe three-section limitation (text, data, and bss) of traditionala.out.

As aforementioned, dedicating bits to bitfields(r_pcrel, r_length, andr_scattered greatly restricted the number of relocationtypes.

Related to the relocation type limitation, a.long foo - . in a data section requires a pair ofrelocations, SUBTRACTOR and/UNSIGNED. I havesome notes on Port LLVM XRayto Apple systems.

Size comparison

TODO

Size reduction opportunities

ELFCLASS32 structures are already compact, offering limited sizereduction potential. ELFCLASS64 structures, while flexible, can beoptimized by sacrificing some flexibility (64-bit quantities). The64-bit symbol control structure is compact, but section and relocation'sare quite wasteful if we can sacrifice some flexibility.

As the ELF paper acknowledges, "Relocatable and executable files donot necessarily have the same constraints, and we considered using twofile formats. Eventually, we decided the two activities were similarenough that a single format would suffice." There are more toolsinspecting executables than relocatable files. So, naturally, we mightwant to change just relocatable files. Can we use ELFCLASS32 relocatablefiles for 64-bit architectures?

Well, x86-64 and AArch64 make a clear distinct of ELFCLASS32 andELFCLASS64. ELFCLASS32 is for ILP32 (x32, aarch64_ilp32) whileELFCLASS64 is for LP64. However, the discontinued Itanium architecturesets a precedent that ELFCLASS32 can be used for LP64 programs. Quotingits psABI (Intel Itanium Processorspecific Application BinaryInterface (ABI)).

For Itanium architecture ILP32 relocatable (i.e. of type ET_REL)objects, the file class value in e_ident[EI_CLASS] must be ELFCLASS32.For LP64 relocatable objects, the file class value may be eitherELFCLASS32 or ELFCLASS64, and a conforming linker must be able toprocess either or both classes. ET_EXEC or ET_DYN object file types mustuse ELFCLASS32 for ILP32 and ELFCLASS64 for LP64 programs.
Addresses appearing in ELFCLASS32 relocatable objects for LP64programs are implicitly extended to 64 bits by zero-extending.
Note: Some constructs legal in LP64 programs, e.g. absolute 64-bitaddresses outside the 32-bit range, may require use of an ELFCLASS64relocatable object file.

Given the prior art, it seems promising to allow ELFCLASS32 when thecode size concerns people. Ideally there should be a marker todistinguish ILP32 and LP64-using-ELFCLASS32 object files.

The primary changes reside in the assembler and linker. It's alsoimportant to ensure that binary manipulation programs (like objcopy) anddump tools are happy with them.

Further optimization potential lies in exploring the use ofElf32_Rel instead of Elf32_Rela for evensmaller relocations.

Replacing control structures

This approach is independent of whether ELFCLASS32 is adopted and canbe applied to both ELFCLASS32 and ELFCLASS64. The ELF paper is clear,"ELF allows extension and redefinition for other control structures."However, caution is warranted due to the significant impact on theecosystem as many tools rely on the existing structures.

One promising example is Elf32_Shdr_minimized, a customstructure reduced to 32 bytes from the standardElf32_Shdr's 40 bytes. While I would be nervous, but if wereduce sh_type to a uint16_t, the structuresize can reduce to 28 bytes.

TODO

WebAssembly