LLD, the LLVM linker, is a mature and fast linker supporting multiplebinary formats (ELF, Mach-O, PE/COFF, WebAssembly). Designed as astandalone program, the code base relies heavily on global state, makingit less than ideal for library integration. As outlined in RFC:Revisiting LLD-as-a-library design, two main hurdles exist:
I understand that calling a linker API could be convenient,especially when you want to avoid shipping another executable (which canbe large when you linking against LLVM statically). However, I believethat invoking LLD as a separate process remains the recommendedapproach. There are several advantages:
While spawning a new process offers build system benefits, the issueof global state usage within LLD remains a concern. This is a factor toconsider, especially for advanced use cases. Here are global variablesin the LLD 15 code base.
1 | % rg '^extern [^(]* \w+;' lld/ELF |
Some global states exist as static member variables.
LLD has been undergoing a transformation to reduce its reliance onglobal variables. This improves its suitability for libraryintegration.
In 2021, global variables were removed fromlld/Common
. The COFF port followed suite, eliminating most of its globalvariables.
Inspired by theseadvancements, I conceived a plan to eliminate globalvariables from the ELF port. In 2022, as part of the work to enableparallel section initialization, I introduced a classstruct Ctx
to lld/ELF/Config.h
. Here is myplan:
Ctx
.Ctx &ctx
parameter.lld::elf::link
.Ctx
Over the past two years and a half, I have migrated global variablesinto the Ctx
class, e.g..
1 | diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h |
I did not do anything thing with the global variables in 2024. Thework was resumed in July 2024. I moved TarWriter
,SymbolAux
, Out
, ElfSym
,outputSections
, etc into Ctx
.
1 | struct Ctx { |
The config
variable, used to store command-line options,was pervasive throughout lld/ELF. To enhance code clarity andmaintainability, I renamed it to ctx.arg
(mold naming).
Ctx &ctx
as parametersThe subsequent phase involved adding Ctx &ctx
as aparameter to numerous functions and classes, gradually eliminatingreferences to the global ctx
.
I incorporated Ctx &ctx
as a member variable to afew classes (e.g. SyntheticSection
,OutputSection
) to minimize the modifications to memberfunctions. This approach was not suitable for Symbol
andInputSection
, since even a single word could increasememory consumption significantly.
1 | // Writer.cpp |
ctx
variableOnce the global ctx
variable's reference count reachedzero, it was time to remove it entirely. I implemented the change onNovember 16, 2024.
1 | diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h |
Prior to this modification, the cleanupCallback function wasessential for resetting the global ctx when lld::elf::link was calledmultiple times.
Previously, cleanupCallback
was essential for resettingthe global ctx
when lld::elf::link
was invokedmultiple times. With the removal of the global variable, this callbackis no longer necessary. We can now rely on the constructor to initializeCtx
and avoid the need for a reset
function.
lld/Common
While significant progress has been made to lld/ELF
,lld/Common
needs a lot of work as well. A lot of sharedutility code (diagnostics, bump allocator) utilizes the globallld::context()
.
1 | /// Returns the default error handler. |
Although thread-local variables are an option, worker threads spawnedby llvm/lib/Support/Parallel.cpp
don't inherit their valuesfrom the main thread. Given our direct access toCtx &ctx
, we can leverage context-aware APIs asreplacements.
https://github.com/llvm/llvm-project/pull/112319introduced context-aware diagnostic utilities:
log("xxx")
=>Log(ctx) << "xxx"
message("xxx")
=>Msg(ctx) << "xxx"
warn("xxx")
=>Warn(ctx) << "xxx"
errorOrWarn(toString(f) + "xxx")
=>Err(ctx) << f << "xxx"
error(toString(f) + "xxx")
=>ErrAlways(ctx) << f << "xxx"
fatal("xxx")
=>Fatal(ctx) << "xxx"
As of Nov 16, 2024, I have eliminatedlog/warn/error/fatal
from lld/ELF.
Miscellaneous suggestions:
lld::saver
void message(const Twine &msg, llvm::raw_ostream &s = outs());
,which utilizes lld::outs()
lld::make
from lld/include/lld/Common/Memory.h
LTO link jobs utilize LLVM. Understanding its global state iscrucial.
While LLVM allows for multiple LLVMContext
instances tobe allocated and used concurrently, it's important to note that theseinstances share certain global states, such as cl::opt
andManagedStatic
. Specifically, it's not possible to run twoconcurrent LLVM compilations (including LTO link jobs) with distinctsets of cl::opt
option values. To link with distinctcl::opt
values, even after removing LLD's global state,you'll need to spawn a new LLD process.
Any proposal that moves away from global state seems to complicatecl::opt
usage, making it impractical.
LLD also utilizes functions from llvm/Support/Parallel.h
for parallelism. These functions rely on global state likegetDefaultExecutor
andllvm::parallel::strategy
. Ongoing work by Alexandre Ganeaaims to make these functions context-aware. (It's nice to meet you inperson in LLVM Developers' Meeting last month)