AddressSanitizer (ASan) is a compiler technology that detectsaddressability-related memory errors with some additional checks. Itconsists of two components: compiler instrumentation and a runtimelibrary. To put it simply,
This article describes global variable instrumentation.
AddressSanitizer instruments certain defined global variables of LLVMexternal or internal linkage. To be instrumented, the variable mustsatisfy a bunch of conditions.
no_sanitize_address
attribute inLLVM IR. Variables receive this attribute when annotated as__attribute__((no_sanitize("address")))
or__attribute__((disable_sanitizer_instrumentation))
inC/C++.1 | int g0; |
Each instrumented global variable is padded with a right redzone todetect out-of-bounds accesses. 1
2@g0 = dso_local global { i32, [28 x i8] } zeroinitializer, comdat, align 32
@g1 = dso_local constant { i64, [24 x i8] } zeroinitializer, comdat, align 32
On ELF platforms, by default (since Clang 17.0) each instrumentedglobal variable receives an associated __asan_global_$name
variable, which is located within the asan_globals
section.Additionally, there are several related variables, including someunnamed ones (@0
and @1
), as well as__odr_asan_gen_g0
and __odr_asan_gen_g1
, alongwith metadata nodes (!0
and !1
), which we willdiscuss in more detail later."
1 | @___asan_gen_.1 = private unnamed_addr constant [3 x i8] c"g0\00", align 1 |
The module constructor asan.module_ctor
processesgarbage-collectable asan_globals
input sections. Thisconstructor invokes a runtime callback to register the instrumentedglobal variables, which involves poisoning the redzone and conductingODR violation checks. I will discuss ODR violation checking later.1
2
3
4
5
6define internal void @asan.module_ctor() #0 comdat {
call void @__asan_init()
call void @__asan_version_mismatch_check_v8()
call void @__asan_register_elf_globals(i64 ptrtoint (ptr @___asan_globals_registered to i64), i64 ptrtoint (ptr @__start_asan_globals to i64), i64 ptrtoint (ptr @__stop_asan_globals to i64))
ret void
}
The runtime poisons the redzone of each instrumented global variable.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26void __asan_register_elf_globals(uptr *flag, void *start, void *stop) {
if (*flag) return;
if (!start) return;
CHECK_EQ(0, ((uptr)stop - (uptr)start) % sizeof(__asan_global));
__asan_global *globals_start = (__asan_global*)start;
__asan_global *globals_stop = (__asan_global*)stop;
__asan_register_globals(globals_start, globals_stop - globals_start);
*flag = 1;
}
void __asan_register_globals(__asan_global *globals, uptr n) {
if (!flags()->report_globals) return;
...
for (uptr i = 0; i < n; i++)
RegisterGlobal(&globals[i]);
// Poison the metadata. It should not be accessible to user code.
PoisonShadow(reinterpret_cast<uptr>(globals), n * sizeof(__asan_global),
kAsanGlobalRedzoneMagic);
}
static void RegisterGlobal(const Global *g) {
...
if (CanPoisonMemory())
PoisonRedZones(*g);
}
Every full granule in the shadow of the redzone is filled with 0xf9(kAsanGlobalRedzoneMagic
) while a partial granule is filledin a manner similar to partially-addressable stack memory.1
2
3
4
5
6
7
8
9
10
11ALWAYS_INLINE void PoisonRedZones(const Global &g) {
uptr aligned_size = RoundUpTo(g.size, ASAN_SHADOW_GRANULARITY);
FastPoisonShadow(g.beg + aligned_size, g.size_with_redzone - aligned_size,
kAsanGlobalRedzoneMagic);
if (g.size != aligned_size) {
FastPoisonShadowPartialRightRedzone(
g.beg + RoundDownTo(g.size, ASAN_SHADOW_GRANULARITY),
g.size % ASAN_SHADOW_GRANULARITY, ASAN_SHADOW_GRANULARITY,
kAsanGlobalRedzoneMagic);
}
}
If an access occurs within a redzone byte poisoned by 0xf9 or withina partial redzone preceding 0xf9, the runtime will report aglobal-buffer-overflow
error. Here is an example:
1 | cat > a.c <<e |
1 | % ./a 1 # a[argc * 5] == a[10] is out-of-bounds |
The global variable poisoning mechanism offers a straightforwardmeans to detect differences in variable definitions between twocomponents, such as between the main executable and a shared object, orbetween two shared objects. This can be considered a category of ODRviolations.
1 | echo 'int var; int main() { return var; }' > a.cc |
1 | % ./a |
The default mode, detect_odr_violation=2
, also prohibitssymbol interposition on variables. If you change long
toint
in b.cc
, you will still encounter anodr-violation
error. In contrast, withdetect_odr_violation=1
, errors are suppressed if theregistered variables are of the same size. 1
2
3
4
5% ASAN_OPTIONS=detect_odr_violation=1 ./a
% ASAN_OPTIONS=detect_odr_violation=2 ./a
=================================================================
==2574052==ERROR: AddressSanitizer: odr-violation (0x562d39db1200):
...
For a variable named $var
, a one-byte variable,__odr_asan_gen_$var
, is created with the original linkage(essentially must be external
).
If $var
is defined in two instrumented modules, their__odr_asan_gen_$var
symbols reference to the same copy dueto symbol interposition. When registering $var
, the runtimechecks whether __odr_asan_gen_$var
is already 1, and ifyes, the program has an ODR violation; otherwise__odr_asan_gen_$var
is set to 1.
1 | @__odr_asan_gen_g0 = global i8 0, align 1 |
The private aliases @0and @1 were due to http://reviews.llvm.org/D15642.
The previous example uses-fsanitize-address-use-odr-indicator
.
Prior to Clang 16,-fno-sanitize-address-use-odr-indicator
was the default fornon-Windows platforms. The runtime checks checks whether a variable hasbeen registered by verifying whether its redzone has been poisoned, andreports an ODR violation when the redzone has been poisoned.1
2
3
4
5@___asan_gen_.1 = private unnamed_addr constant [3 x i8] c"g0\00", align 1
@___asan_gen_.2 = private unnamed_addr constant [3 x i8] c"g1\00", align 1
@__asan_global_g0 = private global { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @g0 to i64), i64 4, i64 32, i64 ptrtoint (ptr @___asan_gen_.1 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 0 }, section "asan_globals", !associated !0
@__asan_global_g1 = private global { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @g1 to i64), i64 8, i64 32, i64 ptrtoint (ptr @___asan_gen_.2 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 0 }, section "asan_globals", !associated !1
@llvm.compiler.used = appending global [4 x ptr] [ptr @g0, ptr @g1, ptr @__asan_global_g0, ptr @__asan_global_g1], section "llvm.metadata"
This mode eliminates the need for an additional variable like__odr_asan_gen_$var
, but it can lead to interaction issueswhen mixing instrumented and uninstrumented components. In the case of ashared object, if the reference to $var
in__asan_global_$var
is interposed with an uninstrumentedvariable due to symbol interposition, it may result in a spurious errorstating, "The following global variable is not properly aligned."
For Clang 16, I introduced the use of-fsanitize-address-use-odr-indicator
by default fornon-Windows targets (see https://reviews.llvm.org/D137227).
(Additionally, https://reviews.llvm.org/D127911 changed the ODRindicator symbol name to __odr_asan_gen_$demangled
.)
Private aliases have an interest interaction with copy relocations.This issue is reported at https://gcc.gnu.org/PR68016.
The default -fsanitize-address-use-odr-indicator
inClang 16 and later cannot detect the global-buffer-overflow
error below:
1 | echo 'int f[5] = {1};' > foo.cc |
The definition of f
in foo.cc
isinstrumented, resulting in the creation of __asan_global_f
.However, the executable actually accesses the copy created by the linkerdue to copy relocation.
When -asan-use-private-alias=1
is in effect (the defaultsince Clang 16), the __asan_global_f
variable referencesthe unused copy inside the shared object. The executable accesses thecopy-relocated variable, whose redzone is not poisoned, resulting in noerror.
Conversely, when -asan-use-private-alias=0
is in effect,the __asan_global_f
variable references the copy-relocatedvariable and poisons the redzone within the executable. Consequently,accessing f[5]
leads to the expected error.
Since Clang 17, asan.module_ctor
is, by default, placedin a COMDAT group. When multiple instrumented relocatable object filesare linked together, only one asan.module_ctor
isretained.
__asan_global_g0
is positioned in a section that linksto the section defining g0
using theSHF_LINK_ORDER
flag. During linking, if the linker discardsthe section defining g0
, the asan_globals
section containing __asan_global_g0
will also be discarded.For more detail on SHF_LINK_ORDER
, you can refer to Metadatasections, COMDAT and SHF_LINK_ORDER.
Before Clang 17, the default behavior was to use-fno-sanitize-address-globals-dead-stripping
. In this mode,the instrumentation places pointers to instrumented global variables ina metadata array and calls __asan_register_globals
.__asan_register_globals
then iterates over the array andregisters each global variable.
1 | @g0 = dso_local global { i32, [28 x i8] } zeroinitializer, align 32 |
asan.module_ctor
references the metadata array@0
, which, in turn, references @1
and@2
. @1
and @2
reference theglobal variables g0
and g1
, respectively. Thisunfortunately indicates that g0
and g1
cannotbe discarded by section-based garbage collection.
It's important to note that this version ofasan.module_ctor
is not placed within a COMDAT group. Inanother compile unit, a separate asan.module_ctor
references a different metadata array. As a result, theseasan.module_ctor
functions cannot share the sameimplementation.
In a linked component, both __asan_init
and__asan_version_mismatch_check_v8
will be called multipletimes, incurring a small overhead.
Regrettably, the default setting of-fsanitize-address-globals-dead-stripping
in Clang 17 had abug. Specifically, when there are no global variables, and the uniquemodule ID is non-empty, a COMDAT asan.module_ctor
iscreated without any __asan_register_elf_globals
calls. Ifthis COMDAT is selected as the prevailing copy by the linker, thelinkage unit will lack a __asan_register_elf_globals
call,resulting in an unpoisoned redzone and a non-functional ODR violationchecker.
I have fixed this in the main branch (#67745) butLLVM 17.0.2 does not contain the fix.
Before Clang 15, Clang's instrumentation includedllvm.asan.globals
, and the AddressSanitizer runtimerequired its object file feature for symbolization.
https://reviews.llvm.org/D127552 enabled debuginformation for symbolization and https://reviews.llvm.org/D127911 deleted the metadatanode llvm.asan.globals
.