IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    AddressSanitizer: global variable instrumentation

    MaskRay发表于 2023-10-16 06:41:55
    love 0

    AddressSanitizer (ASan) is a compiler technology that detectsaddressability-related memory errors with some additional checks. Itconsists of two components: compiler instrumentation and a runtimelibrary. To put it simply,

    • The compiler instruments global variables, stack frames, and heapallocations to monitor shadow memory.
    • The compiler also instruments memory access instructions to verifyshadow memory.
    • In case of an error, the inserted code invokes a callback(implemented in the runtime library) to report the error along with astack trace. Typically, the program will terminate after displaying theerror message.

    This article describes global variable instrumentation.

    Global variableinstrumentation

    AddressSanitizer instruments certain defined global variables of LLVMexternal or internal linkage. To be instrumented, the variable mustsatisfy a bunch of conditions.

    • It is not thread-local.
    • It has a smaller alignment.
    • It is not synthesized by LLVM.
    • It does not have the no_sanitize_address attribute inLLVM IR. Variables receive this attribute when annotated as__attribute__((no_sanitize("address"))) or__attribute__((disable_sanitizer_instrumentation)) inC/C++.
    1
    2
    int g0;
    const long g1 = 42;

    Each instrumented global variable is padded with a right redzone todetect out-of-bounds accesses.

    1
    2
    @g0 = dso_local global { i32, [28 x i8] } zeroinitializer, comdat, align 32
    @g1 = dso_local constant { i64, [24 x i8] } zeroinitializer, comdat, align 32

    On ELF platforms, by default (since Clang 17.0) each instrumentedglobal variable receives an associated __asan_global_$namevariable, which is located within the asan_globals section.Additionally, there are several related variables, including someunnamed ones (@0 and @1), as well as__odr_asan_gen_g0 and __odr_asan_gen_g1, alongwith metadata nodes (!0 and !1), which we willdiscuss in more detail later."

    1
    2
    3
    4
    5
    6
    7
    8
    @___asan_gen_.1 = private unnamed_addr constant [3 x i8] c"g0\00", align 1
    @___asan_gen_.2 = private unnamed_addr constant [3 x i8] c"g1\00", align 1
    @__asan_global_g0 = private global { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @0 to i64), i64 4, i64 32, i64 ptrtoint (ptr @___asan_gen_.1 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 ptrtoint (ptr @__odr_asan_gen_g0 to i64) }, section "asan_globals", comdat($g0), !associated !0
    @__asan_global_g1 = private global { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @1 to i64), i64 4, i64 32, i64 ptrtoint (ptr @___asan_gen_.2 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 ptrtoint (ptr @__odr_asan_gen_g1 to i64) }, section "asan_globals", comdat($g1), !associated !1
    @llvm.compiler.used = appending global [4 x ptr] [ptr @g0, ptr @g1, ptr @__asan_global_g0, ptr @__asan_global_g1], section "llvm.metadata"

    !0 = !{ptr @g0}
    !1 = !{ptr @g1}

    The module constructor asan.module_ctor processesgarbage-collectable asan_globals input sections. Thisconstructor invokes a runtime callback to register the instrumentedglobal variables, which involves poisoning the redzone and conductingODR violation checks. I will discuss ODR violation checking later.

    1
    2
    3
    4
    5
    6
    define internal void @asan.module_ctor() #0 comdat {
    call void @__asan_init()
    call void @__asan_version_mismatch_check_v8()
    call void @__asan_register_elf_globals(i64 ptrtoint (ptr @___asan_globals_registered to i64), i64 ptrtoint (ptr @__start_asan_globals to i64), i64 ptrtoint (ptr @__stop_asan_globals to i64))
    ret void
    }

    The runtime poisons the redzone of each instrumented global variable.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    void __asan_register_elf_globals(uptr *flag, void *start, void *stop) {
    if (*flag) return;
    if (!start) return;
    CHECK_EQ(0, ((uptr)stop - (uptr)start) % sizeof(__asan_global));
    __asan_global *globals_start = (__asan_global*)start;
    __asan_global *globals_stop = (__asan_global*)stop;
    __asan_register_globals(globals_start, globals_stop - globals_start);
    *flag = 1;
    }

    void __asan_register_globals(__asan_global *globals, uptr n) {
    if (!flags()->report_globals) return;
    ...
    for (uptr i = 0; i < n; i++)
    RegisterGlobal(&globals[i]);

    // Poison the metadata. It should not be accessible to user code.
    PoisonShadow(reinterpret_cast<uptr>(globals), n * sizeof(__asan_global),
    kAsanGlobalRedzoneMagic);
    }

    static void RegisterGlobal(const Global *g) {
    ...
    if (CanPoisonMemory())
    PoisonRedZones(*g);
    }

    Every full granule in the shadow of the redzone is filled with 0xf9(kAsanGlobalRedzoneMagic) while a partial granule is filledin a manner similar to partially-addressable stack memory.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    ALWAYS_INLINE void PoisonRedZones(const Global &g) {
    uptr aligned_size = RoundUpTo(g.size, ASAN_SHADOW_GRANULARITY);
    FastPoisonShadow(g.beg + aligned_size, g.size_with_redzone - aligned_size,
    kAsanGlobalRedzoneMagic);
    if (g.size != aligned_size) {
    FastPoisonShadowPartialRightRedzone(
    g.beg + RoundDownTo(g.size, ASAN_SHADOW_GRANULARITY),
    g.size % ASAN_SHADOW_GRANULARITY, ASAN_SHADOW_GRANULARITY,
    kAsanGlobalRedzoneMagic);
    }
    }

    global-buffer-overflowexample

    If an access occurs within a redzone byte poisoned by 0xf9 or withina partial redzone preceding 0xf9, the runtime will report aglobal-buffer-overflow error. Here is an example:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    cat > a.c <<e
    #include <string.h>
    int main(int argc, char **argv) {
    static char a[10];
    memset(a, 0, 10);
    return a[argc * 5];
    }
    e
    clang -fsanitize=address a.c -o a
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    % ./a 1  # a[argc * 5] == a[10] is out-of-bounds
    =================================================================
    ==240472==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5592092356aa at pc 0x5592088dc38f bp 0x7ffd457ab520 sp 0x7ffd457ab518
    READ of size 1 at 0x5592092356aa thread T0
    #0 0x5592088dc38e (/tmp/c/a+0x14238e)
    #1 0x7fd59d38f6c9 (/lib/x86_64-linux-gnu/libc.so.6+0x276c9) (BuildId: 2ac5fa07c22f99cfd5dc47c70cd5f0e78b974269)
    #2 0x7fd59d38f784 (/lib/x86_64-linux-gnu/libc.so.6+0x27784) (BuildId: 2ac5fa07c22f99cfd5dc47c70cd5f0e78b974269)
    #3 0x559208800f80 (/tmp/c/a+0x66f80)

    0x5592092356aa is located 0 bytes after global variable 'main.a' defined in 'a.c' (0x5592092356a0) of size 10
    SUMMARY: AddressSanitizer: global-buffer-overflow (/tmp/c/a+0x14238e)
    Shadow bytes around the buggy address:
    0x559209235400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    =>0x559209235680: 00 00 00 00 00[02]f9 f9 00 00 00 00 00 00 00 00
    0x559209235700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x559209235900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
    ...

    ODR violation checker

    The global variable poisoning mechanism offers a straightforwardmeans to detect differences in variable definitions between twocomponents, such as between the main executable and a shared object, orbetween two shared objects. This can be considered a category of ODRviolations.

    1
    2
    3
    4
    echo 'int var; int main() { return var; }' > a.cc
    echo 'long var;' > b.cc
    clang++ -fpic -fsanitize=address -shared b.cc -o b.so
    clang++ -fsanitize=address a.cc ./b.so -o a
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    % ./a
    =================================================================
    ==2770366==ERROR: AddressSanitizer: odr-violation (0x562fa202af00):
    [1] size=4 'var' a.cc
    [2] size=8 'var' b.cc
    These globals were registered at these points:
    [1]:
    #0 0x562fa158b2e6 (/tmp/c/a+0x7d2e6)
    #1 0x562fa158c429 (/tmp/c/a+0x7e429)
    #2 0x7f3cc02217f5 (/lib/x86_64-linux-gnu/libc.so.6+0x277f5) (BuildId: f4017039b18cb668db130b83647b6a0dbefd4414)

    [2]:
    #0 0x562fa158b2e6 (/tmp/c/a+0x7d2e6)
    #1 0x562fa158c429 (/tmp/c/a+0x7e429)
    #2 0x7f3cc0794e2d (/lib64/ld-linux-x86-64.so.2+0x4e2d) (BuildId: f5756553d3c09f2e23001148fcb4df1ebd89afe6)

    ==2770366==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0
    SUMMARY: AddressSanitizer: odr-violation: global 'var' at a.cc
    ==2770366==ABORTING

    The default mode, detect_odr_violation=2, also prohibitssymbol interposition on variables. If you change long toint in b.cc, you will still encounter anodr-violation error. In contrast, withdetect_odr_violation=1, errors are suppressed if theregistered variables are of the same size.

    1
    2
    3
    4
    5
    % ASAN_OPTIONS=detect_odr_violation=1 ./a
    % ASAN_OPTIONS=detect_odr_violation=2 ./a
    =================================================================
    ==2574052==ERROR: AddressSanitizer: odr-violation (0x562d39db1200):
    ...

    For a variable named $var, a one-byte variable,__odr_asan_gen_$var, is created with the original linkage(essentially must be external).

    If $var is defined in two instrumented modules, their__odr_asan_gen_$var symbols reference to the same copy dueto symbol interposition. When registering $var, the runtimechecks whether __odr_asan_gen_$var is already 1, and ifyes, the program has an ODR violation; otherwise__odr_asan_gen_$var is set to 1.

    1
    2
    3
    4
    5
    @__odr_asan_gen_g0 = global i8 0, align 1
    @__odr_asan_gen_g1 = global i8 0, align 1

    @0 = private alias { i32, [28 x i8] }, ptr @g0
    @1 = private alias { i32, [28 x i8] }, ptr @g1

    The private aliases @0and @1 were due to http://reviews.llvm.org/D15642.

    ODR indicator

    The previous example uses-fsanitize-address-use-odr-indicator.

    Prior to Clang 16,-fno-sanitize-address-use-odr-indicator was the default fornon-Windows platforms. The runtime checks checks whether a variable hasbeen registered by verifying whether its redzone has been poisoned, andreports an ODR violation when the redzone has been poisoned.

    1
    2
    3
    4
    5
    @___asan_gen_.1 = private unnamed_addr constant [3 x i8] c"g0\00", align 1
    @___asan_gen_.2 = private unnamed_addr constant [3 x i8] c"g1\00", align 1
    @__asan_global_g0 = private global { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @g0 to i64), i64 4, i64 32, i64 ptrtoint (ptr @___asan_gen_.1 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 0 }, section "asan_globals", !associated !0
    @__asan_global_g1 = private global { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @g1 to i64), i64 8, i64 32, i64 ptrtoint (ptr @___asan_gen_.2 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 0 }, section "asan_globals", !associated !1
    @llvm.compiler.used = appending global [4 x ptr] [ptr @g0, ptr @g1, ptr @__asan_global_g0, ptr @__asan_global_g1], section "llvm.metadata"

    This mode eliminates the need for an additional variable like__odr_asan_gen_$var, but it can lead to interaction issueswhen mixing instrumented and uninstrumented components. In the case of ashared object, if the reference to $var in__asan_global_$var is interposed with an uninstrumentedvariable due to symbol interposition, it may result in a spurious errorstating, "The following global variable is not properly aligned."

    For Clang 16, I introduced the use of-fsanitize-address-use-odr-indicator by default fornon-Windows targets (see https://reviews.llvm.org/D137227).

    (Additionally, https://reviews.llvm.org/D127911 changed the ODRindicator symbol name to __odr_asan_gen_$demangled.)

    Copy relocations

    Private aliases have an interest interaction with copy relocations.This issue is reported at https://gcc.gnu.org/PR68016.

    The default -fsanitize-address-use-odr-indicator inClang 16 and later cannot detect the global-buffer-overflowerror below:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    echo 'int f[5] = {1};' > foo.cc
    echo 'extern int f[5]; int main() { return f[5]; }' > a.cc
    clang++ -fpic -fsanitize=address -mllvm -asan-use-private-alias=1 -shared foo.cc -o foo1.so
    clang++ -fno-pic -fsanitize=address -mllvm -asan-use-private-alias=1 -no-pie a.cc ./foo1.so -o a1
    ./a1 # no error

    clang++ -fpic -fsanitize=address -mllvm -asan-use-private-alias=0 -shared foo.cc -o foo0.so
    clang++ -fno-pic -fsanitize=address -mllvm -asan-use-private-alias=0 -no-pie a.cc ./foo0.so -o a0
    ./a0 # error

    The definition of f in foo.cc isinstrumented, resulting in the creation of __asan_global_f.However, the executable actually accesses the copy created by the linkerdue to copy relocation.

    When -asan-use-private-alias=1 is in effect (the defaultsince Clang 16), the __asan_global_f variable referencesthe unused copy inside the shared object. The executable accesses thecopy-relocated variable, whose redzone is not poisoned, resulting in noerror.

    Conversely, when -asan-use-private-alias=0 is in effect,the __asan_global_f variable references the copy-relocatedvariable and poisons the redzone within the executable. Consequently,accessing f[5] leads to the expected error.

    Garbage collection

    Since Clang 17, asan.module_ctor is, by default, placedin a COMDAT group. When multiple instrumented relocatable object filesare linked together, only one asan.module_ctor isretained.

    __asan_global_g0 is positioned in a section that linksto the section defining g0 using theSHF_LINK_ORDER flag. During linking, if the linker discardsthe section defining g0, the asan_globalssection containing __asan_global_g0 will also be discarded.For more detail on SHF_LINK_ORDER, you can refer to Metadatasections, COMDAT and SHF_LINK_ORDER.

    Before Clang 17, the default behavior was to use-fno-sanitize-address-globals-dead-stripping. In this mode,the instrumentation places pointers to instrumented global variables ina metadata array and calls __asan_register_globals.__asan_register_globals then iterates over the array andregisters each global variable.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    @g0 = dso_local global { i32, [28 x i8] } zeroinitializer, align 32
    @g1 = dso_local global { i64, [24 x i8] } zeroinitializer, align 32

    @___asan_gen_.1 = private unnamed_addr constant [3 x i8] c"g0\00", align 1
    @___asan_gen_.2 = private unnamed_addr constant [3 x i8] c"g1\00", align 1

    @llvm.compiler.used = appending global [2 x ptr] [ptr @g0, ptr @g1], section "llvm.metadata"
    @0 = internal global [2 x { i64, i64, i64, i64, i64, i64, i64, i64 }] [{ i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @1 to i64), i64 4, i64 32, i64 ptrtoint (ptr @___asan_gen_.1 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 ptrtoint (ptr @__odr_asan_gen_g0 to i64) }, { i64, i64, i64, i64, i64, i64, i64, i64 } { i64 ptrtoint (ptr @2 to i64), i64 4, i64 32, i64 ptrtoint (ptr @___asan_gen_.2 to i64), i64 ptrtoint (ptr @___asan_gen_ to i64), i64 0, i64 0, i64 ptrtoint (ptr @__odr_asan_gen_g1 to i64) }]

    @1 = private alias { i32, [28 x i8] }, ptr @g0
    @2 = private alias { i32, [28 x i8] }, ptr @g1

    define internal void @asan.module_ctor() #0 {
    call void @__asan_init()
    call void @__asan_version_mismatch_check_v8()
    call void @__asan_register_globals(i64 ptrtoint (ptr @0 to i64), i64 2)
    ret void
    }

    asan.module_ctor references the metadata array@0, which, in turn, references @1 and@2. @1 and @2 reference theglobal variables g0 and g1, respectively. Thisunfortunately indicates that g0 and g1 cannotbe discarded by section-based garbage collection.

    It's important to note that this version ofasan.module_ctor is not placed within a COMDAT group. Inanother compile unit, a separate asan.module_ctorreferences a different metadata array. As a result, theseasan.module_ctor functions cannot share the sameimplementation.

    In a linked component, both __asan_init and__asan_version_mismatch_check_v8 will be called multipletimes, incurring a small overhead.

    Regrettably, the default setting of-fsanitize-address-globals-dead-stripping in Clang 17 had abug. Specifically, when there are no global variables, and the uniquemodule ID is non-empty, a COMDAT asan.module_ctor iscreated without any __asan_register_elf_globals calls. Ifthis COMDAT is selected as the prevailing copy by the linker, thelinkage unit will lack a __asan_register_elf_globals call,resulting in an unpoisoned redzone and a non-functional ODR violationchecker.

    I have fixed this in the main branch (#67745) butLLVM 17.0.2 does not contain the fix.

    Global variable metadata

    Before Clang 15, Clang's instrumentation includedllvm.asan.globals, and the AddressSanitizer runtimerequired its object file feature for symbolization.

    https://reviews.llvm.org/D127552 enabled debuginformation for symbolization and https://reviews.llvm.org/D127911 deleted the metadatanode llvm.asan.globals.



沪ICP备19023445号-2号
友情链接