IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    All about sanitizer interceptors

    MaskRay发表于 2023-02-13 00:11:23
    love 0

    Many sanitizers want to know every function in the program. Userfunctions are instrumented and therefore known by the sanitizer runtime.For library functions, some (e.g. mmap, munmap, memoryallocation/deallocation functions, longjmp, vfork) need specialtreatment. Sanitizers leverage symbol interposition to redirect suchfunction calls to its own implementation: interceptors. Other libraryfunctions can be treated as normal user code. Either instrumenting thefunction or providing an interceptor is fine.

    In some cases instrumenting is infeasible:

    • Assembly source files usually do not (or are inconvenient to) callsanitizer callbacks
    • Many libc implementations cannot be instrumented. Whencan glibc be built with Clang?
    • Some functions have performance issues if instrumented instead ofintercepted (mostly mem* and str*)

    And interceptors may be the practical choice.

    This article talks about how interceptors work and the requirementsof sanitizer interceptors.

    How interceptors work

    Here is the short summary and I will elaborate.

    • ELF platforms: __interceptor_$name has a weak alias$name of the same name as the intercepted function.
    • Apple platforms: __DATA,__interpose holds the addressof the interceptor (wrap_$name) and the interceptedfunction.
    • Windows: hot patching the intercepted function

    ELF platforms

    In Clang, sanitizer runtime files are named$resource_dir/lib/$triple/libclang_rt.*.{a,so} on ELFplatforms. In the olderLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=off configuration(default before LLVM 15.0.0), the files are named$resource_dir/lib/libclang_rt.*-$arch.{a,so}. The.a files are called static runtime while the.so files are called shared runtime or dynamic runtime. Asof 2023-01, Android and Fuchsia default to shared runtime while otherELF platforms (Linux, *BSD, etc) default to static runtime.

    In GCC, sanitizer runtime files are namedlib*san.{a,so}. Shared runtime is the default. Specify-static-libasan to use static runtime.

    Static runtime

    Most static runtime files are only used when linking executables.When linking an executable, Clang Driver passes--whole-archive $resource_dir/lib/libclang_rt.$name.a --no-whole-archiveto the linker. For the following example, iflibclang_rt.$name.a defines malloc andfree, the executable will get the definitions.

    1
    2
    printf '#include <stdlib.h>\nint main() { void *p = malloc(42); free(p); }' > a.c
    clang -fsanitize=address a.c -o a

    malloc, free, and actually all libcinterceptors are exported to .dynsym because they aredefined/referenced by a link-time shared object (glibclibc.so.6), even if -Wl,--export-dynamic isnot specified. See ExplainGNU style linker options#--export-dynamic for detail.-Wl,--gc-sections cannot discard these interceptors as.dynsym symbols are considered GC roots.

    Also, note that the definitions are weak and unversioned.

    1
    2
    3
    % nm -D a | grep -w 'malloc\|free'
    00000000000ead10 W free
    00000000000eb010 W malloc

    When linking a shared object, the static runtime is not used. OnLinux glibc, the shared object has a versioned reference.

    1
    2
    printf '#include <stdlib.h>\nvoid *foo() { return malloc(42); }' > b.c
    clang -fsanitize=address -fpic -shared b.c -o b.so
    1
    2
    % nm -D b.so | grep 'malloc\|free'
    U malloc@GLIBC_2.2.5

    If we make b.so a link-time dependency of the executablea or dlopen b.so at run-time, themalloc@GLIBC_2.2.5 reference from b.so will bebound to the definition in a. The dynamic loader computes abreadth-first symbol search list(executable, needed0, needed1, needed2, needed0_of_needed0, needed1_of_needed0, ...).For each symbol reference, the dynamic loader iterates over the list andfinds the first component which provides a definition. The executableprovides a definition and the search stops at the executable. See thefirst few paragraphs of ELFinterposition and -Bsymbolic.

    Actually we also use a rule that a malloc@GLIBC_2.2.5reference can be bound to a malloc definition ofVER_NDX_GLOBAL. See Allabout symbol versioning#rtld-behavior.

    Since the executable defines malloc andfree, its calls to the two symbols do not use PLT. This isan advantage over shared runtime. Calls from the shared object stillneed PLT. Preloading malloc functions does not work since theexecutable's definitions take priority.

    An interceptor is available as two symbols, one named__interceptor_$name has a STB_GLOBAL bindingand the other named $name has a STB_WEAKbinding. User code can define a STB_GLOBAL$name to overridethe interceptor and call __interceptor_$name to get thesanitizer definition.

    1
    2
    3
    4
    % readelf -Ws $(clang --print-file-name=libclang_rt.asan.a) | grep -E ' (malloc|__interceptor_malloc)$'
    2196: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __interceptor_malloc
    57: 0000000000000000 295 FUNC GLOBAL DEFAULT 10 __interceptor_malloc
    121: 0000000000000000 295 FUNC WEAK DEFAULT 10 malloc

    Shared runtime

    On targets that default to static runtime, use-shared-libsan to select this configuration. Bothexecutables and shared objects will link againstlibclang_rt.$name.so. Here is an example on Linuxglibc.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    % clang -fsanitize=address -shared-libsan a.c -o a
    % readelf -Wd a | grep 'clang_rt\|libc'
    0x0000000000000001 (NEEDED) Shared library: [libclang_rt.asan.so]
    0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
    % readelf -d b.so | grep 'clang_rt\|libc'
    0x0000000000000001 (NEEDED) Shared library: [libclang_rt.asan.so]
    0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
    % readelf -W --dyn-syms $(clang --print-file-name=libclang_rt.asan.so) | grep -w malloc
    1295: 000000000011a2d0 295 FUNC WEAK DEFAULT 11 malloc

    In the symbol search list, libclang_rt.asan.so appearsbefore libc.so.6, so malloc references froma and b.so will be bound to the definitionmalloc in libclang_rt.asan.so. The rule that aversioned reference can be bound to a definition ofVER_NDX_GLOBAL kicks in again.

    Preloading a shared object is dangerous and the asan runtime warnsabout it.

    1
    2
    3
    % clang -fsanitize=address -shared-libsan -Wl,-rpath=$(dirname $(clang --print-file-name=libclang_rt.asan.so)) a.c -o a
    % LD_PRELOAD=/lib/x86_64-linux-gnu/libjemalloc.so.2 ./a
    ==1650190==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

    dlsym RTLD_NEXT

    An interceptor needs to call the intercepted library function. Duringinitialization, the runtime calls dlsym(RTLD_NEXT, $name)for all intercepted functions and save the addresses. An interceptor__interceptor_$name calls the saved return value ofdlsym(RTLD_NEXT, $name).

    In the previous examples, whether the definition is in the executableor libclang_rt.asan.so, the component is prior tolibc.so.6 in the symbol search list. Thereforedlsym(RTLD, $name) will retrieve the symbol address inlibc.so.6.

    Before glibc 2.36, dlsym(RTLD_NEXT, name) returned theaddress of the oldest version definition of name inlibc.so.6. And we got to use the old semantics. This isusually benign but not ideal (see https://github.com/google/sanitizers/issues/1371 for aregexec issue). I fixed glibc 2.36 (BZ#14932) so that dlsym(RTLD_NEXT, name) returns thedefault version definition now.

    This fix led to an interesting issue. glibc made__pthread_mutex_lock a non-default version definition in2.34, so dlsym(RTLD_NEXT, "__pthread_mutex_lock") wouldreturn NULL. I fixed it by disablingthe interceptor if glibc>=2.34 at build time.

    Apple platforms

    On Apple platforms, sanitizers use a dyld feature to interceptsymbols. Only dynamic runtime is supported.

    The runtime defines a special section __DATA,__interposewhich contains a list of pairs holding the addresses of the interceptor(named wrap_$name) and the intercepted function. dyld willredirect calls to the interceptor, as long as the call is not comingfrom the library defining the interceptor.

    Let's say we have an interceptor wrap_strcmp. It callsthe real definition strcmp. dyld will bind the reference tothe real definition in libsystem_kernel.dylib, instead ofwrap_strcmp.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    // compiler-rt/lib/interception/interception.h
    // For a function foo() and a wrapper function bar() create a global pair
    // of pointers { bar, foo } in the __DATA,__interpose section.
    // As a result all the calls to foo() will be routed to bar() at runtime.
    #define INTERPOSER_2(func_name, wrapper_name) __attribute__((used)) \
    const interpose_substitution substitution_##func_name[] \
    __attribute__((section("__DATA, __interpose"))) = { \
    { reinterpret_cast<const uptr>(wrapper_name), \
    reinterpret_cast<const uptr>(func_name) } \
    }

    Let's use a reduced example to get a feeling.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    cat > a.c <<eof
    #include <string.h>

    int initialized;
    int main(int argc, char *argv[]) {
    initialized = 1;
    return strcmp(argv[1], argv[2]);
    }
    eof
    cat > b.c <<eof
    #include <stdio.h>
    #include <string.h>

    extern int initialized;
    int wrap_strcmp(const char *a, const char *b) {
    if (initialized)
    printf("strcmp: %s %s\n", a, b);
    return strcmp(a, b);
    }

    __attribute__((used, section("__DATA, __interpose")))
    static void *const interpose[] = {(void *)wrap_strcmp, (void *)strcmp};
    eof
    clang -dynamiclib -Wl,-U,_initialized b.c -o b.dylib # or use -Wl,-undefined,dynamic_lookup
    clang a.c b.dylib -o a
    ./a a a

    Windows

    See AddressSanitizerfor Windows, 2014 LLVM Developers' Meeting. The currentimplementation is atcompiler-rt/lib/interception/interception_win.cpp.Interceptors are implemented by hot patching the entry block of theintercepted function.

    Linking runtime has multiple ways.

    For /MT,

    • Some functions (e.g. malloc) are defined by the lib.The runtime just defines the interceptor as the same name.
    • Many functions are dllimported. The runtime calls__interception::OverrideFunction to try multiple hotpatching techniques.

    For /MD, the runtime maintains a list of interestingDLLs, checks which DLL defines the intercepted function, and calls__interception::OverrideFunction for hot patching.

    After hot patching, the first instruction of the intercepted functionwill jump to the interceptor either directly or through a trampoline.The interceptor will jump back.

    Interceptor requirements

    AddressSanitizer

    AddressSanitizer detects addressability bugs. For a mapped memoryregion, AddressSanitizer uses shadow memory to track whether user bytesare unaddressable (poisoned): accesses are considered a bug(heap-buffer-overflow, heap-use-after-free, stack-buffer-overflow, stack-use-after-{return,scope},etc). 8 (granule) aligned user bytes are mapped to one shadow memorybyte. (This can be patched to support other granules, e.g. now-deletedMyriad RTEMS used 32 as the granule.)

    A shadow memory byte is 0 (unpoisoned, all 8 bytes areaddressable) or a non-zero integer (poisoned, not all 8 bytesare addressable). The non-zero integer may be smaller than 8 (the firstX bytes are addressable) or a predefined special value (to indicate abug category).

    At the start of an interceptor, AsanInitFromRtl iscalled if the runtime hasn't been initialized yet.

    mmap and munmap do not need special treatment.

    When a chunk of memory is reserved from a mapped region for heapallocation, the associated shadow memory is poisoned with 0xfa(kAsanHeapLeftRedzoneMagic). For a malloc-family function,its interceptor records the allocation information (thread ID, requestedsize, stack trace, allocation type (malloc, new, new[]),etc) and unpoisons the shadow (sets to zeros) which may be 0xfa ifunallocated previously.

    For a free-family function, its interceptor detects double free andalloc-dealloc type mismatch bugs, records the deallocation information(thread ID, stack trace), and poisons the shadow with 0xfd(kAsanHeapFreeMagic). Instrumented/intercepted accesses tothe deallocated memory will cause an error.

    For a library function which performs memory reads or writes, itsinterceptor emulates an instrumented memory read/write: check the shadowmemory and report an error in case of a poisoned byte.

    Some library functions allocate memory internally. An implementationtypically carefully uses an interposable symbol mallocinstead of a private alias, so the allocation/deallocation will be knownby AddressSanitizer and be unpoisoned/poisoned properly.

    An non-special function which is neither instrumented nor interceptedjust leads to fewer detected errors.

    Stack use after scope

    An instrumented function poisons stack variables to catch stack useafter scope bugs. Instrumentation unpoisons stack variables before anepilogue. If a process creates a subprocess with shared memory, and thesubprocess exits due to a noreturn function (including throwexpressions), the unpoisoned shadow may cause a false positiveto the first process. This is fixed by calling__asan_handle_no_return before calling noreturn functionsto conservatively unpoison the whole stack.

    vfork has similarissues and needs an interceptor.

    longjmp-family functions have similar issues. The stack memory may bereused causing a false positive. These functions are intercepted in theruntime to call __asan_handle_no_return.

    https://github.com/android/ndk/issues/988

    HWAddressSanitizer

    HWAddressSanitizer detects addressability bugs (the same class oferrors as the main feature of AddressSanitizer) using a differentalgorithm (software memory tagging). 16 (granule) aligned user bytes areassociated with a non-zero tag. The tag is implemented as one byte andstored in the shadow memory.

    A memory allocation chooses a random non-zero tag and sets it in thehigh bits of the returned pointer. The shadow memory of the allocatedchunk is filled with the tag. To support accessing the pointer withnon-zero high bits, hardware features (ARM Top Byte Ignore, Intel LinearAddress Masking, RISC-V Pointer Masking) or page aliases are needed.

    The interceptor behavior is similar to AddressSanitizer.

    mmap and munmap do not need special treatment.

    For an instrumented memory read or write operation, its interceptoremulates an instrumented memory read/write: check the pointer tagagainst the tag stored in the shadow memory, and report an error in caseof a mismatch.

    To detect use-after-free, a memory deallocation needs to clear theassociated shadow memory.

    For an interceptor, do something similar to AddressSanitizer: checkthe pointer tag against the shadow memory and report an error in case ofa mismatch.

    HWAddressSanitizer is deployedon Android. Its C library bionic is instrumented so that very fewinterceptors are needed.

    To use HWAddressSanitizer with glibc in the future, eitherinterceptors need to be provided or glibc can be instrumented.

    longjmp-family functions have issues similar to AddressSanitizer. Thestack memory may be reused causing a false positive. These functions areintercepted to call __hwasan_handle_longjmp to clear theshadow memory. vfork needs an interceptor similar toAddressSanitizer.

    ThreadSanitizer

    In the old runtime (tsan v2), 8 aligned user bytes are mapped to ashadow cell of 32 bytes, which contains 4 shadow values. Therepresentation uses 13 bits to record a thread ID (up to 8192 threadsare supported), and 42 bits to record a vector clock timestamp.

    In the new runtime (tsan v3), 8 aligned user bytes are mapped to ashadow cell of 16 bytes, which contains 4 shadow values. A shadow valuerecords the bitmask of accessed bytes (8 bites), a thread slot ID (8bits), a vector clock timestamp (14 bites), is_read (1 bit), is_atomic(1 bit). The shrinking of time is made available because the timestampincrements more slowly (only on atomic releases, mutex unlocks, threadcreation/destruction).

    At the start of an interceptor, the runtime callscur_thread_init and retrieves the thread state and thereturn address of the current function. For a library function whichperforms memory reads or writes, its interceptor emulates aninstrumented memory read/write: record the access as a thread event(EventAccess), form a new shadow value, and check existingshadow values (at most 4) in the shadow cell. If the current shadowvalue and a previous shadow value interact in the bitmask of accessedbytes, have different thread slot IDs, have at least one write, and haveat least one non-atomic access, report a data race. Otherwise replaceone shadow value with the new one.

    pthread mutex functions such aspthread_mutex_{init,destroy,lock,trylock,timedlock,unlock}(and pthread_{rwlock,spin,cond,barrier}_*pthread_once) are intercepted to record mutex lifetime andsynchronization points.

    Most libc functions do not have synchronization semantics.

    An non-special function which is neither instrumented nor interceptedjust leads to fewer detected errors.

    MemorySanitizer

    MemorySanitizer uses shadow memory to track whether a memory regionhas uninitialized values.

    One user byte is mapped to one shadow memory byte. A shadow memorybyte is 0 (unpoisoned, all 8 bits are initialized) or anon-zero integer (poisoned, some bits are uninitialized).

    At the start of an interceptor, __msan_init is called ifthe runtime hasn't been initialized yet. Then__errno_location() is unpoisoned. To support-fsanitize=memory,fuzzer, interceptors introduce a smalloverhead by checkingwhether interceptors are disabled due to libFuzzer. This is so that thelibFuzzer runtime does not need to be instrumented byMemorySanitizer.

    For a library function which performs memory reads: if the shadowmemory is poisoned, report a use-of-uninitialized-value error. For alibrary function which performs memory writes: unpoison the shadowmemory, i.e. mark the memory region as initialized.

    If an uninstrumented function which performs memory writes does nothave an interceptor, the lack of unpoisoning may lead to false positiveswhen the memory is subsequently read. This property is different fromAddressSanitizer/ThreadSanitizer where a missing interceptor usuallyjust leads to fewer detected errors.

    DataFlowSanitizer

    DataFlowSanitizeris a dynamic data flow analysis (taint analysis) tool. It allows tagginga user byte with up to 8 labels. Compiler instrumentation propagateslabels when a user byte affects the computation of another one. Thisprocess is similar to uninitialized value propagation inMemorySanitizer. A user byte is mapped to a shadow memory byte whichsupports 8 labels.

    At the start of an interceptor, dfsan_init is called ifthe runtime hasn't been initialized yet. Then the label of__errno_location() is cleared.

    For a malloc-family or free-family function, its interceptor clearslabels (assuming the bytes are unaffected by other values) bydefault.

    For a library function which performs memory writes or returns avalue, its interceptor propagates the label of the source value.

    An non-special function which performs memory writes and is neitherinstrumented nor intercepted misses label propagation. This may causefalse negatives.

    Standalone LeakSanitizer

    For most major 64-bit platforms (except Apple), AddressSanitizerintegrates and enables LeakSanitizer by default. LeakSanitizer can beused standalone as well to just detect memory leak bugs. See All aboutLeakSanitizer.

    Standalone LeakSanitizer intercepts very few functions: malloc-familyand free-family functions (like a preloaded memory allocator), and a fewfunctions like pthread_create.

    At the start of an interceptor, __lsan_init is called ifthe runtime hasn't been initialized yet.

    For a malloc-family function, its interceptor records the allocationinformation (requested size, stack trace).

    For pthread_create, its interceptor ensurescorrect thread ID, ignores allocations from the realpthread_create, and registers the new thread.

    At exit time, LeakSanitizer performs a GC style stop-the-world andscans all reachable memory chunks. For an unreachable chunk, report anerror using the recorded information.

    Scudo hardened allocatorand GWP-ASan

    Scudo is a user-mode memory allocator designed to be resilientagainst heap-related vulnerabilities. GWP-ASan uses Scudo as themain allocator and lets Scudo hand over sampled memory allocations toits own memory allocator.

    In Scudo, only memory allocator related functions areintercepted.

    MemProf

    TODO MemProf isn't a sanitizer, but it uses sanitizerinterceptors.

    Portability andmaintainability

    Sanitizers support many operating systems and architectures.

    • https://clang.llvm.org/docs/AddressSanitizer.html#supported-platforms
    • https://clang.llvm.org/docs/MemorySanitizer.html#supported-platforms
    • https://clang.llvm.org/docs/ThreadSanitizer.html#supported-platforms
    • ...

    Implementing interceptors (though the code can be shared) may be themost challenging part porting a sanitizer to a new platform.

    • A libc implementation more or less supports some extensions. Thesefunctions need to be intercepted.
    • Type definitions from a newer standard may not be supported by animplementation. A version dispatch is needed or a shim may beprovided.

    Sanitier runtime is therefore scattered with #ifconditional inclusions. Runtime tests with combinatorial explosionssometimes make debugging tricky.

    Therefore, supporting a new operating system should be taken verycarefully. I believe sanitizer maintainers tend to focus on the abilityto instrument libc and provide sanitizer callbacks if a new OS is everconsidered. One may ask whether this improses pressure when sanitizerruntime gets updated and needs to work with older libc versions. Itseems that the intercepted read/write behavior part is quite stable inthe sanitizer runtime, so working with multiple libc versions seemsfine.

    glibc

    As mentioned at the beginning of this article, glibc cannotbe built wglibc ith Clang. I hope that this will change in the nextfew years.

    OK, an observation: new releases of glibc oftentimes requireadaptation from sanitizer interceptors.

    The Linux kernel user-space API and glibc traditionally did not playwell. The discord can occasionally cause portibility issues to projects.Sanitizer interceptors use #include <linux/*.h> a lotand are subject to such issues. E.g. I landed https://reviews.llvm.org/D129471 to remove#include <linux/fs.h> to resolvefsconfig_command/mount_attr conflict with glibc 2.36. TODOmention why it is difficult to pull#include <linux/*.h> out ofcompiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp.

    musl

    I portedasan, cfi, lsan, msan, tsan, ubsan to Linux musl in 2021-01and learned a lot in the process. It seemed that code readabilityimproved in some places (where glibc-ism was refined with a bettercondition, e.g. change from "Linux but not Android" or "all OSes but notX/Y/Z" (where X/Y/Z basically enumerates all non-glibc platforms) to"Linux glibc"). Maintaining the port required low efforts. I appliedvery few fixes in 2021 and 2022.

    musl provides _LARGEFILE64_SOURCE symbols(mmap64, pread64, readdir64, stat64, etc) for glibc ABIcompatibility which are not intended for linking. musl 1.2.4 willdisallow linking against LFS64 symbols[ therefore these interceptorshave to be removed.

    Interesting problems

    Non-interceptedlibrary function calls an intercepted library functionMemorySanitizer/DataFlowSanitizerand ptsname

    ptsname used not to be intercepted and this caused falsepositives with MemorySanitizer.

    In glibc, ptsname calls __ptsname_r with astatic variable (unknown to MemorySanitizer, having a zero valueshadow). Since __ptsname_r is not instrumented, the shadowmemory of numbuf is whatever the last shadow value for thememory region that numbuf happens to occupy. Ifmemcpy is interposable, its interceptor copies theincorrect (poisoned) shadow to buf, which may cause a falsepositive if buf is accessed by the caller.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    int
    __ptsname_r (int fd, char *buf, size_t buflen)
    {
    ...
    char numbuf[21];
    ...
    numbuf[sizeof (numbuf) - 1] = '\0';
    p = _itoa_word (ptyno, &numbuf[sizeof (numbuf) - 1], 10, 0);
    ...
    memcpy (__stpcpy (buf, devpts), p, &numbuf[sizeof (numbuf)] - p);
    ...
    }

    The fix is to instrumentptsname and ptsname_r.

    In glibc before 2018 (BZ#18822), many functions had PLT calls. There were many lurkingissues like the above.

    _FORTIFY_SOURCE

    In short, disable _FORTIFY_SOURCE when a sanitizer isused. If -D_FORTIFY_SOURCE=2 is appended after yourspecified options, use -Wp,-U_FORTIFY_SOURCE to overrideit.

    When _FORTIFY_SOURCE is enabled, some library functionsare redirected to *_chk. Interceptors don't provide*_chk (with an exception https://reviews.llvm.org/D40951), so we just end up withfewer detected errors (e.g. asan, tsan) or false positives (msan).

    See https://github.com/google/sanitizers/issues/247.

    Android linkernamespace and libc++ interceptor

    https://github.com/android/ndk/issues/988



沪ICP备19023445号-2号
友情链接