IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    ELF hash function may overflow

    MaskRay发表于 2023-04-12 07:58:46
    love 0

    This article describes an interesting overflow bug in the ELF hashfunction.

    The System V Application Binary Interface (generic ABI) specifies theELF object file format. When producing an output executable or sharedobject needing a dynamic symbol table (.dynsym), a linkergenerates a .hash section with type SHT_HASHto hold a symbolhash table. A DT_HASH tag is produced to hold theaddress of .hash.

    DT_HASH is used by a dynamic loader to perform symbollookup (for dynamic relocations and dlsym familyfunctions). A detailed description of the format can be found in ELF: symbol lookup viaDT_HASH.

    Other use cases

    In a Solaris Version Definition Section, vd_hash holds avalue generated using the ELF hash function. The GNU symbol versioningscheme inherits this field from Solaris.

    Overflow bug

    The generic ABI gives the following code fragment in "Figure 5-13:Hashing Function".

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    unsigned long
    elf_hash(const unsigned char *name)
    {
    unsigned longh = 0, g;
    while (*name)
    {
    h = (h << 4) + *name++;
    if (g = h & 0xf0000000)
    h ^= g >> 24;
    h &= ~g;
    }
    return h;
    }

    The function is supposed to return a value no larger than 0x0fffffff.Unfortunately, there is a bug. When unsigned long consistsof more than 32 bits, the return value may be larger thanUINT32_MAX. For instance,elf_hash((const unsigned char *)"\xff\x0f\x0f\x0f\x0f\x0f\x12")returns 0x100000002, which is clearly unintended, as the function shouldbehave the same way regardless of whether long represents a32-bit integer or a 64-bit integer.

    It is possible to use 7-bit ASCII characters to trigger the issue.For instance,

    • elf_hash((const unsigned char *)"iiiiii\na")) == 100000001
    • elf_hash((const unsigned char *)"ZZZZZX+a")) == 100000011
    • elf_hash((const unsigned char *)"ZZZZZW9p")) == 100000000

    Note: most ELF operating systems have switched fromDT_HASH to DT_GNU_HASH for many years. Versiondefinitions are rarely used.

    Project survey

    glibc fixedthe overflow issue while optimizing the function in April 1995. Thetwo XOR operations were optimizedto one in Dec 2011.

    binutils-gdb fixedthe overflow issue in May 2003.

    musl has had the elegant and efficient implementation since June 2011(initial check-in of the dynamic linker). It is worth noting thatuint_fast32_t is used, so that an architecture can optimizethe implementation if the architecture has slow 32-bit integerarithmetic operations.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    static uint32_t sysv_hash(const char *s0)
    {
    const unsigned char *s = (void *)s0;
    uint_fast32_t h = 0;
    while (*s) {
    h = 16*h + *s++;
    h ^= h>>24 & 0xf0;
    }
    return h & 0xfffffff;
    }

    Nathan Sidwell raised theissue for llvm-project and pointed out a bug about usingchar instead of unsigned char on2023-04-09.

    I asked Whatif the result of elf_hash is larger than UINT32_MAX? on2023-04-11.



沪ICP备19023445号-2号
友情链接