Both compiler developers and security researchers have builtdisassemblers. They often prioritize different aspects. Compilertoolchains, benefiting from direct contributions from CPU vendors, tendto offer more accurate and robust decoding. Security-focused tools, onthe other hand, often excel in user interface design.
For quick disassembly tasks, rizinprovides a convenient command-line interface.
1 | % rz-asm -a x86 -b 64 -d 4829c390 |
-a x86
can be omitted.
Within the LLVM ecosystem, llvm-objdump serves as a drop-inreplacement for the traditional GNU objdump, leveraging instructioninformation from LLVM's TableGen files(llvm/lib/Target/*/*.td
). Another LLVM tool, llvm-mc, wasoriginally designed for internal testing of the Machine Code (MC) layer,particularly the assembler and disassembler components. There arenumerous RUN: llvm-mc ... tests
withinllvm/test/MC
. Despite its internal origins, llvm-mc isoften distributed as part of the LLVM toolset, making it accessible tousers.
However, using llvm-mc for simple disassembly tasks can becumbersome. It requires explicitly prefixing hexadecimal byte valueswith 0x:
1 | % echo 0x48 0x29 0xc3 0x90 | llvm-mc --triple=x86_64 --cdis --output-asm-variant=1 |
Let's break down the options used in this command:
--triple=x86_64
: This specifies the targetarchitecture. If your LLVM build's default target triple is alreadyx86_64-*-*
, this option can be omitted.--output-asm-variant=1
:LLVM, like GCC, defaults to AT&T syntax for x86 assembly. Thisoption switches to the Intel syntax. See lhmouse/mcfgthread/wiki/Intel-syntaxif you prefer the Intel syntax in compiler toolchains.--cdis
: Introduced in LLVM 18, this option enablescolored disassembly. In older LLVM versions, you have to use--disassemble
.I have contributed patches to remove.text
and allow disassemblingraw bytes without the 0x prefix. You can now use the--hex
option:
1 | % echo 4829c390 | llvm-mc --cdis --hex --output-asm-variant=1 |
You can further simplify this by creating a bash/zsh function. bashand zsh's "here string" feature provides a clean way to specifystdin.
1 | disasm() { |
1 | % disasm 4829c390 |
The --hex
option conveniently ignores whitespace and#
-style comments within the input.
For address information, llvm-mc falls short. We need to turn tollvm-objdump to get that detail. Here are fish and zsh scripts that takeraw hex bytes as input, convert them to a binary format(xxd -r -p
), and then create an ELF relocatable file(llvm-objcopy -I binary
) targeting the x86-64 architecture.Finally, llvm-objdump with the -D
flag disassembles thedata section (.data
) containing the converted binary.
1 | % cat disasm.fish |