Dna is a static binary analysis framework built on top of LLVM. Notably it's written almost entirely in C#, including managed bindings for LLVM, Remill, and Souper.
Dna implements an iterative control flow graph reconstruction inspired heavily by the SATURN paper. It iteratively applies recursive descent, lifting (using remill), and path solving until the complete control flow graph is recovered. In the case of jump tables, we use a recursive algorithm based on Souper and z3 to solve the set of possible jump table targets. You can find the iterative exploration algorithm here, and the jump table solving algorithm here.
Once a control flow graph has been fully explored, it can then be recompiled to x86 and reinserted into the binary using the algorithms from here and here. Though the compiled code is not pretty by any means, it should run so long as the recovered control flow graph is correct. That being said, it is still a research prototype - bugs and edge cases are expected. Control flow graph exploration may fail in the case of e.g. unbounded jump tables or unliftable instructions.
Some other notable features:
- Supports most jump tables, including MSVC's nested or so-called compressed jump tables.
- Supports lifting code with SEH to LLVM IR. When SEH is present,
try/catchstatements andfilterintrinsics are inserted into the control flow graph. Though the recompiler does not (yet) support SEH (the SEH entries are not fixed up), so exceptions will cause crashes. - Includes a strong API for writing LLVM passes natively in C#. We have bindings for e.g.
MemorySSA,LoopInfo, dominator trees, pass pipeline management, etc. - Graph visualization for LLVM IR and binary control flow graphs using graphviz or alternatively a script generator for binary ninja.
Some caveats:
- Only x86_64 is supported
- Recompiled code is not CET compliant
- LLVM/LLVMSharp
- Remill
- Souper
- AsmResolver
- Rivers
Note that Dna is currently based on LLVM 17.
Dna contains a VMProtect devirtualization plugin located in Dna.BinaryTranslator/VMProtect. See this PR for more info.
Dna currently targets LLVM 17 and is expected to be built on Windows x64 with Visual Studio 2022.
Build Dna.LLVMInterop in Release mode; the native dependency tree is Release-built and Debug interop builds are not supported.
- Visual Studio 2022 with C++/MSBuild tools
- CMake
- Ninja
- clang-cl / LLVM tools available from the VS toolchain
- Rust/Cargo, for the EqSat simplifier DLL
- .NET SDK 8+
Run the commands below from a VS x64 developer shell, or another shell with the VS C++ tools on PATH.
The dependency superbuild installs LLVM 17, Remill, Z3, XED, gflags/glog, and related native libraries into Dna.LLVMInterop/dependencies/install.
cmake -S Dna.LLVMInterop/dependencies `
-B Dna.LLVMInterop/dependencies/build `
-G Ninja `
-DCMAKE_BUILD_TYPE=Release `
-DCMAKE_C_COMPILER=clang-cl `
-DCMAKE_CXX_COMPILER=clang-cl
cmake --build Dna.LLVMInterop/dependencies/buildIf changing compiler, build type, or CRT settings, delete both Dna.LLVMInterop/dependencies/build and Dna.LLVMInterop/dependencies/install before reconfiguring.
Dna.Example and the simplifier projects copy eq_sat.dll from the Cargo release output.
cargo build --manifest-path Simplifier/EqSat/Cargo.toml --release& "C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\Bin\MSBuild.exe" `
Dna.sln `
/restore `
/p:Configuration=Release `
/p:Platform=x64 `
/m