All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Benchmark Study: AI Agents Using Ghidra to Detect Backdoors in Binary Executables

By

jakozaur

3mo ago· 15 min readenInsight

Summary

The article describes a benchmark study called BinaryAudit that evaluates AI agents' ability to detect backdoors in compiled binary executables. Researchers partnered with reverse engineering expert Michał "Redford" Kowalczyk to create a benchmark testing AI agents using Ghidra (NSA's reverse engineering tool) to find malicious code in ~40MB binaries of real open-source servers, proxies, and network infrastructure without access to source code. The benchmark measures detection accuracy, false positive rates, and tool proficiency for practical malware detection applications.

Key quotes

· 4 pulled
We partnered with Michał 'Redford' Kowalczyk, reverse engineering expert from Dragon Sector, known for finding malicious code in Polish trains, to create a benchmark of finding backdoors in binary executables, without access to source code.
See BinaryAudit for the full benchmark results — including false positive rates, tool proficiency, and the Pareto
BinaryAudit benchmarks AI agents using Ghidra to find backdoors in compiled binaries of real open-source servers, proxies, and network infrastructure.
We hid backdoors in ~40MB binaries and asked AI + Ghidra to find them
Snippet from the RSS feed
BinaryAudit benchmarks AI agents using Ghidra to find backdoors in compiled binaries of real open-source servers, proxies, and network infrastructure.

You might also wanna read