OpenLATTE is a reimplementation of the LATTE (LLM-Powered Static Binary Taint Analysis) static analysis pipeline for discovering vulnerabilities in stripped binaries. The project automates the three major phases described in the paper:
- Function classification – external library calls are labelled as potential taint sources or sinks using an LLM.
- Flow extraction – Ghidra scripts identify vulnerable destinations and build call chains that trace the flow of tainted data.
- LLM inspection – the discovered flows are analysed by an LLM to report possible vulnerabilities. The
rag.pyscript optionally augments these prompts with a retrieval‑augmented knowledge base (RAG).
The repository contains helper scripts for running Ghidra in headless mode, exporting code for a knowledge base and querying either a local Ollama model or Google Gemini.
build/ # Compiled test binaries and symbol maps
results/ # JSON output of each analysis stage
ghidra-workspace/ # Ghidra projects created during headless runs
*.py, *.sh # Analysis scripts used in each LATTE phase
- Python 3.9+
- Ghidra 11.3.2 with the Ghidrathon 4.0 plugin
- A running LLM backend
- Local model via Ollama (
classifyLocal.py,inspect_flows_with_llm.py) - Google Gemini for higher quality results (
classifyGemini.py,rag.py)
- Local model via Ollama (
pip install -r requirement.txt
Several scripts expect environment variables such as GOOGLE_API_KEY or GOOGLE_API_KEY42 to be set with your Gemini key.
-
Export external functions
./flow.sh /path/to/binary.out # runs Ghidra to dump external functions -
Classify as sources or sinks
python3 batch_classify.py --ext-funcs build/external_funcs_<binary>.out.txt \ --mode sink --output-dir results python3 batch_classify.py --ext-funcs build/external_funcs_<binary>.out.txt \ --mode source --output-dir results
-
Find dangerous flows (headless Ghidra)
./DF.sh # wrapper around find_dangerous_flows.py -
Export code for each flow
"<ghidra>/support/analyzeHeadless" <workspace> ProjectName \ -import <binary> -scriptPath . -postScript export_flow_code.py -deleteProject
-
Inspect flows with an LLM
python3 inspect_flows_with_llm.py \ --flows-with-code results/flows_with_code_<binary>.json \ --sources results/source_classification_<binary>.json \ --output results/vulnerability_reports.json