Skip to content

Hsmnasiri/OpenLATTE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenLATTE

OpenLATTE is a reimplementation of the LATTE (LLM-Powered Static Binary Taint Analysis) static analysis pipeline for discovering vulnerabilities in stripped binaries. The project automates the three major phases described in the paper:

  1. Function classification – external library calls are labelled as potential taint sources or sinks using an LLM.
  2. Flow extraction – Ghidra scripts identify vulnerable destinations and build call chains that trace the flow of tainted data.
  3. LLM inspection – the discovered flows are analysed by an LLM to report possible vulnerabilities. The rag.py script optionally augments these prompts with a retrieval‑augmented knowledge base (RAG).

The repository contains helper scripts for running Ghidra in headless mode, exporting code for a knowledge base and querying either a local Ollama model or Google Gemini.

Repository layout

build/                 # Compiled test binaries and symbol maps
results/               # JSON output of each analysis stage
ghidra-workspace/      # Ghidra projects created during headless runs
*.py, *.sh             # Analysis scripts used in each LATTE phase

Requirements

  • Python 3.9+
  • Ghidra 11.3.2 with the Ghidrathon 4.0 plugin
  • A running LLM backend
    • Local model via Ollama (classifyLocal.py, inspect_flows_with_llm.py)
    • Google Gemini for higher quality results (classifyGemini.py, rag.py)
  • pip install -r requirement.txt

Several scripts expect environment variables such as GOOGLE_API_KEY or GOOGLE_API_KEY42 to be set with your Gemini key.

Basic workflow

  1. Export external functions

    ./flow.sh /path/to/binary.out   # runs Ghidra to dump external functions
  2. Classify as sources or sinks

    python3 batch_classify.py --ext-funcs build/external_funcs_<binary>.out.txt \
        --mode sink   --output-dir results
    python3 batch_classify.py --ext-funcs build/external_funcs_<binary>.out.txt \
        --mode source --output-dir results
  3. Find dangerous flows (headless Ghidra)

    ./DF.sh   # wrapper around find_dangerous_flows.py
  4. Export code for each flow

    "<ghidra>/support/analyzeHeadless" <workspace> ProjectName \
        -import <binary> -scriptPath . -postScript export_flow_code.py -deleteProject
  5. Inspect flows with an LLM

    python3 inspect_flows_with_llm.py \
        --flows-with-code results/flows_with_code_<binary>.json \
        --sources results/source_classification_<binary>.json \
        --output results/vulnerability_reports.json

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published