Conversation
…elper, add exit codes - fix B110: replace bare except:pass with specific exceptions + debug logging - fix B104: default API host 0.0.0.0 -> 127.0.0.1 - fix B324: add usedforsecurity=False to MD5 hash - fix B701: add jinja2 autoescape - extract _build_runner helper to deduplicate CLI setup - add SystemExit(1) to all CLI error paths - remove 19 unused imports across 10 files - strict score: 18.7 -> 70.9
…y types in stage/runner/cli
… (loldrivers_filter, pe_ingest) - test_providers_llm: LLMProvider properties, completion, retry, rate limiting, token errors, import guard - test_providers_decompiler: analyzeHeadless finder, cache hit/corrupt, ghidra execution - test_loldrivers_filter: DB loading, process logic, skip/pass verdicts - test_pe_ingest: directory discovery, extensions, recursion, subdirs, metadata - also: add _load_env helper for dotenv ImportError guard in cli.py
… fix pe_ingest B110 - test_stages_llm: 13 tests covering process flow, caching, classification, template vars, artifact loading - test_ghidra_decompile: 7 tests covering error paths, successful/failed decompilation, cache skip - fix imphash extraction B110: add debug logging and noqa annotation - total: 129 tests (up from 74)
…inner, add --model to resume - extract run_subprocess_with_kill and kill_process_tree to engine/process.py (design coherence) - rename _run_inner to _execute_pipeline_stages (naming quality) - add --model/-m option to resume command for API coherence with run - add network exposure warning when serve --host is non-localhost
…les without suppressions
…state abstractions
… legacy ghidra_runner.py
…tive asyncio execution
…chitecture and workflow execution schema
…gainst Github Actions environment containers
There was a problem hiding this comment.
Pull request overview
This PR introduces a new “DeepZero” pipeline architecture centered around filesystem-backed state (“state ledger”), adds built-in + external processors for driver-analysis workflows, and replaces/removes the previous byovd_agent implementation.
Changes:
- Add core engine modules (pipeline loading/validation, runner, registry, state store, process utilities, UI) and built-in processors (ingest/filter/sort/top-k/command/LLM).
- Add external processors for Ghidra decompilation, LOLDrivers filtering, and bulk Semgrep scanning + a reference
pipelines/loldriverspipeline. - Add extensive test coverage, CI workflow, packaging/metadata updates, and remove legacy
byovd_agentcode.
Reviewed changes
Copilot reviewed 81 out of 88 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_state.py | Adds tests for namespaced sample state + state store persistence/atomic writes. |
| tests/test_stages_llm.py | Adds tests for GenericLLM processing, caching, template vars, classification. |
| tests/test_stage_sort.py | Adds a basic unit test for Sort reduce processor. |
| tests/test_stage.py | Tests processor resolution, base classes, dataclasses, registry behavior. |
| tests/test_sort.py | Adds more complete tests for sorting behavior + validation edge cases. |
| tests/test_semgrep_scanner.py | Smoke test for Semgrep scanner processor construction. |
| tests/test_runner.py | Tests PipelineRunner execution, resume, failure isolation, parallel map, limits. |
| tests/test_registry.py | Tests engine.registry registry + resolving built-ins and file processors. |
| tests/test_providers_llm.py | Tests LLMProvider model parsing and retry/backoff behavior via mocks. |
| tests/test_process.py | Tests subprocess execution with timeout and cross-platform kill behavior. |
| tests/test_pipeline_e2e.py | End-to-end tests for ingest/map/batch/reduce interactions and caching/abort flows. |
| tests/test_pipeline.py | Tests pipeline YAML loading/validation behavior (env vars, ordering, work dir). |
| tests/test_pe_ingest.py | Tests external PE ingest processor behavior (hashes, recursion, filters). |
| tests/test_loldrivers_filter.py | Tests LOLDrivers filter DB loading + filtering by SHA256. |
| tests/test_ghidra_decompile.py | Tests external Ghidra decompile processor validation, caching, and execution paths. |
| tests/test_extract_dispatch.py | Tests Ghidra post-script helper extraction with fully mocked Ghidra modules. |
| tests/test_engine_ui.py | Tests PipelineDashboard stats and layout generation. |
| tests/test_engine_registry.py | Additional registry tests for register/resolve behavior. |
| tests/test_engine_pipeline.py | Tests load_pipeline/validate_pipeline APIs. |
| tests/test_engine_llm.py | Basic tests for LLMProvider init/complete/import guard. |
| tests/test_engine_context.py | Tests generate_context() markdown output/truncation and artifacts listing. |
| tests/test_context.py | Additional context generation test using SampleState history. |
| tests/test_cli.py | Tests CLI command registration/help and basic behaviors for validate/init/status/run/resume/serve. |
| tests/test_builtin_ingest.py | Tests built-in FileDiscovery ingest behavior. |
| tests/test_builtin_filters.py | Tests built-in MetadataFilter, HashExclude, and TopKSelector. |
| tests/test_builtin_external.py | Tests built-in GenericCommand behavior using a mocked async runner. |
| tests/test_api_server.py | Tests Starlette API endpoints against filesystem state store. |
| src/deepzero/stages/top_k.py | Adds TopKSelector reduce processor. |
| src/deepzero/stages/sort.py | Adds Sort reduce processor. |
| src/deepzero/stages/llm.py | Adds GenericLLM map processor (jinja prompts, caching, classification, artifact context). |
| src/deepzero/stages/ingest.py | Adds FileDiscovery ingest processor. |
| src/deepzero/stages/hash_filter.py | Adds HashExclude map processor. |
| src/deepzero/stages/filter.py | Adds MetadataFilter map processor. |
| src/deepzero/stages/command.py | Adds GenericCommand map processor for running external commands. |
| src/deepzero/stages/init.py | Registers built-in processors in the registry. |
| src/deepzero/engine/ui.py | Adds Rich-based pipeline dashboard TUI. |
| src/deepzero/engine/types.py | Adds shared enums for verdict/status types. |
| src/deepzero/engine/registry.py | Adds processor registry + resolution (built-ins, processors/ file refs, dotted refs). |
| src/deepzero/engine/process.py | Adds subprocess runner with timeout + process-tree killing. |
| src/deepzero/engine/pipeline.py | Adds pipeline definition/YAML loader, env-var expansion, processor validation. |
| src/deepzero/engine/llm.py | Adds LLMProvider wrapper with adaptive retry/backoff and import guards. |
| src/deepzero/engine/context.py | Adds generate_context() to write sample context markdown via atomic writes. |
| src/deepzero/engine/init.py | Defines engine package. |
| src/deepzero/api/server.py | Adds Starlette API server for run/sample listing and artifact viewing. |
| src/deepzero/api/init.py | Defines api package. |
| src/deepzero/main.py | Adds module entrypoint to CLI. |
| src/deepzero/init.py | Defines package metadata/version. |
| src/byovd_agent/translate/ghidra_runner.py | Removes legacy BYOVD Ghidra runner implementation. |
| src/byovd_agent/translate/init.py | Removes legacy translate package init. |
| src/byovd_agent/prompts.py | Removes legacy prompt content. |
| src/byovd_agent/models.py | Removes legacy pydantic models. |
| src/byovd_agent/knowledge/vuln_patterns.py | Removes legacy vuln-pattern knowledge module. |
| src/byovd_agent/knowledge/loldrivers.py | Removes legacy LOLDrivers DB wrapper. |
| src/byovd_agent/knowledge/init.py | Removes legacy knowledge package init. |
| src/byovd_agent/ingest/triage.py | Removes legacy ingest/triage implementation. |
| src/byovd_agent/ingest/scraper.py | Removes legacy scraper implementation. |
| src/byovd_agent/ingest/init.py | Removes legacy ingest package init. |
| src/byovd_agent/identify/semgrep_scanner.py | Removes legacy semgrep formatting/scanning utilities. |
| src/byovd_agent/identify/init.py | Removes legacy identify package init. |
| src/byovd_agent/config.py | Removes legacy dotenv-based config. |
| src/byovd_agent/agent.py | Removes legacy DeepAgents-based orchestrator agent. |
| src/byovd_agent/init.py | Removes legacy package init. |
| pyproject.toml | Renames project to deepzero, updates dependencies/extras, pytest config, scripts. |
| processors/semgrep_scanner/semgrep_scanner.py | Adds external bulk Semgrep scanner processor (batch stage). |
| processors/loldrivers_filter/loldrivers_filter.py | Adds external LOLDrivers filter processor (auto-download + cache). |
| processors/ghidra_decompile/ghidra_decompile.py | Adds external Ghidra decompile processor (headless, caching, timeouts). |
| pipelines/loldrivers/rules/msr_access.yaml | Adds Semgrep rules for MSR access patterns. |
| pipelines/loldrivers/rules/method_neither.yaml | Updates METHOD_NEITHER rule text (contains a mojibake issue). |
| pipelines/loldrivers/rules/buffer_overflow.yaml | Adds Semgrep rules for overflow/info-leak patterns. |
| pipelines/loldrivers/rules/arbitrary_rw.yaml | Adds Semgrep rules for arbitrary R/W and related primitives. |
| pipelines/loldrivers/pipeline.yaml | Adds reference LOLDrivers pipeline definition wiring external+built-in processors. |
| pipelines/loldrivers/assessment.j2 | Adds LLM assessment prompt template. |
| func.txt | Adds an unshown auxiliary file. |
| copy_reports.ps1 | Removes legacy report-copying script. |
| README.md | Rewrites documentation to match the new DeepZero pipeline/orchestrator focus. |
| LICENSE | Adds MIT license file. |
| .gitignore | Updates ignore patterns for DeepZero workspace/cache/build/test artifacts. |
| .github/workflows/ci.yml | Adds CI for linting (ruff), security scan (bandit), and pytest on 3.11/3.12. |
| .env.example | Replaces env template to match litellm + ghidra configuration. |
Comments suppressed due to low confidence (1)
pipelines/loldrivers/rules/method_neither.yaml:9
- The message contains mojibake (
ΓÇö) instead of an em dash, which will show up incorrectly in Semgrep output and reports. Replace it with a proper—(or-) character.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class StageOutput: # helper | ||
| def __init__(self, **kwargs): | ||
| for k, v in kwargs.items(): | ||
| setattr(self, k, v) | ||
|
|
There was a problem hiding this comment.
The StageOutput helper class defined here is unused in this test module (no references found), and it shadows the real deepzero.engine.state.StageOutput name used elsewhere. Removing it avoids confusion and keeps the tests focused on the runner behavior.
| from deepzero.engine.stage import ( | ||
| Sample, | ||
| BulkMapProcessor, | ||
| ProcessorResult, | ||
| ProcessorContext, | ||
| ) | ||
|
|
||
|
|
||
| class SemgrepScanner(BulkMapProcessor): | ||
| description = ( | ||
| "runs semgrep batch scan against decompiled source across all active samples" | ||
| ) | ||
|
|
||
| def validate(self, ctx: ProcessorContext) -> list[str]: | ||
| errors = [] | ||
| if not shutil.which("semgrep"): | ||
| errors.append( | ||
| "semgrep CLI not found in PATH - install with: pip install semgrep" | ||
| ) | ||
|
|
||
| rules_dir = self.config.get("rules_dir") | ||
| if not rules_dir: | ||
| errors.append("semgrep_scanner requires 'rules_dir' in config") | ||
| else: | ||
| rules_path = (Path.cwd() / rules_dir).resolve() | ||
| if not rules_path.exists(): | ||
| rules_path = (ctx.pipeline_dir / rules_dir).resolve() | ||
| if not rules_path.exists(): | ||
| errors.append(f"rules_dir does not exist: {rules_dir}") | ||
|
|
||
| return errors | ||
|
|
||
| def process( | ||
| self, ctx: ProcessorContext, entries: list[Sample] | ||
| ) -> list[ProcessorResult]: | ||
| rules_dir = self.config.get("rules_dir", "") |
There was a problem hiding this comment.
SemgrepScanner subclasses BulkMapProcessor, whose process() contract expects list[ProcessorEntry]. This implementation types the argument as list[Sample] and then accesses entry.sample_dir, which Sample does not have, causing an AttributeError at runtime when the runner calls this batch stage. Update the signature/imports to use ProcessorEntry (and keep the list order aligned with the input entries).
| def validate(self, ctx: ProcessorContext) -> list[str]: | ||
| errors = [] | ||
|
|
||
| prompt_ref = self.config.get("prompt") | ||
| if not prompt_ref: | ||
| errors.append("GenericLLM requires 'prompt' template mapping in config") | ||
| else: | ||
| prompt_path = (Path.cwd() / prompt_ref).resolve() | ||
| if not prompt_path.exists(): | ||
| prompt_path = (ctx.pipeline_dir / prompt_ref).resolve() | ||
| if not prompt_path.exists(): | ||
| errors.append(f"Prompt template does not exist: {prompt_ref}") | ||
|
|
||
| # structurally validate LLM bindings early | ||
| model = ctx.global_config.get("model") | ||
| if model: | ||
| try: | ||
| import litellm | ||
|
|
||
| env_state = litellm.validate_environment(model=model) | ||
| if not env_state.get("keys_in_environment", True): | ||
| missing_keys = env_state.get("missing_keys", []) | ||
| if missing_keys: | ||
| errors.append( | ||
| f"LLM backend '{model}' missing credentials in environment. Need: {missing_keys}" | ||
| ) | ||
| except ImportError: | ||
| errors.append( | ||
| "LLM configured, but 'litellm' framework is not installed" | ||
| ) | ||
|
|
||
| return errors | ||
|
|
||
| def process(self, ctx: ProcessorContext, entry: ProcessorEntry) -> ProcessorResult: | ||
| if ctx.llm is None: | ||
| return ProcessorResult.fail("no llm provider configured for generic_llm") | ||
|
|
||
| prompt_ref = self.config.get("prompt", "") | ||
| prompt_text = self._render_prompt(prompt_ref, ctx, entry) | ||
|
|
||
| output_file = self.config.get("output_file", "assessment.md") | ||
| output_path = entry.sample_dir / output_file | ||
| if output_path.exists(): | ||
| self.log.info("output already cached: %s", output_path.name) | ||
| content = output_path.read_text(encoding="utf-8", errors="replace") | ||
| return self._make_result(content, output_file) | ||
|
|
||
| max_retries = self.config.get("max_retries", 3) | ||
| backoff_config = self.config.get("backoff", {}) | ||
|
|
||
| messages = [{"role": "user", "content": prompt_text}] | ||
|
|
||
| response = ctx.llm.complete( | ||
| messages, | ||
| max_retries=max_retries, | ||
| initial_backoff=backoff_config.get("initial", 2.0), | ||
| max_backoff=backoff_config.get("max", 60.0), | ||
| backoff_decay=backoff_config.get("decay", 0.7), | ||
| ) | ||
|
|
||
| tmp = output_path.with_suffix(".tmp") | ||
| tmp.write_text(response, encoding="utf-8") | ||
| os.replace(tmp, output_path) | ||
| self.log.info("response written to %s (%d chars)", output_file, len(response)) | ||
|
|
||
| return self._make_result(response, output_file) | ||
|
|
||
| def _make_result(self, content: str, output_file: str) -> ProcessorResult: | ||
| data: dict[str, Any] = {"llm_output_file": output_file} | ||
|
|
||
| classify_by = self.config.get("classify_by", "") | ||
| if classify_by: | ||
| import re | ||
|
|
||
| match = re.search(classify_by, content[:200], re.IGNORECASE) | ||
| if match: | ||
| verdict_text = match.group(0).strip("[]").lower() | ||
| data["classification"] = verdict_text | ||
|
|
||
| return ProcessorResult.ok( | ||
| artifacts={"llm_output": output_file}, | ||
| data=data, | ||
| ) | ||
|
|
||
| def _render_prompt( | ||
| self, prompt_ref: str, ctx: ProcessorContext, entry: ProcessorEntry | ||
| ) -> str: | ||
| template_path = self._resolve_template(prompt_ref) | ||
|
|
||
| if template_path is not None: | ||
| raw = template_path.read_text(encoding="utf-8") | ||
| template_vars = self._build_template_vars(ctx, entry) | ||
|
|
||
| env = jinja2.Environment( | ||
| loader=jinja2.FileSystemLoader(str(template_path.parent)), | ||
| undefined=jinja2.Undefined, | ||
| autoescape=jinja2.select_autoescape(), | ||
| ) | ||
| template = env.from_string(raw) | ||
| return template.render(**template_vars) | ||
|
|
||
| return prompt_ref |
There was a problem hiding this comment.
GenericLLM.validate() resolves prompt templates relative to both Path.cwd() and ctx.pipeline_dir, but _render_prompt() / _resolve_template() only resolve via Path.cwd() (and absolute paths). This means a template that exists relative to the pipeline directory can pass validation but be treated as a literal prompt string at runtime. Consider resolving relative refs against ctx.pipeline_dir as well (or pass ctx into _resolve_template).
| spec = importlib.util.spec_from_file_location( | ||
| f"deepzero.custom.{file_path.stem}", file_path | ||
| ) | ||
| if spec is None or spec.loader is None: | ||
| raise ImportError(f"cannot load processor from {file_path}") | ||
|
|
||
| module = importlib.util.module_from_spec(spec) | ||
| spec.loader.exec_module(module) | ||
|
|
There was a problem hiding this comment.
_load_specific_class() executes the module without first registering it in sys.modules (unlike _load_processor_from_file()). This can lead to duplicate module instances on repeated loads and can break imports/state that rely on module identity. Insert the module into sys.modules[spec.name] before exec_module().
| try: | ||
| hist = history | ||
| except NameError: | ||
| hist = {} | ||
|
|
There was a problem hiding this comment.
This helper uses try/except NameError around variables that are always defined in this scope (e.g., history), which is dead code and makes the fixture harder to understand/maintain. Simplify by assigning directly (or parameterizing the fixture) and remove the unreachable exception paths.
|
@copilot apply changes based on the comments in this thread |
… template resolution Agent-Logs-Url: https://github.com/416rehman/DeepZero/sessions/4d129d7b-9c62-4794-a11a-285adce9fb26 Co-authored-by: 416rehman <15690606+416rehman@users.noreply.github.com>
Applied all 5 review changes in commit
|
Agent-Logs-Url: https://github.com/416rehman/DeepZero/sessions/ff83a24c-e0b7-4091-9be5-080ebcff2236 Co-authored-by: 416rehman <15690606+416rehman@users.noreply.github.com>
No description provided.