feat: add get_config support to device-discovery by leoparente · Pull Request #267 · netboxlabs/orb-discovery

leoparente · 2026-01-30T13:16:58Z

This pull request introduces support for translating and ingesting device configuration data (such as startup, running, and candidate configs) from NAPALM into the Diode SDK, while maintaining compatibility with environments where the DeviceConfig protobuf message is not yet available. The changes are structured to gracefully handle the absence of this feature in the SDK and to ensure robust error handling during data collection.

Device configuration ingestion and translation:

Added logic in runner.py to collect device configuration data with error handling, ensuring that failures to retrieve config do not interrupt the data collection process.
Introduced a new DeviceConfig wrapper and related translation functions in translate.py to convert NAPALM config data into the Diode SDK DeviceConfig protobuf, including safe handling when the SDK does not yet support this message. [1] [2]
Updated the translate_device and translate_data functions to pass configuration data through the translation pipeline and attach it to the resulting device entity when supported. [1] [2] [3] [4]

Type hinting and imports:

Added missing imports and type hints to support new functionality and improve code clarity.

github-actions · 2026-01-30T13:17:41Z

Coverage Report

File	Stmts	Miss	Cover	Missing
device_discovery
client.py	55	9	84%	140–158
discovery.py	63	2	97%	142–145
interface.py	157	3	98%	140–144
main.py	49	2	96%	181, 187
metrics.py	53	1	98%	114
server.py	88	10	89%	44–46, 72–87, 184, 187
translate.py	128	30	77%	27–31, 41–59, 138, 140, 165, 235–263
version.py	7	1	86%	14
device_discovery/policy
manager.py	61	3	95%	37–38, 161
portscan.py	84	11	87%	34–35, 59–60, 64, 72–76, 117
run.py	83	2	98%	150, 182
runner.py	189	36	81%	211–212, 217–218, 245, 426–511
TOTAL	1122	110	90%

Tests	Skipped	Failures	Errors	Time
178	0 💤	0 ❌	0 🔥	7.838s ⏱️

mfiedorowicz

Argus Code Review Summary

🟡 Medium: 6
🔵 Low: 6

🔍 Automated review by Argus on behalf of @mfiedorowicz

mfiedorowicz · 2026-02-14T20:39:02Z

device-discovery/device_discovery/client.py

+            chunk_num = 1
+            size_bytes = estimate_message_size(entities_list)
+
+            if size_bytes > (3.0 * 1024 * 1024):  # 3MB threshold


🟡 Medium

The 3MB threshold is hardcoded as a magic number. More importantly, estimate_message_size and create_message_chunks may use different internal thresholds for chunking, leading to inconsistency. Consider using a named constant and verifying alignment with the SDK's chunking logic.

Suggested change

if size_bytes > (3.0 * 1024 * 1024): # 3MB threshold

MAX_MESSAGE_SIZE_BYTES = 3 * 1024 * 1024 # 3MB threshold

if size_bytes > MAX_MESSAGE_SIZE_BYTES:

mfiedorowicz · 2026-02-14T20:39:02Z

device-discovery/device_discovery/client.py

+                            f"ERROR ingestion failed for {hostname} chunk {i}/{chunk_num}: "
+                            f"{response.errors}"
+                        )
+                        return  # Stop on first error


🟡 Medium

On chunk ingestion error, the method returns None silently. The caller has no way to know ingestion failed — partial data was ingested (some chunks succeeded) but no error is raised or returned. This is inconsistent with the non-chunked path which also silently logs. Consider raising an exception or returning a status so callers can handle failures.

mfiedorowicz · 2026-02-14T20:39:02Z

device-discovery/device_discovery/client.py

-            logger.info(f"Hostname {hostname}: Successful ingestion")
+
+            # Convert to list for size estimation and chunking
+            entities_list = list(translated_entities)


🔵 Low

Calling list(translated_entities) materializes the entire generator into memory. If translate_data returns a very large dataset (which is likely given the need for chunking at 3MB+), this doubles memory usage — once for the list, then again for the chunks. Consider whether the SDK's create_message_chunks could accept an iterator, or whether size estimation could be done differently.

mfiedorowicz · 2026-02-14T20:39:02Z

device-discovery/device_discovery/client.py

@@ -119,15 +125,50 @@ def ingest(self, metadata: dict[str, Any] | None, data: dict):
        with self._lock:


🔵 Low

The entire ingest operation (including network calls to self.diode_client.ingest for potentially multiple chunks) is performed while holding self._lock. This means all other operations (including init_client and concurrent ingest calls) are blocked for the duration of potentially slow network I/O. Consider whether finer-grained locking is appropriate — e.g., only lock around shared state access, not the network calls.

mfiedorowicz · 2026-02-14T20:39:02Z

device-discovery/device_discovery/translate.py

+            running = running.encode("utf-8")
+
+    # Skip if no actual config data present
+    if not any([startup, running]):


🟡 Medium

Using any() with a list literal is unnecessary and slightly less efficient. any([startup, running]) creates a list before evaluating; use any((startup, running)) or simply startup or running instead.

Suggested change

if not any([startup, running]):

if not (startup or running):

mfiedorowicz · 2026-02-14T20:39:02Z