Skip to content

[Medium] CSV run-mode rows print uninitialized event_number and thread_id #128

@zhaozhiwen

Description

@zhaozhiwen

one-line summary
In run mode the CSV factory writes event_number and thread_id that were never set, because run publishing never calls startEvent/publishEventHeader; both ints are default-constructed without an initializer and are therefore indeterminate.

Affected files: gemc/gstreamer/factories/CSV/run/publishDigitized.cc:52, gemc/gstreamer/factories/CSV/gstreamerCSVFactory.h:265,271 (member declarations)

Details: The run-mode row writes these members directly:

// run/publishDigitized.cc:52
ofile_digitized << event_number << ", " << timestamp << ", " << thread_id << ", " << detectorName << ", ";

But the members are declared without initializers:

// gstreamerCSVFactory.h
/// \brief Cached event number copied at the start of the most recent event publish cycle.
int event_number;          // :265
...
/// \brief Cached thread id copied from the most recent event header.
int thread_id;             // :271

event_number is only assigned in the event-mode startEvent path and thread_id only in publishEventHeaderImpl. The run-mode path (publishRunDigitizedDataImpl) never goes through either, so both ints hold indeterminate values — reading them is undefined behavior and emits garbage into the CSV. timestamp is a std::string (default-constructed empty), so it prints as an empty field, which is benign.

Impact: Every row of a run-mode CSV digitized output prints garbage for the evn and thread_id columns. Triggered by any csv gstreamer output configured with run-level type.

Proposed fix: Default-initialize the members so the run-mode columns are deterministic (e.g. -1, matching the JSON factory's fallback convention):

 	/// \brief Cached event number copied at the start of the most recent event publish cycle.
-	int event_number;
+	int event_number = -1;
 
 	/// \brief Cached run number copied at the start of the most recent run publish cycle.
-	int runId;
+	int runId = -1;
 
 	/// \brief Cached thread id copied from the most recent event header.
-	int thread_id;
+	int thread_id = -1;

(runId is included for the same reason — it is the run-mode analogue and is read by the run header path; initializing it is a one-line consistency fix. timestamp already defaults to empty, so it needs no change.)

Note: this makes run-mode rows show -1 for evn/thread_id, which are not meaningful at run granularity but are at least well-defined and machine-parseable. If a real run identifier is desired in those columns, the run publish path should instead set event_number/thread_id from the run header before writing rows — but that is a schema decision; the minimal correctness fix is the initializers above.


Split out from #102 (one issue per bug). Found via an AI-assisted code review with manual verification against commit 5f8ce875.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Planned

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions