Telemetry anomaly detection portfolio repo.
poetry installGenerate synthetic telemetry and ground truth:
poetry run python -m anomaly_detector.cli generate-data --out data.csv --truth truth.json --seconds 600 --seed 42Stream replay with rolling z-score + hysteresis:
poetry run python -m anomaly_detector.cli run-stream \
--data data.csv \
--truth truth.json \
--detector rolling_zscore \
--trigger-threshold 3.5 \
--clear-threshold 2.5 \
--clear-patience 3 \
--min-event-samples 3 \
--merge-gap-samples 2 \
--phase-aware \
--out pred.jsonTrain Isolation Forest and save the model:
poetry run python -m anomaly_detector.cli train-if --data data.csv --model if_model.joblib --window-size 20 --contamination 0.01Run Isolation Forest streaming:
poetry run python -m anomaly_detector.cli run-stream \
--data data.csv \
--truth truth.json \
--detector isolation_forest \
--model-path if_model.joblib \
--if-threshold 0.0 \
--out pred_if.jsonTrain LSTM predictor (next-step forecasting):
poetry run python -m anomaly_detector.cli train-lstm \
--data data.csv \
--model-path lstm_model.pt \
--lookback 30 \
--epochs 20 \
--batch-size 64 \
--lr 1e-3 \
--features vibration,rpm,actuation,temp,currentRun LSTM streaming (prediction error scoring):
poetry run python -m anomaly_detector.cli run-stream \
--data data.csv \
--truth truth.json \
--detector lstm \
--model-path lstm_model.pt \
--lstm-threshold 0.5 \
--min-event-samples 3 \
--merge-gap-samples 2 \
--out pred_lstm.jsonEvaluate event metrics:
poetry run python -m anomaly_detector.cli evaluate --truth truth.json --pred pred.jsonOptional comparison table after training Isolation Forest:
poetry run python -m anomaly_detector.cli train-if \
--data data.csv \
--model if_model.joblib \
--window-size 20 \
--contamination 0.01 \
--truth truth.json \
--compareRun tests:
poetry run pytestThis repo uses interval IoU (intersection over union) to match predicted events to ground truth events. A prediction is a true positive if its IoU with a truth interval exceeds the threshold. Precision/recall/F1 are computed at the event level.
Alert rate (alerts per minute) provides a simple proxy for alert fatigue: lower alert rates reduce operator overload but can lower recall. In practice, tune thresholds to balance detection coverage with operational noise.
The LSTM detector learns to predict the next timestep from a fixed lookback window. Anomaly scores are derived from the prediction error, which helps surface temporal dynamics and context-dependent deviations (e.g., drifting signals or coordinated changes). This adds model complexity and requires training data representative of normal operation, so calibrate thresholds (e.g., the 99th percentile of training errors) and watch for drift/OOD conditions. If no LSTM threshold is provided, the CLI defaults to the stored training p99 value.