Skip to content

ms4624/anomaly-detector

Repository files navigation

anomaly-detector

Telemetry anomaly detection portfolio repo.

Install

poetry install

Run (MVP)

Generate synthetic telemetry and ground truth:

poetry run python -m anomaly_detector.cli generate-data --out data.csv --truth truth.json --seconds 600 --seed 42

Stream replay with rolling z-score + hysteresis:

poetry run python -m anomaly_detector.cli run-stream \
  --data data.csv \
  --truth truth.json \
  --detector rolling_zscore \
  --trigger-threshold 3.5 \
  --clear-threshold 2.5 \
  --clear-patience 3 \
  --min-event-samples 3 \
  --merge-gap-samples 2 \
  --phase-aware \
  --out pred.json

Train Isolation Forest and save the model:

poetry run python -m anomaly_detector.cli train-if --data data.csv --model if_model.joblib --window-size 20 --contamination 0.01

Run Isolation Forest streaming:

poetry run python -m anomaly_detector.cli run-stream \
  --data data.csv \
  --truth truth.json \
  --detector isolation_forest \
  --model-path if_model.joblib \
  --if-threshold 0.0 \
  --out pred_if.json

Train LSTM predictor (next-step forecasting):

poetry run python -m anomaly_detector.cli train-lstm \
  --data data.csv \
  --model-path lstm_model.pt \
  --lookback 30 \
  --epochs 20 \
  --batch-size 64 \
  --lr 1e-3 \
  --features vibration,rpm,actuation,temp,current

Run LSTM streaming (prediction error scoring):

poetry run python -m anomaly_detector.cli run-stream \
  --data data.csv \
  --truth truth.json \
  --detector lstm \
  --model-path lstm_model.pt \
  --lstm-threshold 0.5 \
  --min-event-samples 3 \
  --merge-gap-samples 2 \
  --out pred_lstm.json

Evaluate event metrics:

poetry run python -m anomaly_detector.cli evaluate --truth truth.json --pred pred.json

Optional comparison table after training Isolation Forest:

poetry run python -m anomaly_detector.cli train-if \
  --data data.csv \
  --model if_model.joblib \
  --window-size 20 \
  --contamination 0.01 \
  --truth truth.json \
  --compare

Run tests:

poetry run pytest

Event-based evaluation and alert fatigue

This repo uses interval IoU (intersection over union) to match predicted events to ground truth events. A prediction is a true positive if its IoU with a truth interval exceeds the threshold. Precision/recall/F1 are computed at the event level.

Alert rate (alerts per minute) provides a simple proxy for alert fatigue: lower alert rates reduce operator overload but can lower recall. In practice, tune thresholds to balance detection coverage with operational noise.

LSTM predictor detector

The LSTM detector learns to predict the next timestep from a fixed lookback window. Anomaly scores are derived from the prediction error, which helps surface temporal dynamics and context-dependent deviations (e.g., drifting signals or coordinated changes). This adds model complexity and requires training data representative of normal operation, so calibrate thresholds (e.g., the 99th percentile of training errors) and watch for drift/OOD conditions. If no LSTM threshold is provided, the CLI defaults to the stored training p99 value.

About

End-to-end multivariate telemetry anomaly detection toolkit. Uses synthetic flight-test data, streaming replay alerts, event-based IoU evaluation, and baseline/ML/deep detectors (z-score, Isolation Forest, LSTM).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages