Skip to content

feat(server): add mongodb command monitoring metrics#5437

Open
mihir-dixit2k27 wants to merge 6 commits intolitmuschaos:masterfrom
mihir-dixit2k27:feat/mongo-observability
Open

feat(server): add mongodb command monitoring metrics#5437
mihir-dixit2k27 wants to merge 6 commits intolitmuschaos:masterfrom
mihir-dixit2k27:feat/mongo-observability

Conversation

@mihir-dixit2k27
Copy link
Copy Markdown
Contributor

What this PR does

This PR introduces MongoDB observability to the GraphQL server by implementing a Prometheus-based event.CommandMonitor. It enables real-time tracking of database latency, throughput, and concurrency without modifying repository or business logic.

Key Changes

New Telemetry Package

  • Added pkg/telemetry/mongo_monitor.go to intercept MongoDB Started, Succeeded, and Failed events.

New Metrics

  • litmus_mongo_command_duration_seconds (Histogram) — latency (5ms–10s buckets)
  • litmus_mongo_command_total (Counter) — total commands by status (success, failed)
  • litmus_mongo_in_flight_commands (Gauge) — concurrent operations

Integration

  • Injected monitor in pkg/database/mongodb/init.go
  • Exposed metrics via /metrics

Technical Details

  • Cardinality Protection: Commands are normalized. High-value commands (find, insert, update, delete, aggregate) are labeled explicitly; others are grouped as "other" to prevent metric explosion.
  • Safety: Uses sync.Once for safe metric registration. The gauge is decremented in both Succeeded and Failed handlers to prevent stuck metrics.
  • Performance: Relies on MongoDB driver event hooks with negligible overhead and does not wrap or alter repository logic.

Verification

Verified locally using MongoDB 4.4 (Docker).

Steps

go run server.go
curl http://localhost:8080/metrics | grep litmus_mongo

After triggering database operations via the UI or GraphQL queries.

Example Output

litmus_mongo_command_total{command="find",status="success"} 2
litmus_mongo_in_flight_commands{command="find"} 0

Note: Metric values vary based on traffic.

Impact

  • Additive change
  • No breaking changes
  • Negligible performance overhead

mihir-dixit2k27 and others added 5 commits February 15, 2026 21:32
Implements event.CommandMonitor to track database latency, throughput, and concurrency. Exposes Prometheus metrics (duration, total, in-flight) via the /metrics endpoint.

Signed-off-by: Mihir Dixit <dixitmihir1@gmail.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds MongoDB command-level observability to the GraphQL server by introducing a Prometheus-backed MongoDB event.CommandMonitor, enabling visibility into command latency, throughput, and concurrency with minimal impact to existing repository/business logic.

Changes:

  • Added a new telemetry package that defines and registers Prometheus metrics and a MongoDB CommandMonitor.
  • Wired the command monitor into the MongoDB client initialization (SetMonitor(...)).
  • Exposed a Prometheus scrape endpoint at GET /metrics on the main Gin server.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
chaoscenter/graphql/server/server.go Registers telemetry metrics at startup and exposes /metrics via promhttp.Handler().
chaoscenter/graphql/server/pkg/telemetry/mongo_monitor.go Implements MongoDB command monitoring + Prometheus histogram/counter/gauge with command normalization.
chaoscenter/graphql/server/pkg/database/mongodb/init.go Enables MongoDB driver command monitoring by setting a client monitor.
chaoscenter/graphql/server/go.mod Adds Prometheus client dependency (and related indirects).
chaoscenter/graphql/server/go.sum Adds checksums for newly introduced Prometheus dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 151 to 153
router.GET("/", handlers.PlaygroundHandler())
router.GET("/metrics", gin.WrapH(promhttp.Handler()))
router.Any("/query", authorization.Middleware(srv, mongodb.MgoClient))
@@ -91,7 +92,7 @@ func MongoConnection() (*mongo.Client, error) {
Password: dbPassword,
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants