Grafana Queries - Complete Guide with Configuration

This guide provides ready-to-use PromQL queries for all Enhanced HTTP Metrics, including panel configuration details ( legend, min step, units).

1. Traffic Overview Metrics

1.1 Total Requests Per Second (RPS)

Query:

sum(rate(rr_http_requests_by_endpoint_total[5m]))

Configuration:

Legend: Total RPS
Min Step: 15s
Unit: requests/sec (reqps)
Panel Type: Graph
Description: Overall request rate across all endpoints

1.2 RPS by Endpoint

Query:

topk(10, sum by (endpoint) (rate(rr_http_requests_by_endpoint_total[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: requests/sec (reqps)
Panel Type: Graph or Bar gauge
Description: Top 10 endpoints by request rate

1.3 RPS by HTTP Method

Query:

sum by (method) (rate(rr_http_requests_by_endpoint_total[5m]))

Configuration:

Legend: {{method}}
Min Step: 15s
Unit: requests/sec (reqps)
Panel Type: Graph or Pie chart
Description: Request distribution by HTTP method (GET, POST, etc.)

1.4 RPS by Status Code

Query:

sum by (status) (rate(rr_http_requests_by_endpoint_total[5m]))

Configuration:

Legend: Status {{status}}
Min Step: 15s
Unit: requests/sec (reqps)
Panel Type: Graph (stacked area)
Description: Request rate grouped by status code

1.5 Success Rate (2xx responses)

Query:

sum(rate(rr_http_requests_by_endpoint_total{status=~"2.."}[5m])) / sum(rate(rr_http_requests_by_endpoint_total[5m])) * 100

Configuration:

Legend: Success Rate
Min Step: 15s
Unit: percent (0-100)
Panel Type: Graph or Gauge
Thresholds: Red < 95%, Yellow 95-99%, Green > 99%
Description: Percentage of successful (2xx) requests

1.6 Current Queue Size

Query:

rr_http_requests_queue

Configuration:

Legend: Queue Size
Min Step: 5s
Unit: short
Panel Type: Graph
Description: Number of requests currently waiting in queue

2. Performance Metrics

2.1 Average Request Duration

Query:

rate(rr_http_request_duration_seconds_sum[5m]) / rate(rr_http_request_duration_seconds_count[5m])

Configuration:

Legend: Avg Duration
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Description: Mean request duration across all requests

2.2 Request Duration by Endpoint

Query:

sum by (endpoint) (rate(rr_http_request_duration_seconds_sum[5m])) / sum by (endpoint) (rate(rr_http_request_duration_seconds_count[5m]))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Description: Average duration per endpoint

2.3 P50 Latency (Median)

Query:

histogram_quantile(0.50, sum by (le) (rate(rr_http_request_duration_seconds_bucket[5m])))

Configuration:

Legend: P50 (Median)
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Description: 50th percentile latency (median response time)

2.4 P95 Latency

Query:

histogram_quantile(0.95, sum by (le) (rate(rr_http_request_duration_seconds_bucket[5m])))

Configuration:

Legend: P95
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Thresholds: Green < 1s, Yellow 1-5s, Red > 5s
Description: 95th percentile latency (95% of requests complete faster)

2.5 P99 Latency

Query:

histogram_quantile(0.99, sum by (le) (rate(rr_http_request_duration_seconds_bucket[5m])))

Configuration:

Legend: P99
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Description: 99th percentile latency (99% of requests complete faster)

2.6 P95 Latency by Endpoint

Query:

histogram_quantile(0.95, sum by (endpoint, le) (rate(rr_http_request_duration_seconds_bucket[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph or Table
Description: 95th percentile per endpoint

2.7 Queue Time (Average)

Query:

rate(rr_http_queue_time_seconds_sum[5m]) / rate(rr_http_queue_time_seconds_count[5m])

Configuration:

Legend: Queue Time
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Description: Average time requests spend waiting in queue

2.8 Queue Time P95

Query:

histogram_quantile(0.95, sum by (le) (rate(rr_http_queue_time_seconds_bucket[5m])))

Configuration:

Legend: Queue Time P95
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Thresholds: Green < 0.1s, Yellow 0.1-0.5s, Red > 0.5s
Description: 95th percentile queue wait time

2.9 Processing Time (Average)

Query:

rate(rr_http_processing_time_seconds_sum[5m]) / rate(rr_http_processing_time_seconds_count[5m])

Configuration:

Legend: Processing Time
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Description: Average PHP worker processing time

2.10 Processing Time P95

Query:

histogram_quantile(0.95, sum by (le) (rate(rr_http_processing_time_seconds_bucket[5m])))

Configuration:

Legend: Processing Time P95
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph
Description: 95th percentile processing time

2.11 Performance Breakdown (Stacked)

Query 1 (Queue Time):

sum(rate(rr_http_queue_time_seconds_sum[5m])) / sum(rate(rr_http_queue_time_seconds_count[5m]))

Query 2 (Processing Time):

sum(rate(rr_http_processing_time_seconds_sum[5m])) / sum(rate(rr_http_processing_time_seconds_count[5m]))

Configuration:

Legend:
- Query 1: Queue Time
- Query 2: Processing Time
Min Step: 15s
Unit: seconds (s)
Panel Type: Graph (stacked area)
Description: Visual breakdown of where time is spent

2.12 Queue vs Processing Time Ratio

Query:

(rate(rr_http_queue_time_seconds_sum[5m]) / rate(rr_http_queue_time_seconds_count[5m])) / (rate(rr_http_request_duration_seconds_sum[5m]) / rate(rr_http_request_duration_seconds_count[5m])) * 100

Configuration:

Legend: Queue Time %
Min Step: 15s
Unit: percent (0-100)
Panel Type: Graph
Description: Percentage of total time spent waiting in queue

3. Endpoint Analysis

3.1 Top 10 Slowest Endpoints (by P95)

Query:

topk(10, histogram_quantile(0.95, sum by (endpoint, le) (rate(rr_http_duration_by_endpoint_seconds_bucket[15m]))))

Configuration:

Legend: {{endpoint}}
Min Step: 30s
Unit: seconds (s)
Panel Type: Bar gauge (horizontal) or Table
Description: Slowest endpoints ranked by 95th percentile

3.2 Top 10 Slowest Endpoints (by Average)

Query:

topk(10, sum by (endpoint) (rate(rr_http_duration_by_endpoint_seconds_sum[5m])) / sum by (endpoint) (rate(rr_http_duration_by_endpoint_seconds_count[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: seconds (s)
Panel Type: Bar gauge or Table
Description: Slowest endpoints by mean duration

3.3 Top 10 Most Trafficked Endpoints

Query:

topk(10, sum by (endpoint) (rate(rr_http_requests_by_endpoint_total[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: requests/sec (reqps)
Panel Type: Bar gauge or Table
Description: Endpoints with highest request rate

3.4 Endpoint Performance Table

Query 1 (RPS):

sum by (endpoint) (rate(rr_http_requests_by_endpoint_total[5m]))

Query 2 (Avg Duration):

sum by (endpoint) (rate(rr_http_duration_by_endpoint_seconds_sum[5m])) / sum by (endpoint) (rate(rr_http_duration_by_endpoint_seconds_count[5m]))

Query 3 (P95 Duration):

histogram_quantile(0.95, sum by (endpoint, le) (rate(rr_http_duration_by_endpoint_seconds_bucket[5m])))

Query 4 (Error %):

sum by (endpoint) (rate(rr_http_errors_total[5m])) / sum by (endpoint) (rate(rr_http_requests_by_endpoint_total[5m])) * 100

Configuration:

Legend: N/A (Table columns)
Min Step: 15s
Unit:
- Query 1: requests/sec (reqps)
- Query 2: seconds (s)
- Query 3: seconds (s)
- Query 4: percent (0-100)
Panel Type: Table
Column Names: Endpoint, RPS, Avg Duration, P95 Duration, Error Rate %
Description: Comprehensive endpoint performance overview

3.5 Endpoint Duration Heatmap

Query:

sum by (le, endpoint) (rate(rr_http_duration_by_endpoint_seconds_bucket[5m]))

Configuration:

Legend: {{endpoint}}
Min Step: 30s
Unit: N/A (heatmap)
Panel Type: Heatmap
Description: Visual distribution of response times across endpoints

4. Error Tracking

4.1 Total Error Rate

Query:

sum(rate(rr_http_errors_total[5m]))

Configuration:

Legend: Errors/sec
Min Step: 15s
Unit: errors/sec
Panel Type: Graph
Description: Total errors per second (all types)

4.2 Error Rate Percentage

Query:

sum(rate(rr_http_errors_total[5m])) / sum(rate(rr_http_requests_by_endpoint_total[5m])) * 100

Configuration:

Legend: Error Rate
Min Step: 15s
Unit: percent (0-100)
Panel Type: Graph or Gauge
Thresholds: Green < 1%, Yellow 1-5%, Red > 5%
Description: Percentage of requests that result in errors

4.3 Error Rate by Type

Query:

sum by (type) (rate(rr_http_errors_total[5m]))

Configuration:

Legend: {{type}}
Min Step: 15s
Unit: errors/sec
Panel Type: Graph (stacked) or Pie chart
Description: Errors grouped by classification (client_error, server_error, timeout, no_workers)

4.4 Error Rate by Status Code

Query:

sum by (status) (rate(rr_http_errors_total[5m]))

Configuration:

Legend: HTTP {{status}}
Min Step: 15s
Unit: errors/sec
Panel Type: Graph or Table
Description: Errors grouped by specific HTTP status code

4.5 Error Rate by Endpoint

Query:

topk(10, sum by (endpoint) (rate(rr_http_errors_total[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: errors/sec
Panel Type: Bar gauge or Table
Description: Endpoints with highest error rate

4.6 Most Error-Prone Endpoints (Percentage)

Query:

topk(10, sum by (endpoint) (rate(rr_http_errors_total[5m])) / sum by (endpoint) (rate(rr_http_requests_by_endpoint_total[5m])) * 100)

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: percent (0-100)
Panel Type: Bar gauge or Table
Description: Endpoints with highest error percentage

4.7 4xx vs 5xx Errors

Query 1 (Client Errors):

sum(rate(rr_http_errors_total{status=~"4.."}[5m]))

Query 2 (Server Errors):

sum(rate(rr_http_errors_total{status=~"5.."}[5m]))

Configuration:

Legend:
- Query 1: 4xx (Client Errors)
- Query 2: 5xx (Server Errors)
Min Step: 15s
Unit: errors/sec
Panel Type: Graph (stacked area)
Description: Comparison of client vs server errors

4.8 Error Rate by Endpoint and Type (Heatmap)

Query:

sum by (endpoint, type) (rate(rr_http_errors_total[5m]))

Configuration:

Legend: N/A (heatmap)
Min Step: 30s
Unit: errors/sec
Panel Type: Heatmap
Description: Visual correlation between endpoints and error types

4.9 No Free Workers Errors

Query:

sum(rate(rr_http_no_free_workers_total[5m]))

Configuration:

Legend: No Workers Available
Min Step: 15s
Unit: errors/sec
Panel Type: Graph
Thresholds: Any value > 0 is critical
Description: Rate of requests rejected due to worker pool exhaustion

4.10 Timeout Errors

Query:

sum(rate(rr_http_errors_total{type="timeout"}[5m]))

Configuration:

Legend: Timeouts
Min Step: 15s
Unit: errors/sec
Panel Type: Graph
Description: Rate of timeout errors (408, 504)

4.11 Client Errors (4xx)

Query:

sum(rate(rr_http_errors_total{type="client_error"}[5m]))

Configuration:

Legend: Client Errors (4xx)
Min Step: 15s
Unit: errors/sec
Panel Type: Graph
Description: Rate of client-side errors

4.12 Server Errors (5xx)

Query:

sum(rate(rr_http_errors_total{type="server_error"}[5m]))

Configuration:

Legend: Server Errors (5xx)
Min Step: 15s
Unit: errors/sec
Panel Type: Graph
Description: Rate of server-side errors

4.13 Specific Status Code Rates

Query (404 Not Found):

sum(rate(rr_http_requests_by_endpoint_total{status="404"}[5m]))

Query (500 Internal Server Error):

sum(rate(rr_http_requests_by_endpoint_total{status="500"}[5m]))

Query (503 Service Unavailable):

sum(rate(rr_http_requests_by_endpoint_total{status="503"}[5m]))

Configuration:

Legend: HTTP {{status}}
Min Step: 15s
Unit: errors/sec
Panel Type: Graph
Description: Track specific problematic status codes

5. Worker Pool Health

5.1 Active Workers (Current)

Query:

rr_http_active_workers

Configuration:

Legend: Active Workers
Min Step: 5s
Unit: short
Panel Type: Graph or Stat
Description: Number of workers currently processing requests

5.2 Idle Workers (Current)

Query:

rr_http_idle_workers

Configuration:

Legend: Idle Workers
Min Step: 5s
Unit: short
Panel Type: Graph or Stat
Description: Number of workers available and waiting

5.3 Worker Utilization Percentage

Query:

rr_http_worker_utilization_percent

Configuration:

Legend: Utilization
Min Step: 5s
Unit: percent (0-100)
Panel Type: Gauge or Graph
Thresholds: Green < 70%, Yellow 70-90%, Red > 90%
Description: Worker pool utilization percentage

5.4 Total Workers

Query:

rr_http_active_workers + rr_http_idle_workers

Configuration:

Legend: Total Workers
Min Step: 5s
Unit: short
Panel Type: Stat
Description: Total number of workers in pool

5.5 Active vs Idle Workers (Stacked)

Query 1:

rr_http_active_workers

Query 2:

rr_http_idle_workers

Configuration:

Legend:
- Query 1: Active
- Query 2: Idle
Min Step: 5s
Unit: short
Panel Type: Graph (stacked area)
Description: Visual representation of worker pool state

5.6 Worker Utilization Over Time

Query:

avg_over_time(rr_http_worker_utilization_percent[5m])

Configuration:

Legend: Avg Utilization (5m)
Min Step: 15s
Unit: percent (0-100)
Panel Type: Graph
Description: Smoothed worker utilization trend

5.7 Peak Worker Utilization

Query:

max_over_time(rr_http_worker_utilization_percent[1h])

Configuration:

Legend: Peak Utilization (1h)
Min Step: 1m
Unit: percent (0-100)
Panel Type: Graph or Stat
Description: Maximum utilization observed in last hour

5.8 Average Queue Length Over Time

Query:

avg_over_time(rr_http_requests_queue[5m])

Configuration:

Legend: Avg Queue Size
Min Step: 15s
Unit: short
Panel Type: Graph
Description: Average number of requests in queue

5.9 Maximum Queue Length

Query:

max_over_time(rr_http_requests_queue[1h])

Configuration:

Legend: Max Queue Size (1h)
Min Step: 1m
Unit: short
Panel Type: Graph or Stat
Description: Peak queue size in last hour

6. Request/Response Sizes

6.1 Average Request Size

Query:

rate(rr_http_request_size_bytes_sum[5m]) / rate(rr_http_request_size_bytes_count[5m])

Configuration:

Legend: Avg Request Size
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Graph
Description: Mean request body size

6.2 Average Response Size

Query:

rate(rr_http_response_size_bytes_sum[5m]) / rate(rr_http_response_size_bytes_count[5m])

Configuration:

Legend: Avg Response Size
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Graph
Description: Mean response body size

6.3 Request Size P95

Query:

histogram_quantile(0.95, sum by (le) (rate(rr_http_request_size_bytes_bucket[5m])))

Configuration:

Legend: Request Size P95
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Graph
Description: 95th percentile request size

6.4 Response Size P95

Query:

histogram_quantile(0.95, sum by (le) (rate(rr_http_response_size_bytes_bucket[5m])))

Configuration:

Legend: Response Size P95
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Graph
Description: 95th percentile response size

6.5 Request Size by Endpoint

Query:

sum by (endpoint) (rate(rr_http_request_size_bytes_sum[5m])) / sum by (endpoint) (rate(rr_http_request_size_bytes_count[5m]))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Graph or Table
Description: Average request size per endpoint

6.6 Response Size by Endpoint

Query:

sum by (endpoint) (rate(rr_http_response_size_bytes_sum[5m])) / sum by (endpoint) (rate(rr_http_response_size_bytes_count[5m]))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Graph or Table
Description: Average response size per endpoint

6.7 Largest Request Endpoints

Query:

topk(10, sum by (endpoint) (rate(rr_http_request_size_bytes_sum[5m])) / sum by (endpoint) (rate(rr_http_request_size_bytes_count[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Bar gauge or Table
Description: Endpoints with largest average request size

6.8 Largest Response Endpoints

Query:

topk(10, sum by (endpoint) (rate(rr_http_response_size_bytes_sum[5m])) / sum by (endpoint) (rate(rr_http_response_size_bytes_count[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: bytes (IEC)
Panel Type: Bar gauge or Table
Description: Endpoints with largest average response size

6.9 Total Bandwidth In (Request Rate)

Query:

sum(rate(rr_http_request_size_bytes_sum[5m]))

Configuration:

Legend: Inbound Bandwidth
Min Step: 15s
Unit: bytes/sec (Bps)
Panel Type: Graph
Description: Total bytes per second received

6.10 Total Bandwidth Out (Response Rate)

Query:

sum(rate(rr_http_response_size_bytes_sum[5m]))

Configuration:

Legend: Outbound Bandwidth
Min Step: 15s
Unit: bytes/sec (Bps)
Panel Type: Graph
Description: Total bytes per second sent

6.11 Bandwidth by Endpoint (Inbound)

Query:

topk(10, sum by (endpoint) (rate(rr_http_request_size_bytes_sum[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: bytes/sec (Bps)
Panel Type: Bar gauge or Table
Description: Request bandwidth per endpoint

6.12 Bandwidth by Endpoint (Outbound)

Query:

topk(10, sum by (endpoint) (rate(rr_http_response_size_bytes_sum[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: bytes/sec (Bps)
Panel Type: Bar gauge or Table
Description: Response bandwidth per endpoint

6.13 Request/Response Size Ratio

Query:

(rate(rr_http_response_size_bytes_sum[5m]) / rate(rr_http_response_size_bytes_count[5m])) / (rate(rr_http_request_size_bytes_sum[5m]) / rate(rr_http_request_size_bytes_count[5m]))

Configuration:

Legend: Response/Request Ratio
Min Step: 15s
Unit: short (ratio)
Panel Type: Graph
Description: How many times larger responses are compared to requests

7. Advanced Analytics

7.1 Request Rate Trend (Hour over Hour)

Query:

sum(rate(rr_http_requests_by_endpoint_total[1h])) / sum(rate(rr_http_requests_by_endpoint_total[1h] offset 24h))

Configuration:

Legend: HoH Change
Min Step: 5m
Unit: short (ratio)
Panel Type: Graph or Stat
Description: Current hour traffic vs same hour yesterday (1.0 = same, 2.0 = double)

7.2 Performance Degradation Detection

Query:

histogram_quantile(0.95, sum by (le) (rate(rr_http_request_duration_seconds_bucket[5m]))) / histogram_quantile(0.95, sum by (le) (rate(rr_http_request_duration_seconds_bucket[5m] offset 1h)))

Configuration:

Legend: P95 Degradation
Min Step: 15s
Unit: short (ratio)
Panel Type: Graph
Thresholds: Green < 1.2x, Yellow 1.2-1.5x, Red > 1.5x
Description: Current P95 vs 1 hour ago (1.0 = no change, 2.0 = twice as slow)

7.3 Throughput per Worker

Query:

sum(rate(rr_http_requests_by_endpoint_total[5m])) / (rr_http_active_workers + rr_http_idle_workers)

Configuration:

Legend: RPS per Worker
Min Step: 15s
Unit: requests/sec (reqps)
Panel Type: Graph
Description: Average requests handled per worker

7.4 Error Burst Detection

Query:

sum(rate(rr_http_errors_total[1m])) > 2 * avg_over_time(sum(rate(rr_http_errors_total[1m]))[10m:1m])

Configuration:

Legend: Error Burst
Min Step: 15s
Unit: bool (0 or 1)
Panel Type: Graph (binary)
Description: Detects sudden spikes in errors (>2x baseline)

7.5 Slowest Hour of Day

Query:

avg_over_time((rate(rr_http_request_duration_seconds_sum[1h]) / rate(rr_http_request_duration_seconds_count[1h]))[24h:1h])

Configuration:

Legend: Hourly Avg Latency
Min Step: 1h
Unit: seconds (s)
Panel Type: Graph (24 hour view)
Description: Average latency by hour over 24 hours

7.6 Endpoint Performance Correlation

Query:

(sum by (endpoint) (rate(rr_http_errors_total[5m])) / sum by (endpoint) (rate(rr_http_requests_by_endpoint_total[5m]))) * (sum by (endpoint) (rate(rr_http_duration_by_endpoint_seconds_sum[5m])) / sum by (endpoint) (rate(rr_http_duration_by_endpoint_seconds_count[5m])))

Configuration:

Legend: {{endpoint}}
Min Step: 15s
Unit: short
Panel Type: Graph or Table
Description: Correlation score: slow + error-prone endpoints have higher values

7.7 Server Uptime

Query:

rr_http_uptime_seconds

Configuration:

Legend: Uptime
Min Step: 1m
Unit: seconds (s) or duration (s)
Panel Type: Stat
Description: How long server has been running

7.8 Request Distribution by HTTP Method (Pie)

Query:

sum by (method) (rate(rr_http_requests_by_endpoint_total[5m])) / sum(rate(rr_http_requests_by_endpoint_total[5m])) * 100

Configuration:

Legend: {{method}}
Min Step: 15s
Unit: percent (0-100)
Panel Type: Pie chart
Description: Percentage breakdown of requests by HTTP method

7.9 Status Code Distribution (Pie)

Query:

sum by (status) (rate(rr_http_requests_by_endpoint_total[5m])) / sum(rate(rr_http_requests_by_endpoint_total[5m])) * 100

Configuration:

Legend: {{status}}
Min Step: 15s
Unit: percent (0-100)
Panel Type: Pie chart
Description: Percentage breakdown of responses by status code

8. Dashboard Layout Recommendations

Row 1: Key Metrics Overview (4 panels)

Total RPS - Stat panel
P95 Latency - Stat panel with threshold colors
Error Rate % - Gauge with thresholds
Worker Utilization - Gauge with thresholds

Row 2: Traffic & Performance (2 panels)

RPS by Endpoint - Graph (time series)
Latency Percentiles - Graph (P50, P95, P99)

Row 3: Performance Breakdown (2 panels)

Queue vs Processing Time - Stacked area graph
Top Slowest Endpoints - Bar gauge

Row 4: Error Analysis (2 panels)

Error Rate by Type - Stacked area graph
Most Error-Prone Endpoints - Table

Row 5: Worker Pool Health (2 panels)

Active vs Idle Workers - Stacked area graph
Queue Size - Graph with threshold line

Row 6: Bandwidth Analysis (2 panels)

Request/Response Sizes - Graph (dual Y-axis)
Top Bandwidth Consumers - Table

Row 7: Detailed Endpoint Table (1 panel)

Endpoint Performance Table - Table with multiple queries

9. Unit Reference Guide

Standard Grafana Units

Time Units:

seconds (s) - for durations
milliseconds (ms) - for sub-second timings
duration (s) - auto-formats (1m 30s, 2h 15m, etc.)

Rate Units:

requests/sec (reqps) - for request rates
errors/sec - for error rates
bytes/sec (Bps) - for bandwidth

Size Units:

bytes (IEC) - auto-formats (KB, MB, GB) using 1024 base
bytes (SI) - auto-formats using 1000 base

Percentage:

percent (0-100) - displays as 95%
percentunit (0.0-1.0) - displays 0.95 as 95%

Count:

short - auto-formats large numbers (1K, 1M)
none - raw number

Boolean:

bool - 0 or 1
bool_yes_no - displays as Yes/No
bool_on_off - displays as On/Off

10. Common Threshold Configurations

Latency Thresholds

Green: < 1s
Yellow: 1-5s
Red: > 5s

Error Rate Thresholds

Green: < 1%
Yellow: 1-5%
Red: > 5%

Worker Utilization Thresholds

Green: < 70%
Yellow: 70-90%
Red: > 90%

Queue Time Thresholds

Green: < 100ms
Yellow: 100-500ms
Red: > 500ms

Success Rate Thresholds

Red: < 95%
Yellow: 95-99%
Green: > 99%

FilesExpand file tree

metrics.md

Latest commit

History

metrics.md

File metadata and controls

Grafana Queries - Complete Guide with Configuration

1. Traffic Overview Metrics

1.1 Total Requests Per Second (RPS)

1.2 RPS by Endpoint

1.3 RPS by HTTP Method

1.4 RPS by Status Code

1.5 Success Rate (2xx responses)

1.6 Current Queue Size

2. Performance Metrics

2.1 Average Request Duration

2.2 Request Duration by Endpoint

2.3 P50 Latency (Median)

2.4 P95 Latency

2.5 P99 Latency

2.6 P95 Latency by Endpoint

2.7 Queue Time (Average)

2.8 Queue Time P95

2.9 Processing Time (Average)

2.10 Processing Time P95

2.11 Performance Breakdown (Stacked)

2.12 Queue vs Processing Time Ratio

3. Endpoint Analysis

3.1 Top 10 Slowest Endpoints (by P95)

3.2 Top 10 Slowest Endpoints (by Average)

3.3 Top 10 Most Trafficked Endpoints

3.4 Endpoint Performance Table

3.5 Endpoint Duration Heatmap

4. Error Tracking

4.1 Total Error Rate

4.2 Error Rate Percentage

4.3 Error Rate by Type

4.4 Error Rate by Status Code

4.5 Error Rate by Endpoint

4.6 Most Error-Prone Endpoints (Percentage)

4.7 4xx vs 5xx Errors

4.8 Error Rate by Endpoint and Type (Heatmap)

4.9 No Free Workers Errors

4.10 Timeout Errors

4.11 Client Errors (4xx)

4.12 Server Errors (5xx)

4.13 Specific Status Code Rates

5. Worker Pool Health

5.1 Active Workers (Current)

5.2 Idle Workers (Current)

5.3 Worker Utilization Percentage

5.4 Total Workers

5.5 Active vs Idle Workers (Stacked)

5.6 Worker Utilization Over Time

5.7 Peak Worker Utilization

5.8 Average Queue Length Over Time

5.9 Maximum Queue Length

6. Request/Response Sizes

6.1 Average Request Size

6.2 Average Response Size

6.3 Request Size P95

6.4 Response Size P95

6.5 Request Size by Endpoint

6.6 Response Size by Endpoint

6.7 Largest Request Endpoints

6.8 Largest Response Endpoints

6.9 Total Bandwidth In (Request Rate)

6.10 Total Bandwidth Out (Response Rate)

6.11 Bandwidth by Endpoint (Inbound)

6.12 Bandwidth by Endpoint (Outbound)

6.13 Request/Response Size Ratio

7. Advanced Analytics

7.1 Request Rate Trend (Hour over Hour)

7.2 Performance Degradation Detection

7.3 Throughput per Worker

7.4 Error Burst Detection

7.5 Slowest Hour of Day

7.6 Endpoint Performance Correlation

7.7 Server Uptime

7.8 Request Distribution by HTTP Method (Pie)