Add OpenTelemetry Observability Support

# Add OpenTelemetry Observability Support

## 📋 **Description**
Integrate OpenTelemetry to provide comprehensive observability for Queuety message broker operations. This will enable users to monitor performance, track message flows, and debug issues using their preferred observability platform.

## 🎯 **Problem Statement**
Currently, Queuety lacks observability features, making it difficult for users to:
- Monitor message throughput and latency
- Track topic and subscriber metrics
- Debug connection and persistence issues
- Identify performance bottlenecks
- Set up production monitoring and alerting

## 🚀 **Proposed Solution**
Implement a dual observability approach:
- **Prometheus** for metrics collection and monitoring
- **OpenTelemetry** for distributed tracing

### **Metrics (Prometheus)**
Expose Prometheus metrics at `/metrics` endpoint:
- `queuety_messages_published_total{topic}` - Total published messages
- `queuety_messages_delivered_total{topic}` - Total delivered messages  
- `queuety_messages_failed_total{topic, reason}` - Failed message deliveries
- `queuety_message_processing_seconds{topic, operation}` - Processing latency histogram
- `queuety_topics_total` - Number of active topics
- `queuety_subscribers_total{topic}` - Subscribers per topic
- `queuety_active_connections` - Current TCP connections count
- `queuety_badger_operations_total{operation, status}` - BadgerDB operation metrics
- `queuety_auth_attempts_total{result}` - Authentication attempts tracking

### **Distributed Tracing (OpenTelemetry)**
Export traces to OTLP-compatible backends:
- **Message lifecycle spans**: publish → persist → deliver → ack
- **BadgerDB operations**: save, update, delete operations
- **Connection flows**: accept, authenticate, disconnect
- **Topic management**: create topic, add subscriber operations

### **Supported Tracing Backends**
- **Jaeger** - Open source distributed tracing
- **Datadog APM** - Enterprise tracing and APM
- **New Relic** - Full observability platform  
- **OTLP Generic** - Any OpenTelemetry-compatible backend (Zipkin, etc.)

## 📁 **Implementation Plan**

### **Phase 1: Core Infrastructure**
- [ ] Create `telemetry/` package structure
- [ ] Implement base `Telemetry` struct and configuration
- [ ] Add Prometheus exporter support
- [ ] Environment variable configuration

### **Phase 2: Instrumentation** 
- [ ] Instrument message publish/deliver operations
- [ ] Add BadgerDB persistence metrics
- [ ] Track TCP connection lifecycle
- [ ] Implement distributed tracing spans

### **Phase 3: Multi-Backend Support**
- [ ] Add Datadog OTLP integration
- [ ] Implement New Relic support
- [ ] Generic OTLP exporter for other backends
- [ ] Jaeger tracing exporter

### **Phase 4: Documentation & Examples**
- [ ] Docker Compose examples with Prometheus/Grafana
- [ ] Configuration documentation
- [ ] Grafana dashboard templates
- [ ] Production deployment guides

## 🔧 **Technical Details**

### **Configuration**
```yaml
# Environment variables
QUEUETY_TELEMETRY_ENABLED=true
QUEUETY_TELEMETRY_BACKEND=prometheus|datadog|newrelic|otlp|jaeger
QUEUETY_TELEMETRY_ENDPOINT=https://custom-endpoint
QUEUETY_TELEMETRY_API_KEY=your-api-key
```

### **Dependencies**
```go
go.opentelemetry.io/otel v1.24.0
go.opentelemetry.io/otel/exporters/prometheus v0.46.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.24.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.24.0
go.opentelemetry.io/otel/sdk v1.24.0
```

### **Code Structure**
```
queuety/
├── telemetry/
│   ├── telemetry.go    # Main setup and providers
│   └── metrics.go      # Metrics definitions and helpers
├── server/
│   ├── server.go       # Instrumented message handling
│   └── persistence.go  # BadgerDB metrics
└── examples/
    └── docker-compose-prometheus.yml
```

## ✅ **Benefits**

### **For Users**
- **Production Ready**: Monitor Queuety in production environments
- **Vendor Freedom**: Choose any observability platform
- **Debug Capabilities**: Trace message flows and identify bottlenecks
- **Performance Insights**: Understand throughput and latency patterns
- **Alerting**: Set up proactive monitoring and alerts

### **For Project**
- **Enterprise Adoption**: Makes Queuety suitable for production use
- **Community Growth**: Observability is essential for serious deployments
- **Debugging**: Easier to troubleshoot issues and performance problems
- **Competitive Advantage**: Many message brokers lack comprehensive observability

## 🎨 **Example Usage**

### **Prometheus + Grafana**
```bash
# Start with observability stack
docker-compose -f examples/docker-compose-prometheus.yml up

# View metrics
curl http://localhost:9090/metrics | grep queuety
```

### **Datadog Integration**
```bash
QUEUETY_TELEMETRY_ENABLED=true \
QUEUETY_TELEMETRY_BACKEND=datadog \
QUEUETY_TELEMETRY_API_KEY=${DD_API_KEY} \
./queuety
```

## 🔄 **Backward Compatibility**
- Telemetry is **disabled by default** - zero impact on existing deployments
- **No breaking changes** to existing APIs
- **Optional dependencies** - OTEL libs only loaded when telemetry enabled
- **Graceful degradation** - Server continues running if telemetry setup fails

## 📊 **Success Criteria**
- [ ] Metrics are exported correctly to all supported backends
- [ ] Distributed traces show complete message lifecycle
- [ ] Zero performance impact when telemetry disabled
- [ ] Documentation includes setup guides for each backend
- [ ] Example dashboards provided for Grafana
- [ ] CI/CD tests verify telemetry functionality

## 🤝 **Contributing**
This is a significant feature that would benefit from community input:
- Feedback on metric names and labels
- Additional backend support requests
- Dashboard and alerting rule contributions
- Documentation improvements
- Testing on different deployment scenarios

## 🏷️ **Labels**
`enhancement` `observability` `monitoring` `production-ready` `good-first-issue` `help-wanted`

## 📚 **References**
- [[OpenTelemetry Go Documentation](https://opentelemetry.io/docs/instrumentation/go/)](https://opentelemetry.io/docs/instrumentation/go/)
- [[Prometheus Best Practices](https://prometheus.io/docs/practices/naming/)](https://prometheus.io/docs/practices/naming/)
- [[OTEL Semantic Conventions](https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/)](https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/)

---

**Priority**: High - Observability is crucial for production message broker deployments

**Effort**: Large - Comprehensive feature requiring instrumentation across the codebase

**Impact**: High - Enables enterprise adoption and production monitoring capabilities

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenTelemetry Observability Support #12

Add OpenTelemetry Observability Support

📋 Description

🎯 Problem Statement

🚀 Proposed Solution

Metrics (Prometheus)

Distributed Tracing (OpenTelemetry)

Supported Tracing Backends

📁 Implementation Plan

Phase 1: Core Infrastructure

Phase 2: Instrumentation

Phase 3: Multi-Backend Support

Phase 4: Documentation & Examples

🔧 Technical Details

Configuration

Dependencies

Code Structure

✅ Benefits

For Users

For Project

🎨 Example Usage

Prometheus + Grafana

Datadog Integration

🔄 Backward Compatibility

📊 Success Criteria

🤝 Contributing

🏷️ Labels

📚 References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add OpenTelemetry Observability Support #12

Description

Add OpenTelemetry Observability Support

📋 Description

🎯 Problem Statement

🚀 Proposed Solution

Metrics (Prometheus)

Distributed Tracing (OpenTelemetry)

Supported Tracing Backends

📁 Implementation Plan

Phase 1: Core Infrastructure

Phase 2: Instrumentation

Phase 3: Multi-Backend Support

Phase 4: Documentation & Examples

🔧 Technical Details

Configuration

Dependencies

Code Structure

✅ Benefits

For Users

For Project

🎨 Example Usage

Prometheus + Grafana

Datadog Integration

🔄 Backward Compatibility

📊 Success Criteria

🤝 Contributing

🏷️ Labels

📚 References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions