In 3scale SaaS we have been using successfully limitador for a couple of years together with Redis, to protect all our public endpoints. However:
- We are using an old image community image
- Yamls are managed individually via ArgoCD
We would like to update how we manage limitador application, and use the most recommended limitador setup using limitador-operator, with a production-ready grade.
Current limitador-operator (at least the version 0.4.0 that we use):
- Provides a few prometheus metrics in the HTTP port
- Do not create a prometheus PodMonitor by default
- Do not create a GrafanaDashboard by default
- Do not permit to create a prometetheus PodMonitor via CR
- Do not permit to create a GrafanaDashboard via CR
Desired features:
- Permit to create a prometheus PodMonitor via CR
- Permit to create a GrafanaDashboard via CR
- Being observability something optional, might not be enabled by default
3scale SaaS specific example
Example of the PodMonitor used in 3scale SaaS production to manage between 3,500 and 5,500 requests/second with 3 limitador pods (selector labels need to coincide with the labels managed right now by limitador-operator):
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: limitador
spec:
podMetricsEndpoints:
- interval: 30s
path: /metrics
port: http
scheme: http
selector:
matchLabels:
app.kubernetes.io/name: limitador
Possible CR config
Both PodMonitor and GrafanaDashboard should be able to be customized via CR, but use default sane values if they are enabled, so you dont need to provide all the config if you dont want, and want to trust on defaults.
- PodMonitor possible customization:
- enabled: true/false
- interval: how often prometheus-operator will scrape limitador pods (have an impact on prometheus memory/timeseries database sizes)
- labelSelector: sometimes prometheus-operator is configured to scrape PodMonitors/ServiceMonitors with specific label selectors
- GrafanaOperator possible customization:
- enabled: true/false
- labelSelector: sometimes grafana-operator is configured to scrape GrafanaDashboards with specific label selectors
apiVersion: limitador.kuadrant.io/v1alpha1
kind: Limitador
metadata:
name: limitador-sample
spec:
podMonitor:
enabled: true # by default it is false, so does not create a PodMonitor
interval: 30s # by default it is 30 if not defined
labelSelector: XX ## by default not define any label/selector
... ## maybe in the future permit to override more PodMonitor fields if needed, dont think anymore is needed by now
grafanaDashboard:
enabled: true
labelSelector: XX ## by default not define any label/selector
The initial dashboard would be provided by us initially (3scale SRE), can be embedded into operator as an asset, like done with 3scale-operator.
Current Dashboard screenshots including limitador metrics by limitador_namespace (the app being limited), and also pods, resources cpu/mem/net metrics:



PrometheusRules (aka prometheus alerts)
Regarding PrometheusRules (prometheus alerts), my advise is to not embed them into the operator, but provide in the repo a yaml with an example of possible alerts that can be deployed, tuned... by the app administrator if needed.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: limitador
spec:
groups:
- name: limitador.rules
rules:
- alert: LimitadorJobDown
annotations:
message: Prometheus Job {{ $labels.job }} on {{ $labels.namespace }} is DOWN
expr: up{job=~".*limitador.*"} == 0
for: 5m
labels:
severity: critical
- alert: LimitadorPodDown
annotations:
message: Limitador pod {{ $labels.pod }} on {{ $labels.namespace }} is DOWN
expr: limitador_up == 0
for: 5m
labels:
severity: critical
In 3scale SaaS we have been using successfully limitador for a couple of years together with Redis, to protect all our public endpoints. However:
We would like to update how we manage limitador application, and use the most recommended limitador setup using limitador-operator, with a production-ready grade.
Current limitador-operator (at least the version
0.4.0that we use):Desired features:
3scale SaaS specific example
Example of the PodMonitor used in 3scale SaaS production to manage between 3,500 and 5,500 requests/second with 3 limitador pods (selector labels need to coincide with the labels managed right now by limitador-operator):
Possible CR config
Both PodMonitor and GrafanaDashboard should be able to be customized via CR, but use default sane values if they are enabled, so you dont need to provide all the config if you dont want, and want to trust on defaults.
The initial dashboard would be provided by us initially (3scale SRE), can be embedded into operator as an asset, like done with 3scale-operator.
Current Dashboard screenshots including limitador metrics by
limitador_namespace(the app being limited), and also pods, resources cpu/mem/net metrics:PrometheusRules (aka prometheus alerts)
Regarding PrometheusRules (prometheus alerts), my advise is to not embed them into the operator, but provide in the repo a yaml with an example of possible alerts that can be deployed, tuned... by the app administrator if needed.
Example: