-
Notifications
You must be signed in to change notification settings - Fork 115
Open
Description
Problem Statement
The current User Cluster MLA Admin Guide primarily documents the API-based approach for managing alert rules via REST endpoints. However, it doesn't document the CRD-based approach which is more suitable for GitOps workflows.
Through cluster investigation, I discovered that Kubermatic provides native Kubernetes CRDs for managing alerting rules:
rulegroups.kubermatic.k8c.io- For defining Prometheus alert/recording rulesalertmanagers.kubermatic.k8c.io- For configuring Alertmanager routing
These CRDs can be applied directly to user cluster namespaces (e.g., cluster-*) in the seed cluster, enabling GitOps-based alert management.
Proposed Documentation Addition
Add a new section: "Managing User Cluster Alerting via GitOps" to the User Cluster MLA documentation.
Content to Include:
1. RuleGroup CRD Overview
- Explain that RuleGroups can be created as Kubernetes resources in user cluster namespaces
- Document the CRD structure and fields
Example:
apiVersion: kubermatic.k8c.io/v1
kind: RuleGroup
metadata:
name: haproxy-alerts
namespace: cluster-xxxxx # User cluster namespace in seed
spec:
cluster:
name: xxxxx
ruleGroupType: Metrics # or "Logs"
isDefault: false
data: |
groups:
- name: haproxy-service-specific-alerts
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{code=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: High 5xx error rate detected2. Alertmanager Configuration via Secret
- Document how to configure Alertmanager by updating the
alertmanagersecret - Show the secret structure and key names
Example:
apiVersion: v1
kind: Secret
metadata:
name: alertmanager
namespace: cluster-xxxxx
type: Opaque
stringData:
alertmanager.yaml: |
template_files: {}
alertmanager_config: |
route:
receiver: 'default'
group_by: ['alertname', 'cluster', 'service']
routes:
- receiver: 'slack-critical'
match:
severity: critical
receivers:
- name: 'default'
slack_configs:
- api_url: 'https://hooks.slack.com/services/XXX'
channel: '#alerts'
- name: 'slack-critical'
slack_configs:
- api_url: 'https://hooks.slack.com/services/XXX'
channel: '#critical-alerts'3. Alertmanager CRD Reference
- Document the
alertmanagers.kubermatic.k8c.ioCRD - Explain its relationship with the secret
Example:
apiVersion: kubermatic.k8c.io/v1
kind: Alertmanager
metadata:
name: alertmanager
namespace: cluster-xxxxx
spec:
configSecret:
name: alertmanager # References the secret above4. GitOps Workflow Examples
- Show how to structure alert rules in Git repository
- Provide ArgoCD/Flux application examples
- Best practices for organizing rules by service/component
5. API vs CRD Comparison Table
| Aspect | API Approach | CRD Approach |
|---|---|---|
| GitOps Support | Requires CI/CD integration | Native Kubernetes resources |
| Version Control | Manual API calls | Git history |
| Declarative | No | Yes |
| Access Control | KKP API permissions | Kubernetes RBAC |
| Tooling | curl, API clients | kubectl, ArgoCD, Flux |
| Use Case | Programmatic management | Infrastructure as Code |
Benefits
This documentation would:
- Enable GitOps workflows for alert management
- Provide a more Kubernetes-native approach
- Help teams already using ArgoCD/Flux for infrastructure
- Reduce the learning curve for Kubernetes users
- Fill a gap in current documentation
Additional Context
Current documentation focuses on:
- API endpoints:
GET/POST/PUT/DELETE /api/v2/projects/{project_id}/clusters/{cluster_id}/rulegroups - UI-based management via KKP dashboard
Missing:
- CRD-based declarative approach
- GitOps integration patterns
- Complete CRD specification examples
Related Documentation
Metadata
Metadata
Assignees
Labels
No labels