Prometheus Alert Rule Generator
Build a Prometheus alerting rules file from proven presets — instance down, high CPU, memory pressure, low disk, request latency, error rate, and pod crash-looping. Toggle the alerts you want, tune each threshold, the for duration, and the severity label, and the tool writes a valid rules YAML with PromQL expressions and templated annotations. Copy it into your Prometheus config. Everything is generated in your browser.
| On | Alert | Threshold | For | Severity |
|---|
Expressions assume the common node_exporter and HTTP/Kubernetes metric names. Confirm the metric names match your exporters, then load the file via a rule_files entry in prometheus.yml and route the severities in Alertmanager.
How to use the Prometheus Alert Rule Generator
Tick the alerts you want in the table. For each, set the threshold (a percentage for resource alerts, seconds for latency), the for duration — how long the condition must hold before firing, which suppresses brief spikes — and a severity label that Alertmanager uses for routing. The YAML regenerates live; save it to a file such as alerts.yml and reference it from the rule_files list in prometheus.yml.
Check the metric names against your setup before relying on the rules. The CPU, memory, and disk expressions use node_exporter metrics; the latency and error-rate alerts assume an HTTP histogram named http_request_duration_seconds and a counter http_requests_total with a status label; the crash-loop alert uses kube-state-metrics. Rename them to match your exporters and instrumentation. After loading, verify the rules under Status → Rules in the Prometheus UI and test routing by triggering a low-severity alert.
How Prometheus alerting rules work
Prometheus separates detecting a problem from notifying someone about it. Alerting rules, defined in a YAML file Prometheus loads, continuously evaluate PromQL expressions against your metrics; when an expression returns results, an alert becomes active. A companion service, Alertmanager, receives those active alerts and handles grouping, silencing, deduplication, and delivery to email, Slack, PagerDuty, and the like. This generator produces the rules half; Alertmanager configuration is separate.
Each rule has four important parts. The expr is the PromQL condition — for example CPU utilisation above a threshold or up == 0 for a target that has stopped responding. The for clause requires the condition to stay true for a duration before the alert actually fires, which filters out momentary spikes that would otherwise page someone needlessly. labels attach metadata, most importantly a severity, that Alertmanager routes on. annotations carry human-readable text, and they can use Go templating like {{ $labels.instance }} and {{ $value }} to embed the offending instance and the current metric value directly in the message.
Good alerts are symptom-based and actionable. Alerting on user-visible symptoms — high error rate, slow requests, a down instance — rather than on every internal fluctuation keeps noise low and signal high, and a sensible for duration prevents flapping. The presets here follow that philosophy: thresholds that indicate a real problem, durations long enough to ignore transient blips, and severities that distinguish a page-now critical from a look-soon warning. Start from them, tune the numbers to your environment's normal behaviour, and expand with service-specific rules over time.
Common use cases
- Bootstrapping monitoring. Get a solid baseline of infrastructure alerts without writing PromQL from scratch.
- Standard thresholds. Apply consistent CPU, memory, and disk alerts across many hosts.
- Learning alerting rules. See how expr, for, labels, and annotations combine in valid syntax.
- Reducing noise. Add for-durations to flappy alerts that currently fire on brief spikes.