前提環境:
- Docker環境
涉及參考文檔:
- Prometheus 錄制規則
- Prometheus 警報規則
文法檢查規則
promtool check rules /path/to/example.rules.yml
一:錄制規則文法
groups 文法:
groups:
[ - <rule_group> ]
rule_group 文法
# The name of the group. Must be unique within a file.
name: <string>
# How often rules in the group are evaluated.
[ interval: <duration> | default = global.evaluation_interval ]
# Limit the number of alerts an alerting rule and series a recording
# rule can produce. 0 is no limit.
[ limit: <int> | default = 0 ]
rules:
[ - <rule> ... ]
rules 文法
# The name of the time series to output to. Must be a valid metric name.
record: <string>
# The PromQL expression to evaluate. Every evaluation cycle this is
# evaluated at the current time, and the result recorded as a new set of
# time series with the metric name as given by 'record'.
expr: <string>
# Labels to add or overwrite before storing the result.
labels:
[ <labelname>: <labelvalue> ]
示例規則檔案:
groups:
- name: cpu-node
rules:
- record: job_instance_mode:node_cpu_seconds:avg_rate5m
expr: avg by (job, instance, mode) (rate(node_cpu_seconds_total{instance="10.1.32.231"}[5m]))
labels:
job_instance_mode: node_cpu_seconds
![](https://img.laitimes.com/img/9ZDMuAjOiMmIsIjOiQnIsIyZuBnLzcTYiJ2N3MmZ2MmZyQWN5cjNyQDM2UTZhBTZyMzN2Q2Lc52YucWbp5GZzNmLn9Gbi1yZtl2Lc9CX6MHc0RHaiojIsJye.png)
二:警報規則文法
警報規則允許您根據 Prometheus
自定義警報條件 表達式語言表達式和發送有關觸發警報的通知
到外部服務。
文法格式:
# The name of the alert. Must be a valid label value.
alert: <string> # 告警名稱
# The PromQL expression to evaluate. Every evaluation cycle this is
# evaluated at the current time, and all resultant time series become
# pending/firing alerts.
expr: <string> # 自定義文法
# Alerts are considered firing once they have been returned for this long.
# Alerts which have not yet fired for long enough are considered pending.
[ for: <duration> | default = 0s ] # 持續設定時間才觸發,在此之間一直處于等待告警狀态(pending)
# Labels to add or overwrite for each alert.
labels:
[ <labelname>: <tmpl_string> ] # 告警名稱标簽
# Annotations to add to each alert.
annotations: #
[ <labelname>: <tmpl_string> ]
定義警報規則:
标簽和注釋值
可以使用控制台進行模闆化模闆。該變量儲存警報執行個體的标簽鍵/值對。已配置的
可以通過變量通路外部标簽
。該變量儲存警報執行個體的評估值
groups:
- name: Dos端口探針
rules:
- alert: Dos端口探針 #告警名稱
expr: probe_success{job="Dos-Port-Status"}==0 #比對規則
for: 1m # 一直持續時間,才觸發告警規則
labels: # 标簽部分
severity: critical
team: "{{ $labels.job }}" # $labels.job ——> Prometheus 主配置檔案定義的Job名稱
annotations: # 注解部分
summary: '{{$labels.env}} TCP探測失敗' # 采集主機的标簽名稱
description: '{{ $labels.env}}【{{ $labels.name}}】TCP探測端口失敗,目前狀态碼:{{$value}}' # 采集主機的标簽名稱
觸發效果