![](https://img.laitimes.com/img/__Qf2AjLwojIjJCLyojI0JCLiAjM2EzLcd3LcJzLcJzdllmVldWYtl2PnVGcq5SY4p3Z5NHd1gDevwVN1QDOwkTNtUGall3LcVmdhNXLwRHdo9CXt92YucWbpRWdvx2Yx5yazF2Lc9CX6MHc0RHaiojIsJye.jpeg)
使用kubeadm搭建的叢集預設證書有效期是1年,續費證書其實是一件很快的事情。但是就怕出事了才發現,畢竟作為專業搬磚工程師,每天都很忙的。
鑒于此,監控叢集證書有效期是一件不得不做的事情。Prometheus作為雲原生領域的王者,如果能用它來監控證書有效期并能及時告警,那就再好不過了。
ssl_exporter就是來做這個事情的。ssh_exporter是一個Prometheus Exporter能提供多種針對 SSL 的檢測手段,包括:https 證書生效/失效時間、檔案證書生效/失效時間,OCSP 等相關名額。
下面就來監聽叢集證書的有效期。
安裝
apiVersion: v1
kind: Service
metadata:
labels:
name: ssl-exporter
name: ssl-exporter
spec:
ports:
- name: ssl-exporter
protocol: TCP
port: 9219
targetPort: 9219
selector:
app: ssl-exporter
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ssl-exporter
spec:
replicas: 1
selector:
matchLabels:
app: ssl-exporter
template:
metadata:
name: ssl-exporter
labels:
app: ssl-exporter
spec:
initContainers:
# Install kube ca cert as a root CA
- name: ca
image: alpine
command:
- sh
- -c
- |
set -e
apk add --update ca-certificates
cp /var/run/secrets/kubernetes.io/serviceaccount/ca.crt /usr/local/share/ca-certificates/kube-ca.crt
update-ca-certificates
cp /etc/ssl/certs/* /ssl-certs
volumeMounts:
- name: ssl-certs
mountPath: /ssl-certs
containers:
- name: ssl-exporter
image: ribbybibby/ssl-exporter:v0.6.0
ports:
- name: tcp
containerPort: 9219
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs
volumes:
- name: ssl-certs
emptyDir: {}
複制
執行
kubectl apply -f .
安裝即可。
待Pod正常運作,如下:
# kubectl get po -n monitoring -l app=ssl-exporter
NAME READY STATUS RESTARTS AGE
ssl-exporter-7ff4759679-f4qbs 1/1 Running 0 21m
複制
然後配置prometheus抓取規則。
!! 由于我的Prometheus是通過Prometheus Operator部署的,是以通過additional的方式進行抓取。
首先建立一個檔案
prometheus-additional.yaml
,其内容如下:
- job_name: ssl-exporter
metrics_path: /probe
static_configs:
- targets:
- kubernetes.default.svc:443
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: ssl-exporter.monitoring:9219
複制
然後建立secret,指令如下:
kubectl delete secret additional-config -n monitoring
kubectl -n monitoring create secret generic additional-config --from-file=prometheus-additional.yaml
複制
然後修改prometheus-prometheus.yaml配置檔案,新增如下内容:
additionalScrapeConfigs:
name: additional-config
key: prometheus-additional.yaml
複制
prometheus-prometheus.yaml的整體配置如下:
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
prometheus: k8s
name: k8s
namespace: monitoring
spec:
alerting:
alertmanagers:
- name: alertmanager-main
namespace: monitoring
port: web
baseImage: quay.io/prometheus/prometheus
nodeSelector:
kubernetes.io/os: linux
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
replicas: 2
resources:
requests:
memory: 400Mi
ruleSelector:
matchLabels:
prometheus: k8s
role: alert-rules
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
additionalScrapeConfigs:
name: additional-config
key: prometheus-additional.yaml
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: v2.11.0
storage:
volumeClaimTemplate:
spec:
storageClassName: managed-nfs-storage
resources:
requests:
storage: 10Gi
複制
然後重新執行prometheus-prometheus.yaml檔案,指令如下:
kubectl apply -f prometheus-prometheus.yaml
複制
現在可以在prometheus的web界面看到正常的抓取任務了,如下:
然後通過
(ssl_cert_not_after-time())/3600/24
即可看到證書還有多久失效。
image.png
通過
ssl_tls_connect_success
可以觀測ssl連結是否正常。
image.png
告警
上面已經安裝ssl_exporter成功,并且能正常監控資料了,下面就配置一些告警規則,以便于運維能快速知道這個事情。
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: monitoring-ssl-tls-rules
namespace: monitoring
labels:
prometheus: k8s
role: alert-rules
spec:
groups:
- name: check_ssl_validity
rules:
- alert: "K8S叢集證書在30天後過期"
expr: (ssl_cert_not_after-time())/3600/24 <30
for: 1h
labels:
severity: critical
annotations:
description: 'K8S叢集的證書還有{{ printf "%.1f" $value }}天就過期了,請盡快更新證書'
summary: "K8S叢集證書證書過期警告"
- name: ssl_connect_status
rules:
- alert: "K8S叢集證書可用性異常"
expr: ssl_tls_connect_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: "K8S叢集證書連接配接異常"
description: "K8S叢集 {{ $labels.instance }} 證書連接配接異常"
複制
如下展示規則正常,在異常的時候就可以接收到告警了。
image.png
公衆号:運維開發故事
github:https://github.com/orgs/sunsharing-note/dashboard
愛生活,愛運維