背景說明
Kubernetes 通過 csi external-snapshotter 來做到對雲盤快照的支援, 官方隻支援最基本的快照的建立及删除。
ACK 通過安裝 storage-auto-snapshotter 元件來使用雲盤的定時快照功能
事前部署
部署 csi-snapshotter
首先我們需要部署 csi-snapshotter 來支援基本快照的建立,需要确認目前 ACK 叢集的版本
- ACK叢集版本 >= 1.18 , 在叢集建立的時候就已經部署好了 csi-snapshotter, 無需進行額外部署
- ACK叢集版本 < 1.18 參考如下文章進行部署 https://developer.aliyun.com/article/757325 .
部署 storage-auto-snapshotter 插件
- 使用
指令建立 deployment。kubectl apply -f deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: storage-auto-snapshotter
namespace: kube-system
labels:
app: storage-auto-snapshotter
spec:
selector:
matchLabels:
app: storage-auto-snapshotter
template:
metadata:
labels:
app: storage-auto-snapshotter
spec:
tolerations:
- operator: "Exists"
priorityClassName: system-node-critical
serviceAccount: admin
hostNetwork: true
hostPID: true
containers:
- name: storage-auto-snapshotter
image: registry.cn-beijing.aliyuncs.com/gyq193577/csi_auto_snapshotter:v1.16.6-9268802
imagePullPolicy: Always
env:
- name: SNAPSHOT_CLASS
value: ""
volumeMounts:
- name: date-config
mountPath: /etc/localtime
volumes:
- name: date-config
hostPath:
path: /etc/localtime
-
判斷插件是否正常啟動kubectl get pods -nkube-system | grep storage-auto-snapshotter | grep Running
部署 VolumeSnapshotPolicy CRD
-
建立 CRDkubectl create -f volumesnapshotcrd.yaml
apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: name: volumesnapshotpolicies.storage.alibabacloud.com spec: group: storage.alibabacloud.com versions: - name: v1alpha1 served: true storage: true schema: openAPIV3Schema: description: VolumeSnapshotPolicy is the Schema for the VolumeSnapshotPolicy API properties: apiVersion: description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#resources' type: string kind: description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#types-kinds' type: string metadata: type: object spec: description: VolumeSnapshotPolicySpec defines the desired Specification of VolumeSnapshotPolicy properties: retentionDays: description: retentionDays is days to save snapshot format: int64 type: integer repeatWeekdays: description: RepeatWeekdays is a list of days in a week to create disk snapshot type: array items: type: string timePoints: description: TimePoints is a list of hours in a day to create disk snapshot type: array items: type: string type: object type: object subresources: status: {} scope: Cluster names: kind: VolumeSnapshotPolicy plural: volumesnapshotpolicies shortNames: - vsp
-
檢查 crd 是否已經正确建立kubectl get crd volumesnapshotpolicies.storage.alibabacloud.com
添權重限
- 托管版(标準托管版 & ACK Pro)無需添權重限。
- 專有版 ACK 需要在 ram worker role 上添加如下權限。
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": [ "ecs:DescribeInstances", "ecs:CreateAutoSnapshotPolicy", "ecs:DeleteAutoSnapshotPolicy", "ecs:DescribeSnapshots", "ecs:ApplyAutoSnapshotPolicy", "ecs:ModifyAutoSnapshotPolicy", "ecs:DescribeAutoSnapshotPolicyEX" ], "Resource": [ "*" ], "Condition": {} } ] }
定時快照功能使用
storage-operator deployment 啟動之後, 系統會檢查目前cluster中是否存在 VolumeSnapshotPolicy, 如果存在, 繼續對比目前建立的執行個體是否存在于 ecs 系統中, 如果存在,則跳過, 不存在則建立。
建立 VolumeSnapshotPolicy 執行個體
apiVersion: v1
items:
- apiVersion: storage.alibabacloud.com/v1alpha1
kind: VolumeSnapshotPolicy
metadata:
name: volumesnapshotpolicy1
spec:
retentionDays: 1
repeatWeekdays: ["1", "2"]
timePoints: ["11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"]
kind: List
metadata:
resourceVersion: ""
selfLink: ""
該執行個體代表一個 ecs 上的自動快照政策,使用者在 Kubernetes 上建立一個上面的執行個體,系統會自動在使用者對應 ECS 服務上建立 自動快照政策,下面介紹下
spec
字段意義
字段名稱 | 意義 |
---|---|
retentionDays | 自動快照建立保留天數 -1 為永久儲存 |
repeatWeekdays | 一周内自動建立快照的時間點(天) |
timePoints | 一天内自動建立快照的時間點(小時) |
建立 pvc/pv, 并為 pvc 設定自動快照生成政策
- 通過 給 pvc 設定 annotations 來關聯快照政策
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: csi-pvc-snapshot-policy annotations: policy.volumesnapshot.csi.alibabacloud.com: volumesnapshotpolicy1 # 這裡需要将 pvc 與上一步建立出來的 volumesnapshotpolicy 相關聯 spec: accessModes: - ReadWriteOnce resources: requests: storage: 25Gi selector: matchLabels: alicloud-pvname: static-disk-pv-snapshot-policy --- apiVersion: v1 kind: PersistentVolume metadata: name: csi-pv-snapshot-policy labels: alicloud-pvname: static-disk-pv-snapshot-policy spec: capacity: storage: 25Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain csi: driver: diskplugin.csi.alibabacloud.com volumeHandle: <your-disk-id>
注意,這時 storage-auto-snapshot 并不會将 pvc 綁定的雲盤關聯到自動關聯的政策上。因為這時雲盤還沒有任何資料,沒有必要建立快照造成資金損失。
建立 pod 關聯這個 pvc/pv
- 隻有當雲盤被pod挂載之後,自動快照政策才開始生效
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web-policy
spec:
selector:
matchLabels:
app: nginx-policy
serviceName: "nginx"
template:
metadata:
labels:
app: nginx-policy
spec:
containers:
- name: nginx-policy
image: nginx
ports:
- containerPort: 80
name: web
volumeMounts:
- name: pvc-disk
mountPath: /data
volumes:
- name: pvc-disk
persistentVolumeClaim:
claimName: csi-pvc-snapshot-policy
當pod啟動之後,storage-operator 會自動将 pv 對應的 diskId 與 VolumeSnapshotPolicy 進行關聯,并按照政策進行快照生成。
檢視自動快照政策是否生效
- 登入 ecs 首頁面
- 點選 存儲與快照 頁面
- 點選 自動快照政策 頁面
- 檢視快照政策是否關聯上了指定雲盤
修改定時快照政策
注意
- 修改定時快照政策會影響該政策關聯的所有雲盤,請謹慎修改
- 不要在 ecs 頁面上進行政策修改,所有的修改請通過 crd 進行修改
- 通過修改 volumesnapshotpolicy crd 進行快照政策的變更
$ kubectl edit volumesnapshotpolicy volumesnapshotpolicy1
```
```
apiVersion: v1
items:
- apiVersion: storage.alibabacloud.com/v1alpha1
kind: VolumeSnapshotPolicy
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"storage.alibabacloud.com/v1alpha1","kind":"VolumeSnapshotPolicy","metadata":{"annotations":{},"name":"volumesnapshotpolicy1"},"spec":{"repeatWeekdays":["1","2"],"retentionDays":1,"timePoints":["11","12","13","14","15","16","17","18","19","20","21","22","23"]}}
policyId: sp-uf6ahkkav6016ondbiyk
creationTimestamp: "2021-01-05T11:13:52Z"
generation: 2
managedFields:
- apiVersion: storage.alibabacloud.com/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:kubectl.kubernetes.io/last-applied-configuration: {}
f:spec:
.: {}
f:retentionDays: {}
f:timePoints: {}
manager: kubectl-client-side-apply
operation: Update
time: "2021-01-05T11:13:52Z"
- apiVersion: storage.alibabacloud.com/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:repeatWeekdays: {}
manager: kubectl-edit
operation: Update
time: "2021-01-06T07:16:59Z"
- apiVersion: storage.alibabacloud.com/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:policyId: {}
manager: operator
operation: Update
time: "2021-01-06T07:16:59Z"
name: volumesnapshotpolicy1
resourceVersion: "339669"
selfLink: /apis/storage.alibabacloud.com/v1alpha1/volumesnapshotpolicies/volumesnapshotpolicy1
uid: 02257b4a-28e6-46f4-a767-81c0d117aba0
spec:
repeatWeekdays:
- "1"
- "2"
- "3"
- "4"
retentionDays: 1
timePoints:
- "11"
- "12"
- "13"
- "14"
- "15"
- "16"
- "17"
- "18"
- "19"
- "20"
- "21"
- "22"
- "23"
kind: List
metadata:
resourceVersion: ""
selfLink: ""
```
- 在 ecs 頁面上觀察定時快照政策是否已更新.
### 使用定時快照生成的快照進行磁盤恢複
---
- 綁定定時快照政策之後, 使用者會在 ack 叢集中看到自動建立的快照(volumesnapshot & volumesnapshotcontent)
```
$ kubectl get volumesnapshot
NAME READYTOUSE RESTORESIZE DELETIONPOLICY DRIVER VOLUMESNAPSHOTCLASS VOLUMESNAPSHOT AGE
s-uf6221xxxxxxxxxxx true 41943040 Delete diskplugin.csi.alibabacloud.com default-snapclass s-uf622145z6iibqtlrbwi 7m40s
s-uf65y0zxxxxxxxxx true 41943040 Delete diskplugin.csi.alibabacloud.com default-snapclass s-uf65y0zrhwsd581q60mg 7m40s
s-uf6a83xxxxxxxxxx true 41943040 Delete diskplugin.csi.alibabacloud.com default-snapclass s-uf6a83009o5s2jgcch9f 7m40s
s-uf6fmpbyrxxxxxxx true 41943040 Delete diskplugin.csi.alibabacloud.com default-snapclass s-uf6fmpbyrlm10amjicha 7m40s
- 這時我們就可以使用任意一個 volumesnapshot 來進行雲盤的恢複
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web-restore
spec:
selector:
matchLabels:
app: nginx
serviceName: "nginx"
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
hostNetwork: true
containers:
- name: nginx
image: nginx
command: ["sh", "-c"]
args: ["sleep 10000"]
volumeMounts:
- name: disk-ssd
mountPath: /data
volumeClaimTemplates:
- metadata:
name: disk-ssd
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: alicloud-disk-ssd
resources:
requests:
storage: 20Gi
dataSource:
name: s-uf6221xxxxxxxxxxx
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
- 等待pod啟動之後,我們就完成了定時快照中資料的恢複