kubernetes infomer 中的 resync

前言

最早是認為 kubernetes 中 controller 模式 informer 中 resync 是 controller 定時和 api-server 去同步，保證資料的一緻性的。後來發現其實不是。下面我們來一一說下。本文的引用都是來自這本書 Programming Kubernetes 。

controller 和 api-server 是不需要定期同步來保證資料的一緻性的

參考下面這段話：

Programming Kubernetes 第三章 client-go 的 Informers and Caching

The resync is purely in-memory and does not trigger a call to the server. This used to be different but was eventually changed because the error behavior of the watch mechanism had been improved enough to make relists unnecessary.

Informers also have advanced error behavior: when the long-running watch connection breaks down, they recover from it by trying another watch request, picking up the event stream without losing any events. If the outage is long, and the API server lost events because etcd purged them from its database before the new watch request was successful, the informer will relist all objects.

Next to relists, there is a configurable resync period for reconciliation between the in-memory cache and the business logic: the registered event handlers will be called for all objects each time this period has passed. Common values are in minutes (e.g., 10 or 30 minutes).

我記得之前也看過 blog ，最早 kubernetes controller 是有定期和 api-server 同步來保證資料一緻性的，但是現在 watch 機制已經改良了，定期同步這個機制是沒有必要的。

上面文字也說了，resync 是為了 reconcile 業務邏輯和記憶體緩存（最近一次 relist 的結果）的。

如果 resync 是為了 reconcile 業務邏輯和記憶體緩存，那麼為什麼有的 controller 會有對比 ResourceVersion 的步驟

The resync interval of 30 seconds in this example leads to a complete set of events being sent to the registered UpdateFunc such that the controller logic is able to reconcile its state with that of the API server. By comparing the ObjectMeta.resourceVersion field, it is possible to distinguish a real update from a resync.

我覺得如果要 reconcile 業務邏輯和記憶體緩存，就應該把所有的 event 都放到 workqueue 中，但是實際檢視有的 controller 中是會比較一下 ResourceVersion，可以先看下我提的這個 issue kubernetes 源碼中在 pkg/controller/deployment/deployment_controller.go 101行中的 NewDeploymentController 會傳入 DeploymentInformer 和 ReplicaSetInformer，為什麼這兩個 Informer 的 AddEventHandler 中注冊的 UpdateFunc “dc.updateDeployment dc.updateReplicaSet” updateReplicaSet 中有對比 ResourceVersion 的步驟而 updateDeployment 中沒有？

我還是不太了解這個 resync 是怎麼 reconcile 業務邏輯，我再仔細研究一下，研究明白了，會在更新這個部落格。

更新于20201208

問了一下大神 Brendan Burns，郵件回複如下：

Hello,

I'm not super familiar with that code since I wasn't involved in writting it, but my guess from looking at it is that it is an oversight.

In both cases it is possible that there will be an "update" where the resource version doesn't change, due to a re-list of the resources. Such an "update" is in fact a no-op and should be ignored. So I think it would be reasonable to add the same check to the updateDeployment code.

Hope that helps.

--brendan

看了大神的回答，還是沒能解開我的疑惑，我又從 discuss.kubernetes.io 也問了這個問題，還沒人回複我呢。

kubernetes infomer 中的 resync

前言

controller 和 api-server 是不需要定期同步來保證資料的一緻性的

如果 resync 是為了 reconcile 業務邏輯和記憶體緩存，那麼為什麼有的 controller 會有對比 ResourceVersion 的步驟

更新于20201208

更新于20210408

參考連結

繼續閱讀

架構設計-業務邏輯層簡述

用這麼麻煩嗎？

GoFrame系列：4、項目結構及開發明細

k8s部署es叢集和kibana

kubernetes學習筆記--挂載GlusterFS存儲卷

Kubernetes - Xshell連接配接虛拟機 & 搭建Kubernetes基礎叢集

Kubernetes學習--資源管理方式

k8s資源管理1. 基礎2. 依賴3. Pod4. 控制器5. Service

3 第三章資源管理

kubernetes-雲原生技術進階第18講：Kubernetes 排程和資源管理第18講：Kubernetes 排程和資源管理一、Kubernetes 排程過程二、Kubernetes 基礎排程力三、Kubernetes 進階排程能力

通過serviceAccount的secret通路kubernetes API Server前提設定環境變量通過curl通路restAPI額外部分

cephadm離線搭建v17.2.0 Quincy版本Ceph叢集叢集規劃準備工作

使用jvm監控工具(jconsole、jvisualvm)通過jmx遠端連接配接kubernetes上的java應用

Error: docker-ce conflicts with 2:docker-1.13.1-53.git774336d.el7.centos.x86_64

golang建構Dockerfile，并打包成鏡像，運作在docker和k8s上

使用kubeadm+calico部署kubernetes v1.25.3