直接上一段代码

func (po *PreOccupy) SetPodAnnotation(pod *v1.Pod) error {
   clientSet := po.Handle.ClientSet()
 
   if _, ok := pod.Annotations[OccupyExpireTime]; ok {
      pod.Annotations[OccupyExpireTime] = ""
   }
   if _, ok := pod.Annotations[OccupyKilled]; ok {
      pod.Annotations[OccupyKilled] = "false"
   }
   _, err := clientSet.CoreV1().Pods(pod.Namespace).Update(context.TODO(), pod, metav1.UpdateOptions{})
   if err != nil {
      return err
   }
 
   return nil
}

代码理解起来很简单,声明一个client,使用client,更新一条数据。但是这段代码会发生一个错误,在日志中提示说:无法修改pod的内置对象数据,请重试或者强制更新。

ResourceVersion登场

ResourceVersion是kubernetes实现乐观锁的方式,ResourceVersion是服务器来控制,使用的是ETCD中的modifiedIndex值。是不是我随便修改一下提交上去,不是。
ResourceVersion是服务器来控制,client修改了也会被controller-manager认为失败。

优化代码

func (po *PreOccupy) SetCacheNodeName(pod *v1.Pod, nodeName string) error {
   clientSet := po.Handle.ClientSet()
   retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
      if _, ok := pod.Annotations[OccupyExpireTime]; ok {
         if _, ok := pod.Annotations[OccupyNodeName]; ok {
            newPod, err := clientSet.CoreV1().Pods(pod.Namespace).Get(context.TODO(), pod.Name, metav1.GetOptions{})
            if err != nil {
               return err
            }
            newPod.Annotations[OccupyNodeName] = nodeName
            _, err = clientSet.CoreV1().Pods(newPod.Namespace).Update(context.TODO(), newPod, metav1.UpdateOptions{})
            if err != nil {
               return err
            }
         }
      }
 
      return nil
   })
 
   if retryErr != nil {
      return retryErr
   }
 
   return nil
}

上述代码并没有明显出现提交ResourceVersion,注意代码11行,在update操作时,直接使用的newPod,而非pod数据。这就是相当于全量提交了最新的pod数据。

不得不说,社区设计的真是巧妙。

RetryOnConflict

retry是如果你担心你的update、get、list、create等操作发生错误,系统会自动帮你重试,设计的也同时很巧妙。

代码原文解释

RetryOnConflict is used to make an update to a resource when you have to worry about
conflicts caused by other code making unrelated updates to the resource at the same
time. fn should fetch the resource to be modified, make appropriate changes to it, try
to update it, and return (unmodified) the error from the update function. On a
successful update, RetryOnConflict will return nil. If the update function returns a
“Conflict” error, RetryOnConflict will wait some amount of time as described by
backoff, and then try again. On a non-“Conflict” error, or if it retries too many times
and gives up, RetryOnConflict will return an error to the caller.

至此,问题解决了。这种client的方式多用于client主动修改k8s的资源,也可以用于并发场景。

我的博客即将同步至腾讯云+社区,邀请大家一同入驻:https://cloud.tencent.com/developer/support-plan?invite_code=344hdj96dhesc

Logo

开源、云原生的融合云平台

更多推荐