k8s部分问题总结
k8s环境问题
·
镜像下不下来之一
对应节点的kubelet日志显示:
Get https://registry-1.docker.io/v2/: x509: certificate signed by unknown authority
手动pull:
docker pull natsio/nats-server-config-reloader:0.6.2
提示一样的
原因应该是ca证书加载有问题,kubectl不明原因重启后(journalctl -u kubelet --since “1 day ago” | grep “Stopping kubelet”)问题解决,应该是kubelet启动会重新加载根证书。根据日志可能是/etc/kubernetes/pki/ca.crt
遇到第二次,是有人操作了/etc/kubernetes/pki/目录下的东西,看来确实是/etc/kubernetes/pki/ca.crt出问题了
可以使用kubeadm init phase certs all --apiserver-advertise-address=192.168.x.x恢复(此命令会重新生成master的证书,会影响所有节点,如果不想动别的node不建议使用)
可以尝试systemctl restart kubelet,不行的话重新kubeadm join。。。
镜像下不下来之二
kubelet日志显示镜像下载失败:
error unmarshalling content: invalid character '<' looking for beginning of value
手动docker pull同样的提示
使用curl进一步验证:
curl https://curl.haxx.se/ca/cacert.pem
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
使用wget下载证书:
wget https://curl.haxx.se/ca/cacert.pem --no-check-certificate
发现pem格式不对,cat cacert.pem发现是个网络准入html页面。
是IT把访问截获自己返回提示装准入软件。
IT真是坑死了,照着ca证书问题搞了两天。
某节点pod提示Init:CrashLoopBackOff
Jun 01 01:26:45 globalrsyncid-2 kubelet[12340]: I0601 01:26:45.912330 12340 topology_manager.go:221] [topologymanager] RemoveContainer - Container ID: 8c49c81e6a05a7d5d59b712b0b20ff648f661641e285ccb67ab8e2b413ed8928
Jun 01 01:26:45 globalrsyncid-2 kubelet[12340]: I0601 01:26:45.924174 12340 topology_manager.go:221] [topologymanager] RemoveContainer - Container ID: 8c49c81e6a05a7d5d59b712b0b20ff648f661641e285ccb67ab8e2b413ed8928
Jun 01 01:26:45 globalrsyncid-2 kubelet[12340]: E0601 01:26:45.927884 12340 remote_runtime.go:291] RemoveContainer "8c49c81e6a05a7d5d59b712b0b20ff648f661641e285ccb67ab8e2b413ed8928" from runtime service failed: rpc error: code = Unknown desc = failed to remove container "8c49c81e6a05a7d5d59b712b0b20ff648f661641e285ccb67ab8e2b413ed8928": Error response from daemon: removal of container 8c49c81e6a05a7d5d59b712b0b20ff648f661641e285ccb67ab8e2b413ed8928 is already in progress
Jun 01 01:26:45 globalrsyncid-2 kubelet[12340]: E0601 01:26:45.927938 12340 kuberuntime_container.go:704] failed to remove pod init container "istio-init": rpc error: code = Unknown desc = failed to remove container "8c49c81e6a05a7d5d59b712b0b20ff648f661641e285ccb67ab8e2b413ed8928": Error response from daemon: removal of container 8c49c81e6a05a7d5d59b712b0b20ff648f661641e285ccb67ab8e2b413ed8928 is already in progress; Skipping pod "ssi-54fb84ddc8-585fc_synccontroller(9fbdce45-d0f1-457f-95a7-a06fa21ffe3b)"
Jun 01 01:26:45 globalrsyncid-2 kubelet[12340]: E0601 01:26:45.928328 12340 pod_workers.go:191] Error syncing pod 9fbdce45-d0f1-457f-95a7-a06fa21ffe3b ("ssi-54fb84ddc8-585fc_synccontroller(9fbdce45-d0f1-457f-95a7-a06fa21ffe3b)"), skipping: failed to "StartContainer" for "istio-init" with CrashLoopBackOff: "back-off 5m0s restarting failed container=istio-init pod=ssi-54fb84ddc8-585fc_synccontroller(9fbdce45-d0f1-457f-95a7-a06fa21ffe3b)"
kubectl logs podname -n xxx istio-proxy发现有异常,猜测可能是iptable被破坏了。
因为本namespace不需要istio,所以可以去掉:
kubectl label namespace synccontroller istio-injection=disabled --overwrite=true
etcd helm反亲和
antiAffinity: true
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: etcd_cluster
operator: In
values:
- etcd-cluster
topologyKey: kubernetes.io/hostname
更多推荐
已为社区贡献6条内容
所有评论(0)