k8s 和 kubesphere 丧失自愈能力

问题的描述:今天通过 All-in-One 安装了一下kubesphere,安装成功之后,关机重启了一下,就出现了问题。

kubectl get pods -n kube-system
The connection to the server lb.kubesphere.local:6443 was refused - did you specify the right host or port?

无法启动k8skubesphere,docker启动没有问题,但是里面所有的镜像都没有运行起来。

journalctl -fu kubelet
Jun 09 00:57:09 node1 kubelet[31921]: I0609 00:57:09.922299   31921 docker_service.go:263] "Docker Info" dockerInfo=&{ID:92a716e0-deb8-4383-9764-3dc01289e973 Containers:76 ContainersRunning:0 ContainersPaused:0 ContainersStopped:76 Images:32 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Using metacopy false] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:25 OomKillDisable:true NGoroutines:40 SystemTime:2024-06-09T00:57:09.910584434-07:00 LoggingDriver:json-file CgroupDriver:cgroupfs CgroupVersion:1 NEventsListener:0 KernelVersion:3.10.0-1160.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSVersion:7 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc00067c150 NCPU:4 MemTotal:8076972032 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:node1 Labels:[] ExperimentalBuild:false ServerVersion:24.0.6 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:<nil>} runc:{Path:runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:7880925980b188f4c97b462f709d0db8e8962aff Expected:7880925980b188f4c97b462f709d0db8e8962aff} RuncCommit:{ID:v1.1.9-0-gccaecfc Expected:v1.1.9-0-gccaecfc} InitCommit:{ID:de40ad0 Expected:de40ad0} SecurityOptions:[name=seccomp,profile=builtin] ProductLicense:Community Engine DefaultAddressPools:[] Warnings:[]}
Jun 09 00:57:09 node1 kubelet[31921]: E0609 00:57:09.922396   31921 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
Jun 09 00:57:09 node1 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Jun 09 00:57:09 node1 systemd[1]: Unit kubelet.service entered failed state.
Jun 09 00:57:09 node1 systemd[1]: kubelet.service failed.

抓去了一下主要的错误信息

Failed to run kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

这个错误表明kubelet配置使用的 cgroup 驱动是 systemd,而 Docker 使用的 cgroup 驱动是 cgroupfsKubernetes Docker 必须使用相同的 cgroup 驱动才能正常工作,否则可能导致资源管理和隔离的问题。

方法 1: 更改 Docker 的 cgroup 驱动

可以修改 Docker 的配置,使其与 kubelet 使用相同的 cgroup 驱动。这通常涉及编辑 Docker 的启动配置文件(例如 /etc/docker/daemon.json),并添加或修改 “exec-opts” 设置来指定 systemd 作为 cgroup 驱动。示例配置如下:

{
  "exec-opts": ["native.cgroupdriver=systemd"]
}

然后重启 Docker 服务以应用更改:

sudo systemctl daemon-reload
sudo systemctl restart docker

方法 2: 更改 kubelet 的 cgroup 驱动

如果你更倾向于保持 Docker 使用的设置不变,你也可以修改kubelet的配置来使用 cgroupfs 驱动。这通常涉及到编辑 kubelet 的启动参数,在kubeletsystemd服务文件(如 /etc/systemd/system/kubelet.service.d/10-kubeadm.conf 或 /var/lib/kubelet/kubeadm-flags.env)中,找到与–cgroup-driver相关的行,并将其值改为 cgroupfs。如果不存在这样的行,你可能需要手动添加类似以下的行:

KUBELET_EXTRA_ARGS="--cgroup-driver=cgroupfs"

之后,记得重启 kubelet 服务:

sudo systemctl daemon-reload
sudo systemctl restart kubelet

启动服务之后我们查看pod

kubectl get pods -A

在这里插入图片描述
在这里插入图片描述
问题解决。

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐