k8s 问题故障

zb313982521

1134人浏览 · 2021-07-27 15:55:19

zb313982521 · 2021-07-27 15:55:19 发布

1 安装docker
1.1 添加yum源
curl https://download.docker.com/linux/centos/docker-ce.repo -o /etc/yum.repos.d/docker-ce.repo
1.2 安装 containerd包
yum install https://download.docker.com/linux/fedora/30/x86_64/stable/Packages/containerd.io-1.2.6-3.3.fc30.x86_64.rpm
1.3 安装docker
yum install docker-ce

systemctl start docker

systemctl enable docker
1.4 添加阿里云镜像源
cat <<EOF > /etc/docker/daemon.json
{
"registry-mirrors": [
"https://3laho3y3.mirror.aliyuncs.com"
]
}
EOF

systemctl restart docker
2 安装K8S
2.1 下载k8s镜像
docker pull mirrorgooglecontainers/kube-apiserver:v1.15.0
docker pull mirrorgooglecontainers/kube-controller-manager:v1.15.0
docker pull mirrorgooglecontainers/kube-scheduler:v1.15.0
docker pull mirrorgooglecontainers/kube-proxy:v1.15.0
docker pull mirrorgooglecontainers/pause:3.1
docker pull mirrorgooglecontainers/etcd:3.3.10
docker pull coredns/coredns:1.3.1
docker tag docker.io/mirrorgooglecontainers/kube-apiserver:v1.15.0 k8s.gcr.io/kube-apiserver:v1.15.0
docker tag docker.io/mirrorgooglecontainers/kube-controller-manager:v1.15.0 k8s.gcr.io/kube-controller-manager:v1.15.0
docker tag docker.io/mirrorgooglecontainers/kube-scheduler:v1.15.0 k8s.gcr.io/kube-scheduler:v1.15.0
docker tag docker.io/mirrorgooglecontainers/kube-proxy:v1.15.0 k8s.gcr.io/kube-proxy:v1.15.0
docker tag docker.io/mirrorgooglecontainers/pause:3.1 k8s.gcr.io/pause:3.1
docker tag docker.io/mirrorgooglecontainers/etcd:3.3.10 k8s.gcr.io/etcd:3.3.10
docker tag docker.io/coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1
2.2 安装kubeadm等
添加源

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
安装

yum -y install kubelet-1.15.0 kubeadm-1.15.0 kubectl-1.15.0

systemctl start kubelet
systemctl enable kubelet
2.3 初始化
kubeadm init --kubernetes-version=v1.15.0 --pod-network-cidr=10.100.0.0/16 --ignore-preflight-errors='all'
[init] Using Kubernetes version: v1.15.0
[preflight] Running pre-flight checks
[WARNING NumCPU]: the number of available CPUs 1 is less than the required 2
[WARNING Port-6443]: Port 6443 is in use
[WARNING Port-10251]: Port 10251 is in use
[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[WARNING FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.11. Latest validated version: 18.09
[WARNING Hostname]: hostname "gauz" could not be reached
[WARNING Hostname]: hostname "gauz": lookup gauz on 100.100.2.136:53: no such host
[WARNING Port-10250]: Port 10250 is in use
[WARNING Port-2379]: Port 2379 is in use
[WARNING Port-2380]: Port 2380 is in use
[WARNING DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing ca certificate authority
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 0.012336 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node gauz as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node gauz as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: dixxvr.j43qbc7vp8p0j8ia
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.18.11.71:6443 --token dixxvr.j43qbc7vp8p0j8ia \
--discovery-token-ca-cert-hash sha256:1542ffe87114dfb4c764dc55c53c98bb6ef4e1511e14b7f672d1c82680c70be5
设置环境变量

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
2.4 查看容器
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
71f15ae2e1cf d235b23c3570 "/usr/local/bin/kube…" 2 minutes ago Up 2 minutes k8s_kube-proxy_kube-proxy-q2tn9_kube-system_ca009125-94e7-4eb5-ac39-e735c434bbb3_0
55aa12592397 k8s.gcr.io/pause:3.1 "/pause" 2 minutes ago Up 2 minutes k8s_POD_kube-proxy-q2tn9_kube-system_ca009125-94e7-4eb5-ac39-e735c434bbb3_0
094a6f086ad2 8328bb49b652 "kube-controller-man…" 3 minutes ago Up 3 minutes k8s_kube-controller-manager_kube-controller-manager-gauz_kube-system_d5c660bbfe23fa080a4fd4de0b58cd5a_6
818325d7186d 2c4adeb21b4f "etcd --advertise-cl…" 3 minutes ago Up 3 minutes k8s_etcd_etcd-gauz_kube-system_80213ba6c7294a012c698bed95cfd1ec_0
31c801d0d3c3 201c7a840312 "kube-apiserver --ad…" 4 minutes ago Up 4 minutes k8s_kube-apiserver_kube-apiserver-gauz_kube-system_cc5422db7959b7b7d322007ee9e83b19_14
2a55791a815a k8s.gcr.io/pause:3.1 "/pause" 8 minutes ago Up 8 minutes k8s_POD_kube-controller-manager-gauz_kube-system_d5c660bbfe23fa080a4fd4de0b58cd5a_0
6b77398f7ac4 2d3813851e87 "kube-scheduler --bi…" 23 minutes ago Up 23 minutes k8s_kube-scheduler_kube-scheduler-gauz_kube-system_31d9ee8b7fb12e797dc981a8686f6b2b_0
9c35e45be1d2 k8s.gcr.io/pause:3.1 "/pause" 23 minutes ago Up 23 minutes k8s_POD_kube-scheduler-gauz_kube-system_31d9ee8b7fb12e797dc981a8686f6b2b_0
aa40dd695a5b k8s.gcr.io/pause:3.1 "/pause" 23 minutes ago Up 23 minutes k8s_POD_kube-apiserver-gauz_kube-system_cc5422db7959b7b7d322007ee9e83b19_0
dff3a24339da k8s.gcr.io/pause:3.1 "/pause" 23 minutes ago Up 23 minutes k8s_POD_etcd-gauz_kube-system_80213ba6c7294a012c698bed95cfd1ec_0
2.5 测试
kubectl run nginx --image=nginx --replicas=2 --port=80
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
deployment.apps/nginx created

kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-7c45b84548-dfp97 0/1 Pending 0 12s
nginx-7c45b84548-mbb59 0/1 Pending 0 12s
出于安全考虑，默认配置下Kubernetes不会将Pod调度到Master节点。如果希望将k8s-master也当作Node使用，可以执行如下命令：

kubectl taint node k8s-master node-role.kubernetes.io/master-
其中k8s-master是主机节点hostname如果要恢复Master Only状态，执行如下命令：

kubectl taint node k8s-master node-role.kubernetes.io/master=""
再次查看

kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-7c45b84548-dfp97 0/1 ContainerCreating 0 111s
nginx-7c45b84548-mbb59 0/1 ContainerCreating 0 111s

kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-7c45b84548-dfp97 1/1 Running 0 4m13s
nginx-7c45b84548-mbb59 1/1 Running 0 4m13s
3 部署Ingress-nginx
3.1 下载mandatory.yaml和替换镜像源
wget https://raw.githubusercontent.com/kubernetes/ingress-nginx/nginx-0.20.0/deploy/mandatory.yaml

sed -i 's#k8s.gcr.io/defaultbackend-amd64#registry.cn-qingdao.aliyuncs.com/kubernetes_xingej/defaultbackend-amd64#g' mandatory.yaml #替换defaultbackend-amd64镜像地址

sed -i 's#quay.io/kubernetes-ingress-controller/nginx-ingress-controller#registry.cn-qingdao.aliyuncs.com/kubernetes_xingej/nginx-ingress-controller#g' mandatory.yaml #替换nginx-ingress-controller镜像地址
3.2 部署nginx-ingress-controller
kubectl apply -f mandatory.yaml
3.3 添加NodePort端口
kubectl apply -f service-nodeport.yaml
service-nodeport.yaml内容如下

apiVersion: v1
kind: Service
metadata:
name: ingress-nginx
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
spec:
type: NodePort
ports:
- name: http
port: 80
targetPort: 80
protocol: TCP
nodePort: 32080 #http
- name: https
port: 443
targetPort: 443
protocol: TCP
nodePort: 32443 #https
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
3.4 查看服务
kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default-http-backend ClusterIP 10.96.7.58 <none> 80/TCP 79m
ingress-nginx NodePort 10.96.177.26 <none> 80:32080/TCP,443:32443/TCP 51m
3.5 开放端口
iptables -I FORWARD -p tcp --sport 32080 -j ACCEPT
iptables -I FORWARD -p tcp --dport 32080 -j ACCEPT
iptables -I FORWARD -p tcp --sport 80 -j ACCEPT
iptables -I FORWARD -p tcp --dport 80 -j ACCEPT
3.6 测试

3.7 通过域名访问
kind: Ingress
metadata:
name: ingress-app
namespace: default
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: service.gauz #生产中该域名应当可以被公网解析
http:
paths:
- path:
backend:
serviceName: nginx
servicePort: 80
测试

4 额外
4.1 集群一直NotReady
kubectl get nodes
NAME STATUS ROLES AGE VERSION
test1 NotReady master 13m v1.18.4
test2 NotReady <none> 8m45s v1.18.4
查看日志

journalctl -f -u kubelet
— Logs begin at Fri 2020-03-13 23:48:41 HKT. —
Jun 19 10:13:45 test2 kubelet[12770]: W0619 10:13:45.254218 12770 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jun 19 10:13:46 test2 kubelet[12770]: E0619 10:13:46.598642 12770 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 19 10:13:50 test2 kubelet[12770]: W0619 10:13:50.254465 12770 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jun 19 10:13:51 test2 kubelet[12770]: E0619 10:13:51.599853 12770 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 19 10:13:55 test2 kubelet[12770]: W0619 10:13:55.254679 12770 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jun 19 10:13:56 test2 kubelet[12770]: E0619 10:13:56.601110 12770 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 19 10:14:00 test2 kubelet[12770]: W0619 10:14:00.254862 12770 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jun 19 10:14:01 test2 kubelet[12770]: E0619 10:14:01.602280 12770 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 19 10:14:05 test2 kubelet[12770]: W0619 10:14:05.255056 12770 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jun 19 10:14:06 test2 kubelet[12770]: E0619 10:14:06.603401 12770 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 19 10:14:10 test2 kubelet[12770]: W0619 10:14:10.255183 12770 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jun 19 10:14:11 test2 kubelet[12770]: E0619 10:14:11.604576 12770 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Jun 19 10:14:15 test2 kubelet[12770]: W0619 10:14:15.255369 12770 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jun 19 10:14:16 test2 kubelet[12770]: E0619 10:14:16.605776 12770 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

安装

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
再次查看node状态

kubectl get nodes
NAME STATUS ROLES AGE VERSION
test1 Ready master 29m v1.18.4
test2 Ready <none> 24m v1.18.4
4.2 [ERROR Swap]: running with swap on is not supported. Please disable swap
swapoff -a
修改 /etc/fstab 文件，注释掉 SWAP 的自动挂载，使用free -m确认swap已经关闭。 swappiness参数调整，修改/etc/sysctl.d/k8s.conf添加下面一行：

vm.swappiness=0
执行sysctl -p /etc/sysctl.d/k8s.conf使修改生效。
4.3 [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
kubeadm init --kubernetes-version=v1.18.4 --pod-network-cidr=10.100.0.0/16 `--ignore-preflight-errors="all"
————————————————
版权声明：本文为CSDN博主「qauzy」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/idwtwt/article/details/106845714

K8S/Kubernetes

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐

【深度】阿里巴巴万级规模 K8s 集群全局高可用体系之美

作者 | 韩堂、柘远、沉醉来源 | 阿里巴巴云原生公众号前言台湾作家林清玄在接受记者采访的时候，如此评价自己 30 多年写作生涯：“第一个十年我才华横溢，‘贼光闪现’，令周边黯然失色；第二个十年，我终于‘宝光现形’，不再去抢风头，反而与身边的美丽相得益彰；进入第三个十年，繁华落尽见真醇，我进入了‘醇光初现’的阶段，真正体味到了境界之美”。长夜有穷，真水无香。领略过了 K8s“身在江

K8S/Kubernetes

如何基于 K8s 构建下一代 DevOps 平台？

作者 | 孙健波（天元）导读：当前云原生 DevOps 体系现状如何？面临哪些挑战？如何通过 OAM 解决云原生 DevOps 场景下的诸多问题？云原生开发应用模型 OAM(Open Application Model) 社区核心成员孙健波将为大家一一解答，并分享如何基于 OAM 和 Kubernetes 打造无限能力的下一代 DevOps 平台。什么是 DevOps？为什么基于 Kub