前言

安装k8s,按照官方文档如果一切顺利的话,还是比较容易的事情,安利一下我以前写的一篇科普文章,关于如何在生产环境搭建k8s集群:https://nieoding-dis-doc.readthedocs.io ,最近有兄弟求助,在vagrant下怎么搭都搭不起来,各种奇奇怪怪的问题,于是自己玩了一下,默认情况下确实无法正常工作,排查了很久,发现主要卡在了一个核心问题上绕不过去:
vagrant创建的虚拟网卡eth0是给它自己ssh用的,这样虚拟出来的每台虚拟机eth0 都是10.0.2.15,而k8s默认是使用eth0的,这样搭建出来的k8s是个假系统,跑得起来才怪!
解决思路也简单:
PlanA:让vagrant eth0不要用NAT方式固定IP
PlanB:让k8s去使用eth1
实践出真知,PlanA是条死路,别人家官方就是这样子的,目前无解,只能尝试PlanB了,下文权当一个踩坑mark,各位看官看看即可,如果直接使用云虚拟机是不会有这个问题的。
PlanB的关键实施点:

  1. 各节点上kubelet强制使用eth1分配的IP
  2. master节点上 kubeadm init 强制指定apiserver-advertise-address为eth1分配的IP
  3. flannel组件强制指定eth1网卡初始化

制作k8s基础镜像

初始化一台虚拟机

# 基于centos/7
vagrant init centos/7
# 启动虚拟机
vagrant up
# 登陆虚拟机	
vagrant ssh
# 在虚拟机上进入root账户,再做后面的工作
sudo su

准备工作

# 关闭selinux
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
# 内核参数修改
cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
# 关闭防火墙
service firewalld stop
systemctl disable firewalld

安装 docker-ce

yum install -y yum-utils \
  device-mapper-persistent-data \
  lvm2
yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo
yum install -y docker-ce
systemctl enable docker
service docker start

安装 kubeadm

cat <<EOF > /etc/yum.repos.d/k8s.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
EOF
yum install -y kubelet kubeadm kubectl
systemctl enable kubelet

打镜像

退出虚拟机

# 关闭虚拟机
vagrant halt
# 制作镜像
vagrant package --output k8s.box
# 将该镜像导入到系统box列表,下文将基于该镜像工作
vagrant box add k8s k8s.box

创建2台虚拟机

功能IP
master172.28.128.10
node172.28.128.11
# 创建Vagrantfile
cat << EOF > Vagrantfile
Vagrant.configure("2") do |config|
  config.vm.box = "k8s"
  config.vm.provider "virtualbox" do |vb|
    vb.memory = "2048"
    vb.cpus = 2
  end
  config.vm.define :master do |cfg|
    cfg.vm.hostname = "master"
    cfg.vm.network :private_network, ip: "172.28.128.10"
  end
  config.vm.define :node do |cfg|
    cfg.vm.hostname = "node"
    cfg.vm.network :private_network, ip: "172.28.128.11"
  end
end
EOF
# 启动2台虚拟机
vagrant up master
vagrant up node
# 开一个新窗口
vagrant ssh master
# 开一个新窗口
vagrant ssh node 

Master使用eth1网卡初始化k8s

kubelet 指定eth1启动

# /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--node-ip=172.28.128.10 --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
# 重启kubelet
service kubelet restart

kubeadm 预下载基础镜像

kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers

kubeadm 初始化(强制apiserver-advertise-address=eth1)

kubeadm init --pod-network-cidr 10.244.0.0/16 --apiserver-advertise-address 172.28.128.10 --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers

一切顺利屏幕会有如下输出

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.28.128.10:6443 --token 8k0umw.19b2qpmtwfzcqigw \
    --discovery-token-ca-cert-hash sha256:bf9d3c0702222d1ae693fc3d6bf114b8668601e48b3d3ec67062401208617d3a 

上面的文字都有意义的哦

 mkdir -p $HOME/.kube
 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
 sudo chown $(id -u):$(id -g) $HOME/.kube/config

执行这3行指令,我们的kubectl才能跑起来,kubeadm join...是节点加入指令,复制下来

[root@master vagrant]# kubectl get nodes
NAME     STATUS     ROLES    AGE     VERSION
master   NotReady   master   4m52s   v1.15.0

Node使用eth1网卡加入k8s

kubelet 指定eth1启动

# /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--node-ip=172.28.128.11 --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
# 重启kubelet
service kubelet restart

kubeadm 加入k8s

# 执行master上生成的指令
kubeadm join 172.28.128.10:6443 --token 8k0umw.19b2qpmtwfzcqigw \
    --discovery-token-ca-cert-hash sha256:bf9d3c0702222d1ae693fc3d6bf114b8668601e48b3d3ec67062401208617d3a

一切顺利屏幕会有如下输出

[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

回到master上看一下, 可以看见INTERNAL-IP都是我们希望的IP了(状态还是NoReady,因为网络组件还没安装)

[root@master vagrant]# kubectl get nodes -o wide
NAME     STATUS     ROLES    AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION              CONTAINER-RUNTIME
master   NotReady   master   5m13s   v1.15.0   172.28.128.10   <none>        CentOS Linux 7 (Core)   3.10.0-957.5.1.el7.x86_64   docker://18.9.6
node     NotReady   <none>   2m33s   v1.15.0   172.28.128.11   <none>        CentOS Linux 7 (Core)   3.10.0-957.5.1.el7.x86_64   docker://18.9.6

再看看系统内部容器情况,IP也正常了

[root@master vagrant]# kubectl get pods -n kube-system -o wide
NAME                             READY   STATUS    RESTARTS   AGE     IP              NODE     NOMINATED NODE   READINESS GATES
coredns-6967fb4995-dvmdq         0/1     Pending   0          10m     <none>          <none>   <none>           <none>
coredns-6967fb4995-zgpwh         0/1     Pending   0          10m     <none>          <none>   <none>           <none>
etcd-master                      1/1     Running   0          9m58s   172.28.128.10   master   <none>           <none>
kube-apiserver-master            1/1     Running   0          10m     172.28.128.10   master   <none>           <none>
kube-controller-manager-master   1/1     Running   0          9m54s   172.28.128.10   master   <none>           <none>
kube-proxy-drzzq                 1/1     Running   0          8m20s   172.28.128.11   node     <none>           <none>
kube-proxy-vlfvk                 1/1     Running   0          10m     172.28.128.10   master   <none>           <none>
kube-scheduler-master            1/1     Running   0          9m49s   172.28.128.10   master   <none>           <none>

还有最后一个工作要做,把网络组件安装上去

安装flannel网络组件

我们在master上执行

curl https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml -o kube-flannel.yml

修改kube-flannel.yml(强制指定eth1网卡初始化,只需要找到amd64这段修改即可)

# kube-flannel.yml
...
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        - --iface=eth1
...

创建flannel

kubectl apply -f kube-flannel.yml

看一下容器情况,多了kube-flannel-ds容器组,coredns也由Pending转为Running状态

[root@master vagrant]# kubectl get pods -n kube-system -o wide
NAME                             READY   STATUS    RESTARTS   AGE   IP              NODE     NOMINATED NODE   READINESS GATES
coredns-6967fb4995-dvmdq         0/1     Running   0          14m   10.244.1.2      node     <none>           <none>
coredns-6967fb4995-zgpwh         0/1     Running   0          14m   10.244.0.2      master   <none>           <none>
etcd-master                      1/1     Running   0          13m   172.28.128.10   master   <none>           <none>
kube-apiserver-master            1/1     Running   0          13m   172.28.128.10   master   <none>           <none>
kube-controller-manager-master   1/1     Running   0          13m   172.28.128.10   master   <none>           <none>
kube-flannel-ds-amd64-q8phh      1/1     Running   0          33s   172.28.128.11   node     <none>           <none>
kube-flannel-ds-amd64-zv2rw      1/1     Running   0          33s   172.28.128.10   master   <none>           <none>
kube-proxy-drzzq                 1/1     Running   0          12m   172.28.128.11   node     <none>           <none>
kube-proxy-vlfvk                 1/1     Running   0          14m   172.28.128.10   master   <none>           <none>
kube-scheduler-master            1/1     Running   0          13m   172.28.128.10   master   <none>           <none>

看一下节点情况,全部转为Ready状态

[root@master vagrant]# kubectl get nodes -o wide       
NAME     STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION              CONTAINER-RUNTIME
master   Ready    master   17m   v1.15.0   172.28.128.10   <none>        CentOS Linux 7 (Core)   3.10.0-957.5.1.el7.x86_64   docker://18.9.6
node     Ready    <none>   14m   v1.15.0   172.28.128.11   <none>        CentOS Linux 7 (Core)   3.10.0-957.5.1.el7.x86_64   docker://18.9.6
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐