EKS 训练营-弹性伸缩(4)

# 介绍EKS的弹性伸缩包括两个方面的内容:- 一个是水平的 Pod 弹性伸缩(HPA：Horizontal Pod Autoscaler)，这是调用K8s的API实现replica配置- 一个是集群的 Node 弹性伸缩(CA：Cluster Autoscaler)，这是集群层面的垂直弹性伸缩我们将结合 kube-ops-view (通过Helm部署，[官方地址](https://...

wzlinux

652人浏览 · 2021-06-07 10:12:05

wzlinux · 2021-06-07 10:12:05 发布

介绍

EKS的弹性伸缩包括两个方面的内容:

一个是水平的 Pod 弹性伸缩(HPA：Horizontal Pod Autoscaler)，这是调用K8s的API实现replica配置
一个是集群的 Node 弹性伸缩(CA：Cluster Autoscaler)，这是集群层面的垂直弹性伸缩

我们将结合 kube-ops-view (通过Helm部署，官方地址 )来实现弹性伸缩的可视化配置的数据源。

安装 helm

1.下载和安装Helm

mkdir -p ~/environment/helm &amp;&amp; cd ~/environment/helm/
curl -sSL https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash

helm version --short

2.配置Repository

helm repo add stable https://charts.helm.sh/stable
helm repo update

安装kube-ops-view

此处我们使用参数 --set service.type=LoadBalancer 表示把 kube-ops-view 的入口部署到 ELB 上(从而避免了必须使用kube-proxy做端口转发)

helm install kube-ops-view \
  stable/kube-ops-view \
  --set service.type=LoadBalancer \
  --set rbac.create=True

查看安装情况

helm list

如果返回类似如下表示部署成功

NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
kube-ops-view   default         1               2021-05-21 15:33:24.092579976 +0800 CST deployed        kube-ops-view-1.2.4     20.4.0

查看入口

kubectl get svc kube-ops-view | tail -n 1 | awk '{ print "Kube-ops-view URL = http://"$4 }'

系统会返回类似的结果

Kube-ops-view URL = http://ab370ee399c94476d995bcfbfb56ef05-718234254.eu-west-1.elb.amazonaws.com

打开其中的 URL(因为此处我们使用了 ELB，在创建 ELB 和传播 DNS 的时候需要点时间，一般需要 1-2 分钟左右)，会获得一个如下图所示的监控数据页面

监控截图的含义如下

Pod伸缩(HPA)

1.准备工作

要实现水平弹性伸缩，需要先部署 Metrics Server ，有数据才能实现。

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

2.部署测试环境

我们部署一个 apache

kubectl create deployment php-apache --image=us.gcr.io/k8s-artifacts-prod/hpa-example
kubectl set resources deploy php-apache --requests=cpu=200m
kubectl expose deploy php-apache --port 80

kubectl get pod -l app=php-apache

设定 cpu 超过 50% 就扩容

kubectl autoscale deployment php-apache \
    --cpu-percent=50 \
    --min=1 \
    --max=10

查看状态

kubectl get hpa

NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   <unknown>/50%   1         10        0          15s

大概过上1-2分钟，我们就会发现 /50% 变成了 0%/50% 。

3.人为加压

部署一个pod用来操作

kubectl --generator=run-pod/v1 run -i --tty load-generator --image=busybox /bin/sh

进入Pod后，运行3个后台进程持续加压

while true; do wget -q -O - http://php-apache; done &amp;
while true; do wget -q -O - http://php-apache; done &amp;
while true; do wget -q -O - http://php-apache; done &amp;

另开一个窗口，查看加压情况

kubectl get hpa -w

发现负载上来后，逐步扩容(如果要快速看到10个副本的效果，可以多运行几行加压脚本)

wangzan:~/environment $ kubectl get hpa -w
NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   622%/50%   1         10        1          2m41s
php-apache   Deployment/php-apache   622%/50%   1         10        4          2m49s
php-apache   Deployment/php-apache   622%/50%   1         10        8          3m4s
php-apache   Deployment/php-apache   622%/50%   1         10        10         3m20s
php-apache   Deployment/php-apache   125%/50%   1         10        10         3m35s

4.清除环境

通过如下方式清除刚才的水平扩容测试环境

kubectl delete hpa,svc php-apache
kubectl delete deployment php-apache
kubectl delete pod load-generator

Cluster伸缩(CA)

在 EKS 里面，集群自动伸缩是集成了 AWS 的 Auto Scaling Groups(ASG)服务实现的，支持如下几种方式

单个ASG
多个ASG
自动发现，点击这里查看更多详情
控制平台节点配置

1.配置弹性伸缩组

查看系统已经部署好的 asg 的内容

aws autoscaling \
    describe-auto-scaling-groups \
    --region eu-west-1 \
    --query "AutoScalingGroups[? Tags[? (Key=='eks:cluster-name') &amp;&amp; Value=='my-cluster']].[AutoScalingGroupName, MinSize, MaxSize,DesiredCapacity]" \
    --output table

如下所示(默认配置最小，最大和实际需要的实例数量都是2个)

-------------------------------------------------------------
|                 DescribeAutoScalingGroups                 |
+-------------------------------------------+----+----+-----+
|  eks-96bcc1d0-d5ac-a1c7-56be-d4ffdab08a88 |  2 |  2 |  2  |
+-------------------------------------------+----+----+-----+

此处我们把最大的实例数量调整成 4 个

aws autoscaling \
    update-auto-scaling-group \
    --auto-scaling-group-name eks-96bcc1d0-d5ac-a1c7-56be-d4ffdab08a88 \
    --min-size 2 \
    --desired-capacity 2 \
    --max-size 5 \
    --region eu-west-1

2.配置 IAM 角色

此处我们使用 IAM Roles for Service Accounts，有兴趣的读者请参考官方文档

使用如下的方式启用 IAM 角色和 Service Accounts 的功能

eksctl utils associate-iam-oidc-provider \
    --cluster eksworkshop-eksctl \
    --approve

配置 IAM 角色

mkdir ~/environment/cluster-autoscaler &amp;&amp; cd ~/environment/cluster-autoscaler

cat &lt;<eof> k8s-asg-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}
EoF

aws iam create-policy   \
  --policy-name k8s-asg-policy \
  --policy-document file://k8s-asg-policy.json

把IAM Role和SA关联起来

eksctl create iamserviceaccount \
    --name cluster-autoscaler \
    --namespace kube-system \
    --cluster my-cluster \
    --attach-policy-arn "arn:aws:iam::921283538843:policy/k8s-asg-policy" \
    --approve \
    --override-existing-serviceaccounts \
    --region eu-west-1

查看并确认

kubectl -n kube-system describe sa cluster-autoscaler

3.部署 Cluster Autoscaler

完成以下步骤以部署 Cluster Autoscaler。我们建议您先查看部署注意事项并优化 Cluster Autoscaler 部署，然后再将其部署到生产集群。

部署 Cluster Autoscaler

部署 Cluster Autoscaler。

kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

使用以下命令修补部署以将 cluster-autoscaler.kubernetes.io/safe-to-evict 注释添加到 Cluster Autoscaler Pod。

kubectl -n kube-system \
    annotate deployment.apps/cluster-autoscaler \
    cluster-autoscaler.kubernetes.io/safe-to-evict="false"

使用以下命令编辑 Cluster Autoscaler 部署。

kubectl -n kube-system edit deployment.apps/cluster-autoscaler

编辑cluster-autoscaler容器命令修改为自己集群的参数。

--balance-similar-node-groups
--skip-nodes-with-system-pods=false

spec:
      containers:
      - command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
        - --balance-similar-node-groups
        - --skip-nodes-with-system-pods=false

保存并关闭该文件以应用更改。

在 Web 浏览器中打开 Cluster Autoscaler 发行版页面GitHub，找到与集群的 Kubernetes 主版本和次要版本匹配的最新 Cluster Autoscaler 版本。例如，如果您集群的 Kubernetes 版本是 1.19，则查找以 1.19. 开头的最新 Cluster Autoscaler 版本。记录该版本的语义版本号以便在下一步中使用。

使用以下命令，将 Cluster Autoscaler 映像标签设置为您在上一步中记录的版本。替换为您自己的值。

kubectl set image deployment cluster-autoscaler \
  -n kube-system \
  cluster-autoscaler=k8s.gcr.io/autoscaling/cluster-autoscaler:v1.20.0

4.部署nginx应用

通过如下方式部署一个测试用的 Nginx 服务

cd ~/environment/cluster-autoscaler/

cat &lt;<eof> nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-to-scaleout
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        service: nginx
        app: nginx
    spec:
      containers:
      - image: nginx
        name: nginx-to-scaleout
        resources:
          limits:
            cpu: 500m
            memory: 512Mi
          requests:
            cpu: 500m
            memory: 512Mi
EoF

kubectl apply -f ~/environment/cluster-autoscaler/nginx.yaml
kubectl get deployment/nginx-to-scaleout

设置副本数为10

kubectl scale --replicas=10 deployment/nginx-to-scaleout

我们会发现部分 pod 会处于 “pending” 状态，因为它在等待 EC2 扩容。

可以查看详细的日志信息

kubectl -n kube-system logs -f deployment/cluster-autoscaler

我们会发现已经由原来的 2 个节点变成了 4 个节点，如下所示

wangzan:~/environment/cluster-autoscaler $ kubectl get no
NAME                                          STATUS   ROLES    AGE     VERSION
ip-172-31-19-134.eu-west-1.compute.internal   Ready    <none>   50s     v1.20.4-eks-6b7464
ip-172-31-34-171.eu-west-1.compute.internal   Ready    <none>   3h50m   v1.20.4-eks-6b7464
ip-172-31-45-112.eu-west-1.compute.internal   Ready    <none>   54s     v1.20.4-eks-6b7464
ip-172-31-9-180.eu-west-1.compute.internal    Ready    <none>   3h50m   v1.20.4-eks-6b7464

清理环境

通过如下方式清除我们用于测试集群弹性伸缩的环境信息

cd ~/environment/cluster-autoscaler/

kubectl delete -f nginx.yaml
kubectl delete -f cluster-autoscaler-autodiscover.yaml

eksctl delete iamserviceaccount \
  --name cluster-autoscaler \
  --namespace kube-system \
  --cluster eksworkshop-eksctl \
  --wait

aws iam delete-policy \
  --policy-arn arn:aws:iam::921283538843:policy/k8s-asg-policy

helm -n metrics uninstall metrics-server
kubectl delete ns metrics
helm uninstall kube-ops-view

欢迎大家扫码关注，获取更多信息

</unknown

K8S/Kubernetes

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐

【深度】阿里巴巴万级规模 K8s 集群全局高可用体系之美

作者 | 韩堂、柘远、沉醉来源 | 阿里巴巴云原生公众号前言台湾作家林清玄在接受记者采访的时候，如此评价自己 30 多年写作生涯：“第一个十年我才华横溢，‘贼光闪现’，令周边黯然失色；第二个十年，我终于‘宝光现形’，不再去抢风头，反而与身边的美丽相得益彰；进入第三个十年，繁华落尽见真醇，我进入了‘醇光初现’的阶段，真正体味到了境界之美”。长夜有穷，真水无香。领略过了 K8s“身在江

K8S/Kubernetes

如何基于 K8s 构建下一代 DevOps 平台？

作者 | 孙健波（天元）导读：当前云原生 DevOps 体系现状如何？面临哪些挑战？如何通过 OAM 解决云原生 DevOps 场景下的诸多问题？云原生开发应用模型 OAM(Open Application Model) 社区核心成员孙健波将为大家一一解答，并分享如何基于 OAM 和 Kubernetes 打造无限能力的下一代 DevOps 平台。什么是 DevOps？为什么基于 Kub