k8s gpu分片实现细粒度分配

场景:

有时候一个pod需要使用gpu,但是不需要使用一整张gpu卡,如果按照原生的对gpu卡进行一对一分配,利用率太低

解决方法:

使用显卡插件的timeSlicing功能,实现gpu时间切片

显卡时间片切分

https://github.com/NVIDIA/k8s-device-plugin#shared-access-to-gpus-with-cuda-time-slicing

example:nvidia gpu timeSlicing

部署nvidia gpu plugin时,将gpu划分为100份

kubectl get cm nvidia-config   -n kube-system -o yaml
apiVersion: v1
data:
  config: |
    {
       "version": "v1",
       "sharing": {
         "timeSlicing": {
           "resources": [
             {
               "name": "nvidia.com/gpu",
               "replicas": 100,
             }
           ]
         }
       }
    }
kind: ConfigMap
metadata:
  name: nvidia-config
  namespace: kube-system

kubectl get ds nvidia-device-plugin-daemonset  -n kube-system -o yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nvidia-device-plugin-daemonset
  namespace: kube-system
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: nvidia-device-plugin-ds
  template:
    metadata:
      creationTimestamp: null
      labels:
        name: nvidia-device-plugin-ds
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node-role.kubernetes.io/nvidia-gpu
                operator: Exists
      containers:
      - args:
        - --config-file=/etc/nvidia/config
        env:
        - name: FAIL_ON_INIT_ERROR
          value: "false"
        image: nvcr.io/nvidia/k8s-device-plugin:v0.12.2
        imagePullPolicy: IfNotPresent
        name: nvidia-device-plugin-ctr
        resources: {}
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/kubelet/device-plugins
          name: device-plugin
        - mountPath: /etc/nvidia
          name: config
      dnsPolicy: ClusterFirst
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoSchedule
        key: nvidia.com/gpu
        operator: Exists
      volumes:
      - hostPath:
          path: /var/lib/kubelet/device-plugins
          type: ""
        name: device-plugin
      - configMap:
          defaultMode: 420
          name: nvidia-config
        name: config
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

nvidia这个方案适用场景和不适用的场景

适用于: 一机一卡的场景

不适用于: 一机多卡,会出现在pod里看到多张显卡的情况

解决方案:
1、使用failRequestsGreaterThanOne

参考:
https://developer.nvidia.com/blog/improving-gpu-utilization-in-kubernetes/
https://github.com/NVIDIA/k8s-device-plugin
2、适用阿里云开源方案:
参考:
https://github.com/AliyunContainerService/gpushare-scheduler-extender

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐