k8s 中一容器 始终占用显卡不释放,相关占用显卡进程已 kill   

 通过dmesg 查看  报 Unable to allocate memory on node -1 ,治标不治本的办法 重启对应的容器

通过搜索  要最终解决该问题, 当前系统内核  4.4.0-xxxx  该版本问题,导致k8s上出问题,

解决办法升级ubuntu 系统内核,该内核升级不要手动随意下载一高版本deb安装, 通过相关命令升级 

anon:0KB active_anon:11652KB inactive_file:516KB active_file:180KB unevictable:0KB
[233318.319275] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[233318.319513] [45521]     0 45521      255        1       5       2        0          -998 pause
[233318.319520] [46984]     0 46984   359100     2503      82       6        0          -998 flanneld
[233318.319527] [ 6504]     0  6504     2550      191      10       3        0          -998 iptables
[233318.319569] [ 6631]     0  6631      302        6       4       3        0          -998 iptables
[233318.319583] Memory cgroup out of memory: Kill process 45521 (pause) score 0 or sacrifice child
[233318.321901] Killed process 45521 (pause) total-vm:1020kB, anon-rss:4kB, file-rss:0kB
[233335.103348] NVRM: RmInitAdapter failed! (0x26:0x65:1106)
[233335.103397] NVRM: rm_init_adapter failed for device bearing minor number 5
[233395.716627] SLUB: Unable to allocate memory on node -1 (gfp=0x2088020)
[233395.716635]   cache: mnt_cache(14988:e196f5f19fcea94079334d52d6fbb730dc94693de78a9902be307037e5eb5a0c), object size: 384, buffer size: 384, default order: 2, min order: 0
[233395.716639]   node 0: slabs: 18, objs: 756, free: 0
[233395.716641]   node 1: slabs: 8, objs: 336, free: 0
[233429.318915] NVRM: RmInitAdapter failed! (0x26:0x65:1106)
[233429.318990] NVRM: rm_init_adapter failed for device bearing minor number 5

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐