k8s部署es, 容器一直重启, 报错提示“Back-off restarting failed container“
且后续会持续更新**
metadata:
name: elasticsearch
spec:
selector:
matchLabels:
name: elasticsearch
replicas: 1
template:
metadata:
labels:
name: elasticsearch
spec:
initContainers:
- name: init-sysctl
image: busybox
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: elasticsearch
command: [ “/bin/bash”, “-c”, “–” ]
args: [ “while true; do sleep 30; done;” ]
image: elasticsearch:8.6.2
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 100m
memory: 1Gi
env:
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
ports:
- containerPort: 9200
- containerPort: 9300
volumeMounts:
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch/data/
- name: es-config
mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
subPath: elasticsearch.yml
volumes:
- name: elasticsearch-data
persistentVolumeClaim:
claimName: es-pvc
- name: es-config
configMap:
name: es
2. 重新部署, 容器不再退出, deployment状态保持green, 但es服务不可用(预料中, 因为改了command, 所以ES服务压根儿就没启), `kubectl exec -it <pod_name> -c <container_name> /bin/bash`进入容器, 手动执行`bash /usr/share/elasticsearch/bin/elasticsearch`命令启动ES服务, 查看到报错信息如下:
{
“@timestamp”: “2023-04-06T10:06:47.648Z”,
“log.level”: “ERROR”,
“message”: “fatal exception while booting Elasticsearch”,
“ecs.version”: “1.2.0”,
“service.name”: “ES_ECS”,
“event.dataset”: “elasticsearch.server”,
“process.thread.name”: “main”,
“log.logger”: “org.elasticsearch.bootstrap.Elasticsearch”,
“elasticsearch.node.name”: “node-1”,
“elasticsearch.cluster.name”: “my-cluster”,
“error.type”: “java.lang.IllegalStateException”,
“error.message”: “failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?”,
“error.stack_trace”: “java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat org.elasticsearch.server@8.6.2/org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:285)\n\tat org.elasticsearch.server@8.6.2/org.elasticsearch.node.Node.(Node.java:478)\n\tat org.elasticsearch.server@8.6.2/org.elasticsearch.node.Node.(Node.java:322)\n\tat org.elasticsearch.server@8.6.2/org.elasticsearch.bootstrap.ElasticsearchKaTeX parse error: Undefined control sequence: \n at position 33: …earch.java:214)\̲n̲\tat org.elasti…NodeLock.(NodeEnvironment.java:230)\n\tat org.elasticsearch.server@8.6.2/org.elasticsearch.env.NodeEnvironmentKaTeX parse error: Undefined control sequence: \n at position 42: …nment.java:198)\̲n̲\tat org.elasti…NodeLock.(NodeEnvironment.java:223)\n\t… 7 more\n”
}
挂载的路径`/usr/share/elasticsearch/data`不可写, 初步判断为权限问题, 尝试在容器的挂载路径下创建新文件, 果然权限不足:
elasticsearch@elasticsearch-5b75df88cb-xbng8:~/data$ touch demo.txt
touch: cannot touch ‘demo.txt’: Permission denied
elasticsearch@elasticsearch-5b75df88cb-xbng8:~/data$
网上查阅相关报错的资料, 确实有非root用户启动的容器出现权限问题(elasticsearch服务无法用root启动, 因此elasticsearch容器是以elasticsearch用户启动的)的案例
3.修改挂载路径的权限, 此处不能直接在node上直接创建同名用户并赋权, 容器用户与宿主机用户通过uid对应, 所以先确认容器中用户的uid
es容器中执行
elasticsearch@elasticsearch-5b75df88cb-xbng8:~/data$ id
uid=1000(elasticsearch) gid=1000(elasticsearch) groups=1000(elasticsearch),0(root)
elasticsearch@elasticsearch-5b75df88cb-xbng8:~/data$
node中执行
root@minikube:/# grep 1000 /etc/passwd
docker❌1000:999:,:/home/docker:/bin/bash
root@minikube:/#
查看node挂载的hostPath的权限
root@minikube:/# ll -dh data/es-data
drwxr-xr-x 4 root root 4.0K Apr 6 09:40 data/es-data/
root@minikube:/#
将hostPath路径授与上面查到的用户权限
root@minikube:/# chown docker -R data/es-data
授权后
root@minikube:/# ll -dh data/es-data
drwxr-xr-x 4 docker root 4.0K Apr 6 09:40 data/es-data/
root@minikube:/#
检查容器中的挂载路径已有写入权限
es容器中执行
elasticsearch@elasticsearch-5b75df88cb-xbng8:~/data$ touch demo.txt
elasticsearch@elasticsearch-5b75df88cb-xbng8:~/data$ ll demo.txt
-rw-r–r-- 1 elasticsearch elasticsearch 0 Apr 6 10:29 demo.txt
elasticsearch@elasticsearch-5b75df88cb-xbng8:~/data$
4.将deployment.yaml中的调试内容删除(**将第3步中修改用户权限的指令放到initC中执行**), 重新部署, 问题解决, 完整yaml内容如下:
pv和pvc
apiVersion: v1
kind: PersistentVolume
metadata:
name: es-pv
namespace: century-creator
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: es-host
hostPath:
path: /data/es-data
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: es-pvc
namespace: century-creator
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: es-host
apiVersion: v1
kind: ConfigMap
metadata:
name: es
namespace: century-creator
data:
elasticsearch.yml: |
cluster.name: my-cluster
discovery.type: single-node
node.name: node-1
network.host: 0.0.0.0
http.port: 9200
http.cors.enabled: true
http.cors.allow-origin: /.*/
ingest.geoip.downloader.enabled: false
xpack.security.enabled: false
apiVersion: apps/v1
自我介绍一下,小编13年上海交大毕业,曾经在小公司待过,也去过华为、OPPO等大厂,18年进入阿里一直到现在。
深知大多数大数据工程师,想要提升技能,往往是自己摸索成长或者是报班学习,但对于培训机构动则几千的学费,着实压力不小。自己不成体系的自学效果低效又漫长,而且极易碰到天花板技术停滞不前!
因此收集整理了一份《2024年大数据全套学习资料》,初衷也很简单,就是希望能够帮助到想自学提升又不知道该从何学起的朋友。
既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,基本涵盖了95%以上大数据开发知识点,真正体系化!
由于文件比较大,这里只是将部分目录大纲截图出来,每个节点里面都包含大厂面经、学习笔记、源码讲义、实战项目、讲解视频,并且后续会持续更新
如果你觉得这些内容对你有帮助,可以添加VX:vip204888 (备注大数据获取)
一个人可以走的很快,但一群人才能走的更远。不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎扫码加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!
且后续会持续更新**
如果你觉得这些内容对你有帮助,可以添加VX:vip204888 (备注大数据获取)
[外链图片转存中…(img-9JCPPxM1-1712977693044)]
一个人可以走的很快,但一群人才能走的更远。不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎扫码加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!
更多推荐
所有评论(0)