现在环境是 k8s 部署了jupyterlab,jupyterlab里面启动pyspark或者spark-shell命令,会由于driver端口回调问题而在yarn任务日志里面报错。

一种方式是完全定死ip和端口,

一种方式是在代码或者yaml里面动态的获取

部分代码如下  

      <dependency>
            <groupId>io.kubernetes</groupId>
            <artifactId>client-java</artifactId>
            <version>4.0.0-beta1</version> 
        </dependency>
  //更新nodeporkuai
        AtomicReference<Boolean> needUpdate = new AtomicReference<>(false);
        service.getSpec().getPorts().forEach(p -> {
            if (!p.getPort().equals(p.getNodePort())) {  
                p.setTargetPort(new IntOrString(p.getNodePort()));
            }
            if (DRIVER_PORT_NAME.equalsIgnoreCase(p.getName())) {
                develop.getSparkEnv().add(new V1EnvVarBuilder()
                        .withName("DRIVERPORT")
                        .withValue(String.valueOf(p.getNodePort()))
                        .build());
            } else if (BM_PORT_NAME.equalsIgnoreCase(p.getName())) {
                develop.getSparkEnv().add(new V1EnvVarBuilder()
                        .withName("BMPORT")
                        .withValue(String.valueOf(p.getNodePort()))
                        .build());
            }
        });

        V1ObjectFieldSelector v1ObjectFieldSelector =new V1ObjectFieldSelector();
        v1ObjectFieldSelector.setApiVersion("v1");
        v1ObjectFieldSelector.setFieldPath("status.hostIP");
        V1EnvVarSource v1EnvVarSource=new V1EnvVarSource();
        v1EnvVarSource.setFieldRef(v1ObjectFieldSelector);
        develop.getSparkEnv().add(new V1EnvVarBuilder()
                .withName("MY_HOST_IP").withValueFrom(v1EnvVarSource).build());

最后yaml文件如下

 env:
    - name: USERNAME
      value: x1 
    - name: NODESELECTOR
    - name: DRIVERPORT
      value: "31218"
    - name: BMPORT
      value: "32095"
    - name: MY_HOST_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP

提交的命令格式如下

/export/spark-bin-hadoop2.7/bin/pyspark-cust.sh --conf spark.driver.bindAddress=0.0.0.0 --conf spark.driver.host=${MY_HOST_IP} --conf spark.driver.port=${DRIVERPORT} --conf spark.driver.blockManager.port=${BMPORT} --master yarn
~                            

 

 

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐