KerberOS Hadoop 认证安装配置
1. 关闭 selinuxvim /etc/sysconfig/selinux2. 安装 yum 源配置参考https://blog.csdn.net/Jerry_991/article/details/1189105053. 安装 kerberos 的 server 端(一般找一台单独的机器)yum install -y krb5-ibs krb5-server krb5-workstation
目录
4. 配置hadoop的lib/native (本地运行库)
1. 关闭 selinux
vim /etc/sysconfig/selinux
关闭防火墙,设置开机关闭
systemctl stop firewalld
chkconfig firewalld off
2. 安装 yum 源配置参考
https://blog.csdn.net/Jerry_991/article/details/118910505
3. 安装 kerberos 的 server 端
(一般找一台单独的机器)
yum install -y krb5-ibs krb5-server krb5-workstation
(查看 yum list | grep krb 中是否有安装软件)
4. 配置 krb5.conf 文件
vim /etc/krb5.conf
名词讲解:
realms 域 : 表示一个公司或者一个组织,逻辑上的授权认证范围,比如,某个认证账户是属于某个域下的,跨域账户不通用,域和FQDN的配置很像,使用大写。可以配置多个realm
logging :日志配置
libdefaults:配置默认的设置,包括 ticket 的生存周期
domain_real : 是 kerberos 内的域和主机名的域的一个对应关系
.itcast.cn 类似 *.itcast.cn 表示入 cdh0.itcast.cn cdh1.itcast.cn 等均是ITCAST.CN 这个 realm
itcast.cn 表示 itcast.cn 这个主机名也是 ITCAST.CN 这个 realm 的一部分
5. 配置 kdc.conf
vim /var/kerberos/krb5kdc/kdc.conf
名词解释
acl_file : Kerberos acl 的一些配置对应的文件(权限管控)
admin_keytab : 登录凭证,有了这个相当于直接有了 ticket ,可以免密直接登录某个账户,所以这个文件很重要
supported_enctypes : 支持的加密方式
6. 配置 kadm5.acl 文件
vim /var/kerberos/krb5kdc/kadm5.acl
改为:
*/admin@ITCAST.COM *
其中 */admin 是Kerberos 中的账户形式,如 rm/cdh0.itcast.cn@ITCAST.CN 表示在 cdh0 机器上的 resourcemanager 账户,这个账户属于 ITCAST.CN 这个域
最后 * 表示符合 */admin 的账户拥有所有权限
7. 初始化 kerberos 库
输入:kdb5_util create -s -r ITCAST.CN , 其中 ITCAST.CN 是对应的域,如你的不同请修改
然后要求输入两次密码,输入 krb5kdc 即可
这样就得到数据库 master 账户:K/M@ITCAST.CN , 密码:krb5kdc
会 在 /var/kerberos/krb5kdc/ 下生成相应的文件
输入:kadmin.local 进入admin 的后台,
执行 listprincs 查看机器当前的账户
创建账户
addprinc root/admin@ITCAST.CN 输入密码即可,前面账户满足 ITCAST.CN 的管理员账户存在,(满足 */admin@ITCAST.CN 规则,拥有全部权限)
重启 kerberos
# 重启服务
systemctl restart krb5kdc.service
systemctl restart kadmin.service
# 开机启动
chkconfig krb5kdc on
chkconfig kadmin on
8. Kerberos 客户端
yum install -y krb5-libs krb5-workstation
客户端 只有一个 /etc/krb5.conf 文件需要配置,复杂 server 端的即可
测试能否登录 admin 账户
kinit test/admin@ITCAST.CN 输入密码回车,没有任何输出说明配置成功
klist 验证 ticket 是否正确(24h 有效期)
kadmin 输入正确的密码 (显示和server端输入 kadmin.local 一样)
输入 listprincs
ok ... 到此,Kerberos 已经安装完成,下面需要配置Hadoop集群,针对Kerberos进行设置
9. hadoop kerberos 认证配置
1) 配置 HDFS
1. 添加用户 (三个节点均执行)
groupadd hadoop;useradd hdfs -g hadoop -p hdfs;useradd hive -g hadoop -p hive;useradd yarn -g hadoop -p yarn;useradd mapred -g hadoop -p mapred
(每个账户设置免密登录)
2. 配置HDFS相关的kerberos账户
说明:Hadoop需要kerberos来进行认证,以启动服务来说,在后面配置hadoop的时候我们会给对应服务指定一个kerberos的账户,比如namenode运行在cdh-master机器上,我们将namenode指定给了nn/cdh-master.itcast.cn@ITCAST.CN 这个账户,那么想启动namenode就必须认证这个账户才可以。
1. 在每个节点执行 mkdir /etc/security/keytabs (用于存放 key 秘钥文件)
2. 配置cdh-*上运行的服务对应kerbetros账户
(kadmin 登录报错 : kadmin: GSS-API (or Kerberos) error while initializing kadmin interface , 通报不服务器时间更kdc服务器时间一致即可)
# 执行 kadmin
# 输入密码进入kerberos的admin后台 (密码自定义)
################################ cdh-master ################################
# 创建namenode的账户
addprinc -randkey nn/cdh-master.hadoop.cn@HADOOP.CN
# 创建secondarynamenode的账户
addprinc -randkey sn/cdh-master.hadoop.cn@HADOOP.CN
# 创建resourcemanager的账户
addprinc -randkey rm/cdh-master.hadoop.cn@HADOOP.CN
# 创建用于https服务的相关账户
addprinc -randkey http/cdh-master.hadoop.cn@HADOOP.CN
# 防止启动或者操作的过程中需要输入密码,创建免密登录的keytab文件
# 创建nn账户的keytab
ktadd -k /etc/security/keytabs/nn.service.keytab nn/cdh-master.hadoop.cn@HADOOP.CN
# 创建rm账户的keytab
ktadd -k /etc/security/keytabs/nn.service.keytab sn/cdh-master.hadoop.cn@HADOOP.CN
# 创建rm账户的keytab
ktadd -k /etc/security/keytabs/rm.service.keytab rm/cdh-master.hadoop.cn@HADOOP.CN
# 创建HTTP账户的keytab
ktadd -k /etc/security/keytabs/http.service.keytab http/cdh-master.hadoop.cn@HADOOP.CN
最终会得到三个文件:
-r-------- 1 hdfs hadoop 406 Sep 27 17:48 nn.service.keytab
-r-------- 1 hdfs hadoop 406 Sep 27 17:48 sn.service.keytab
-r-------- 1 yarn hadoop 406 Sep 27 17:48 rm.service.keytab
-r--r----- 1 hdfs hadoop 406 Sep 27 17:48 http.service.keytab
chmod 修改文件的读写权限
chmod 400 nn.service.keytab
chmod 400 sn.service.keytab
chmod 400 rm.service.keytab
chmod 440 http.service.keytab
(http.service.keytab 整个hadoop组需要web服务,用440权限)
################################ cdh-worker1 ################################
# 创建datanode的账户
addprinc -randkey dn/cdh-worker1.hadoop.cn@HADOOP.CN
# 创建nodemanater的账户
addprinc -randkey nm/cdh-worker1.hadoop.cn@HADOOP.CN
# 创建http的账户
addprinc -randkey http/cdh-worker1.hadoop.cn@HADOOP.CN
# 防止启动或者操作的过程中需要输入密码,创建免密登录的keytab文件
# 创建dn账户的keytab
ktadd -k /etc/security/keytabs/dn.service.keytab dn/cdh-worker1.hadoop.cn@HADOOP.CN
# 创建nm账户的keytab
ktadd -k /etc/security/keytabs/nm.service.keytab nm/cdh-worker1.hadoop.cn@HADOOP.CN
# 创建http账户的keytab
ktadd -k /etc/security/keytabs/http.service.keytab http/cdh-worker1.hadoop.cn@HADOOP.CN
最终会得到三个文件:
-r-------- 1 hdfs hadoop 406 Sep 27 17:48 dn.service.keytab
-r-------- 1 yarn hadoop 406 Sep 27 17:48 nm.service.keytab
-r--r----- 1 hdfs hadoop 406 Sep 27 17:48 http.service.keytab
chmod 修改文件的读写权限
chmod 400 dn.service.keytab
chmod 400 nm.service.keytab
chmod 440 http.service.keytab
(http.service.keytab 整个hadoop组需要web服务,用440权限)
################################ cdh-worker2 ################################
# 创建datanode的账户
addprinc -randkey dn/cdh-worker2.hadoop.cn@HADOOP.CN
# 创建nodemanater的账户
addprinc -randkey nm/cdh-worker2.hadoop.cn@HADOOP.CN
# 创建http的账户
addprinc -randkey http/cdh-worker2.hadoop.cn@HADOOP.CN
# 防止启动或者操作的过程中需要输入密码,创建免密登录的keytab文件
# 创建dn账户的keytab
ktadd -k /etc/security/keytabs/dn.service.keytab dn/cdh-worker2.hadoop.cn@HADOOP.CN
# 创建nm账户的keytab
ktadd -k /etc/security/keytabs/nm.service.keytab nm/cdh-worker2.hadoop.cn@HADOOP.CN
# 创建http账户的keytab
ktadd -k /etc/security/keytabs/http.service.keytab http/cdh-worker2.hadoop.cn@HADOOP.CN
最终会得到三个文件:
-r-------- 1 hdfs hadoop 406 Sep 27 17:48 dn.service.keytab
-r-------- 1 yarn hadoop 406 Sep 27 17:48 nm.service.keytab
-r--r----- 1 hdfs hadoop 406 Sep 27 17:48 http.service.keytab
chmod 修改文件的读写权限
chmod 400 dn.service.keytab
chmod 400 nm.service.keytab
chmod 440 http.service.keytab
(http.service.keytab 整个hadoop组需要web服务,用440权限)
3. 创建数据目录(这个根据自己的安装实际情况设置)
mkdir -p /data/nn;mkdir /data/dn;mkdir /data/nm-local;mkdir /data/nm-log;mkdir /data/mr-history
#!/bin/bash
hadoop_home=/opt/bigdata/hadoop/hadoop-3.2.0
dfs_namenode_name_dir=/opt/bigdata/hadoop/dfs/name
dfs_datanode_data_dir=/opt/bigdata/hadoop/dfs/data
nodemanager_local_dir=/opt/bigdata/hadoop/yarn/local-dir1
nodemanager_log_dir=/opt/bigdata/hadoop/yarn/log-dir
mr_history=/opt/bigdata/hadoop/mapred
if [ ! -n "$hadoop_home" ];then
echo "请输入hadoop home路径 ..."
exit
fi
chgrp -R hadoop $hadoop_home
chown -R hdfs:hadoop $hadoop_home
chown root:hadoop $hadoop_home
chown hdfs:hadoop $hadoop_home/sbin/distribute-exclude.sh
chown hdfs:hadoop $hadoop_home/sbin/hadoop-daemon.sh
chown hdfs:hadoop $hadoop_home/sbin/hadoop-daemons.sh
# chown hdfs:hadoop $hadoop_home/sbin/hdfs-config.cmd
# chown hdfs:hadoop $hadoop_home/sbin/hdfs-config.sh
chown mapred:hadoop $hadoop_home/sbin/mr-jobhistory-daemon.sh
chown hdfs:hadoop $hadoop_home/sbin/refresh-namenodes.sh
chown hdfs:hadoop $hadoop_home/sbin/workers.sh
chown hdfs:hadoop $hadoop_home/sbin/start-all.cmd
chown hdfs:hadoop $hadoop_home/sbin/start-all.sh
chown hdfs:hadoop $hadoop_home/sbin/start-balancer.sh
chown hdfs:hadoop $hadoop_home/sbin/start-dfs.cmd
chown hdfs:hadoop $hadoop_home/sbin/start-dfs.sh
chown hdfs:hadoop $hadoop_home/sbin/start-secure-dns.sh
chown yarn:hadoop $hadoop_home/sbin/start-yarn.cmd
chown yarn:hadoop $hadoop_home/sbin/start-yarn.sh
chown hdfs:hadoop $hadoop_home/sbin/stop-all.cmd
chown hdfs:hadoop $hadoop_home/sbin/stop-all.sh
chown hdfs:hadoop $hadoop_home/sbin/stop-balancer.sh
chown hdfs:hadoop $hadoop_home/sbin/stop-dfs.cmd
chown hdfs:hadoop $hadoop_home/sbin/stop-dfs.sh
chown hdfs:hadoop $hadoop_home/sbin/stop-secure-dns.sh
chown yarn:hadoop $hadoop_home/sbin/stop-yarn.cmd
chown yarn:hadoop $hadoop_home/sbin/stop-yarn.sh
chown yarn:hadoop $hadoop_home/sbin/yarn-daemon.sh
chown yarn:hadoop $hadoop_home/sbin/yarn-daemons.sh
chown mapred:hadoop $hadoop_home/bin/mapred*
chown yarn:hadoop $hadoop_home/bin/yarn*
chown hdfs:hadoop $hadoop_home/bin/hdfs*
chown hdfs:hadoop $hadoop_home/etc/hadoop/capacity-scheduler.xml
chown hdfs:hadoop $hadoop_home/etc/hadoop/configuration.xsl
chown hdfs:hadoop $hadoop_home/etc/hadoop/core-site.xml
chown hdfs:hadoop $hadoop_home/etc/hadoop/hadoop-*
chown hdfs:hadoop $hadoop_home/etc/hadoop/hdfs-*
chown hdfs:hadoop $hadoop_home/etc/hadoop/httpfs-*
chown hdfs:hadoop $hadoop_home/etc/hadoop/kms-*
chown hdfs:hadoop $hadoop_home/etc/hadoop/log4j.properties
chown mapred:hadoop $hadoop_home/etc/hadoop/mapred-*
chown hdfs:hadoop $hadoop_home/etc/hadoop/workers
chown hdfs:hadoop $hadoop_home/etc/hadoop/ssl-*
chown yarn:hadoop $hadoop_home/etc/hadoop/yarn-*
chmod 755 -R $hadoop_home/etc/hadoop/*
chown root:hadoop $hadoop_home/etc
chown root:hadoop $hadoop_home/etc/hadoop
chown root:hadoop $hadoop_home/etc/hadoop/container-executor.cfg
chown root:hadoop $hadoop_home/bin/container-executor
chown root:hadoop $hadoop_home/bin/test-container-executor
chmod 6050 $hadoop_home/bin/container-executor
chown 6050 $hadoop_home/bin/test-container-executor
mkdir -p $hadoop_home/logs
mkdir -p $hadoop_home/logs/hdfs
mkdir -p $hadoop_home/logs/yarn
mkdir -p $mr_history/done-dir
mkdir -p $mr_history/intermediate-done-dir
mkdir -p $nodemanager_local_dir
mkdir -p $nodemanager_log_dir
chown -R hdfs:hadoop $hadoop_home/logs
chmod -R 755 $hadoop_home/logs
chown -R hdfs:hadoop $hadoop_home/logs/hdfs
chmod -R 755 $hadoop_home/logs/hdfs
chown -R yarn:hadoop $hadoop_home/logs/yarn
chmod -R 755 $hadoop_home/logs/yarn
chown -R hdfs:hadoop $dfs_datanode_data_dir
chown -R hdfs:hadoop $dfs_namenode_name_dir
chmod -R 700 $dfs_datanode_data_dir
chmod -R 700 $dfs_namenode_name_dir
chown -R yarn:hadoop $nodemanager_local_dir
chown -R yarn:hadoop $nodemanager_log_dir
chmod -R 770 $nodemanager_local_dir
chmod -R 770 $nodemanager_log_dir
chown -R mapred:hadoop $mr_history
chmod -R 770 $mr_history
4. 配置hadoop的lib/native (本地运行库)
/opt/bigdata/hadoop/hadoop-3.2.0/lib/native 本地库无需配置
5. 设置HDFS的配置文件
1)hadoop-env.sh 增加 (根据自己的具体配置改路径)
export JAVA_HOME=/usr/java/jdk1.8.0_144
export HADOOP_HOME=/opt/bigdata/hadoop/hadoop-3.2.0
export HADOOP_CONF_DIR=/opt/bigdata/hadoop/hadoop-3.2.0/etc/hadoop
export HADOOP_LOG_DIR=/opt/bigdata/hadoop/hadoop-3.2.0/logs
export HADOOP_COMMON_LIB_NATIVE_DIR=/opt/bigdata/hadoop/hadoop-3.2.0/lib/native
export HADOOP_OPTS="-Djava.library.path=/opt/bigdata/hadoop/hadoop-3.2.0/lib/native"
2)yarn-env.sh 增加 (根据自己的具体配置改路径)
export JAVA_HOME=/usr/java/jdk1.8.0_144
export YARN_CONF_DIR=/opt/bigdata/hadoop/hadoop-3.2.0/etc/hadoop
export YARN_LOG_DIR=/opt/bigdata/hadoop/hadoop-3.2.0/logs/yarn
3)mapred-env.sh 增加 (根据自己的具体配置改路径)
export JAVA_HOME=/usr/java/jdk1.8.0_144
4) core-site.xml 配置如下 (注意 各个节点配置文件不一致 ...)
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cdh-master:9000</value>
<description>默认文件系统的名字,通常指定namenode的URL地址,包括主机和端口号</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
<description>在序列文件中使用的缓存区大小,这个缓存区大小应为页大小的倍数,它决定读写操作中缓存了多少数据</description>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
<description>是否开启hadoop的安全认证</description>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
<description>使用kerberos作为hadoop的安全认证方案</description>
</property>
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[2:$1@$0](nn/.*@HADOOP.CN)s/.*/hdfs/
RULE:[2:$1@$0](sn/.*@HADOOP.CN)s/.*/hdfs/
RULE:[2:$1@$0](dn/.*@HADOOP.CN)s/.*/hdfs/
RULE:[2:$1@$0](rm/.*@HADOOP.CN)s/.*/yarn/
RULE:[2:$1@$0](nm/.*@HADOOP.CN)s/.*/yarn/
RULE:[2:$1@$0](http/.*@HADOOP.CN)s/.*/hdfs/
DEFAULT
</value>
<description>将kerberos主体映射到本地用户名:匹配规则,比如第一行就是表示将nn/*@HADOOP.CN的principal绑定到hdfs账户上,也就是想要得到一个认证后的hdfs账户,请使用Kerberos的nn/*@HADOOP.CN账户来认证
同理,下面的http开头的Kerberos账户,其实也是绑定到了hdfs本地账户上,也就是如果想要操作hdfs,用nn和http都是可以的,只是我们在逻辑上多创建几个kerberos账户好分配,比如namenode分配nn,DataNode分配dn,其实nn和dn都是对应的hdfs
</description>
</property>
<!-- Hive Kerberos -->
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
<description></description>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
<description></description>
</property>
<property>
<name>hadoop.proxyuser.hdfs.hosts</name>
<value>*</value>
<description></description>
</property>
<property>
<name>hadoop.proxyuser.hdfs.groups</name>
<value>*</value>
<description></description>
</property>
<property>
<name>hadoop.proxyuser.HTTP.hosts</name>
<value>*</value>
<description></description>
</property>
<property>
<name>hadoop.proxyuser.HTTP.groups</name>
<value>*</value>
<description></description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/bigdata/hadoop/tmp</value>
<description>其他临时目录的父目录,会配其他临时目录用到</description>
</property>
<property>
<name>hadoop.http.cross-origin.enabled</name>
<value>true</value>
</property>
<property>
<name>hadoop.http.cross-origin.allowed-origins</name>
<value>*</value>
</property>
<property>
<name>hadoop.http.cross-origin.allowed-methods</name>
<value>GET,POST,HEAD</value>
</property>
<property>
<name>hadoop.http.cross-origin.allowed-headers</name>
<value>X-Requested-With,Content-Type,Accept,Origin</value>
</property>
<property>
<name>hadoop.http.cross-origin.max-age</name>
<value>1800</value>
</property>
<property>
<name>hadoop.http.filter.initializers</name>
<value>org.apache.hadoop.security.HttpCrossOriginFilterInitializer</value>
</property>
</configuration>
5) hdfs-site.xml 配置如下
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>副本数设置,值不应大于datanode的数量 </description>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
<description>hdfs 上块儿的大小设置</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/bigdata/hadoop/dfs/name</value>
<description>namenode name文件夹</description>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>200</value>
<description>namenode 处理rpc线程数,默认10个</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/bigdata/hadoop/dfs/data</value>
<description>namenode data文件夹设置</description>
</property>
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
<description>数据节点访问令牌标识</description>
</property>
<!-- NameNode security config -->
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>nn/cdh-master.hadoop.cn@HADOOP.CN</value>
<description>namenode 对应的kerberos账户为 nn/主机名@HADOOP.CN _HOST会自动转换为主机名</description>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/security/keytabs/nn.service.keytab</value>
<description>因为使用 -randkey 创建的用户,密码随机的,所以需要用免密登录的keytab文件指定namenode需要用的keytab文件在哪里</description>
</property>
<!-- http security config -->
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>http/cdh-master.hadoop.cn@HADOOP.CN</value>
<description>https 相关(如开启namenodeUI)使用的账户</description>
</property>
<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>http/cdh-master.hadoop.cn@HADOOP.CN</value>
<description>web hdfs 使用的账户</description>
</property>
<property>
<name>dfs.web.authentication.kerberos.keytab</name>
<value>/etc/security/keytabs/http.service.keytab</value>
<description>web 使用的账户对应的keytab文件</description>
</property>
<!-- Secondary NameNode security config -->
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>sn/cdh-master.hadoop.cn@HADOOP.CN</value>
<description>secondarynamenode 使用的账户</description>
</property>
<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/etc/security/keytabs/sn.service.keytab</value>
<description>sn对应的keytab文件</description>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name>
<value>http/cdh-master.hadoop.cn@HADOOP.CN</value>
<description>sn需要开启http页面用到的账户</description>
</property>
<!-- DataNode security config -->
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>dn/cdh-master.hadoop.cn@HADOOP.CN</value>
<description>datanode用到的账户</description>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/security/keytabs/dn.service.keytab</value>
<description>datanode 对应的keytab文件路径</description>
</property>
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>700</value>
<description>datanode data文件夹的权限级别</description>
</property>
<!-- dfs 权限用户组 -->
<property>
<name>dfs.permissions.supergroup</name>
<value>hdfs</value>
<description>dfs 权限用户组</description>
</property>
<!-- http web config -->
<property>
<name>dfs.http.policy</name>
<value>HTTPS_ONLY</value>
<description>所有开启的web页面均使用https,细节在ssl server和client那个配置文件内配置</description>
</property>
<property>
<name>dfs.data.transfer.protection</name>
<value>integrity</value>
<description></description>
</property>
<property>
<name>dfs.https.port</name>
<value>50070</value>
<description>web 端口号</description>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.http.address</name>
<value>cdh-master:50070</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>cdh-master:50090</value>
</property>
</configuration>
6. 创建HTTPS证书
每个节点创建文件夹
mkdir -p /etc/security/https
cd /etc/security/https
执行 (CN=各个节点的hostname名称)
openssl req -new -x509 -keyout bd_ca_key -out bd_ca_cert -days 99999 -subj /C=CN/ST=beijing/L=beijing/0=test/OU=test/CN=test
输入两次密码:(这里使用 123456),得到两个文件,复制到其他节点
然后,在每个节点的 /etc/security/https 文件夹下执行以下命令 (CN=各个节点的hostname名称)
# 1 输入密码 123456 (这里会让重新设置密码,可以同样设置123456,总共输入4次密码),生产一个keystore文件
keytool -keystore keystore -alias localhost -validity 99999 -genkey -keyalg RSA -keysize 2048 -dname "CN=test,OU=test,O=test,L=beijing,ST=beijing,C=CN"
# 2 输入密码和确认密码(123456),提示是否信任证书,输入yes,此命令成功后输入一个truststore文件
keytool -keystore truststore -alias CARoot -import -file bd_ca_cert
# 3 输入密码(123456),此命令成功后输出一个cert文件
keytool -certreq -alias localhost -keystore keystore -file cert
# 4 此命令成功后输出 cert_signed 文件
openssl x509 -req -CA bd_ca_cert -CAkey bd_ca_key -in cert -out cert_signed -days 99999 -CAcreateserial -passin pass:123456
# 5 输入密码和确认密码(123456),是否信任证书,输入yes,此命令成功后更新keystore文件
keytool -keystore keystore -alias CARoot -import -file bd_ca_cert
# 6 输入密码和确认密码(123456)
keytool -keystore keystore -alias localhost -import -file cert_signed
最终得到一下几个文件:
7. 配置ssl-server.xml和ssl-client.xml文件,以及workers文件,并发送到其他节点
上面配置的https证书以及设置的密码等在这里用上
1). ssl-server.xml
<configuration>
<property>
<name>ssl.server.truststore.location</name>
<value>/etc/security/https/truststore</value>
<description></description>
</property>
<property>
<name>ssl.server.truststore.password</name>
<value>123456</value>
<description></description>
</property>
<property>
<name>ssl.server.truststore.type</name>
<value>jks</value>
<description>Optional. The keystore file format. default value is "jks".</description>
</property>
<property>
<name>ssl.server.truststore.reload.interval</name>
<value>10000</value>
<description>Truststore reload check interval. in milliseconds. Default value is 10000(10 seconds).</description>
</property>
<property>
<name>ssl.server.keystore.location</name>
<value>/etc/security/https/keystore</value>
<description>Keystore to be used by NN and DN. Must be specified.</description>
</property>
<property>
<name>ssl.server.keystore.password</name>
<value>123456</value>
<description>Must be specified.</description>
</property>
<property>
<name>ssl.server.keystore.keypassword</name>
<value>123456</value>
<description>Must be specified.</description>
</property>
<property>
<name>ssl.server.keystore.type</name>
<value>jks</value>
<description>Optional.The keystore file format.default value is "jks".</description>
</property>
<property>
<name>ssl.server.exclude.cipher.list</name>
<value>TLS_ECDHE_RSA_WITH_RC4_128_SHA,SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA,
SSL_RSA_WITH_DES_CBC_SHA,SSL_DHE_RSA_WITH_DES_CBC_SHA,
SSL_RSA_EXPORT_WITH_RC4_40_MD5,SSL_RSA_EXPORT_WITH_DES40_CBC_SHA,
SL_RSA_WITH_RC4_128_MD5</value>
<description>Optional. the weak security cipher suites that you want excluded from SSL communication.</description>
</property>
</configuration>
2). ssl-client.xml
<configuration>
<property>
<name>ssl.client.truststore.location</name>
<value>/etc/security/https/truststore</value>
<description>Truststore to be used by clients like distcp.Must be specified.</description>
</property>
<property>
<name>ssl.client.truststore.password</name>
<value>123456</value>
<description>Optional.Default value is "".</description>
</property>
<property>
<name>ssl.client.truststore.type</name>
<value>jks</value>
<description>Optional.The keystore file format.default value is "jks".</description>
</property>
<property>
<name>ssl.client.truststore.reload.interval</name>
<value>10000</value>
<description>Truststore reload check interval.in milliseconds. Default value is 10000 (10 seconds).</description>
</property>
<property>
<name>ssl.client.keystore.location</name>
<value>/etc/security/https/keystore</value>
<description>Keystore to be used by clients like distcp.Must be specified.</description>
</property>
<property>
<name>ssl.client.keystore.password</name>
<value>123456</value>
<description>Optional. Default value is "".</description>
</property>
<property>
<name>ssl.client.keystore.keypassword</name>
<value>123456</value>
<description>Optional.Default value is "".</description>
</property>
<property>
<name>ssl.client.keystore.type</name>
<value>jks</value>
<description>Optional. The keystore file format.default value is "jks".</description>
</property>
</configuration>
3). workers 文件
cdh-worker1.hadoop.cn
cdh-worker2.hadoop.cn
8. 设置文件权限
hadoop fs -chown hdfs:hadoop /
hadoop fs -mkdir /tmp
hadoop fs -chown hdfs:hadoop /tmp
hadoop fs -chmod 777 /tmp
hadoop fs -mkdir /user
hadoop fs -chmod hdfs:hadoop /user
hadoop fs -chmod 755 /user
hadoop fs -mkdir /mr-data
hadoop fs -chown mapred:hadoop /mr-data
hadoop fs -mkdir /tmp/hadoop-yarn
hadoop fs -chmod 770 /tmp/hadoop-yarn
9. 很重要的一步:
报错如下:
这个是因为jce的问题,下载地址:
https://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html
下载之后解压得到
-rw-rw-r-- 1 root root 3035 Dec 21 2013 local_policy.jar
-rw-r--r-- 1 root root 7323 Dec 21 2013 README.txt
-rw-rw-r-- 1 root root 3023 Dec 21 2013 US_export_policy.jar
拷贝至jre对应目录,然后重启hdfs即可
# cp UnlimitedJCEPolicyJDK8/*.jar $JAVA_HOME/jre/lib/security
2) 配置 yarn
1. 添加kerberos用户
kadmin 进入kerberos admin 后台
# 添加resourcemanager
addprinc -randkey rm/cdh-master.hadoop.cn@HADOOP.CN
addprinc -randkey rm/cdh-worker1.hadoop.cn@HADOOP.CN
addprinc -randkey rm/cdh-worker2.hadoop.cn@HADOOP.CN
# 添加job history
addprinc -randkey jhs/cdh-master.hadoop.cn@HADOOP.CN
addprinc -randkey jhs/cdh-worker1.hadoop.cn@HADOOP.CN
addprinc -randkey jhs/cdh-worker2.hadoop.cn@HADOOP.CN
# 添加到本地keytabs目录中 三个节点分别执行
ktadd -k /etc/security/keytabs/jhs.service.keytab jhs/cdh-master.hadoop.cn@HADOOP.CN
ktadd -k /etc/security/keytabs/jhs.service.keytab jhs/cdh-worker1.hadoop.cn@HADOOP.CN
ktadd -k /etc/security/keytabs/jhs.service.keytab jhs/cdh-worker2.hadoop.cn@HADOOP.CN
# 修改权限
chown mapred:hadoop jhs.service.keytab
chmod 400 jhs.service.keytab
2. 配置 yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>cdh-master.hadoop.cn</value>
<description>RM的hostname</description>
</property>
<!-- 配置日志服务器的地址,work节点使用 -->
<property>
<name>yarn.log.server.url</name>
<value>https://cdh-master.hadoop.cn:19888/jobhistory/logs/</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/opt/bigdata/hadoop/yarn/local-dir1</value>
<description>中间结果存放位置,可配置多目录</description>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<description>是否启用日志聚合</description>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
<description>日志聚合目录</description>
</property>
<!--
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
<description>调度器</description>
</property>
-->
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/opt/bigdata/hadoop/yarn/log-dir</value>
<description>中间结果存放位置,可配置多目录</description>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>3600</value>
<description>日志过期时间</description>
</property>
<!-- ResourceManager security configs -->
<property>
<name>yarn.resourcemanager.principal</name>
<value>rm/cdh-master.hadoop.cn@HADOOP.CN</value>
<description></description>
</property>
<property>
<name>yarn.resourcemanager.keytab</name>
<value>/etc/security/keytabs/rm.service.keytab</value>
<description></description>
</property>
<property>
<name>yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled</name>
<value>true</value>
<description></description>
</property>
<!--
<property>
<name>yarn.nodemanager.principal</name>
<value>nm/cdh-master.hadoop.cn@HADOOP.CN</value>
<description></description>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/etc/security/keytabs/nm.service.keytab</value>
<description></description>
</property>
-->
<!-- To enable SSL -->
<property>
<name>yarn.http.policy</name>
<value>HTTPS_ONLY</value>
<description></description>
</property>
<property>
<name>yarn.nodemanager.linux-container-executor.group</name>
<value>hadoop</value>
<description></description>
</property>
<property>
<name>yarn.nodemanager.container-executor.class</name>
<value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
<description></description>
</property>
<property>
<name>yarn.nodemanager.linux-container-executor.path</name>
<value>/opt/bigdata/hadoop/hadoop-3.2.0/bin/container-executor</value>
<description></description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>20480</value>
<description>NM总可用物理内存,以MB为单位.不可动态修改</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>20480</value>
<description>可申请的最大内存资源,以MB为单位</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>5</value>
<description>可分配的CPU个数</description>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>NodeManager上运行的附属服务.需配置成mapreduce_shuffle,才可运行MapReduce程序</description>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>cdh-master.hadoop.cn:8032</value>
<description>RM对客户端暴露的地址,客户端通过该地址向RM提交应用程序等</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cdh-master.hadoop.cn:8030</value>
<description>RM对AM暴露的地址,AM通过地址向RM申请资源,释放资源等</description>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cdh-master.hadoop.cn:8031</value>
<description>RM对NM暴露地址,NM通过该地址向RM汇报心跳,领取任务等</description>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>cdh-master.hadoop.cn:8033</value>
<description>管理员可以通过该地址向RM发送管理命令等</description>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cdh-master.hadoop.cn:8088</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.acl.enable</name>
<value>false</value>
<description>Enable ACLs,Defaults to false.</description>
</property>
<property>
<name>yarn.admin.acl</name>
<value>*</value>
<description></description>
</property>
</configuration>
注意:主节点注掉调度器 和 yarn.nodemanager.principal,然后从节点放开yarn.nodemanager.principal
3. 配置mapred.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- mapred security config -->
<property>
<name>mapreduce.jobhistory.principal</name>
<value>jhs/cdh-master.hadoop.cn@HADOOP.CN</value>
<description></description>
</property>
<property>
<name>mapreduce.jobhistory.keytab</name>
<value>/etc/security/keytabs/jhs.service.keytab</value>
<description></description>
</property>
<!-- http security config -->
<property>
<name>mapreduce.jobhistory.webapp.spnego-principal</name>
<value>http/cdh-master.hadoop.cn@HADOOP.CN</value>
<description></description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.spnego-keytab-file</name>
<value>/etc/security/keytabs/http.service.keytab</value>
<description></description>
</property>
<!-- HTTPS SSL -->
<property>
<name>mapreduce.jobhistory.http.policy</name>
<value>HTTPS_ONLY</value>
<description></description>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>cdh-master.hadoop.cn:10020</value>
<description>JobHistory服务器IPC 主机:端口</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>cdh-master.hadoop.cn:19888</value>
<description>JobHistory服务器Web UI地址,用户可根据地址查看Hadoop历史作业情况</property>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/opt/bigdata/hadoop/mapred/done-dir</value>
<description>已经运行完成的hadoop作业目录</description>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/opt/bigdata/hadoop/mapred/intermediate-done-dir</value>
<description>正在运行的hadoop作业记录</description>
</property>
</configuration>
4. 配置container-executor.cfg
yarn.nodemanager.local-dirs=/opt/bigdata/hadoop/nm/nm-local
yarn.nodemanager.log-dirs=/opt/bigdata/hadoop/nm/nm-log
yarn.nodemanager.linux-container-executor.group=hadoop
banned.users=bin
min.user.id=100
allowed.system.users=root,yarn,hdfs,mapred,hive,dev
注意:/opt/bigdata/hadoop文件夹所属因为 root,container-executor.cfg 文件所属因为 root:hadoop
启动 resourcemanager ,nodemanager ,jobhistory 服务
5. 提交MR程序测试
1. su yarn
2. kinit -kt /etc/security/keytabs/rm.service.keytab rm/cdh-master.hadoop.cn@HADOOP.CN
2. hadoop jar /opt/bigdata/hadoop/hadoop-3.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0.jar wordcount /a.sh /a.txt
(/a.sh 和 /a.txt 为 hdfs 路径)
3) 配置 hive kerberos 认证
1. 创建hive用户
useradd hive -g hadoop
chown -R hive:hadoop /opt/bigdata/apache-hive-2.3.5-bin
2. 创建hive的kerberos账户
在安装 hive 节点,kadmin 进入kerberos后台
addprinc -randkey hive/cdh-master.hadoop.cn@HADOOP.CN
ktadd -k /etc/security/keytabs/hive.service.keytab hive/cdh-master.hadoopo.cn@HADOOP.CN
修改所属权限
chown hive:hadoop hive.service.keytab
chmod 400 hive.service.keytab
3. 上传 mysql-connector-java-5.1.44.jar 到hive的lib目录下
4. 安装 mysql 服务(略 ...)
5. 配置 hive-env.sh
# 参考如下:
export HADOOP_HOME=/opt/bigdata/hadoop/hadoop-3.2.0
export JAVA_HOME=/usr/java/jdk1.8.0_144
export HIVE_HOME=/opt/bigdata/hive/apache-hive-2.3.5-bin
export HIVE_CONF_DIR=/opt/bigdata/hive/apache-hive-2.3.5-bin/conf
6. 配置 hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.88.101:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<!-- hive security config -->
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/cdh-master.hadoop.cn@HADOOP.CN</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/security/keytabs/hive.service.keytab</value>
</property>
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<!-- hive.metastore security config -->
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/security/keytabs/hive.service.keytab</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/cdh-master.hadoop.cn@HADOOP.CN</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>datanucleus.schema.autoCreateTables</name>
<value>true</value>
</property>
<property>
<name>datanucleus.metadata.validate</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
</configuration>
7. 初始化hive源数据库
bin/schematool -dbType mysql -initSchema -verbose
8. 认证hive账户(后台启动hive前需要先在本机认证kerberos)
kinit -kt /etc/security/keytabs/hive.service.keytab hive/cdh-master.hadoop.cn@HADOOP.CN
9. 启动hive相关服务
nohup bin/hive --service metastore > metastore.log 2>&1 &
nohup bin/hive --service hiveserver2> hiveserver2.log 2>&1 &
10. 测试
1. 执行 bin/hive 成功进入后,创建库表等操作
2. 执行 bin/beeline 成功进入后
!connect jdbc:hive2://cdh-master.hadoop.cn:10000/default;principal=hive/cdh-master.hadoop.cn@HADOOP.CN
然后,创建库表操作
(认证完hive,记得将 hive-site.xml 发送到spark的conf目录下,否则spark无法提交任务。)
(Hive Kerberos 认证安装完毕 ...)
4) 使用 java 代码集成测试
更多推荐
所有评论(0)