Hive安装教程
1 安装地址1.1 Hive官网地址http://hive.apache.org/1.2 文档查看地址https://cwiki.apache.org/confluence/display/Hive/GettingStarted1.3 下载地址http://archive.apache.org/dist/hive/1.4 github地址https://github...
1 安装地址
1.1 Hive官网地址
http://hive.apache.org/
1.2 文档查看地址
https://cwiki.apache.org/confluence/display/Hive/GettingStarted
1.3 下载地址
http://archive.apache.org/dist/hive/
1.4 github地址
https://github.com/apache/hive
2 安装部署
2.1 Hive安装及配置
1.把apache-hive-1.2.2-bin.tar.gz上传到linux的/opt/software目录下
2.解压apache-hive-2.3.6-bin.tar.gz到/opt/module/目录下面
[caimh@master-node software]$ ll
总用量 585688
-rw-rw-r--. 1 caimh caimh 90859180 9月 26 2019 apache-hive-1.2.2-bin.tar.gz
-rw-r--r--. 1 caimh caimh 198865940 9月 1 06:20 hadoop-2.7.4-with-centos-6.5.tar.gz
-rw-r--r--. 1 caimh caimh 8009 9月 10 11:35 HDFSClientDemo-1.0-SNAPSHOT.jar
-rw-r--r--. 1 caimh caimh 194990602 5月 28 18:07 jdk-8u211-linux-x64.tar.gz
-rw-rw-r--. 1 caimh caimh 77807942 3月 3 2017 mysql-libs.zip
-rw-r--r--. 1 caimh caimh 37191810 6月 7 17:16 zookeeper-3.4.13.tar.gz
[caimh@master-node software]$ tar -zxvf apache-hive-1.2.2-bin.tar.gz -C /opt/module/
3.修改apache-hive-1.2.2-bin的名称为hive-1.2.2
[caimh@master-node module]$ mv apache-hive-1.2.2-bin/ hive-1.2.2
4.修改/opt/module/hive/conf目录下的hive-env.sh.template名称为hive-env.sh
[caimh@master-node conf]$ mv hive-env.sh.template hive-env.sh
5.配置hive-env.sh文件
(a)配置HADOOP_HOME路径
export HADOOP_HOME=/opt/module/hadoop-2.7.4
(b)配置HIVE_CONF_DIR路径
export HIVE_CONF_DIR=/opt/module/hive-1.2.2/conf
2.2 Hadoop集群配置
1.必须启动hdfs和yarn
[caimh@master-node hadoop-2.7.4]$ sbin/start-dfs.sh
[caimh@master-node hadoop-2.7.4]$ sbin/start-yarn.sh
2.在HDFS上创建/tmp和/user/hive/warehouse两个目录并修改他们的同组权限可写
[caimh@master-node hadoop-2.7.4]$ bin/hadoop fs -mkdir /tmp
[caimh@master-node hadoop-2.7.4]$ bin/hadoop fs -mkdir -p /user/hive/warehouse
[caimh@master-node hadoop-2.7.4]$ bin/hadoop fs -chmod g+w /tmp
[caimh@master-node hadoop-2.7.4]$ bin/hadoop fs -chmod g+w /user/hive/warehouse
或者在配置文件中关闭权限检查 在hadoop 的hdfs-site.xml 中
<property>
<name>dfs.permissions.enable</name>
<value>false</value>
</property>
3 Hive基本操作
[caimh@master-node hive-1.2.2]$ bin/hive --1.启动hive
hive> show dahive> show databases; --2.查看数据库
OK
default
Time taken: 2.566 seconds, Fetched: 1 row(s)tabases;
hive> use default; --3.打开默认数据库
hive> show tables; --4.显示默认数据库default中的表
hive> create table Student(id int,name string); --5.创建一张表
hive> show tables; --6.显示数据库中有几张表
OK
student
Time taken: 0.04 seconds, Fetched: 1 row(s)
hive> show tables; --7.查看表结构
OK
student
Time taken: 0.04 seconds, Fetched: 1 row(s)
hive> desc student;
OK
id int
name string
Time taken: 0.436 seconds, Fetched: 2 row(s)
hive> insert into student(id,name) values(1,"caimh"); --8.向表中插入数据(会生成mr程序执行)
Query ID = caimh_20190925122936_1860f0bc-b2b9-4d2c-b1c8-1cb0bde16cc1
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1569383663698_0001, Tracking URL = http://master-node:8088/proxy/application_1569383663698_0001/
Kill Command = /opt/module/hadoop-2.7.4/bin/hadoop job -kill job_1569383663698_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2019-09-25 12:30:08,181 Stage-1 map = 0%, reduce = 0%
2019-09-25 12:30:22,782 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 8.1 sec
MapReduce Total cumulative CPU time: 8 seconds 100 msec
Ended Job = job_1569383663698_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://master-node:9000/user/hive/warehouse/student/.hive-staging_hive_2019-09-25_12-29-36_074_6546745352170556491-1/-ext-10000
Loading data to table default.student
Table default.student stats: [numFiles=1, numRows=1, totalSize=8, rawDataSize=7]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 8.1 sec HDFS Read: 3572 HDFS Write: 79 SUCCESS
Total MapReduce CPU Time Spent: 8 seconds 100 msec
OK
Time taken: 49.656 seconds
hive> select * from student; --9.查询表中数据
OK
1 caimh
Time taken: 0.241 seconds, Fetched: 1 row(s)
hive> quit; --10.退出hive
[caimh@master-node hive-1.2.2]$
4 问题说明
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases(Hive-on-MR在Hive 2中已弃用,在以后的版本中可能不可用。 考虑使用其他执行引擎(例如spark,tez)或使用Hive 1.X版本)
由于本案例Hive是运行在MR上,所以Hive版本只能考虑1.X。不然,会报上面错误。所以本案例Hive安装以apache-hive-1.2.2-bin.tar.gz示范
更多推荐
所有评论(0)