1. 拿到Excel表后将数据保留,其他的乱七八糟都删掉,然后另存为txt格式的文本,用nodepad++将文本转换为UTF-8编码,此处命名为cityprovince.txt
  2. 将cityprovince.txt传入操作的Linux环境中
  3. hive建表,注意字段类型要相同
drop table tmp.cityprovince;
create table tmp.cityprovince (province String,city String,county String,station String) 
row format delimited fields terminated by '\t' STORED AS TEXTFILE;

    此处txt文本中以空格分开,所以此处以 '/t' 进行分割,否则会将整个数据全放在第一列中

  4.在hive环境中执行指令

load data local inpath '/home/chengwu_1/cityprovince.txt' into table tmp.cityprovince;

  5.在上一步显示ok后,可通过select * from tmp.cityprovince;进行验证。

    注意:需要转换为utf-8,否则tmp.citryprovince会显示乱码

 

  将字段相同的表合并可用union all实现:

select * from tableA union all select * from tableB

 

insert into tmp.applogresult select a.* from (select * from tmp.name1 union all select * from tmp.name2 union all select * from tmp.name7 union all select * from tmp.name12 union all select * from tmp.name17 ) a;

 

Logo

更多推荐