前言:我们在测试hive数据时,经常要入库数据到hive中,无论是从其它数据库入库到hive中还是从本地导入数据到hive中,会遇到timestamp格式的数据,那么我们怎么导入进去呢?

一、首先建立一个人临时表,把数据中timestamp格式全部变为string格式

CREATE TABLE `dwd.dwd_jp_user_member_bk`(
  `etl_insert_date` string, 
  `batch_task_name` string, 
  `member_id` string COMMENT 'ID', 
  `user_id` string COMMENT '业务ID', 
  `nickname` string COMMENT '昵称', 
  `last_name` string COMMENT '姓', 
  `first_name` string COMMENT '名', 
  `avatar_url` string COMMENT '头像', 
  `update_time` string COMMENT '更新时间', 
  `mobile` string COMMENT '手机号', 
  `id_card` string COMMENT '身份证号', 
  `email` string COMMENT '邮箱', 
  `gender` string COMMENT '性别', 
  `age` string COMMENT '年龄', 
  `birth_date` string COMMENT '生日', 
  `channel_name` string COMMENT '会员渠道', 
  `channel_first_name` string COMMENT '会员首次来源', 
  `full_name` string COMMENT '姓名', 
  `register_time` string COMMENT '注册时间', 
  `domain_id` string COMMENT '域名id')
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
WITH SERDEPROPERTIES ( 
  'field.delim'=',', 
  'serialization.format'=',') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://nameservice1/user/hive/warehouse/dwd.db/dwd_jp_user_member_bk'
TBLPROPERTIES (
  'transient_lastDdlTime'='1617160263')

二、把数据导入到临时表(这里我的数据是从本地导入)

load data local inpath 'user_member_user_id.csv' into table dwd.dwd_jp_user_member_bk;

三、把数据从临时表插到正式表

insert overwrite table dwd.dwd_jp_user_member
select 
etl_insert_date
,batch_task_name
,member_id
,user_id
,nickname
,last_name
,first_name
,avatar_url
,cast(substring(update_time,1,10) as timestamp) as update_time
,mobile
,id_card
,email
,gender
,age
,birth_date
,channel_name
,channel_first_name
,full_name
,cast(substring(register_time,1,10) as timestamp) as register_time
from  dwd.dwd_jp_user_member_bk

至此数据导入大功告成!

Logo

大数据从业者之家,一起探索大数据的无限可能!

更多推荐