前提:doris使用broker导入需要单独部署broker,我这边看着部署教程都是不带broker部署部分的。
1.建表
我测试环境的hive数据是用时间分区,在导入时总是报错
type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel
使用SHOW PARTITIONS FROM demo.`table1`发现没有我导入数据所对应的分区
解决办法:用如下建表语句在新建时就把分区建好,其中比较关键的是"dynamic_partition.create_history_partition" = "true"这个参数。由于默认是false,所以不管设置"dynamic_partition.start" = "-300",这个参数为多大,历史分区就是建不起来,如果导入的数据分区是用历史时间命名,需要带上第一个参数为true才能新建出来历史分区。
建表语句
HDFS导入dorisCREATE TABLE demo.`table1` (`a` decimal(10, 0) NULL COMMENT "a",`b` varchar(65533) NULL COMMENT "b",`c` varchar(65533) NULL COMMENT "c",`date` int NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`a`)
COMMENT "测试表"
PARTITION BY RANGE(`date`)
(PARTITION `p20240301` VALUES LESS THAN ("20240302"),PARTITION `p20240302` VALUES LESS THAN ("20240303"),PARTITION `p20240303` VALUES LESS THAN ("20240304")
)
DISTRIBUTED BY RANDOM
PROPERTIES ("dynamic_partition.enable" = "true","dynamic_partition.time_unit" = "DAY","dynamic_partition.start" = "-300","dynamic_partition.end" = "3","dynamic_partition.prefix" = "p","dynamic_partition.buckets" = "1","dynamic_partition.replication_num" = "1","dynamic_partition.create_history_partition" = "true"
)
;
2.导入
导入语句use demo;
LOAD LABEL load_table1
(DATA INFILE("hdfs://ip:port/user/hive/warehouse/table1/*/*")INTO TABLE table1FORMAT AS "ORC"(
a,b,c)COLUMNS FROM PATH AS (date))WITH BROKER broker1
("hadoop.security.authentication" = "kerberos","kerberos_principal" = "xxx/xxx@HADOOP.COM","kerberos_keytab" = "/opt/keytabs/xxx.keytab"
)
PROPERTIES
("timeout" = "72000"
);
3.查看导入结果
show load order by CreateTime desc;