概述
包含External 的表叫外部表
删除外部表只删除metastore的元数据,不删除hdfs中的表数据
外部表 只有一个过程,加载数据和创建表同时完成,并不会移动到数据仓库目录中,只是与外部数据建立一个链接。当删除一个 外部表 时,仅删除该链接
指向已经在 HDFS 中存在的数据,可以创建 Partition
它和 内部表 在元数据的组织上是相同的,而实际数据的存储则有较大的差异
语法
CREATE EXTERNAL TABLE page_view
( viewTime INT,
userid BIGINT,
page_url STRING,
referrer_url STRING,
ip STRING COMMENT 'IP Address of the User',
country STRING COMMENT 'country of origination‘
)
COMMENT 'This is the staging page view table'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'hdfs://centos:9000/user/data/staging/page_view';
实验
hive (default)> create external table
> external_test(name string,city string,address string)
> row format delimited fields terminated by '\t'
> location '/home/hive/extable';
OK
Time taken: 0.559 seconds
hive (default)> desc formatted external_test;
OK
col_name data_type comment
# col_name data_type comment
name string
city string
address string
# Detailed Table Information
Database: default
Owner: hadoop
CreateTime: Wed Sep 21 20:18:21 CST 2016
LastAccessTime: UNKNOWN
Retention: 0
Location: hdfs://hello110:9000/home/hive/extable
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
transient_lastDdlTime 1474460301
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.065 seconds, Fetched: 29 row(s)
hive (default)> load data local inpath '/data/ext_test' into table external_test;
Loading data to table default.external_test
OK
Time taken: 1.286 seconds
hive (default)> select * from external_test;
OK
external_test.name external_test.city external_test.address
1 dddd dddd
2 www www
3 eeee wwww
4 tttt cccc
5 yyycc dddd
Time taken: 1.92 seconds, Fetched: 5 row(s)
不指定location,会默认在:hdfs://hello110:9000/user/hive/warehouse/ 下面
删除
外部表drop table t1。hdfs里数据还在,不会删除。
再创建与刚才相同的表名的表的时候,
select * from t1,会发现表数据会依然还在。
(如果指定了location,则location要一样,如果没有指定,则都在 /user/hive/warehouse 下。)