HiveQL:查询

文章目录

    • 1. select from
      • 1.1 正则表达式指定列
      • 1.2 使用列值计算
      • 1.3 使用函数
      • 1.4 limit 限制返回行数
      • 1.5 别名 as name
      • 1.6 case when then 语句
    • 2. where 语句
    • 3. JOIN 优化
    • 4. 抽样查询
    • 5. union all

学自《hive编程指南》

1. select from

hive (default)> create table employees(> name string,> salary float,> subordinates array<string>,> deductions map<string, float>,> address struct<street:string, city:string, state:string, zip:int>)> partitioned by(country string, state string);hive (default)> load data local inpath "/home/hadoop/workspace/employees.txt"> overwrite into table employees> partition(country='US', state='CA');
Loading data to table default.employees partition (country=US, state=CA)hive (default)> select * from employees;
John Doe	100000.0	["Mary Smith","Todd Jones"]	{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1}	{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}	US	CA
Mary Smith	80000.0	["Bill King"]	{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1}	{"street":"100 Ontario St.","city":"Chicago","state":"IL","zip":60601}	US	CA
Todd Jones	70000.0	[]	{"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1}	{"street":"200 Chicago Ave.","city":"Oak Park","state":"IL","zip":60700}	US	CA
Bill King	60000.0	[]	{"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1}	{"street":"300 Obscure Dr.","city":"Obscuria","state":"IL","zip":60100}	US	CA
Boss Man	200000.0	["John Doe","Fred Finance"]	{"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05}	{"street":"1 Pretentious Drive.","city":"Chicago","state":"IL","zip":60500}	US	CA
Fred Finance	150000.0	["Stacy Accountant"]	{"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05}	{"street":"2 Pretentious Drive.","city":"Chicago","state":"IL","zip":60500}	US	CA
Stacy Accountant	60000.0	[]	{"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1}	{"street":"300 Main St.","city":"Naperville","state":"IL","zip":60563}	US	CA              
  • 可以对表起别名
hive (default)> select name, salary from employees;
hive (default)> select e.name, e.salary from employees e;John Doe	100000.0
Mary Smith	80000.0
Todd Jones	70000.0
Bill King	60000.0
Boss Man	200000.0
Fred Finance	150000.0
Stacy Accountant	60000.0
  • 提取数组元素 [idx],不存在为NULL,提取出的字符串也没有引号
hive (default)> select e.name, e.subordinates[0] from employees e;John Doe	Mary Smith
Mary Smith	Bill King
Todd Jones	NULL
Bill King	NULL
Boss Man	John Doe
Fred Finance	Stacy Accountant
Stacy Accountant	NULL
  • 提取 map 元素 [key]
hive (default)> select e.name, e.deductions['State Taxes'] from employees e;John Doe	0.05
Mary Smith	0.05
Todd Jones	0.03
Bill King	0.03
Boss Man	0.07
Fred Finance	0.07
Stacy Accountant	0.03
  • 提取 struct 中的元素,使用 .
hive (default)> select e.name, e.address.city from employees e;John Doe	Chicago
Mary Smith	Chicago
Todd Jones	Oak Park
Bill King	Obscuria
Boss Man	Chicago
Fred Finance	Chicago
Stacy Accountant	Naperville

1.1 正则表达式指定列

select `price.*` from stocks;

以 price为前缀的列

1.2 使用列值计算

  • 计算税后薪资
hive (default)> select upper(name), salary, deductions['Federal Taxes'],> round(salary*(1-deductions['Federal Taxes'])) from employees;JOHN DOE	100000.0	0.2	80000.0
MARY SMITH	80000.0	0.2	64000.0
TODD JONES	70000.0	0.15	59500.0
BILL KING	60000.0	0.15	51000.0
BOSS MAN	200000.0	0.3	140000.0
FRED FINANCE	150000.0	0.3	105000.0
STACY ACCOUNTANT	60000.0	0.15	51000.0

1.3 使用函数

  • 聚合函数
select count(*), avg(salary) from employees;
set hive.map.aggr=true; # 可以提高聚合性能,但需要更多内存
select distinct address.city from employees;
# distinct 去重
  • 表生成函数,将单列扩展为多行或者多列
hive (default)> select explode(subordinates) as sub from employees;Mary Smith
Todd Jones
Bill King
John Doe
Fred Finance
Stacy Accountant
  • 内置函数

1.4 limit 限制返回行数

limit n 返回 n 行

1.5 别名 as name

1.6 case when then 语句

hive (default)> select name, salary,> case when salary < 50000 then 'low'> 	else 'high'> 	end as bracket from employees;John Doe	100000.0	high
Mary Smith	80000.0	high
Todd Jones	70000.0	high
Bill King	60000.0	high
Boss Man	200000.0	high
Fred Finance	150000.0	high
Stacy Accountant	60000.0	high

2. where 语句

  • 过滤条件
  • like, rlike(正则)
hive (default)> select name, address.street from employees where address.street like "%Ave.";
OK
John Doe	1 Michigan Ave.
Todd Jones	200 Chicago Ave.hive (default)> select name, address.street from employees where address.street like "%Chi%";
OK
Todd Jones	200 Chicago Ave.hive (default)> select name, address.street from employees where address.street rlike ".*(Chicago|Ontario).*";
OK
Mary Smith	100 Ontario St.
Todd Jones	200 Chicago Ave.

3. JOIN 优化

多个表 join 把小的表放在左边

4. 抽样查询

  • 分桶抽样
hive> select name from employees tablesample(bucket 3 out of 4 on rand());
John Doehive> select name from employees tablesample(bucket 3 out of 4 on rand());
Boss Man
Fred Finance
  • 不使用 rand(), 每次结果是一样的
hive> select name from employees tablesample(bucket 3 out of 4 on name);
Mary Smith
Todd Joneshive> select name from employees tablesample(bucket 3 out of 4 on name);
Mary Smith
Todd Jones
  • 百分比抽样
hive> select name from employees tablesample(70 percent);John Doe
Mary Smith
Todd Jones
Bill King
Boss Man

5. union all

将多个表进行合并,每个表必须有相同的列,且字段类型一致

hive> select name from(> select e1.name from employees e1 where e1.name like "Mary%"> union all> select e2.name from employees e2 where e2.name like "Bill%"> ) name_tab> sort by name;WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20210411221203_b3dde291-8596-4b91-95e0-707eeaa873f6
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
2021-04-11 22:12:04,856 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_local1468526053_0003
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 31360 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msecBill King
Mary Smith

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/472386.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

unity鼠标控制镜头旋转_Unity Camera教程之 摄像机跟随鼠标移动而旋转

FollowMouseMove 脚本具体内容如下&#xff1a;using UnityEngine;public class FollowMouseMove : MonoBehaviour {public float moveSpeed 5.0f;// Use this for initializationvoid Start(){}// Update is called once per framevoid Update(){// 获得鼠标当前位置的X和Yfl…

html里span和div,HTML div和span

块代码和1...ccwTest {font-family: .PingFang SC;">;color:white;margin:18px;padding:16px;}这些是文字这里是内容,改变这些文字的颜色或者是改变其他属性需要写在style里 ccwTest是名字,相当于ios中的对象名 通过这个对象名找到是调用的哪个方法2..span.red {color:…

python闭包详解函数_详解python函数的闭包问题(内部函数与外部函数详述)

python函数的闭包问题&#xff08;内嵌函数&#xff09; >>> def func1(): ... print (func1 running...) ... def func2(): ... print (func2 running...) ... func2() ... >>> func1() func1 running... func2 running... 内部函数func2作用域都在外部函数…

实现一个简单的web服务器

代码&#xff1a;http://www.oschina.net/code/snippet_991746_45911转载于:https://www.cnblogs.com/lit10050528/p/4282771.html

python format 冒号_python之格式化输出

字符串格式化有两种方式&#xff0c;%和format先介绍下%号的方法#%s的语法结构&#xff0c;叫做占位符&#xff0c;就是先占一个位置&#xff0c;然后我们用真实的要显示的数据替换占位符即可#最简单的用法就是下面的方式&#xff0c;其实%s还有其他的功能# s 我的名字是%s,我…

Scala 入门1(变量、分支循环、函数)

文章目录1. var 变量&#xff0c;val常量2. 分支、循环3. 函数、方法4. 闭包学自 https://www.runoob.com/scala/scala-tutorial.html 1. var 变量&#xff0c;val常量 scala 语句 用 ; 或者 \n 分句 object HelloWorld { // 类名跟文件名一致def main(args: Array[String])…

2018qs计算机科学专业,2018QS世界大学计算机科学专业排名.docx

2018QS世界大学计算机科学专业排名在2018世界大学计算机科学与信息系统排名中&#xff0c;牛津大学位列第一&#xff0c;哈佛大学和剑桥大学紧随其后。本排名中&#xff0c;共有2所中国大学进入Top50&#xff0c;北京大学位列第17位&#xff0c;清华大学列 20 位&#xff0c;跟…

http://www.cnblogs.com/langjt/p/4281477.html

http://www.cnblogs.com/langjt/p/4281477.html转载于:https://www.cnblogs.com/outlooking/p/4283320.html

nineoldandroid使用_nineoldandroid 详细使用并且实现drawerlayout侧滑动画

nineoldandroid.view.ViewHelpe是一个为了兼容3.0以下的一个动画开源库相关函数解读&#xff1a;(第一个参数都为动画对象&#xff0c;第二个为动画属性值的变化表达式)ViewHelper.setTranslationX(,);//x方向平移ViewHelper.setTranslationY(,);ViewHelper.setScaleX(,);x方向…

maven deploy plugin_Maven快速上手

作者&#xff1a;u_7deeb657158f出自&#xff1a;ITPUB博客原文&#xff1a;blog.itpub.net/69956102/viewspace-2726121/创建项目首先需要创建一个用于存储项目的文件夹&#xff0c;在控制台中输入以下命令&#xff1a;mvn archetype:generate -DgroupIdcom.mycompany.app -Da…

计算机应用基础形考作业3Excel部分,计算机应用基础形考3,Excel部分

“计算机应用基础”形考作业 3(Excel部分)可根据所学知识模块&#xff0c;在下列Excel、PowerPoint或Access中选做一个&#xff0c;我选了Excel (本次作业覆盖“模块 3 Excel 2010 电子表格系统”的内容&#xff0c;请在学完模块3后完成本次作业&#xff0c;要求第17周内完成。…

.net string format

转自&#xff1a;http://www.cnblogs.com/jobs2/p/3948049.html 转自&#xff1a;http://jingyan.baidu.com/article/48206aeaf8c52f216ad6b300.html 1、格式化货币&#xff08;跟系统的环境有关&#xff0c;中文系统默认格式化人民币&#xff0c;英文系统格式化美元&#xff0…

Scala 入门2(数组、List、Set、Map、元组、Option、Iterator)

文章目录1. 数组2. List3. Set4. Map5. 元组6. Option7. 迭代器学自 https://www.runoob.com/scala/scala-tutorial.html 1. 数组 使用 () 来取索引处的元素 // 数组var z1 : Array[String] new Array[String](3)var z2 new Array[String](3) // 两种方式定义z1(0) "…

python实现非对称加密算法_Python3非对称加密算法RSA实例详解

本文实例讲述了Python3非对称加密算法RSA。分享给大家供大家参考&#xff0c;具体如下&#xff1a;python3 可以使用 Crypto.PublicKey.RSA 和 rsa 生成公钥、私钥。其中 python3.6 Crypto 库的安装方式请参考前面一篇《Python3对称加密算法AES、DES3》rsa 加解密的库使用 pip3…

python画饼图_百度飞桨PaddlePaddle之[Python小白逆袭大神]7天训练营

第三次参加百度的7天训练营了这次参加的主题是【Python小白逆袭大神】&#xff0c;不过你别看是小白逆势。。。除非你一开始参加就逆袭完&#xff0c;不然你真的是python小白&#xff0c;这个课程还是有难难度的。说一下个训练营的特点版。这个营从python一些基础练习-->数据…

潍坊学院的计算机类怎么样,潍坊学院教育技术学专业怎么样?有知道的麻烦说下,谢谢!...

潍坊学院教育技术学专业怎么样&#xff1f;有知道的麻烦说下&#xff0c;谢谢&#xff01;以下文字资料是由(历史新知网www.lishixinzhi.com)小编为大家搜集整理后发布的内容&#xff0c;让我们赶快一起来看一下吧&#xff01;潍坊学院教育技术学专业怎么样&#xff1f;有知道的…

elasticsearch配置文件详解

elasticsearch的config文件夹里面有两个配置文件&#xff1a;elasticsearch.yml和logging.yml&#xff0c;第一个是es的基本配置文件&#xff0c;第二个是日志配置文件&#xff0c;es也是使用log4j来记录日志的&#xff0c;所以logging.yml里的设置按普通log4j配置文件来设置就…

Scala 入门3(类、Trait、模式匹配、正则、异常、提取器、IO)

文章目录1. 类和对象2. Trait3. 模式匹配4. 正则5. 异常处理6. 提取器7. 文件 IO学自 https://www.runoob.com/scala/scala-tutorial.html 1. 类和对象 object myClass {import java.io._class Point(xc : Int, yc : Int){var x : Int xcvar y : Int ycdef move(dx:Int, dy…

The Power of Android Action Bars(转载)

转自&#xff1a;http://www.informit.com/articles/article.aspx?p1743642转载于:https://www.cnblogs.com/lance-ehf/p/4285239.html

计算机网络校园网简单设计与实现,简单校园网的设计与实现.docx

本科课程考查(论文)专用封面作业(论文)题目&#xff1a;所修课程名称&#xff1a;简单校园网的设计与实现 《计算机网络实践》修课程时间&#xff1a; 2012年 9月至 2012年 12月完成作业(论文)日期&#xff1a;2012年12月评阅成绩&#xff1a;评阅意见&#xff1a;评阅教师签名…