Spark SQL----JOIN
- 一、描述
- 二、语法
- 三、参数
- 四、Join类型
- 4.1 Inner Join
- 4.2 Left Join
- 4.3 Right Join
- 4.4 Full Join
- 4.5 Cross Join
- 4.6 Semi Join
- 4.7 Anti Join
- 五、例子
一、描述
SQL连接用于根据join criteria组合来自两个关系的行。以下部分描述了整个join语法,子部分介绍了不同类型的连接以及示例。
二、语法
relation { [ join_type ] JOIN [ LATERAL ] relation [ join_criteria ] | NATURAL join_type JOIN [ LATERAL ] relation }
三、参数
- relation
指定要join的关系。 - join_type
指定join类型。
语法:
[ INNER ] | CROSS | LEFT [ OUTER ] | [ LEFT ] SEMI | RIGHT [ OUTER ] | FULL [ OUTER ] | [ LEFT ] ANTI
- join_criteria
指定如何将一个关系中的行与另一个关系的行组合。
语法:
ON boolean_expression | USING ( column_name [ , ... ] ) boolean_expression
指定返回类型为布尔值的表达式。
四、Join类型
4.1 Inner Join
inner join是Spark SQL的默认join。它选择在两个关系中具有匹配值的行。
语法:
relation [ INNER ] JOIN relation [ join_criteria ]
4.2 Left Join
左连接返回左关系中的所有值和右关系中的匹配值,如果没有匹配,则追加NULL。它也被称为左外连接。
语法:
relation LEFT [ OUTER ] JOIN relation [ join_criteria ]
4.3 Right Join
右连接返回来自右关系的所有值和来自左关系的匹配值,如果没有匹配,则追加NULL。它也被称为右外连接。
语法:
relation RIGHT [ OUTER ] JOIN relation [ join_criteria ]
4.4 Full Join
全连接返回两个关系中的所有值,并在没有匹配的一侧附加NULL值。它也被称为全外连接。
语法:
relation FULL [ OUTER ] JOIN relation [ join_criteria ]
4.5 Cross Join
交叉连接返回两个关系的笛卡尔积。
语法:
relation CROSS JOIN relation [ join_criteria ]
4.6 Semi Join
半连接从关系的左侧返回与右侧匹配的值。它也被称为左半连接。
语法:
relation [ LEFT ] SEMI JOIN relation [ join_criteria ]
4.7 Anti Join
反连接返回与右关系不匹配的左关系中的值。它也被称为左反连接。
语法:
relation [ LEFT ] ANTI JOIN relation [ join_criteria ]
五、例子
-- Use employee and department tables to demonstrate different type of joins.
SELECT * FROM employee;
+---+-----+------+
| id| name|deptno|
+---+-----+------+
|105|Chloe| 5|
|103| Paul| 3|
|101| John| 1|
|102| Lisa| 2|
|104| Evan| 4|
|106| Amy| 6|
+---+-----+------+SELECT * FROM department;
+------+-----------+
|deptno| deptname|
+------+-----------+
| 3|Engineering|
| 2| Sales|
| 1| Marketing|
+------+-----------+-- Use employee and department tables to demonstrate inner join.
SELECT id, name, employee.deptno, deptnameFROM employee INNER JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|103| Paul| 3|Engineering|
|101| John| 1| Marketing|
|102| Lisa| 2| Sales|
+---+-----+------+-----------|-- Use employee and department tables to demonstrate left join.
SELECT id, name, employee.deptno, deptnameFROM employee LEFT JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|105|Chloe| 5| NULL|
|103| Paul| 3|Engineering|
|101| John| 1| Marketing|
|102| Lisa| 2| Sales|
|104| Evan| 4| NULL|
|106| Amy| 6| NULL|
+---+-----+------+-----------|-- Use employee and department tables to demonstrate right join.
SELECT id, name, employee.deptno, deptnameFROM employee RIGHT JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|103| Paul| 3|Engineering|
|101| John| 1| Marketing|
|102| Lisa| 2| Sales|
+---+-----+------+-----------|-- Use employee and department tables to demonstrate full join.
SELECT id, name, employee.deptno, deptnameFROM employee FULL JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|101| John| 1| Marketing|
|106| Amy| 6| NULL|
|103| Paul| 3|Engineering|
|105|Chloe| 5| NULL|
|104| Evan| 4| NULL|
|102| Lisa| 2| Sales|
+---+-----+------+-----------|-- Use employee and department tables to demonstrate cross join.
SELECT id, name, employee.deptno, deptname FROM employee CROSS JOIN department;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|105|Chloe| 5|Engineering|
|105|Chloe| 5| Marketing|
|105|Chloe| 5| Sales|
|103| Paul| 3|Engineering|
|103| Paul| 3| Marketing|
|103| Paul| 3| Sales|
|101| John| 1|Engineering|
|101| John| 1| Marketing|
|101| John| 1| Sales|
|102| Lisa| 2|Engineering|
|102| Lisa| 2| Marketing|
|102| Lisa| 2| Sales|
|104| Evan| 4|Engineering|
|104| Evan| 4| Marketing|
|104| Evan| 4| Sales|
|106| Amy| 4|Engineering|
|106| Amy| 4| Marketing|
|106| Amy| 4| Sales|
+---+-----+------+-----------|-- Use employee and department tables to demonstrate semi join.
SELECT * FROM employee SEMI JOIN department ON employee.deptno = department.deptno;
+---+-----+------+
| id| name|deptno|
+---+-----+------+
|103| Paul| 3|
|101| John| 1|
|102| Lisa| 2|
+---+-----+------+-- Use employee and department tables to demonstrate anti join.
SELECT * FROM employee ANTI JOIN department ON employee.deptno = department.deptno;
+---+-----+------+
| id| name|deptno|
+---+-----+------+
|105|Chloe| 5|
|104| Evan| 4|
|106| Amy| 6|
+---+-----+------+