当前项目背景需要使用到国密SM4对加密后的数据进行解密,Hive是不支持的,尝试了华为DWS数仓,华为只支持在DWS中的SM4加密解密,不支持外部加密数据DWS解密
新建Maven工程
只需要将引用的第三方依赖打到jar包中,hadoop和hive的依赖不需要打,不需要打的依赖scope选择provided即可。
使用idea新建maven工程,pom.xml配置如下:
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>com.szc.bigdata.hive</groupId><artifactId>sm4_decode</artifactId><version>0.1</version><properties><hadoop.version>3.1.1-hw-ei-311006</hadoop.version><hive.version>3.1.0-hw-ei-311006</hive.version></properties><dependencies><!-- 国密解密依赖以下3个包,commons-codec用来将字节数组转string用的 --><dependency><groupId>commons-codec</groupId><artifactId>commons-codec</artifactId><version>1.15</version></dependency><dependency><groupId>org.bouncycastle</groupId><artifactId>bcprov-jdk15to18</artifactId><version>1.69</version></dependency><dependency><groupId>cn.hutool</groupId><artifactId>hutool-crypto</artifactId><version>5.8.16</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-auth</artifactId><version>${hadoop.version}</version><scope>provided</scope></dependency><dependency><groupId>org.apache.hive</groupId><artifactId>hive-jdbc</artifactId><version>${hive.version}</version><scope>provided</scope></dependency><dependency><groupId>org.apache.hive</groupId><artifactId>hive-common</artifactId><version>${hive.version}</version><scope>provided</scope></dependency><dependency><groupId>org.apache.hive</groupId><artifactId>hive-shims</artifactId><version>${hive.version}</version><scope>provided</scope></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>${hadoop.version}</version><scope>provided</scope></dependency><dependency><groupId>org.apache.hive</groupId><artifactId>hive-exec</artifactId><version>${hive.version}</version><scope>provided</scope></dependency></dependencies><repositories><!-- 这里根据实际情况选择,我这边使用的是华为平台 --><repository><id>huaweicloudsdk</id><url>https://mirrors.huaweicloud.com/repository/maven/huaweicloudsdk/</url><releases><enabled>true</enabled></releases><snapshots><enabled>true</enabled></snapshots></repository></repositories><build><plugins><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-assembly-plugin</artifactId><version>3.1.0</version><configuration><!-- 此配置会打两个包,一个是不带依赖的,一个是将依赖打到jar包的 --><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs><archive><manifest><addClasspath>true</addClasspath><mainClass>com.szc.bigdata.hive.udf.SM4Decode</mainClass></manifest></archive></configuration><executions><execution><id>make-assembly</id><!-- bind to the packaging phase --><phase>package</phase><goals><goal>single</goal></goals></execution></executions></plugin></plugins></build>
</project>
编写自定义函数类
SmUtil引用的是hutools里的工具类
public class SM4Decode extends UDF {public String evaluate(String data, String key) {if (data == null || "".equals(data)) {return null;}SymmetricCrypto sm4 = SmUtil.sm4(key.getBytes());return StringUtils.newStringUtf8(sm4.decrypt(data));}
}
上传jar包到hdfs上
上传jar包到hdfs上
如果集群开启了权限控制,kerberos需要先试用kinit登录才可以
# 刷新环境变量
source bigdate_env# kinit 登录
kinit <用户名>
# 回车后输入密码# 上传到指定目录
hdfs dfs -put ~/sm4_decode-3.1.0-hw-ei-311006-jar-with-dependencies.jar /tmp# 授权
hdfs dfs -chmod 777 /tmp/sm4_decode-3.1.0-hw-ei-311006-jar-with-dependencies.jar
创建函数
# 进入到hive目录下执行beeline
beeline# 授权admin权限
set role admin;# 创建函数
CREATE FUNCTION sm4decode AS 'com.szc.bigdata.hive.udf.SM4Decode' using jar 'hdfs:///tmp/sm4_decode-3.1.0-hw-ei-311006-jar-with-dependencies.jar';# 创建临时函数
CREATE TEMPORARY FUNCTION sm4decode AS 'com.szc.bigdata.hive.udf.SM4Decode' using jar 'hdfs:///tmp/sm4_decode-3.1.0-hw-ei-311006-jar-with-dependencies.jar';# 使用函数
select sm4decode('decodestr','key');