Spark入门（三）Idea构建spark项目

一、依赖包配置

scala与spark的相关依赖包，spark包后尾下划线的版本数字要跟scala的版本第一二位要一致，即2.11

pom.xml

<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>com.mk</groupId><artifactId>spark-test</artifactId><version>1.0</version><name>spark-test</name><url>http://spark.mk.com</url><properties><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding><maven.compiler.source>1.8</maven.compiler.source><maven.compiler.target>1.8</maven.compiler.target><scala.version>2.11.1</scala.version><spark.version>2.4.4</spark.version><hadoop.version>2.6.0</hadoop.version></properties><dependencies><!-- scala依赖--><dependency><groupId>org.scala-lang</groupId><artifactId>scala-library</artifactId><version>${scala.version}</version></dependency><!-- spark依赖--><dependency><groupId>org.apache.spark</groupId><artifactId>spark-core_2.11</artifactId><version>${spark.version}</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-sql_2.11</artifactId><version>${spark.version}</version></dependency><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>4.11</version><scope>test</scope></dependency></dependencies><build><pluginManagement><plugins><plugin><artifactId>maven-clean-plugin</artifactId><version>3.1.0</version></plugin><plugin><artifactId>maven-resources-plugin</artifactId><version>3.0.2</version></plugin><plugin><artifactId>maven-compiler-plugin</artifactId><version>3.8.0</version></plugin><plugin><artifactId>maven-surefire-plugin</artifactId><version>2.22.1</version></plugin><plugin><artifactId>maven-jar-plugin</artifactId><version>3.0.2</version></plugin></plugins></pluginManagement></build>
</project>

二、PI例子

java重新编写scala的PI例子

package com.mk;import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.SparkSession;import java.util.ArrayList;
import java.util.List;public class App 
{public static void main( String[] args ){SparkConf sparkConf = new SparkConf();if(System.getProperty("os.name").toLowerCase().contains("win")) {sparkConf.setMaster("local[2]");//本地模拟System.out.println("使用本地模拟是spark");}SparkSession session = SparkSession.builder().appName("Pi").config(sparkConf).config(sparkConf).getOrCreate();int slices =2;int n = (int)Math.min(100_000L * slices, Integer.MAX_VALUE);JavaSparkContext sparkContext = new JavaSparkContext(session.sparkContext());List<Integer> list = new ArrayList<>(n);for (int i = 0; i < n; i++)list.add(i + 1);int count  = sparkContext.parallelize(list, slices).map(v -> {double x = Math.random() * 2 - 1;double y = Math.random() * 2 - 1;if (x * x + y * y < 1)return 1;return 0;}).reduce((Integer a, Integer b) ->a+b);System.out.println("PI:"+  4.0 * count / n);session.stop();}
}

三、直接在idea本地运行

输出PI

四、spark集群提交

项目打成jar，把spark-test.jar上传到~目录，执行shell命令

~/software/spark-2.4.4-bin-hadoop2.6/bin/spark-submit --master spark://hadoop01:7077,hadoop02:7077,hadoop03:7077 --class com.mk.App ~/spark-test.jar

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/322704.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

Spark入门（三）Idea构建spark项目

一、依赖包配置

二、PI例子

三、直接在idea本地运行

四、spark集群提交

相关文章

VS Tools for AI全攻略

欢乐纪中B组周五模拟赛【2019.3.8】

每日一问8-17

Spark入门（四）Idea远程提交项目到spark集群

jzoj4742-单峰【数学,数论】

关于负载均衡的一切：总结与思考

Vue父组件使用子组件时，需要携带参数，函数内如何获取子组件给的值

Spark入门（五）Spark SQL shell启动方式(元数据存储在derby)

jzoj4743-积木【状压dp】

Spark入门（六）Spark SQL shell启动方式(元数据存储在mysql)

Asp.NET Core2.0 项目实战入门视频课程_完整版

Js中等于号使用

jzoj4745-看电影【期望概率,数学】

VS Tools for AI全攻略（2）低配置虚拟机也能玩转深度学习，无需NC/NV系列

Spark入门（七）Spark SQL thriftserver/beeline启动方式

array关于map,reduce,filter的用法

欢乐纪中B组周六赛【2019.3.9】

【青岛】12月16日.NETCore与AI技术交流会-等你来哦！！

MyBatis Generator分页插件RowBoundsPlugin坑

对于自绝对父相的理解