启动hdfs
略
http://blog.csdn.net/zengmingen/article/details/53006541
启动spark
略
安装:http://blog.csdn.net/zengmingen/article/details/72123717
spark-shell:http://blog.csdn.net/zengmingen/article/details/72162821
准备数据
vi wordcount.txt
hello zeng
hello miao
hello gen
hello zeng
hello wen
hello biao
zeng miao gen
zeng wen biao
lu ting ting
zhang xiao zhu
chang sheng xiang qi lai
zhu ye su ai ni
上传到hdfs
hdfs dfs -put wordcount.txt /
编写代码
用scala语言,在spark-shell命令窗下
sc.textFile("hdfs://nbdo1:9000/wordcount.txt")
.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_)
.saveAsTextFile("hdfs://nbdo1:9000/out")
运行结果
补充:
将运行结果保存到一个文件。点击阅读扩展
代码:
sc.textFile("hdfs://nbdo1:9000/wordcount.txt")
.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_)
.coalesce(1,true).saveAsTextFile("hdfs://nbdo1:9000/out2")
运行结果
-------------
更多的Java,Android,大数据,J2EE,Python,数据库,Linux,Java架构师,教程,视频请访问:
http://www.cnblogs.com/zengmiaogen/p/7083694.html