19 使用MapReduce编程统计超市1月商品被购买的次数

首先将1月份的订单数据上传到HDFS上，订单数据格式 ID Goods两个数据字段构成

将订单数据保存在order.txt中，（上传前记得启动集群）。

打开Idea创建项目

修改pom.xml，添加依赖

<dependencies><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-client</artifactId><version>3.1.4</version></dependency><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>4.12</version></dependency><dependency><groupId>org.slf4j</groupId><artifactId>slf4j-log4j12</artifactId><version>1.7.30</version></dependency>
</dependencies>

指定打包方式：jar

打包时插件的配置：

<build><plugins><plugin><artifactId>maven-compiler-plugin</artifactId><version>3.1</version><configuration><source>1.8</source><target>1.8</target></configuration></plugin><plugin><artifactId>maven-assembly-plugin</artifactId><configuration><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs></configuration><executions><execution><id>make-assembly</id><phase>package</phase></execution></executions></plugin></plugins>
</build>

在resources目录下新建log4j文件log4j.properties

log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
log4j.appender.logfile=org.apache.log4j.FileAppender
log4j.appender.logfile.File=D:\\ordercount.log
log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n

在com.maidu.ordercount包中创建一个新类ShoppingOrderCount类，编写以下模块

1.Mapper模块的编写

在ShoppingOrderCount中定义一个内部类MyMapper

public static class MyMap extends Mapper<Object,Text, Text, IntWritable>{@Overridepublic void map(Object key,Text value,Context context) throws IOException ,InterruptedException {String line =value.toString();String[] arr =line.split(" "); //3 水果    水果作为键    值 1(数量1 不是 3 表示用户编号)if(arr.length==2){context.write( new Text(arr[1]),new IntWritable(1)  );}}
}

2.Reducer模块的编写

在ShoppingOrderCount中定义一个内部类MyReduce

public static class MyReduce extends Reducer<Text,IntWritable,Text,IntWritable>{@Overrideprotected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {int count =0;for(IntWritable val:values){count++;}context.write(key,new IntWritable(count));}
}

3.Driver模块的编写

在ShoppingOrderCount类中编写主方法

public static void main(String[] args) throws Exception{Configuration conf =new Configuration();String []otherArgs =new GenericOptionsParser(conf,args).getRemainingArgs();if(otherArgs.length<2){System.out.println("必须输入读取文件路径和输出文件路径");System.exit(2);}Job job = Job.getInstance(conf,"order count");job.setJarByClass(ShoppingOrderCount.class);job.setMapperClass(MyMap.class);job.setReducerClass(MyReduce.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);//添加输入的路径for(int i =0;i<otherArgs.length-1;i++){FileInputFormat.addInputPath(job,new Path(otherArgs[i]));}//设置输出路径FileOutputFormat.setOutputPath(job,new Path(otherArgs[otherArgs.length-1]));//执行任务System.exit( job.waitForCompletion(true)?0:1 );}