1、简介
Flume 自带 Source 有 Avro、Thrift、Netcat、Taildir、Kafka、Http等,有些场合比如我们指定访问接口获取数据当做 Flume 的 Source,像这种定制化的 Source 需要我们自己实现,下面我将介绍如何自定义实现 Source。
2、自定义实现 Flume 的 Source
使用自定义 Source ,访问自身的web 服务,并且发送至 logger 的 Sink。
2.1、引入依赖
<dependencies><dependency><groupId>org.apache.flume</groupId><artifactId>flume-ng-core</artifactId><version>1.11.0</version></dependency>
</dependencies><build><plugins><plugin><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId><version>3.2.1</version><configuration><skip>true</skip></configuration></plugin></plugins>
</build>
2.2、自定义Source
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.PollableSource;
import org.apache.flume.conf.Configurable;
import org.apache.flume.event.EventBuilder;
import org.apache.flume.source.AbstractSource;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;public class MySource extends AbstractSource implements Configurable, PollableSource {private String path;private final static Logger log = LoggerFactory.getLogger(MySource.class);@Overridepublic Status process() throws EventDeliveryException {Status status = null;try{URL url = new URL(this.path);HttpURLConnection connection = (HttpURLConnection) url.openConnection();connection.setRequestMethod("GET");connection.connect();InputStream is = connection.getInputStream();BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(is, "utf-8"));StringBuffer stringBuffer = new StringBuffer();String line = null;while ((line = bufferedReader.readLine()) != null){stringBuffer.append(line);stringBuffer.append("\t");}log.info("test =======================: {}" , stringBuffer.toString());Event event = EventBuilder.withBody(stringBuffer.toString().getBytes());getChannelProcessor().processEvent(event);status = Status.READY;// 2s调用一次Thread.sleep(2000);}catch (Exception ex){log.info("出错了!, {}", ex.getMessage());status = Status.BACKOFF;}return status;}@Overridepublic long getBackOffSleepIncrement() {return 0;}@Overridepublic long getMaxBackOffSleepInterval() {return 0;}@Overridepublic void configure(Context context) {// 从配置文件中读取String path = context.getString("path", "http://baidu.com");log.info("path ==========================: {}", path);this.path = path;}
}
2.3、将自定义Source 打包放到 flume 的 lib目录下
2.4、flume 的配置文件编写
vim flume-self-source.conf
a1.sources = r1
a1.channels = c1
a1.sinks=k1
# source
a1.sources.r1.type = com.weilong.flumeselfdefinition.MySource # 自定义 Source 的全限定类名
a1.sources.r1.path = http://192.168.30.33:8088/hello # 自定义参数
# channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Sink
a1.sinks.k1.type = logger
# bind
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
2.5、测试
2.5.1、启动web服务
2.5.2、启动flume
bin/flume-ng agent -c conf/ -n a1 -f testconf/flume-self-source.conf -Dflume.root.logger=INFO,console
2.5.3、结果
3、总结
本文详细介绍 Flume 如何实现自定义 Source,帮助大家进一步了解Flume的使用。
本人是一个从小白自学计算机技术,对运维、后端、各种中间件技术、大数据等有一定的学习心得,想获取自学总结资料(pdf版本)或者希望共同学习,关注微信公众号:it自学社团。后台回复相应技术名称/技术点即可获得。(本人学习宗旨:学会了就要免费分享)