Java面试必备之JVM+GC教程
这几天闲着在优锐课的java学习必备中学习了,在本文中,了解如何使用Spring Batch通过StaxEventItemReader使用ItemReader读取XML文件并将其数据写入NoSQL。
在本文中,我们将向展示如何使用Spring Batch使用StaxEventItemReader和ItemReader读取XML文件,以及如何使用带有JpaRepository的Custom ItemWriter将其数据写入NoSQL。在这里,我们使用了MongoDB。
自定义ItemReader或ItemWriter是一个类,我们在其中编写自己的读取或写入数据的方式。在Custom Reader中,我们也需要处理分块逻辑。如果我们的读取逻辑很复杂并且无法使用spring提供的Default ItemReader进行处理,那么这将很方便。
使用的工具和库:
1. Maven 3.5+
2. Spring Batch Starter
3. Spring OXM
4. Data Mongodb starter
5. xstream
Maven依赖关系- 需要配置项目。
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><p><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-parent</artifactId><version>2.2.2.RELEASE</version><relativePath ></relativePath> <!-- lookup parent from repository --></parent><groupId>com.example</groupId><artifactId>spring-batch-mongodb</artifactId><version>0.0.1-SNAPSHOT</version><name>spring-batch-mongodb</name><description>Demo project for Spring Boot</description><p><java.version>1.8</java.version><maven-jar-plugin.version>3.1.1</maven-jar-plugin.version></properties><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-batch</artifactId></dependency><dependency><groupId>org.springframework</groupId><artifactId>spring-oxm</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-mongodb</artifactId></dependency><dependency><groupId>com.thoughtworks.xstream</groupId><artifactId>xstream</artifactId><version>1.4.7</version></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><optional>true</optional></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-test</artifactId><scope>test</scope><exclusions><exclusion><groupId>org.junit.vintage</groupId><artifactId>junit-vintage-engine</artifactId></exclusion></exclusions></dependency><dependency><groupId>org.springframework.batch</groupId><artifactId>spring-batch-test</artifactId><scope>test</scope></dependency><dependency><groupId>com.h2database</groupId><artifactId>h2</artifactId><scope>runtime</scope></dependency></dependencies><build><p><p><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId></plugin></plugins></build>
</project>
CustomerWriter — 这是我们创建的自定义写入器,用于将客户数据写入MongoDB。自定义编写器也提供执行复杂操作的功能。
package com.example.writer;
import java.util.List;
import org.springframework.batch.item.ItemWriter;
import org.springframework.beans.factory.annotation.Autowired;
import com.example.domain.Customer;
import com.example.repository.CustomerRepository;
public class CustomerWriter implements ItemWriter<Customer>{@Autowiredprivate CustomerRepository customerRepository;@Overridepublic void write(List<? extends Customer> customers) throws Exception {customerRepository.saveAll(customers);}
}
CustomerRepository — 这是一个Mongo存储库,可与Mongo数据库进行对话并执行操作以取回数据。
package com.example.repository;
import org.springframework.data.mongodb.repository.MongoRepository;
import com.example.domain.Customer;
public interface CustomerRepository extends MongoRepository<Customer, String>{
}
客户 —这是一个包含业务数据的Mongo文档类。
package com.example.domain;
import java.time.LocalDate;
import javax.xml.bind.annotation.XmlRootElement;
import org.springframework.data.annotation.Id;
import org.springframework.data.mongodb.core.mapping.Document;
import org.springframework.data.mongodb.core.mapping.Field;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
@AllArgsConstructor
@NoArgsConstructor
@Builder
@Data
@XmlRootElement(name = "Customer")
@Document
public class Customer {@Idprivate Long id;@Fieldprivate String firstName;@Fieldprivate String lastName;@Fieldprivate LocalDate birthdate;
}
CustomerConverter — 我们已经实现了Converter接口。此类用于Converter实现,并且负责将Java对象与文本数据进行编组。如果在处理期间发生异常,则应引发ConversionException。如果使用高级com.thoughtworks.xstream.XStream门面,则可以使用XStream.registerConverter()方法注册新的转换器。
package com.example.config;
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import com.example.domain.Customer;
import com.thoughtworks.xstream.converters.Converter;
import com.thoughtworks.xstream.converters.MarshallingContext;
import com.thoughtworks.xstream.converters.UnmarshallingContext;
import com.thoughtworks.xstream.io.HierarchicalStreamReader;
import com.thoughtworks.xstream.io.HierarchicalStreamWriter;
public class CustomerConverter implements Converter {private static final DateTimeFormatter DT_FORMATTER = DateTimeFormatter.ofPattern("dd-MM-yyyy HH:mm:ss");@Overridepublic boolean canConvert(Class type) {return type.equals(Customer.class);}@Overridepublic void marshal(Object source, HierarchicalStreamWriter writer, MarshallingContext context) {// Don't do anything}@Overridepublic Object unmarshal(HierarchicalStreamReader reader, UnmarshallingContext context) {reader.moveDown();Customer customer = new Customer();customer.setId(Long.valueOf(reader.getValue()));reader.moveUp();reader.moveDown();customer.setFirstName(reader.getValue());reader.moveUp();reader.moveDown();customer.setLastName(reader.getValue());reader.moveUp();reader.moveDown();customer.setBirthdate(LocalDate.parse(reader.getValue(), DT_FORMATTER));return customer;}
}
JobConfiguration- 这是负责执行批处理作业的主要类。在此类中,我们使用了各种Bean来执行单独的任务。
StaxEventItemReader — 用于读取基于StAX的XML输入的项目读取器。它从输入的XML文档中提取片段,该片段对应于要处理的记录。片段用StartDocument和EndDocument事件包装,以便可以像独立XML文档一样对片段进行进一步处理。该实现不是线程安全的。
CustomerWriter —这是一个自定义类,可将数据写入MongoDB。
step1 —此步骤配置ItemReader和ItemWriter,但是ItemProcessor是可选步骤,我们已跳过。
作业- 代表作业的批处理域对象。Job是一个显式抽象,表示开发人员指定的作业配置。应当注意,重新启动策略是整体上应用的,而不是步骤。
package com.example.config;
import java.util.HashMap;
import java.util.Map;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.item.xml.StaxEventItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.oxm.xstream.XStreamMarshaller;
import com.example.domain.Customer;
import com.example.writer.CustomerWriter;
@Configuration
public class JobConfiguration {@Autowiredprivate JobBuilderFactory jobBuilderFactory;@Autowiredprivate StepBuilderFactory stepBuilderFactory;@Beanpublic StaxEventItemReader<Customer> customerItemReader(){Map<String, Class> aliases = new HashMap<>();aliases.put("customer", Customer.class);CustomerConverter converter = new CustomerConverter();XStreamMarshaller ummarshaller = new XStreamMarshaller();ummarshaller.setAliases(aliases);ummarshaller.setConverters(converter);StaxEventItemReader<Customer> reader = new StaxEventItemReader<>();reader.setResource(new ClassPathResource("/data/customer.xml"));reader.setFragmentRootElementName("customer")reader.setUnmarshaller(ummarshaller);return reader;}@Beanpublic CustomerWriter customerWriter() {return new CustomerWriter();}@Beanpublic Step step1() throws Exception {return stepBuilderFactory.get("step1").<Customer, Customer>chunk(200).reader(customerItemReader()).writer(customerWriter()).build();}@Beanpublic Job job() throws Exception {return jobBuilderFactory.get("job").start(step1()).build();}
}
application.properties
spring.data.mongodb.host=localhost
spring.data.mongodb.port=27017
Customer.xml —这是Spring Batch读取的示例数据。
<?xml version="1.0" encoding="UTF-8" ?>
<customers><customer><id>1</id><firstName>John</firstName><lastName>Doe</lastName><birthdate>10-10-1988 19:43:23</birthdate></customer><customer><id>2</id><firstName>James</firstName><lastName>Moss</lastName><birthdate>01-04-1991 10:20:23</birthdate></customer><customer><id>3</id><firstName>Jonie</firstName><lastName>Gamble</lastName><birthdate>21-07-1982 11:12:13</birthdate></customer><customer><id>4</id><firstName>Mary</firstName><lastName>Kline</lastName><birthdate>07-08-1973 11:27:42</birthdate></customer><customer><id>5</id><firstName>William</firstName><lastName>Lockhart</lastName><birthdate>04-04-1994 04:15:11</birthdate></customer><customer><id>6</id><firstName>John</firstName><lastName>Doe</lastName><birthdate>10-10-1988 19:43:23</birthdate></customer><customer><id>7</id><firstName>Kristi</firstName><lastName>Dukes</lastName><birthdate>17-09-1983 21:22:23</birthdate></customer><customer><id>8</id><firstName>Angel</firstName><lastName>Porter</lastName><birthdate>15-12-1980 18:09:09</birthdate></customer><customer><id>9</id><firstName>Mary</firstName><lastName>Johnston</lastName><birthdate>07-07-1987 19:43:03</birthdate></customer><customer><id>10</id><firstName>Linda</firstName><lastName>Rodriguez</lastName><birthdate>16-09-1991 09:13:43</birthdate></customer><customer><id>11</id><firstName>Phillip</firstName><lastName>Lopez</lastName><birthdate>18-12-1965 11:10:09</birthdate></customer><customer><id>12</id><firstName>Peter</firstName><lastName>Dixon</lastName><birthdate>09-05-1996 19:09:23</birthdate></customer>
</customers>
MainApp — SpringBatchMongodbApplication可以作为Spring Boot项目运行。
package com.example;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;
import org.springframework.data.mongodb.repository.config.EnableMongoRepositories;
@SpringBootApplication(exclude = {DataSourceAutoConfiguration.class})
@EnableBatchProcessing
@EnableMongoRepositories(basePackages = "com.example.repository")
public class SpringBatchMongodbApplication {public static void main(String[] args) {SpringApplication.run(SpringBatchMongodbApplication.class, args);}
}
输出:我们可以得出结论,Spring Batch已经使用建议的模式/文档类型读取了数据并将其写入MongoDB。
db.getCollection('customer').find({})
/* 1 */
{"_id" : NumberLong(1),"firstName" : "John","lastName" : "Doe","birthdate" : ISODate("1988-10-09T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 2 */
{"_id" : NumberLong(2),"firstName" : "James","lastName" : "Moss","birthdate" : ISODate("1991-03-31T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 3 */
{"_id" : NumberLong(3),"firstName" : "Jonie","lastName" : "Gamble","birthdate" : ISODate("1982-07-20T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 4 */
{"_id" : NumberLong(4),"firstName" : "Mary","lastName" : "Kline","birthdate" : ISODate("1973-08-06T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 5 */
{"_id" : NumberLong(5),"firstName" : "William","lastName" : "Lockhart","birthdate" : ISODate("1994-04-03T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 6 */
{"_id" : NumberLong(6),"firstName" : "John","lastName" : "Doe","birthdate" : ISODate("1988-10-09T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 7 */
{"_id" : NumberLong(7),"firstName" : "Kristi","lastName" : "Dukes","birthdate" : ISODate("1983-09-16T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 8 */
{"_id" : NumberLong(8),"firstName" : "Angel","lastName" : "Porter","birthdate" : ISODate("1980-12-14T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 9 */
{"_id" : NumberLong(9),"firstName" : "Mary","lastName" : "Johnston","birthdate" : ISODate("1987-07-06T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 10 */
{"_id" : NumberLong(10),"firstName" : "Linda","lastName" : "Rodriguez","birthdate" : ISODate("1991-09-15T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 11 */
{"_id" : NumberLong(11),"firstName" : "Phillip","lastName" : "Lopez","birthdate" : ISODate("1965-12-17T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
/* 12 */
{"_id" : NumberLong(12),"firstName" : "Peter","lastName" : "Dixon","birthdate" : ISODate("1996-05-08T18:30:00.000Z"),"_class" : "com.example.domain.Customer"
}
> 喜欢这篇文章的可以点个赞,欢迎大家留言评论,记得关注我,每天持续更新技术干货、职场趣事、海量面试资料等等
> 如果你对java技术很感兴趣也可以交流学习,共同学习进步。
> 不要再用"没有时间“来掩饰自己思想上的懒惰!趁年轻,使劲拼,给未来的自己一个交代
文章写道这里,欢迎完善交流。最后奉上近期整理出来的一套完整的java架构思维导图,分享给大家对照知识点参考学习。有更多JVM、Mysql、Tomcat、Spring Boot、Spring Cloud、Zookeeper、Kafka、RabbitMQ、RockerMQ、Redis、ELK、Git等Java干货