In this post, we will show you how to configure a Spring Batch job to read data from multiple CSV file and write into single csv file.
Project structure
This is a directory structure of the standard gradle project.
Project dependencies
task wrapper(type: Wrapper) { gradleVersion = '3.2.1' } apply plugin: 'java' apply plugin: 'eclipse' apply plugin: 'org.springframework.boot' sourceCompatibility = 1.8 repositories { mavenLocal() mavenCentral() } dependencies { compileOnly('org.projectlombok:lombok:1.16.12') compile('org.springframework.boot:spring-boot-starter-batch:1.5.2.RELEASE') testCompile('org.springframework.boot:spring-boot-starter-test:1.5.2.RELEASE') } buildscript { repositories { mavenLocal() jcenter() } dependencies { classpath "org.springframework.boot:spring-boot-gradle-plugin:1.5.2.RELEASE" } }
application.properties file
#empty
Spring Batch Jobs
CSV files
1,facebook.com 2,yahoo.com 3,google.com
200,walkingtechie.blogspot.com 300,stackoverflow.com 400,oracle.com
999,eclipse.org 888,baidu.com 777,twitter.com
Create a job which will read from multiple CSV files and write into a single CSV file
package com.walking.techie.multiresource.jobs; import com.walking.techie.multiresource.model.Domain; import org.springframework.batch.core.Job; import org.springframework.batch.core.Step; import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.core.launch.support.RunIdIncrementer; import org.springframework.batch.item.file.FlatFileItemReader; import org.springframework.batch.item.file.FlatFileItemWriter; import org.springframework.batch.item.file.MultiResourceItemReader; import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper; import org.springframework.batch.item.file.mapping.DefaultLineMapper; import org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor; import org.springframework.batch.item.file.transform.DelimitedLineAggregator; import org.springframework.batch.item.file.transform.DelimitedLineTokenizer; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Value; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.core.io.FileSystemResource; import org.springframework.core.io.Resource; @Configuration @EnableBatchProcessing public class ReadMultiFileJob { @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; @Value("csv/inputs/domain*.csv") private Resource[] resources; @Bean public Job readFiles() { return jobBuilderFactory.get("readFiles").incrementer(new RunIdIncrementer()). flow(step1()).end().build(); } @Bean public Step step1() { return stepBuilderFactory.get("step1").<Domain, Domain>chunk(10) .reader(multiResourceItemReader()).writer(writer()).build(); } @Bean public MultiResourceItemReader<Domain> multiResourceItemReader() { MultiResourceItemReader<Domain> resourceItemReader = new MultiResourceItemReader<Domain>(); resourceItemReader.setResources(resources); resourceItemReader.setDelegate(reader()); return resourceItemReader; } @Bean public FlatFileItemReader<Domain> reader() { FlatFileItemReader<Domain> reader = new FlatFileItemReader<Domain>(); reader.setLineMapper(new DefaultLineMapper() {{ setLineTokenizer(new DelimitedLineTokenizer() {{ setNames(new String[]{"id", "domain"}); }}); setFieldSetMapper(new BeanWrapperFieldSetMapper<Domain>() {{ setTargetType(Domain.class); }}); }}); return reader; } @Bean public FlatFileItemWriter<Domain> writer() { FlatFileItemWriter<Domain> writer = new FlatFileItemWriter<>(); writer.setResource(new FileSystemResource("output/domain.all.csv")); writer.setLineAggregator(new DelimitedLineAggregator<Domain>() {{ setDelimiter(","); setFieldExtractor(new BeanWrapperFieldExtractor<Domain>() {{ setNames(new String[]{"id", "domain"}); }}); }}); return writer; } }
Map CSV file values to Domain
object and write to one CSV file.
A Java model class
package com.walking.techie.multiresource.model; import lombok.Data; @Data public class Domain { int id; String domain; }
Run Application
package com.walking.techie; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration; @SpringBootApplication(exclude = DataSourceAutoConfiguration.class) public class Application { public static void main(String[] args) { SpringApplication.run(Application.class, args); } }
Output
Output of the application will store in output/domain.all.csv
1,facebook.com 2,yahoo.com 3,google.com 200,walkingtechie.blogspot.com 300,stackoverflow.com 400,oracle.com 999,eclipse.org 888,baidu.com 777,twitter.com
Note : This code has been compiled and run on mac notebook and intellij IDEA.
@Value("csv/inputs/domain*.csv")
ReplyDeleteprivate Resource[] resources;
This above piece of code will not work and will throw an exception. It should get a String[] and need to iterate it and convert it to a Resource[]
@Value("csv/inputs/domain*.csv")
private String[] resources;
for (String file : resources) {
UrlResource urlResource = new UrlResource(file);
resourceList.add(urlResource);
}
code in article work as expected this is not required.
DeleteIt was throwing an error..John Doe
ReplyDeletecan you share your stack trace?
Delete