Spring Batch Tasklet example using spring boot - Walking Techie

Blog about Java programming, Design Pattern, and Data Structure.

Tuesday, March 28, 2017

Spring Batch Tasklet example using spring boot

Tasklet is an interface in the spring batch. This is used for processing a single operation and clean resources after or before step started or completed.

Project structure

This is a directory structure of the standard gradle project.

Spring Batch Tasklet Project Structure

Project dependencies

task wrapper(type: Wrapper) {
    gradleVersion = '3.2.1'
}

apply plugin: 'java'
apply plugin: 'eclipse'
apply plugin: 'org.springframework.boot'

version = '0.0.1'
sourceCompatibility = 1.8

repositories {
    mavenLocal()
    mavenCentral()
}


dependencies {
    compileOnly('org.projectlombok:lombok:1.16.12')
    compile('org.springframework.boot:spring-boot-starter-batch:1.5.2.RELEASE')
    testCompile('org.springframework.boot:spring-boot-starter-test:1.5.2.RELEASE')
}
buildscript {
    repositories {
        mavenLocal()
        jcenter()
    }
    dependencies {
        classpath "org.springframework.boot:spring-boot-gradle-plugin:1.5.2.RELEASE"
    }
}

application.properties file

#empty

Spring Batch Jobs

CSV files

1,facebook.com
2,yahoo.com
3,google.com
200,walkingtechie.blogspot.com
300,stackoverflow.com
400,oracle.com
999,eclipse.org
888,baidu.com
777,twitter.com

Create two step task job, first step is to read from multiple CSV files and write into a single CSV file, second step is to delete all CSV files.

package com.walking.techie.taskletstep.jobs;

import com.walking.techie.taskletstep.model.Domain;
import com.walking.techie.taskletstep.tasklet.FileDeletingTasklet;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.FlatFileItemWriter;
import org.springframework.batch.item.file.MultiResourceItemReader;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor;
import org.springframework.batch.item.file.transform.DelimitedLineAggregator;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;

@Configuration
@EnableBatchProcessing
public class ReadMultiFileJob {

  @Autowired
  private JobBuilderFactory jobBuilderFactory;
  @Autowired
  private StepBuilderFactory stepBuilderFactory;

  @Value("csv/domain*.csv")
  private Resource[] resources;

  @Value("output/domain.all.csv")
  private Resource resource;

  @Value("csv/")
  private Resource directory;

  @Bean
  public Job readFiles() {
    return jobBuilderFactory.get("readFiles").incrementer(new RunIdIncrementer()).
        flow(step1()).next(step2()).end().build();
  }

  @Bean
  public Step step1() {
    return stepBuilderFactory.get("step1").<Domain, Domain>chunk(10)
        .reader(multiResourceItemReader()).writer(writer()).build();
  }

  @Bean
  public Step step2() {
    return stepBuilderFactory.get("step2").tasklet(fileDeletingTasklet()).build();
  }

  @Bean
  public FileDeletingTasklet fileDeletingTasklet() {
    FileDeletingTasklet tasklet = new FileDeletingTasklet();
    tasklet.setDirectory(directory);
    return tasklet;
  }

  @Bean
  public MultiResourceItemReader<Domain> multiResourceItemReader() {
    MultiResourceItemReader<Domain> resourceItemReader = new MultiResourceItemReader<Domain>();
    resourceItemReader.setResources(resources);
    resourceItemReader.setDelegate(reader());
    return resourceItemReader;
  }

  @Bean
  public FlatFileItemReader<Domain> reader() {
    FlatFileItemReader<Domain> reader = new FlatFileItemReader<Domain>();
    reader.setLineMapper(new DefaultLineMapper() {{
      setLineTokenizer(new DelimitedLineTokenizer() {{
        setNames(new String[]{"id", "domain"});
      }});
      setFieldSetMapper(new BeanWrapperFieldSetMapper<Domain>() {{
        setTargetType(Domain.class);
      }});
    }});
    return reader;
  }

  @Bean
  public FlatFileItemWriter<Domain> writer() {
    FlatFileItemWriter<Domain> writer = new FlatFileItemWriter<>();
    writer.setResource(resource);
    writer.setLineAggregator(new DelimitedLineAggregator<Domain>() {{
      setDelimiter(",");
      setFieldExtractor(new BeanWrapperFieldExtractor<Domain>() {{
        setNames(new String[]{"id", "domain"});
      }});
    }});
    return writer;
  }
}

Map CSV file values to Domain object and write to one CSV file.

A Java model class

package com.walking.techie.taskletstep.model;

import lombok.Data;

@Data
public class Domain {

  int id;
  String domain;
}

Tasklet, In below java code it will delete all the files of the given directories.

package com.walking.techie.taskletstep.tasklet;

import java.io.File;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.UnexpectedJobExecutionException;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.core.io.Resource;
import org.springframework.util.Assert;

public class FileDeletingTasklet implements Tasklet, InitializingBean {

  private Resource directory;

  @Override
  public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)
      throws Exception {
    File dir = directory.getFile();
    Assert.state(dir.isDirectory());

    File[] files = dir.listFiles();
    for (int i = 0; i < files.length; i++) {
      boolean deleted = files[i].delete();
      if (!deleted) {
        throw new UnexpectedJobExecutionException(
            "Could not delete file " + files[i].getPath());
      } else {
        System.out.println(files[i].getPath() + " is deleted!");
      }
    }
    return RepeatStatus.FINISHED;
  }

  @Override
  public void afterPropertiesSet() throws Exception {
    Assert.notNull(directory, "directory must be set");
  }

  public Resource getDirectory() {
    return directory;
  }

  public void setDirectory(Resource directory) {
    this.directory = directory;
  }
}

Run Application

package com.walking.techie;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;

@SpringBootApplication(exclude = DataSourceAutoConfiguration.class)
public class Application {

  public static void main(String[] args) {
    SpringApplication.run(Application.class, args);
  }
}

Output

Output of the application will store in build/resources/main/output/domain.all.csv
1,facebook.com
2,yahoo.com
3,google.com
200,walkingtechie.blogspot.com
300,stackoverflow.com
400,oracle.com
999,eclipse.org
888,baidu.com
777,twitter.com

output in console

2017-03-26 13:11:31.438  INFO 9585 --- [           main] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=readFiles]] launched with the following parameters: [{run.id=1}]
2017-03-26 13:11:31.455  INFO 9585 --- [           main] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step1]
2017-03-26 13:11:31.498  INFO 9585 --- [           main] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step2]
/Users/santoshkumar/Desktop/blogger/Spring boot/spring-batch-tasklet/build/resources/main/csv/domain-1-03-2017.csv is deleted!
/Users/santoshkumar/Desktop/blogger/Spring boot/spring-batch-tasklet/build/resources/main/csv/domain-2-03-2017.csv is deleted!
/Users/santoshkumar/Desktop/blogger/Spring boot/spring-batch-tasklet/build/resources/main/csv/domain-3-03-2017.csv is deleted!
2017-03-26 13:11:31.506  INFO 9585 --- [           main] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=readFiles]] completed with the following parameters: [{run.id=1}] and the following status: [COMPLETED]

Note : This code has been compiled and run on mac notebook and intellij IDEA.

3 comments :

  1. Thanks. Could you please guide me on the below two queries ?
    1. https://stackoverflow.com/questions/51233667/what-is-the-best-or-correct-way-to-write-the-large-query-in-the-jdbccursoritemre/51234011#51234011
    2. https://stackoverflow.com/questions/51234502/consider-marking-one-of-the-beans-as-primary-updating-the-consumer-to-accept-m

    ReplyDelete