shp파일등록방법변경
This commit is contained in:
@@ -4,27 +4,51 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
## Project Overview
|
||||
|
||||
Spring Boot CLI application that queries PostgreSQL PostGIS spatial data and converts it to ESRI shapefiles and GeoJSON. The application processes AI inference results from the KAMCO database and generates geographic data files for visualization in GIS applications. It also supports automatic registration of shapefiles to GeoServer via REST API.
|
||||
Spring Boot 3.5.7 CLI application that converts PostgreSQL PostGIS spatial data to ESRI shapefiles and GeoJSON formats. The application uses **Spring Batch** for memory-efficient processing of large datasets (1M+ records) and supports automatic GeoServer layer registration via REST API.
|
||||
|
||||
**Key Features**:
|
||||
- Memory-optimized batch processing (90-95% reduction: 2-13GB → 150-200MB)
|
||||
- Chunk-based streaming with cursor pagination (fetch-size: 1000)
|
||||
- Automatic geometry validation and type conversion (MultiPolygon → Polygon)
|
||||
- Coordinate system validation (EPSG:5186 Korean 2000 / Central Belt)
|
||||
- Dual execution modes: Spring Batch (recommended) and Legacy mode
|
||||
|
||||
## Build and Run Commands
|
||||
|
||||
### Build
|
||||
```bash
|
||||
./gradlew build
|
||||
./gradlew build # Full build with tests
|
||||
./gradlew clean build -x test # Skip tests
|
||||
./gradlew spotlessApply # Apply Google Java Format (2-space indentation)
|
||||
./gradlew spotlessCheck # Verify formatting without applying
|
||||
```
|
||||
|
||||
The built JAR will be named `shp-exporter.jar` (configured in `bootJar` task).
|
||||
Output: `build/libs/shp-exporter.jar` (fixed name, no version suffix)
|
||||
|
||||
### Run Application
|
||||
|
||||
#### Generate Shapefiles
|
||||
#### Spring Batch Mode (Recommended)
|
||||
```bash
|
||||
./gradlew bootRun
|
||||
# Generate shapefile + GeoJSON
|
||||
./gradlew bootRun --args="--batch --converter.batch-ids[0]=252"
|
||||
|
||||
# With GeoServer registration
|
||||
export GEOSERVER_USERNAME=admin
|
||||
export GEOSERVER_PASSWORD=geoserver
|
||||
./gradlew bootRun --args="--batch --geoserver.enabled=true --converter.batch-ids[0]=252"
|
||||
|
||||
# Using JAR (production)
|
||||
java -jar build/libs/shp-exporter.jar \
|
||||
--batch \
|
||||
--converter.inference-id=D5E46F60FC40B1A8BE0CD1F3547AA6 \
|
||||
--converter.batch-ids[0]=252 \
|
||||
--converter.batch-ids[1]=253
|
||||
```
|
||||
|
||||
Or using JAR:
|
||||
#### Legacy Mode (Small Datasets Only)
|
||||
```bash
|
||||
java -jar build/libs/shp-exporter.jar
|
||||
./gradlew bootRun # No --batch flag
|
||||
# Warning: May OOM on large datasets
|
||||
```
|
||||
|
||||
#### Upload Shapefile to GeoServer
|
||||
@@ -71,122 +95,410 @@ By default, the application runs with `spring.profiles.active=prod` (set in `app
|
||||
|
||||
## Architecture
|
||||
|
||||
### Processing Pipeline
|
||||
The application follows a layered architecture with a linear data flow:
|
||||
### Dual Execution Modes
|
||||
|
||||
1. **CLI Entry** (`ConverterCommandLineRunner`) → Parses command-line args and routes to either shapefile generation or GeoServer upload
|
||||
2. **Service Orchestration** (`ShapefileConverterService`) → Coordinates the conversion workflow based on mode (MERGED, MAP_IDS, or RESOLVE)
|
||||
3. **Data Access** (`InferenceResultRepository`) → Queries PostGIS database using `PreparedStatementCreator` for PostgreSQL array parameters
|
||||
4. **Geometry Conversion** (`GeometryConverter`) → Converts PostGIS WKT format to JTS Geometry objects using `WKTReader`
|
||||
5. **File Writing** (`ShapefileWriter`, `GeoJsonWriter`, `ResultZipWriter`) → Generates output files using GeoTools
|
||||
6. **GeoServer Integration** (`GeoServerRegistrationService`) → Registers shapefiles to GeoServer via REST API (optional)
|
||||
The application supports two execution modes with distinct processing pipelines:
|
||||
|
||||
### Key Design Points
|
||||
#### Spring Batch Mode (Recommended)
|
||||
**Trigger**: `--batch` flag
|
||||
**Use Case**: Large datasets (100K+ records), production workloads
|
||||
**Memory**: 150-200MB constant (chunk-based streaming)
|
||||
|
||||
**Conversion Modes**: The application supports three execution modes controlled by `converter.mode`:
|
||||
- `MERGED`: Creates a single shapefile for all data matching `batch-ids` (ignores `map-ids`)
|
||||
- `MAP_IDS`: Processes only the `map-ids` specified in configuration (requires `map-ids` to be set)
|
||||
- `RESOLVE`: Queries the database for all distinct `map-ids` matching `batch-ids`, then processes each (avoids OS command-line length limits)
|
||||
- If `mode` is unspecified: defaults to `MERGED` if `map-ids` is empty, otherwise `MAP_IDS`
|
||||
|
||||
**Geometry Handling**: Two-step conversion process:
|
||||
- PostGIS returns geometries as WKT (Well-Known Text) via `ST_AsText(geometry)` in SQL query
|
||||
- `GeometryConverter` parses WKT to JTS `Geometry` objects using `WKTReader`
|
||||
- `ShapefileWriter` uses JTS geometries with GeoTools to write shapefile artifacts (.shp, .shx, .dbf, .prj)
|
||||
|
||||
**Shapefile Constraints**:
|
||||
- Validates all geometries are homogeneous (same type) via `ShapefileConverterService.validateGeometries()`
|
||||
- Shapefiles cannot contain mixed geometry types (e.g., cannot mix Polygon and Point)
|
||||
- Geometry type determined from first valid geometry in result set
|
||||
|
||||
**Output Structure**:
|
||||
- For MAP_IDS/RESOLVE mode: `{output-base-dir}/{inference-id}/{map-id}/`
|
||||
- For MERGED mode: `{output-base-dir}/{inference-id}/merge/`
|
||||
- Each directory contains: `.shp`, `.shx`, `.dbf`, `.prj`, `.geojson`, and `.zip` files
|
||||
|
||||
**PostgreSQL Array Parameters**: The repository uses `PreparedStatementCreator` to handle PostgreSQL array syntax:
|
||||
```java
|
||||
Array batchIdsArray = con.createArrayOf("bigint", batchIds.toArray());
|
||||
ps.setArray(1, batchIdsArray);
|
||||
**Pipeline Flow**:
|
||||
```
|
||||
ConverterCommandLineRunner
|
||||
→ JobLauncher.run(mergedModeJob)
|
||||
→ Step 1: GeometryTypeValidationTasklet (validates geometry homogeneity)
|
||||
→ Step 2: generateShapefileStep (chunk-oriented)
|
||||
→ JdbcCursorItemReader (fetch-size: 1000)
|
||||
→ FeatureConversionProcessor (InferenceResult → SimpleFeature)
|
||||
→ StreamingShapefileWriter (chunk-based append)
|
||||
→ Step 3: generateGeoJsonStep (chunk-oriented, same pattern)
|
||||
→ Step 4: CreateZipTasklet (creates .zip for GeoServer)
|
||||
→ Step 5: GeoServerRegistrationTasklet (conditional, if --geoserver.enabled=true)
|
||||
→ Step 6: generateMapIdFilesStep (partitioned, sequential map_id processing)
|
||||
```
|
||||
This enables `WHERE batch_id = ANY(?)` queries.
|
||||
|
||||
**GeoServer Integration**:
|
||||
- Workspace 'cd' must be pre-created in GeoServer before registration
|
||||
- Uses environment variables `GEOSERVER_USERNAME` and `GEOSERVER_PASSWORD` for authentication
|
||||
- Supports automatic deletion and re-registration when `overwrite-existing: true`
|
||||
- Non-blocking: registration failures are logged but don't stop the application
|
||||
**Key Components**:
|
||||
- `JdbcCursorItemReader`: Cursor-based streaming (no full result set loading)
|
||||
- `StreamingShapefileWriter`: Opens GeoTools transaction, writes chunks incrementally, commits at end
|
||||
- `GeometryTypeValidationTasklet`: Pre-validates with SQL `DISTINCT ST_GeometryType()`, auto-converts MultiPolygon
|
||||
- `CompositeItemWriter`: Simultaneously writes shapefile and GeoJSON in map_id worker step
|
||||
|
||||
#### Legacy Mode
|
||||
**Trigger**: No `--batch` flag (deprecated)
|
||||
**Use Case**: Small datasets (<10K records)
|
||||
**Memory**: 1.4-9GB (loads entire result set)
|
||||
|
||||
**Pipeline Flow**:
|
||||
```
|
||||
ConverterCommandLineRunner
|
||||
→ ShapefileConverterService.convertAll()
|
||||
→ InferenceResultRepository.findByBatchIds() (full List<InferenceResult>)
|
||||
→ validateGeometries() (in-memory validation)
|
||||
→ ShapefileWriter.write() (DefaultFeatureCollection accumulation)
|
||||
→ GeoJsonWriter.write()
|
||||
```
|
||||
|
||||
### Key Design Patterns
|
||||
|
||||
**Geometry Type Validation & Auto-Conversion**:
|
||||
- Pre-validation step runs SQL `SELECT DISTINCT ST_GeometryType(geometry)` to detect mixed types
|
||||
- Supports automatic conversion: `ST_MultiPolygon` → `ST_Polygon` (extracts first polygon only)
|
||||
- Fails fast on unsupported mixed types (e.g., Polygon + LineString)
|
||||
- Validates EPSG:5186 coordinate bounds (X: 125-530km, Y: -600-988km) and ST_IsValid()
|
||||
- See `GeometryTypeValidationTasklet` (batch/tasklet/GeometryTypeValidationTasklet.java:1-290)
|
||||
|
||||
**WKT to JTS Conversion Pipeline**:
|
||||
1. PostGIS query returns `ST_AsText(geometry)` as WKT string
|
||||
2. `GeometryConvertingRowMapper` converts ResultSet row to `InferenceResult` with WKT string (batch/reader/GeometryConvertingRowMapper.java:1-74)
|
||||
3. `FeatureConversionProcessor` uses `GeometryConverter.parseGeometry()` to convert WKT → JTS Geometry (service/GeometryConverter.java:1-92)
|
||||
4. `StreamingShapefileWriter` wraps JTS geometry in GeoTools `SimpleFeature` and writes to shapefile
|
||||
|
||||
**Chunk-Based Transaction Management** (Spring Batch only):
|
||||
```java
|
||||
// StreamingShapefileWriter
|
||||
@BeforeStep
|
||||
public void open() {
|
||||
transaction = new DefaultTransaction("create");
|
||||
featureStore.setTransaction(transaction); // Long-running transaction
|
||||
}
|
||||
|
||||
@Override
|
||||
public void write(Chunk<SimpleFeature> chunk) {
|
||||
ListFeatureCollection collection = new ListFeatureCollection(featureType, chunk.getItems());
|
||||
featureStore.addFeatures(collection); // Append chunk to shapefile
|
||||
// chunk goes out of scope → GC eligible
|
||||
}
|
||||
|
||||
@AfterStep
|
||||
public void afterStep() {
|
||||
transaction.commit(); // Commit all chunks at once
|
||||
transaction.close();
|
||||
}
|
||||
```
|
||||
|
||||
**PostgreSQL Array Parameter Handling**:
|
||||
```java
|
||||
// InferenceResultItemReaderConfig uses PreparedStatementSetter
|
||||
ps -> {
|
||||
Array batchIdsArray = ps.getConnection().createArrayOf("bigint", batchIds.toArray());
|
||||
ps.setArray(1, batchIdsArray); // WHERE batch_id = ANY(?)
|
||||
ps.setString(2, mapId);
|
||||
}
|
||||
```
|
||||
|
||||
**Output Directory Strategy**:
|
||||
- Batch mode (MERGED): `{output-base-dir}/{inference-id}/merge/` → Single merged shapefile + GeoJSON
|
||||
- Batch mode (map_id partitioning): `{output-base-dir}/{inference-id}/{map-id}/` → Per-map_id files
|
||||
- Legacy mode: `{output-base-dir}/{inference-id}/{map-id}/` (no merge folder)
|
||||
|
||||
**GeoServer Registration**:
|
||||
- Only shapefile ZIP is uploaded (GeoJSON not registered)
|
||||
- Requires pre-created workspace 'cd' and environment variables for auth
|
||||
- Conditional execution via JobParameter `geoserver.enabled`
|
||||
- Non-blocking: failures logged but don't stop batch job
|
||||
|
||||
## Configuration
|
||||
|
||||
Configuration files are located in `src/main/resources/`:
|
||||
- `application.yml`: Base configuration (sets active profile)
|
||||
- `application-prod.yml`: Production database and converter settings
|
||||
- `application-dev.yml`: Development settings
|
||||
- `application-local.yml`: Local development settings
|
||||
### Profile System
|
||||
- Default profile: `prod` (set in application.yml)
|
||||
- Configuration hierarchy: `application.yml` → `application-{profile}.yml`
|
||||
- Override via: `--spring.profiles.active=dev`
|
||||
|
||||
### Converter Configuration
|
||||
### Key Configuration Properties
|
||||
|
||||
**Converter Settings** (`ConverterProperties.java`):
|
||||
```yaml
|
||||
converter:
|
||||
inference-id: 'D5E46F60FC40B1A8BE0CD1F3547AA6'
|
||||
map-ids: [] # Optional: list of map_ids, or empty for merged mode
|
||||
batch-ids: [252, 253, 257] # Required: batch ID filter
|
||||
mode: 'MERGED' # Optional: MERGED, MAP_IDS, or RESOLVE
|
||||
inference-id: 'D5E46F60FC40B1A8BE0CD1F3547AA6' # Output folder name
|
||||
batch-ids: [252, 253, 257] # PostgreSQL batch_id filter (required)
|
||||
map-ids: [] # Legacy mode only (ignored in batch mode)
|
||||
mode: 'MERGED' # Legacy mode only: MERGED, MAP_IDS, or RESOLVE
|
||||
output-base-dir: '/data/model_output/export/'
|
||||
crs: 'EPSG:5186' # Korean 2000 / Central Belt CRS
|
||||
crs: 'EPSG:5186' # Korean 2000 / Central Belt
|
||||
|
||||
batch:
|
||||
chunk-size: 1000 # Records per chunk (affects memory usage)
|
||||
fetch-size: 1000 # JDBC cursor fetch size
|
||||
skip-limit: 100 # Max skippable records per chunk
|
||||
enable-partitioning: false # Future: parallel map_id processing
|
||||
```
|
||||
|
||||
### GeoServer Configuration
|
||||
**GeoServer Settings** (`GeoServerProperties.java`):
|
||||
```yaml
|
||||
geoserver:
|
||||
base-url: 'https://kamco.geo-dev.gs.dabeeo.com/geoserver'
|
||||
workspace: 'cd'
|
||||
overwrite-existing: true
|
||||
connection-timeout: 30000
|
||||
read-timeout: 60000
|
||||
username: 'admin' # Optional: prefer environment variables
|
||||
password: 'geoserver' # Optional: prefer environment variables
|
||||
workspace: 'cd' # Must be pre-created in GeoServer
|
||||
overwrite-existing: true # Delete existing layer before registration
|
||||
connection-timeout: 30000 # 30 seconds
|
||||
read-timeout: 60000 # 60 seconds
|
||||
# Credentials from environment variables (preferred):
|
||||
# GEOSERVER_USERNAME, GEOSERVER_PASSWORD
|
||||
```
|
||||
|
||||
**Spring Batch Metadata**:
|
||||
```yaml
|
||||
spring:
|
||||
batch:
|
||||
job:
|
||||
enabled: false # Prevent auto-run on startup
|
||||
jdbc:
|
||||
initialize-schema: always # Auto-create BATCH_* tables
|
||||
```
|
||||
|
||||
## Database Integration
|
||||
|
||||
### Query Pattern
|
||||
All queries filter by `batch_id = ANY(?)` and include `after_c IS NOT NULL AND after_p IS NOT NULL` to ensure data quality.
|
||||
### Query Strategies
|
||||
|
||||
Primary queries:
|
||||
- `findByMapId(batchIds, mapId)`: Retrieve records for a specific map_id
|
||||
- `findByBatchIds(batchIds)`: Retrieve all records for batch_ids (merged mode)
|
||||
- `findMapIdByBatchIds(batchIds)`: Query distinct map_ids for RESOLVE mode
|
||||
**Spring Batch Mode** (streaming):
|
||||
```sql
|
||||
-- InferenceResultItemReaderConfig.java
|
||||
SELECT uid, map_id, probability, before_year, after_year,
|
||||
before_c, before_p, after_c, after_p,
|
||||
ST_AsText(geometry) as geometry_wkt
|
||||
FROM inference_results_testing
|
||||
WHERE batch_id = ANY(?)
|
||||
AND ST_GeometryType(geometry) IN ('ST_Polygon', 'ST_MultiPolygon')
|
||||
AND ST_SRID(geometry) = 5186
|
||||
AND ST_X(ST_Centroid(geometry)) BETWEEN 125000 AND 530000
|
||||
AND ST_Y(ST_Centroid(geometry)) BETWEEN -600000 AND 988000
|
||||
AND ST_IsValid(geometry) = true
|
||||
ORDER BY map_id, uid
|
||||
-- Uses server-side cursor with fetch-size=1000
|
||||
```
|
||||
|
||||
**Legacy Mode** (full load):
|
||||
```sql
|
||||
-- InferenceResultRepository.java
|
||||
SELECT uid, map_id, probability, before_year, after_year,
|
||||
before_c, before_p, after_c, after_p,
|
||||
ST_AsText(geometry) as geometry_wkt
|
||||
FROM inference_results_testing
|
||||
WHERE batch_id = ANY(?) AND map_id = ?
|
||||
-- Returns full List<InferenceResult> in memory
|
||||
```
|
||||
|
||||
**Geometry Type Validation**:
|
||||
```sql
|
||||
-- GeometryTypeValidationTasklet.java
|
||||
SELECT DISTINCT ST_GeometryType(geometry)
|
||||
FROM inference_results_testing
|
||||
WHERE batch_id = ANY(?) AND geometry IS NOT NULL
|
||||
-- Pre-validates homogeneous geometry requirement
|
||||
```
|
||||
|
||||
### Field Mapping
|
||||
Database columns map to shapefile fields (note: shapefile field names limited to 10 characters):
|
||||
Database columns map to shapefile fields (10-character limit):
|
||||
|
||||
| Database Column | DB Type | Shapefile Field | Shapefile Type |
|
||||
|-----------------|---------|-----------------|----------------|
|
||||
| uid | uuid | uid | String |
|
||||
| map_id | text | map_id | String |
|
||||
| probability | float8 | chn_dtct_p | String |
|
||||
| before_year | bigint | cprs_yr | Long |
|
||||
| after_year | bigint | crtr_yr | Long |
|
||||
| before_c | text | bf_cls_cd | String |
|
||||
| before_p | float8 | bf_cls_pro | String |
|
||||
| after_c | text | af_cls_cd | String |
|
||||
| after_p | float8 | af_cls_pro | String |
|
||||
| geometry | geom | the_geom | Polygon |
|
||||
| Database Column | DB Type | Shapefile Field | Shapefile Type | Notes |
|
||||
|-----------------|---------|-----------------|----------------|-------|
|
||||
| uid | uuid | chnDtctId | String | Change detection ID |
|
||||
| map_id | text | mpqd_no | String | Map quadrant number |
|
||||
| probability | float8 | chn_dtct_p | Double | Change detection probability |
|
||||
| before_year | bigint | cprs_yr | Long | Comparison year |
|
||||
| after_year | bigint | crtr_yr | Long | Criteria year |
|
||||
| before_c | text | bf_cls_cd | String | Before classification code |
|
||||
| before_p | float8 | bf_cls_pro | Double | Before classification probability |
|
||||
| after_c | text | af_cls_cd | String | After classification code |
|
||||
| after_p | float8 | af_cls_pro | Double | After classification probability |
|
||||
| geometry | geom | the_geom | Polygon | Geometry in EPSG:5186 |
|
||||
|
||||
**Note**: Probability and classification probability fields are stored as Strings in shapefiles (converted via `String.valueOf()`) to preserve precision.
|
||||
**Field name source**: See `FeatureTypeFactory.java` (batch/util/FeatureTypeFactory.java:1-104)
|
||||
|
||||
### Coordinate Reference System
|
||||
All geometries use **EPSG:5186** (Korean 2000 / Central Belt). The PostGIS geometry column is `geometry(Polygon, 5186)`, and this CRS is encoded in the output shapefile's `.prj` file via GeoTools.
|
||||
- **CRS**: EPSG:5186 (Korean 2000 / Central Belt)
|
||||
- **Valid Coordinate Bounds**: X ∈ [125km, 530km], Y ∈ [-600km, 988km]
|
||||
- **Encoding**: WKT in SQL → JTS Geometry → GeoTools SimpleFeature → `.prj` file
|
||||
- **Validation**: Automatic in batch mode via `ST_X(ST_Centroid())` range check
|
||||
|
||||
## Dependencies
|
||||
|
||||
Key libraries:
|
||||
- **Spring Boot 3.5.7**: Framework (DI, JDBC, web for RestTemplate)
|
||||
- **GeoTools 30.0**: Shapefile and GeoJSON generation (`gt-shapefile`, `gt-referencing`, `gt-epsg-hsql`, `gt-geojson`)
|
||||
- **JTS 1.19.0**: Java Topology Suite for geometry representation
|
||||
- **PostGIS JDBC 2.5.1**: PostgreSQL spatial extension support
|
||||
- **PostgreSQL JDBC Driver**: Database connectivity
|
||||
- **HikariCP**: Connection pooling
|
||||
**Core Framework**:
|
||||
- Spring Boot 3.5.7
|
||||
- `spring-boot-starter`: DI container, logging
|
||||
- `spring-boot-starter-jdbc`: JDBC template, HikariCP
|
||||
- `spring-boot-starter-batch`: Spring Batch framework, job repository
|
||||
- `spring-boot-starter-web`: RestTemplate for GeoServer API calls
|
||||
- `spring-boot-starter-validation`: @NotBlank annotations
|
||||
|
||||
**Important**: `javax.media:jai_core` is globally excluded in `build.gradle` to avoid conflicts with GeoTools.
|
||||
**Spatial Libraries**:
|
||||
- GeoTools 30.0 (via OSGeo repository)
|
||||
- `gt-shapefile`: Shapefile I/O (DataStore, FeatureStore, Transaction)
|
||||
- `gt-geojson`: GeoJSON encoding/decoding
|
||||
- `gt-referencing`: CRS transformations
|
||||
- `gt-epsg-hsql`: EPSG database for CRS lookups
|
||||
- JTS 1.19.0: Geometry primitives (Polygon, MultiPolygon, GeometryFactory)
|
||||
- PostGIS JDBC 2.5.1: PostGIS geometry type support
|
||||
|
||||
**Database**:
|
||||
- PostgreSQL JDBC Driver (latest)
|
||||
- HikariCP (bundled with Spring Boot)
|
||||
|
||||
**Build Configuration**:
|
||||
```gradle
|
||||
// build.gradle
|
||||
configurations.all {
|
||||
exclude group: 'javax.media', module: 'jai_core' // Conflicts with GeoTools
|
||||
}
|
||||
|
||||
bootJar {
|
||||
archiveFileName = "shp-exporter.jar" // Fixed JAR name
|
||||
}
|
||||
|
||||
spotless {
|
||||
java {
|
||||
googleJavaFormat('1.19.2') // 2-space indentation
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Development Patterns
|
||||
|
||||
### Adding a New Step to Spring Batch Job
|
||||
|
||||
When adding steps to `mergedModeJob`, follow this pattern:
|
||||
|
||||
1. **Create Tasklet or ItemWriter** in `batch/tasklet/` or `batch/writer/`
|
||||
2. **Define Step Bean** in `MergedModeJobConfig.java`:
|
||||
```java
|
||||
@Bean
|
||||
public Step myNewStep(JobRepository jobRepository,
|
||||
PlatformTransactionManager transactionManager,
|
||||
MyTasklet tasklet,
|
||||
BatchExecutionHistoryListener historyListener) {
|
||||
return new StepBuilder("myNewStep", jobRepository)
|
||||
.tasklet(tasklet, transactionManager)
|
||||
.listener(historyListener) // REQUIRED for history tracking
|
||||
.build();
|
||||
}
|
||||
```
|
||||
3. **Add to Job Flow** in `mergedModeJob()`:
|
||||
```java
|
||||
.next(myNewStep)
|
||||
```
|
||||
4. **Always include `BatchExecutionHistoryListener`** to track execution metrics
|
||||
|
||||
### Modifying ItemReader Configuration
|
||||
|
||||
ItemReaders are **not thread-safe**. Each step requires its own instance:
|
||||
|
||||
```java
|
||||
// WRONG: Sharing reader between steps
|
||||
@Bean
|
||||
public JdbcCursorItemReader<InferenceResult> reader() { ... }
|
||||
|
||||
// RIGHT: Separate readers with @StepScope
|
||||
@Bean
|
||||
@StepScope // Creates new instance per step
|
||||
public JdbcCursorItemReader<InferenceResult> shapefileReader() { ... }
|
||||
|
||||
@Bean
|
||||
@StepScope
|
||||
public JdbcCursorItemReader<InferenceResult> geoJsonReader() { ... }
|
||||
```
|
||||
|
||||
See `InferenceResultItemReaderConfig.java` for working examples.
|
||||
|
||||
### Streaming Writers Pattern
|
||||
|
||||
When writing custom streaming writers, follow `StreamingShapefileWriter` pattern:
|
||||
|
||||
```java
|
||||
@Component
|
||||
@StepScope
|
||||
public class MyStreamingWriter implements ItemStreamWriter<MyType> {
|
||||
private Transaction transaction;
|
||||
|
||||
@BeforeStep
|
||||
public void open(ExecutionContext context) {
|
||||
// Open resources, start transaction
|
||||
transaction = new DefaultTransaction("create");
|
||||
}
|
||||
|
||||
@Override
|
||||
public void write(Chunk<? extends MyType> chunk) {
|
||||
// Write chunk incrementally
|
||||
// Do NOT accumulate in memory
|
||||
}
|
||||
|
||||
@AfterStep
|
||||
public ExitStatus afterStep(StepExecution stepExecution) {
|
||||
transaction.commit(); // Commit all chunks
|
||||
transaction.close();
|
||||
return ExitStatus.COMPLETED;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### JobParameters and StepExecutionContext
|
||||
|
||||
**Pass data between steps** using `StepExecutionContext`:
|
||||
|
||||
```java
|
||||
// Step 1: Store data
|
||||
stepExecution.getExecutionContext().putString("geometryType", "ST_Polygon");
|
||||
|
||||
// Step 2: Retrieve data
|
||||
@BeforeStep
|
||||
public void beforeStep(StepExecution stepExecution) {
|
||||
String geomType = stepExecution.getJobExecution()
|
||||
.getExecutionContext()
|
||||
.getString("geometryType");
|
||||
}
|
||||
```
|
||||
|
||||
**Job-level parameters** from command line:
|
||||
```java
|
||||
// ConverterCommandLineRunner.buildJobParameters()
|
||||
JobParametersBuilder builder = new JobParametersBuilder();
|
||||
builder.addString("inferenceId", converterProperties.getInferenceId());
|
||||
builder.addLong("timestamp", System.currentTimeMillis()); // Ensures uniqueness
|
||||
```
|
||||
|
||||
### Partitioning Pattern (Map ID Processing)
|
||||
|
||||
The `generateMapIdFilesStep` uses partitioning but runs **sequentially** to avoid DB connection pool exhaustion:
|
||||
|
||||
```java
|
||||
@Bean
|
||||
public Step generateMapIdFilesStep(...) {
|
||||
return new StepBuilder("generateMapIdFilesStep", jobRepository)
|
||||
.partitioner("mapIdWorker", partitioner)
|
||||
.step(mapIdWorkerStep)
|
||||
.taskExecutor(new SyncTaskExecutor()) // SEQUENTIAL execution
|
||||
.build();
|
||||
}
|
||||
```
|
||||
|
||||
For parallel execution in future (requires connection pool tuning):
|
||||
```java
|
||||
.taskExecutor(new SimpleAsyncTaskExecutor())
|
||||
.gridSize(4) // 4 concurrent workers
|
||||
```
|
||||
|
||||
### GeoServer REST API Integration
|
||||
|
||||
GeoServer operations use `RestTemplate` with custom error handling:
|
||||
|
||||
```java
|
||||
// GeoServerRegistrationService.java
|
||||
try {
|
||||
restTemplate.exchange(url, HttpMethod.PUT, entity, String.class);
|
||||
} catch (HttpClientErrorException e) {
|
||||
if (e.getStatusCode() == HttpStatus.NOT_FOUND) {
|
||||
// Handle workspace not found
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Always check workspace existence before layer registration.
|
||||
|
||||
### Testing Considerations
|
||||
|
||||
- **Unit tests**: Mock `JdbcTemplate`, `DataSource` for repository tests
|
||||
- **Integration tests**: Use `@SpringBatchTest` with embedded H2 database
|
||||
- **GeoTools**: Use `MemoryDataStore` for shapefile writer tests
|
||||
- **Current state**: Limited test coverage (focus on critical path validation)
|
||||
|
||||
Refer to `claudedocs/SPRING_BATCH_MIGRATION.md` for detailed batch architecture documentation.
|
||||
|
||||
Reference in New Issue
Block a user