PostgreSQL to Shapefile Converter
Spring Boot CLI application that queries PostgreSQL spatial data and generates shapefiles.
Features
- Batch processing for multiple map_ids
- PostGIS geometry to Shapefile conversion
- GeoServer REST API integration for automatic layer registration
- Configurable via
application.yml - Generates all required shapefile files (.shp, .shx, .dbf, .prj)
- Supports EPSG:5186 (Korean 2000 / Central Belt) coordinate reference system
- GeoJSON export support
Prerequisites
- Java 17 or higher
- PostgreSQL database with PostGIS extension
- Access to the KAMCO database at 192.168.2.127:15432
Configuration
Edit src/main/resources/application.yml to configure:
converter:
spring.profiles.active: 'dev' #profiles.active
inference-id: 'D5E46F60FC40B1A8BE0CD1F3547AA6' # Inference ID (used for output folder structure)
map-ids:
- '35813030' # Add your map_ids here (text type)
batch-ids:
- 252
- 253
- 257
output-base-dir: '/kamco-nfs/dataset/export/'
crs: 'EPSG:5186' # Korean 2000 / Central Belt
geoserver:
base-url: 'https://kamco.geo-dev.gs.dabeeo.com'
workspace: 'cd'
datastore: 'inference_result'
overwrite-existing: true # Delete existing layers before re-registering
connection-timeout: 30000 # 30 seconds
read-timeout: 60000 # 60 seconds
## Converter Mode (`converter.mode`)
`converter.mode`는 대량 `map_ids` 처리 시
OS 커맨드라인 길이 제한(`Argument list too long`) 문제를 방지하기 위해 추가 하였습니다.
### Supported Modes
#### MERGED
- `batch-ids`에 해당하는 **모든 데이터를 하나의 Shapefile로 병합 생성**
- `map-ids`가 설정되어 있어도 **무시됨**
- 단일 결과 파일이 필요한 경우 적합
#### MAP_IDS
- 명시적으로 전달한 `map-ids`만 대상으로 Shapefile 생성
- `converter.map-ids` **필수**
- `map-ids` 개수가 많을 경우 OS 커맨드라인 길이 제한에 걸릴 수 있음
#### RESOLVE
- `batch-ids` 기준으로 **JAR 내부에서 map_ids를 조회**한 뒤 Shapefile 생성
- `map-ids`를 커맨드라인 인자로 전달하지 않음
- 대량 데이터 처리 시 가장 안전한 방식
### Default Behavior (mode 미지정 시)
- `converter.map-ids`가 비어 있으면 → **MERGED**
- `converter.map-ids`가 있으면 → **MAP_IDS**
### Command Line Parameters
You can override configuration values using command line arguments:
**Using Gradle (recommended - no quoting issues):**
```bash
./gradlew bootRun --args="--converter.inference-id=D5E46F60FC40B1A8BE0CD1F3547AA6 --converter.map-ids[0]=35813030 --converter.batch-ids[0]=252 --converter.batch-ids[1]=253 --converter.batch-ids[2]=257 --converter.mode=MERGED"
Using JAR (zsh shell - quote arguments with brackets):
java -jar build/libs/shp-exporter.jar \
'--converter.inference-id=D5E46F60FC40B1A8BE0CD1F3547AA6' \
'--converter.map-ids[0]=35813030' \
'--converter.batch-ids[0]=252' \
'--converter.batch-ids[1]=253' \
'--converter.batch-ids[2]=257' \
'--converter.mode=MERGED'
Using JAR (bash shell - no quotes needed):
java -jar build/libs/shp-exporter.jar \
--converter.inference-id=D5E46F60FC40B1A8BE0CD1F3547AA6 \
--converter.map-ids[0]=35813030 \
--converter.batch-ids[0]=252 \
--converter.batch-ids[1]=253 \
--converter.batch-ids[2]=257 \
--converter.mode=MERGED
java -jar build/libs/shp-exporter.jar \
--batch \
--converter.inference-id=test009 \
--converter.batch-ids[0]=111 \
--converter.batch-ids[1]=114 \
--converter.batch-ids[2]=162 \
--geoserver.enabled=true
Note for zsh users: zsh interprets square brackets [] as glob patterns. Always quote arguments containing brackets when using zsh.
Building
./gradlew build
Running
Generate Shapefiles (Spring Batch Mode - Recommended)
New in v1.1.0: Spring Batch mode provides memory-optimized processing for large datasets.
# MERGED mode (creates single shapefile + GeoJSON for all batch-ids)
./gradlew bootRun --args="--batch --converter.batch-ids[0]=252 --converter.batch-ids[1]=253"
# With GeoServer registration
./gradlew bootRun --args="--batch --geoserver.enabled=true --converter.batch-ids[0]=252"
Output Files (in {output-base-dir}/{inference-id}/merge/):
{inference-id}.shp(+ .shx, .dbf, .prj) - Shapefile{inference-id}.geojson- GeoJSON file{inference-id}.zip- ZIP archive of shapefile
Benefits:
- 90-95% memory reduction (2-13GB → 150-200MB for 1M records)
- Chunk-based streaming (1000 records per chunk)
- Restart capability after failures
- Step-by-step execution support
See claudedocs/SPRING_BATCH_MIGRATION.md for detailed documentation.
Generate Shapefiles (Legacy Mode)
./gradlew bootRun
Or run the JAR directly:
java -jar build/libs/shp-exporter.jar
Register Shapefile to GeoServer
First, set GeoServer credentials as environment variables:
export GEOSERVER_USERNAME=admin
export GEOSERVER_PASSWORD=geoserver
Then register a shapefile:
./gradlew bootRun --args="--upload-shp /kamco-nfs/dataset/export/D5E46F60FC40B1A8BE0CD1F3547AA6/35813030/35813030.shp --layer inference_35813030"
Or using the JAR:
java -jar build/libs/shp-exporter.jar \
--upload-shp /path/to/shapefile.shp \
--layer layer_name
GeoServer Registration Process:
- Verifies workspace 'cd' exists (must be pre-created in GeoServer)
- Creates datastore 'inference_result' if it doesn't exist
- Deletes existing layer if
overwrite-existing: true - Publishes shapefile via REST API
- Verifies successful registration
- Automatically enables WMS, WFS, WMTS services
Important Notes:
- Workspace 'cd' must exist in GeoServer before registration
- Environment variables
GEOSERVER_USERNAMEandGEOSERVER_PASSWORDmust be set - Shapefile path must be absolute
- GeoServer must have file system access to the shapefile location
Output
Legacy Mode Output
Shapefiles will be created in directories structured as output-base-dir/inference-id/map-id/:
/kamco-nfs/dataset/export/D5E46F60FC40B1A8BE0CD1F3547AA6/35813030/
├── 35813030.shp # Shapefile geometry
├── 35813030.shx # Shape index
├── 35813030.dbf # Attribute data
└── 35813030.prj # Projection information
Spring Batch Mode Output
Output structure for MERGED mode (output-base-dir/inference-id/merge/):
/kamco-nfs/dataset/export/D5E46F60FC40B1A8BE0CD1F3547AA6/merge/
├── D5E46F60FC40B1A8BE0CD1F3547AA6.shp # Shapefile geometry
├── D5E46F60FC40B1A8BE0CD1F3547AA6.shx # Shape index
├── D5E46F60FC40B1A8BE0CD1F3547AA6.dbf # Attribute data
├── D5E46F60FC40B1A8BE0CD1F3547AA6.prj # Projection information
├── D5E46F60FC40B1A8BE0CD1F3547AA6.geojson # GeoJSON format
└── D5E46F60FC40B1A8BE0CD1F3547AA6.zip # ZIP archive (for GeoServer)
Note: Only the shapefile (.shp and related files) are registered to GeoServer. GeoJSON files are generated for alternative consumption.
Database Query
Spring Batch Mode (Recommended)
The Spring Batch mode applies comprehensive validation to ensure data quality:
ㄴㅅ
ORDER BY map_id, uid
Validation Criteria:
- Geometry Type: Only ST_Polygon and ST_MultiPolygon (excludes Point, LineString, etc.)
- Coordinate System: EPSG:5186 (Korean 2000 / Central Belt)
- Coordinate Range: Korea territory bounds (X: 125-530km, Y: -600-988km)
- Geometry Validity: Valid topology (ST_IsValid)
Rows failing validation are automatically excluded from processing, ensuring clean shapefile generation.
Performance: See PERFORMANCE_OPTIMIZATION.md for indexing recommendations.
Legacy Mode
Legacy mode uses a simpler query without validation:
SELECT uid, map_id, probability, before_year, after_year,
before_c, before_p, after_c, after_p, ST_AsText(geometry) as geometry_wkt
FROM inference_results_testing
WHERE batch_id = ANY(?) AND map_id = ?
Database Schema
- geometry:
geometry(Polygon, 5186)- EPSG:5186 좌표계 - map_id:
text- 문자열 타입 - before_year, after_year:
bigint- Long 타입 - batch_id:
bigint- Long 타입 - uid:
uuid- UUID 타입
Field Mapping
Shapefile field names are limited to 10 characters:
| Database Column | DB Type | Shapefile Field | Shapefile Type |
|---|---|---|---|
| uid | uuid | chnDtctId | String |
| map_id | text | mpqd_no | String |
| probability | float8 | chn_dtct_p | Double |
| before_year | bigint | cprs_yr | Long |
| after_year | bigint | crtr_yr | Long |
| before_c | text | bf_cls_cd | String |
| before_p | float8 | bf_cls_pro | Double |
| after_c | text | af_cls_cd | String |
| after_p | float8 | af_cls_pro | Double |
| geometry | geom | the_geom | Polygon |
Error Handling
Shapefile Generation
- No results: Logs warning and continues to next map_id
- Mixed geometry types: Throws exception (shapefiles require homogeneous geometry)
- Database connection failure: Application exits with error
- Invalid geometry: Logs warning and continues processing
GeoServer Registration
- Workspace not found: Logs error with remediation steps (must be pre-created)
- Authentication failure: Logs error prompting to verify environment variables
- Network timeout: Logs connection error with timeout details
- Layer already exists: Automatically deletes and re-registers if
overwrite-existing: true - Registration failure: Logs error but does not stop application (non-blocking)
Validating Output
Shapefile Validation
Open the generated shapefiles in QGIS or ArcGIS to verify:
- Geometry displays correctly
- Attribute table contains all expected fields
- CRS is EPSG:5186 (Korean 2000 / Central Belt)
GeoServer Layer Validation
After registering to GeoServer, verify the layer:
- GeoServer Admin Console: https://kamco.geo-dev.gs.dabeeo.com/geoserver/web
- Navigate to Layers → Find your layer (e.g.,
cd:inference_35813030) - Preview the layer using Layer Preview
- Verify services are enabled:
- WMS:
https://kamco.geo-dev.gs.dabeeo.com/geoserver/cd/wms - WFS:
https://kamco.geo-dev.gs.dabeeo.com/geoserver/cd/wfs - WMTS:
https://kamco.geo-dev.gs.dabeeo.com/geoserver/cd/wmts
- WMS:
Example WMS GetMap Request:
https://kamco.geo-dev.gs.dabeeo.com/geoserver/cd/wms?
service=WMS&
version=1.1.0&
request=GetMap&
layers=cd:inference_35813030&
bbox=<bounds>&
width=768&
height=768&
srs=EPSG:5186&
format=image/png
Development
Code Formatting
The project uses Google Java Format with 2-space indentation:
./gradlew spotlessApply
Project Structure
src/main/java/com/kamco/makesample/
├── MakeSampleApplication.java # Main application class
├── batch/ # Spring Batch components (v1.1.0+)
│ ├── config/
│ │ ├── BatchConfiguration.java # Spring Batch configuration
│ │ └── MergedModeJobConfig.java # MERGED mode Job definition
│ ├── processor/
│ │ └── FeatureConversionProcessor.java # InferenceResult → SimpleFeature processor
│ ├── reader/
│ │ ├── GeometryConvertingRowMapper.java # WKT → JTS converter
│ │ └── InferenceResultItemReaderConfig.java # Cursor-based DB reader
│ ├── tasklet/
│ │ ├── CreateZipTasklet.java # ZIP creation tasklet
│ │ ├── GeoServerRegistrationTasklet.java # GeoServer registration tasklet
│ │ └── GeometryTypeValidationTasklet.java # Geometry validation tasklet
│ ├── util/
│ │ └── FeatureTypeFactory.java # Shared feature type creation
│ └── writer/
│ ├── StreamingGeoJsonWriter.java # Streaming GeoJSON writer
│ └── StreamingShapefileWriter.java # Streaming shapefile writer
├── cli/
│ └── ConverterCommandLineRunner.java # CLI entry point (batch + legacy)
├── config/
│ ├── ConverterProperties.java # Shapefile converter configuration
│ ├── GeoServerProperties.java # GeoServer configuration
│ ├── GeoServerCredentials.java # GeoServer authentication
│ └── RestTemplateConfig.java # HTTP client configuration
├── exception/
│ ├── ShapefileConversionException.java
│ ├── GeometryConversionException.java
│ ├── MixedGeometryException.java
│ └── GeoServerRegistrationException.java # GeoServer registration errors
├── model/
│ └── InferenceResult.java # Domain model
├── repository/
│ └── InferenceResultRepository.java # Data access layer (legacy)
├── service/
│ ├── GeometryConverter.java # PostGIS to JTS conversion
│ ├── ShapefileConverterService.java # Orchestration service (legacy)
│ └── GeoServerRegistrationService.java # GeoServer REST API integration
└── writer/
├── ShapefileWriter.java # GeoTools shapefile writer (legacy)
└── GeoJsonWriter.java # GeoJSON export writer (legacy)
Dependencies
- Spring Boot 3.5.7
- spring-boot-starter
- spring-boot-starter-jdbc
- spring-boot-starter-batch (v1.1.0+)
- spring-boot-starter-web (for RestTemplate)
- spring-boot-starter-validation (for @NotBlank annotations)
- GeoTools 30.0
- gt-shapefile
- gt-referencing
- gt-epsg-hsql
- gt-geojson
- PostgreSQL JDBC Driver
- PostGIS JDBC 2.5.1
- JTS (Java Topology Suite) 1.19.0
Troubleshooting
GeoServer Registration Issues
Problem: "Workspace not found: cd"
Solution: Create workspace 'cd' in GeoServer admin console before registration
Steps:
1. Login to GeoServer admin: https://kamco.geo-dev.gs.dabeeo.com/geoserver/web
2. Go to Workspaces → Add new workspace
3. Name: cd, Namespace URI: http://cd
4. Click Save
Problem: "GeoServer credentials not configured"
Solution: Set environment variables before running
export GEOSERVER_USERNAME=admin
export GEOSERVER_PASSWORD=geoserver
Problem: "Layer already exists and overwrite is disabled"
Solution: Enable overwrite in application.yml
geoserver:
overwrite-existing: true
Problem: Connection timeout to GeoServer
Solution: Increase timeout values in application.yml
geoserver:
connection-timeout: 60000 # 60 seconds
read-timeout: 120000 # 120 seconds
Problem: "Registration failed. Layer not found after publication"
Possible causes:
1. GeoServer cannot access shapefile path (check file system permissions)
2. Shapefile is corrupted or invalid
3. Network issue interrupted registration
Solution:
1. Verify GeoServer has read access to shapefile directory
2. Validate shapefile using QGIS or ogr2ogr
3. Check GeoServer logs for detailed error messages
Database Connection Issues
Problem: "Connection refused to 192.168.2.127:15432"
Solution: Verify PostgreSQL is running and accessible
psql -h 192.168.2.127 -p 15432 -U kamco_cds -d kamco_cds
Problem: "No results found for map_id"
Solution: Verify data exists in database
SELECT COUNT(*) FROM inference_results_testing
WHERE batch_id IN (252, 253, 257) AND map_id = '35813030';
Batch Execution History
Overview
Spring Batch mode automatically tracks execution history for each step, recording:
- Start time, end time, duration
- Success/failure status
- Error messages and stack traces (if failed)
- Processing statistics (read/write/commit/rollback/skip counts)
Table Setup
Create the batch_execution_history table before running batch jobs:
psql -h 192.168.2.127 -p 15432 -U kamco_cds -d kamco_cds \
-f src/main/resources/db/migration/V1__create_batch_execution_history.sql
Query Examples
View execution history for a specific job:
SELECT step_name, start_time, end_time, duration_ms, status, read_count, write_count
FROM batch_execution_history
WHERE job_execution_id = 123
ORDER BY start_time;
Check failed steps:
SELECT job_execution_id, step_name, start_time, error_message
FROM batch_execution_history
WHERE status = 'FAILED'
ORDER BY start_time DESC
LIMIT 10;
Average step duration:
SELECT step_name,
COUNT(*) as executions,
ROUND(AVG(duration_ms) / 1000.0, 2) as avg_duration_sec
FROM batch_execution_history
WHERE status = 'COMPLETED'
GROUP BY step_name
ORDER BY avg_duration_sec DESC;
For more query examples and detailed documentation, see BATCH_EXECUTION_HISTORY.md.
License
KAMCO Internal Use Only