Files
kamco-cd-cron/shp-exporter/CLAUDE.md

8.0 KiB
Executable File

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Spring Boot CLI application that queries PostgreSQL PostGIS spatial data and converts it to ESRI shapefiles and GeoJSON. The application processes AI inference results from the KAMCO database and generates geographic data files for visualization in GIS applications. It also supports automatic registration of shapefiles to GeoServer via REST API.

Build and Run Commands

Build

./gradlew build

The built JAR will be named shp-exporter.jar (configured in bootJar task).

Run Application

Generate Shapefiles

./gradlew bootRun

Or using JAR:

java -jar build/libs/shp-exporter.jar

Upload Shapefile to GeoServer

Set environment variables first:

export GEOSERVER_USERNAME=admin
export GEOSERVER_PASSWORD=geoserver

Then upload:

./gradlew bootRun --args="--upload-shp /path/to/file.shp --layer layer_name"

Or using JAR:

java -jar build/libs/shp-exporter.jar --upload-shp /path/to/file.shp --layer layer_name

Override Configuration via Command Line

Using Gradle (recommended - no quoting issues):

./gradlew bootRun --args="--converter.inference-id=ABC123 --converter.map-ids[0]=35813030 --converter.batch-ids[0]=252 --converter.mode=MERGED"

Using JAR with zsh (quote arguments with brackets):

java -jar build/libs/shp-exporter.jar '--converter.inference-id=ABC123' '--converter.map-ids[0]=35813030'

Code Formatting

Apply Google Java Format (2-space indentation) before committing:

./gradlew spotlessApply

Check formatting without applying:

./gradlew spotlessCheck

Active Profile

By default, the application runs with spring.profiles.active=prod (set in application.yml). Profile-specific configurations are in application-{profile}.yml files.

Architecture

Processing Pipeline

The application follows a layered architecture with a linear data flow:

  1. CLI Entry (ConverterCommandLineRunner) → Parses command-line args and routes to either shapefile generation or GeoServer upload
  2. Service Orchestration (ShapefileConverterService) → Coordinates the conversion workflow based on mode (MERGED, MAP_IDS, or RESOLVE)
  3. Data Access (InferenceResultRepository) → Queries PostGIS database using PreparedStatementCreator for PostgreSQL array parameters
  4. Geometry Conversion (GeometryConverter) → Converts PostGIS WKT format to JTS Geometry objects using WKTReader
  5. File Writing (ShapefileWriter, GeoJsonWriter, ResultZipWriter) → Generates output files using GeoTools
  6. GeoServer Integration (GeoServerRegistrationService) → Registers shapefiles to GeoServer via REST API (optional)

Key Design Points

Conversion Modes: The application supports three execution modes controlled by converter.mode:

  • MERGED: Creates a single shapefile for all data matching batch-ids (ignores map-ids)
  • MAP_IDS: Processes only the map-ids specified in configuration (requires map-ids to be set)
  • RESOLVE: Queries the database for all distinct map-ids matching batch-ids, then processes each (avoids OS command-line length limits)
  • If mode is unspecified: defaults to MERGED if map-ids is empty, otherwise MAP_IDS

Geometry Handling: Two-step conversion process:

  • PostGIS returns geometries as WKT (Well-Known Text) via ST_AsText(geometry) in SQL query
  • GeometryConverter parses WKT to JTS Geometry objects using WKTReader
  • ShapefileWriter uses JTS geometries with GeoTools to write shapefile artifacts (.shp, .shx, .dbf, .prj)

Shapefile Constraints:

  • Validates all geometries are homogeneous (same type) via ShapefileConverterService.validateGeometries()
  • Shapefiles cannot contain mixed geometry types (e.g., cannot mix Polygon and Point)
  • Geometry type determined from first valid geometry in result set

Output Structure:

  • For MAP_IDS/RESOLVE mode: {output-base-dir}/{inference-id}/{map-id}/
  • For MERGED mode: {output-base-dir}/{inference-id}/merge/
  • Each directory contains: .shp, .shx, .dbf, .prj, .geojson, and .zip files

PostgreSQL Array Parameters: The repository uses PreparedStatementCreator to handle PostgreSQL array syntax:

Array batchIdsArray = con.createArrayOf("bigint", batchIds.toArray());
ps.setArray(1, batchIdsArray);

This enables WHERE batch_id = ANY(?) queries.

GeoServer Integration:

  • Workspace 'cd' must be pre-created in GeoServer before registration
  • Uses environment variables GEOSERVER_USERNAME and GEOSERVER_PASSWORD for authentication
  • Supports automatic deletion and re-registration when overwrite-existing: true
  • Non-blocking: registration failures are logged but don't stop the application

Configuration

Configuration files are located in src/main/resources/:

  • application.yml: Base configuration (sets active profile)
  • application-prod.yml: Production database and converter settings
  • application-dev.yml: Development settings
  • application-local.yml: Local development settings

Converter Configuration

converter:
  inference-id: 'D5E46F60FC40B1A8BE0CD1F3547AA6'
  map-ids: []           # Optional: list of map_ids, or empty for merged mode
  batch-ids: [252, 253, 257]  # Required: batch ID filter
  mode: 'MERGED'        # Optional: MERGED, MAP_IDS, or RESOLVE
  output-base-dir: '/data/model_output/export/'
  crs: 'EPSG:5186'      # Korean 2000 / Central Belt CRS

GeoServer Configuration

geoserver:
  base-url: 'https://kamco.geo-dev.gs.dabeeo.com/geoserver'
  workspace: 'cd'
  overwrite-existing: true
  connection-timeout: 30000
  read-timeout: 60000
  username: 'admin'     # Optional: prefer environment variables
  password: 'geoserver' # Optional: prefer environment variables

Database Integration

Query Pattern

All queries filter by batch_id = ANY(?) and include after_c IS NOT NULL AND after_p IS NOT NULL to ensure data quality.

Primary queries:

  • findByMapId(batchIds, mapId): Retrieve records for a specific map_id
  • findByBatchIds(batchIds): Retrieve all records for batch_ids (merged mode)
  • findMapIdByBatchIds(batchIds): Query distinct map_ids for RESOLVE mode

Field Mapping

Database columns map to shapefile fields (note: shapefile field names limited to 10 characters):

Database Column DB Type Shapefile Field Shapefile Type
uid uuid uid String
map_id text map_id String
probability float8 chn_dtct_p String
before_year bigint cprs_yr Long
after_year bigint crtr_yr Long
before_c text bf_cls_cd String
before_p float8 bf_cls_pro String
after_c text af_cls_cd String
after_p float8 af_cls_pro String
geometry geom the_geom Polygon

Note: Probability and classification probability fields are stored as Strings in shapefiles (converted via String.valueOf()) to preserve precision.

Coordinate Reference System

All geometries use EPSG:5186 (Korean 2000 / Central Belt). The PostGIS geometry column is geometry(Polygon, 5186), and this CRS is encoded in the output shapefile's .prj file via GeoTools.

Dependencies

Key libraries:

  • Spring Boot 3.5.7: Framework (DI, JDBC, web for RestTemplate)
  • GeoTools 30.0: Shapefile and GeoJSON generation (gt-shapefile, gt-referencing, gt-epsg-hsql, gt-geojson)
  • JTS 1.19.0: Java Topology Suite for geometry representation
  • PostGIS JDBC 2.5.1: PostgreSQL spatial extension support
  • PostgreSQL JDBC Driver: Database connectivity
  • HikariCP: Connection pooling

Important: javax.media:jai_core is globally excluded in build.gradle to avoid conflicts with GeoTools.