Sentinel-2 Water Quality Processing - Workflow Documentation๏ƒ

Overview๏ƒ

This document provides detailed technical information about the Sentinel-2 water quality processing workflow. For quick start instructions, see the main README.md.

The system processes Level-1C Sentinel-2 satellite imagery to extract water quality parameters using the C2RCC (Case-2 Regional Coast Colour) atmospheric correction algorithm.

Understanding the Science? For comprehensive background on water quality monitoring, the C2RCC algorithm, and the scientific basis for our processing steps, please refer to Scientific Background & Theory.

The system processes Sentinel-2 satellite imagery to extract water quality parameters including Chlorophyll-a (CHL), Total Suspended Matter (TSM), and Colored Dissolved Organic Matter (CDOM) using the C2RCC atmospheric correction algorithm.

Table of Contents๏ƒ

  1. Detailed Directory Structure

  2. Technical Configuration

  3. Processing Pipeline Details

  4. Advanced Usage

  5. Algorithm Details

  6. File Formats and Outputs

  7. Performance Optimization

Detailed Directory Structure๏ƒ

The workflow uses a comprehensive directory structure for organized processing:

Sentinel2_WQ/
โ”œโ”€โ”€ 01_scripts/                    # All processing scripts
โ”‚   โ”œโ”€โ”€ download.py                # Data download script
โ”‚   โ”œโ”€โ”€ process_pipeline.py        # Main processing pipeline
โ”‚   โ”œโ”€โ”€ plotting.py                # Visualization script
โ”‚   โ””โ”€โ”€ utils.py                   # Utility functions
โ”‚
โ”œโ”€โ”€ 02_config/                     # Configuration files
โ”‚   โ”œโ”€โ”€ parameters.yaml            # Main configuration
โ”‚   โ””โ”€โ”€ snap_graphs/               # SNAP processing graphs
โ”‚       โ”œโ”€โ”€ resample_subset.xml    # Resample and subset graph
โ”‚       โ”œโ”€โ”€ reproject.xml          # Reprojection parameters
โ”‚       โ”œโ”€โ”€ c2rcc_param.xml        # C2RCC parameters
โ”‚       โ”œโ”€โ”€ cdom_band_math.xml     # CDOM calculation graph
โ”‚       โ””โ”€โ”€ rgb_profile_s2.rgb     # RGB profile for true color
โ”‚
โ”œโ”€โ”€ 03_raw_data/                   # Downloaded raw data
โ”‚   โ””โ”€โ”€ sentinel2_l1c/             # Sentinel-2 L1C zip files
โ”‚
โ”œโ”€โ”€ 04_processed_data/             # Intermediate processed data
โ”‚   โ”œโ”€โ”€ l2a_resampled/             # Resampled and subset data
โ”‚   โ”œโ”€โ”€ l2a_reprojected/           # Reprojected data
โ”‚   โ”œโ”€โ”€ c2rcc_output/              # C2RCC processed data
โ”‚   โ””โ”€โ”€ cdom_output/               # CDOM calculated data
โ”‚
โ”œโ”€โ”€ 05_final_products/             # Final output products
โ”‚   โ”œโ”€โ”€ chl/                       # Chlorophyll-a maps
โ”‚   โ”œโ”€โ”€ tsm/                       # Total Suspended Matter maps
โ”‚   โ”œโ”€โ”€ cdom/                      # CDOM maps
โ”‚   โ””โ”€โ”€ true_color/                # True color images
โ”‚
โ”œโ”€โ”€ 06_logs/                       # Processing logs
โ”œโ”€โ”€ 07_documentation/              # Documentation and metadata
โ”œโ”€โ”€ requirements.txt               # Python dependencies
โ”œโ”€โ”€ run_workflow.py                # Master workflow script
โ”œโ”€โ”€ run_workflow.bat               # Windows batch script
โ””โ”€โ”€ workflow_demo.ipynb            # Jupyter notebook demo

Technical Configuration๏ƒ

For basic setup instructions, see the main README.md and GETTING_STARTED.md

Main Configuration File: 02_config/parameters.yaml๏ƒ

# Study Area Configuration
study_area:
  name: "Western Australia Coast"
  wkt_geometry: "POLYGON ((115.54 -31.93, 115.79 -31.93, 115.78 -32.26, 115.53 -32.26, 115.54 -31.93))"
  subset_geometry: "POLYGON ((115.40 -31.95, 115.80 -31.95, 115.80 -32.30, 115.40 -32.30, 115.40 -31.95))"

# Data Download Configuration
download:
  copernicus_user: "your_username@email.com"
  copernicus_password: "your_password"
  cloud_cover_threshold: 10
  default_start_date: "2025-05-01"
  default_end_date: "2025-06-30"

# Processing Parameters
processing:
  c2rcc:
    salinity: 35.0
    temperature: 30.0
    valid_pixel_expression: "B8 > 0 && B8 < 0.1"
  
  cdom:
    expression: "exp(0.544 * log(rhown_B1)-0.571 * log(rhown_B2)-2.181*log(rhown_B3)+1.398*log(rhown_B4)-1.406)"

Multi-tile and Single-tile Handling๏ƒ

The workflow intelligently adapts to your study area coverage:

๐ŸŸข Single-Tile Coverage๏ƒ

When your study area falls entirely within a single Sentinel-2 tile:

Processing: C2RCC โ†’ CDOM Calculation โ†’ Plotting
Benefits:
  - No mosaic processing (30-50% faster)
  - Saves ~30-50% disk space
  - Direct output to final products

๐ŸŸ  Multi-Tile Coverage๏ƒ

When your study area spans multiple Sentinel-2 tiles:

Processing: C2RCC โ†’ Mosaic โ†’ CDOM Calculation โ†’ Plotting
Benefits:
  - Seamless coverage across tiles
  - Single unified product per date
  - Automatic tile stitching

Automatic Detection: The workflow automatically detects tile count and adapts accordingly. No manual configuration needed!

Directory Structure for Multi-tile Processing๏ƒ

04_processed_data/
โ”œโ”€โ”€ c2rcc_output/              # Individual tile products (all scenarios)
โ”œโ”€โ”€ mosaic_output/             # Multi-tile mosaics (created automatically if needed)
โ””โ”€โ”€ cdom_output/               # Final CDOM products

05_final_products/
โ”œโ”€โ”€ chl/                       # Chlorophyll plots
โ”œโ”€โ”€ tsm/                       # TSM plots
โ”œโ”€โ”€ cdom/                      # CDOM plots
โ””โ”€โ”€ true_color/                # RGB composites

Workflow Steps๏ƒ

Step 1: Data Download๏ƒ

Downloads Sentinel-2 L1C data from Copernicus Data Space Ecosystem based on:

  • Date range

  • Study area boundary

  • Cloud cover threshold

  • Product type (L1C)

Step 2: Resample and Subset๏ƒ

  • Resamples all bands to 10m resolution using B2 as reference

  • Subsets data to study area extent

  • Outputs BEAM-DIMAP format

Step 3: Reproject๏ƒ

  • Reprojects data to WGS84 coordinate system

  • Maintains spatial resolution and extent

Step 4: True Color Generation๏ƒ

  • Generates RGB true color images

  • Uses predefined RGB profile for Sentinel-2

  • Outputs PNG format

Step 5: C2RCC Processing๏ƒ

  • Applies Case-2 Regional Coast Color atmospheric correction

  • Calculates water quality parameters (CHL, TSM)

  • Processes individual tiles (single or multiple per date)

  • Outputs NetCDF format

Step 6: Mosaic Processing (Conditional)๏ƒ

Intelligent multi-tile handling:

  • Multiple tiles for same date: Automatically creates mosaic

    • Combines tiles into single seamless dataset

    • Outputs to mosaic_output/

    • Used for subsequent CDOM calculation

  • Single tile: Skips mosaic processing

    • Directly uses C2RCC output for water quality calculation

    • Saves processing time and disk space

Step 7: CDOM Calculation (Adaptive)๏ƒ

  • Calculates CDOM using band math on atmospherically corrected data

  • Adapts to data source: Automatically uses mosaic if available, otherwise C2RCC

  • Uses empirical algorithm with reflectance bands

  • Outputs NetCDF format

Step 8: Visualization๏ƒ

  • Generates publication-ready plots

  • Applies custom colormaps for each parameter

  • Reads from both C2RCC and mosaic outputs as needed

  • Outputs high-resolution PNG files

Usage Examples๏ƒ

Complete Workflow๏ƒ

# Run complete workflow
python run_workflow.py --action full

# Run with specific date range
python run_workflow.py --action full --start-date 2025-05-01 --end-date 2025-06-30

# Run with cleaning previous processed data
python run_workflow.py --action full --clean

Individual Steps๏ƒ

# Download data only
python run_workflow.py --action download --start-date 2025-05-01 --end-date 2025-06-30

# Process data only
python run_workflow.py --action process

# Generate plots only
python run_workflow.py --action plot

Using Individual Scripts๏ƒ

# Download data
python 01_scripts/download.py --config 02_config/parameters.yaml

# Process data
python 01_scripts/process_pipeline.py --config 02_config/parameters.yaml

# Generate plots
python 01_scripts/plotting.py --config 02_config/parameters.yaml

Fresh Repository Setup๏ƒ

For new users starting with a fresh repository:

python setup_fresh_repo.py

This will:

  • Create the complete directory structure

  • Set up configuration templates

  • Create placeholder README files

  • Prepare the repository for first use

After setup, configure your credentials:

# Edit configuration with your Copernicus credentials
# Update 02_config/parameters.yaml with your study area and credentials

Troubleshooting๏ƒ

Common Issues๏ƒ

  1. SNAP GPT not found

    • Ensure SNAP is installed and gpt command is in PATH

    • Test with: gpt --help

  2. Python import errors

    • Install missing packages: pip install -r requirements.txt

    • Check Python version: python --version

  3. Authentication errors

    • Verify Copernicus credentials in configuration

    • Check internet connectivity

  4. Processing errors

    • Check log files in 06_logs/

    • Verify input data exists

    • Check disk space

Log Files๏ƒ

Processing logs are stored in 06_logs/ with timestamps:

  • processing_YYYYMMDD_HHMMSS.log - Processing pipeline logs

  • master_YYYYMMDD_HHMMSS.log - Master workflow logs

Technical Details๏ƒ

Water Quality Parameters๏ƒ

  1. Chlorophyll-a (CHL)

    • Units: mg/mยณ

    • Range: 0.01 - 20.0 mg/mยณ

    • Colormap: Custom 21-color scale

    • Source: C2RCC conc_chl variable

  2. Total Suspended Matter (TSM)

    • Units: g/mยณ

    • Range: 0 - 4 g/mยณ

    • Colormap: cmocean turbid

    • Source: C2RCC conc_tsm variable

  3. Colored Dissolved Organic Matter (CDOM)

    • Units: mโปยน

    • Range: 0 - 4 mโปยน

    • Colormap: YlOrBr

    • Source: Calculated using band math

CDOM Algorithm๏ƒ

The CDOM algorithm uses the following empirical relationship:

CDOM = exp(0.544 * log(rhown_B1) - 0.571 * log(rhown_B2) - 2.181 * log(rhown_B3) + 1.398 * log(rhown_B4) - 1.406)

Where:

  • rhown_B1, B2, B3, B4 are water-leaving reflectances from C2RCC

  • Coefficients are derived from regional calibration

C2RCC Parameters๏ƒ

Key parameters for Australian coastal waters:

  • Salinity: 35.0 psu

  • Temperature: 30.0ยฐC

  • Valid pixel expression: B8 > 0 && B8 < 0.1

  • TSM factor: 1.06

  • CHL factor: 21.0

Performance Optimization๏ƒ

  1. Parallel Processing: SNAP operations use available CPU cores

  2. Memory Management: Large datasets processed in chunks

  3. Disk I/O: Intermediate files stored on fast storage

  4. Caching: Processed data cached to avoid recomputation

Quality Control๏ƒ

Data Quality Checks๏ƒ

  1. Cloud Masking: Products with >10% cloud cover excluded

  2. Valid Pixel Filtering: Invalid pixels masked using C2RCC flags

  3. Range Validation: Parameter values outside physical ranges excluded

  4. Spatial Consistency: Obvious outliers identified and flagged

Output Validation๏ƒ

  1. File Integrity: NetCDF files checked for corruption

  2. Metadata Validation: Ensure all required attributes present

  3. Statistical Summary: Basic statistics logged for each parameter

  4. Visual Inspection: Sample plots generated for quality assessment

Future Enhancements๏ƒ

  1. Multi-temporal Analysis: Time series analysis capabilities

  2. Machine Learning: Advanced classification algorithms

  3. Real-time Processing: Automated processing of new acquisitions

  4. Web Interface: Browser-based visualization and analysis

  5. API Integration: RESTful API for programmatic access

References๏ƒ

  1. Brockmann, C., et al. (2016). Evolution of the C2RCC Neural Network for Sentinel 2 and 3 for the Retrieval of Ocean Colour Products in Normal and Extreme Optically Complex Waters. Living Planet Symposium, Prague.

  2. Doerffer, R. & Schiller, H. (2007). The MERIS Case 2 water algorithm. International Journal of Remote Sensing, 28(3-4), 517-535.

  3. ESA (2015). Sentinel-2 User Handbook. European Space Agency.

Support๏ƒ

For technical support and questions:

  • Check documentation in 07_documentation/

  • Review log files in 06_logs/

  • Consult SNAP documentation: https://step.esa.int/main/doc/

  • ESA Sentinel-2 resources: https://sentinel.esa.int/web/sentinel/missions/sentinel-2