Skip to content

Data Flow Architecture

The platform follows a standardized data processing pipeline from collection to insights.

Processing Pipeline

Raw Data Collection 
Validation & QC 
Standardization 
Processing & Analysis 
Insights & Metrics 
Reporting & Distribution

Stage 1: Raw Data Collection

Data is collected from multiple sources:

  • Satellite Imagery - Automated satellite data acquisition
  • IoT Sensors - Real-time sensor data streams
  • Field Observations - Manual data entry
  • Third-party APIs - External data sources

Stage 2: Validation & QC

Data quality assurance:

  • Format Validation - File format and structure checks
  • Data Quality Checks - Completeness and accuracy validation
  • Geospatial Validation - Coordinate system and boundary checks
  • Metadata Validation - Required metadata verification

Stage 3: Standardization

Data normalization:

  • Format Standardization - Convert to internal formats
  • Coordinate System - Standardize to WGS84 (EPSG:4326)
  • Schema Mapping - Map to internal data models
  • Unit Conversion - Standardize measurement units

Stage 4: Processing & Analysis

Core analysis and computation:

  • Geospatial Processing - Spatial analysis and calculations
  • ML Model Inference - Run AI/ML models on data
  • Time-series Analysis - Temporal pattern analysis
  • Metric Calculation - Compute biodiversity and carbon metrics

Stage 5: Insights & Metrics

Generate actionable insights:

  • Aggregation - Project and site-level aggregations
  • Visualization Data - Prepare data for charts and maps
  • Trend Analysis - Identify patterns and trends
  • Risk Assessment - Calculate risk scores and categories

Processing Times

Typical processing times vary by data type:

Data Type Typical Processing Time
Small audio files (< 5 MB) 2-5 minutes
Images 1-3 minutes
Medium datasets 10-30 minutes
Large satellite jobs 1-4 hours
Complex analyses 4-24 hours

Processing status can be tracked via the processing_status and processing_message fields in API responses.