Update documentation and add architectural approach document
This commit is contained in:
64
collab/proposals/gis-weather-proposal.md
Normal file
64
collab/proposals/gis-weather-proposal.md
Normal file
@@ -0,0 +1,64 @@
|
||||
# GIS and Weather Data Processing Container Proposal
|
||||
|
||||
## Proposal Summary
|
||||
Create specialized Docker containers for GIS data processing and weather data analysis to support CTO-mode R&D activities, particularly for infrastructure planning and balloon path prediction for your TSYS Group projects.
|
||||
|
||||
## Business Rationale
|
||||
As GIS and weather data analysis become increasingly important for your TSYS Group projects (particularly for infrastructure planning like solar farms and building datasets, and balloon path prediction), there's a need for specialized containers that can handle these data types efficiently while maintaining consistency with existing infrastructure patterns. The containers will support:
|
||||
- Self-hosted GIS stack for privacy and control
|
||||
- Processing public datasets (NOAA, European APIs, etc.)
|
||||
- ETL workflows for both technical and business data processing
|
||||
- Integration with MinIO for data output to business systems
|
||||
|
||||
## Technical Approach
|
||||
- Follow the same disciplined container architecture as the documentation tools
|
||||
- Use layered approach with base and specialized containers
|
||||
- Implement same security patterns (non-root user, UID/GID mapping)
|
||||
- Maintain consistent naming conventions
|
||||
- Use same operational patterns (wrapper scripts, etc.)
|
||||
- Include PostGIS, DuckDB, and optional GPU support
|
||||
- Implement MinIO integration for data output
|
||||
- Support for prototyping on workstations and production on large servers
|
||||
|
||||
## Technology Stack
|
||||
- **GIS Tools**: GDAL, Tippecanoe, DuckDB with spatial extensions
|
||||
- **Database**: PostgreSQL/PostGIS client tools
|
||||
- **Formats**: Shapefiles, Parquet, GRIB, GeoJSON
|
||||
- **Weather**: cfgrib, xarray, MetPy
|
||||
- **ETL**: Pandas, Dask, custom workflow tools
|
||||
- **APIs**: NOAA, European weather APIs
|
||||
- **Visualization**: Folium, Plotly, command-line tools
|
||||
|
||||
## Benefits
|
||||
- Consistent environment across development (workstations) and production (large servers)
|
||||
- Proper file permission handling across different systems
|
||||
- Isolated tools prevent dependency conflicts
|
||||
- Reproducible analysis environments for GIS and weather data
|
||||
- Integration with documentation tools for CTO mode workflows
|
||||
- Support for both technical (GIS/Weather) and business (COO) ETL workflows
|
||||
- Scalable architecture with optional GPU support
|
||||
- Data output capability to MinIO buckets for business use
|
||||
|
||||
## Resource Requirements
|
||||
- Development time: 3-4 weeks for complete implementation
|
||||
- Storage: Additional container images (est. 3-6GB each)
|
||||
- Compute: Higher requirements for processing (can be isolated to CTO mode)
|
||||
- Optional: GPU resources for performance-intensive tasks
|
||||
|
||||
## Expected Outcomes
|
||||
- Improved capability for spatial and weather data analysis
|
||||
- Consistent environments across development and production systems
|
||||
- Better integration with documentation workflows
|
||||
- Faster setup for ETL projects (both technical and business)
|
||||
- Efficient processing of large datasets using DuckDB and Parquet
|
||||
- Proper data output to MinIO buckets for business use
|
||||
- Reduced technical debt through consistent patterns
|
||||
|
||||
## Implementation Timeline
|
||||
- Week 1: Base GIS container with PostGIS, DuckDB, and data format support
|
||||
- Week 2: Base Weather container with GRIB support and API integration
|
||||
- Week 3: Advanced processing containers with Jupyter and visualization
|
||||
- Week 4: Optional GPU variants and MinIO integration testing
|
||||
|
||||
## Approval Request
|
||||
Please review and approve this proposal to proceed with implementation of the GIS and weather data processing containers that will support your infrastructure planning and balloon path prediction work.
|
||||
Reference in New Issue
Block a user