Update documentation and add architectural approach document
This commit is contained in:
@@ -26,7 +26,7 @@ This document tracks the various agents, tools, and systems used in the AIOS-Pub
|
|||||||
- mdbook-pdf (installed via Cargo)
|
- mdbook-pdf (installed via Cargo)
|
||||||
- Typst
|
- Typst
|
||||||
- Marp CLI
|
- Marp CLI
|
||||||
- Markwhen: Interactive text-to-timeline tool
|
- Wandmalfarbe pandoc-latex-template: Beautiful Eisvogel LaTeX template for professional PDF generation
|
||||||
- Spell/Grammar checking:
|
- Spell/Grammar checking:
|
||||||
- Hunspell (with en-US dictionary)
|
- Hunspell (with en-US dictionary)
|
||||||
- Aspell (with en dictionary)
|
- Aspell (with en dictionary)
|
||||||
@@ -95,8 +95,9 @@ docker-compose up --build
|
|||||||
# Spell checking with hunspell
|
# Spell checking with hunspell
|
||||||
./docker-compose-wrapper.sh run docmaker-full hunspell -d en_US document.md
|
./docker-compose-wrapper.sh run docmaker-full hunspell -d en_US document.md
|
||||||
|
|
||||||
# Create timeline with Markwhen
|
# Create timeline with Markwhen (not currently available)
|
||||||
./docker-compose-wrapper.sh run docmaker-full markwhen input.mw --output output.html
|
# This will be enabled when Markwhen installation issue is resolved
|
||||||
|
# ./docker-compose-wrapper.sh run docmaker-full markwhen input.mw --output output.html
|
||||||
|
|
||||||
# Grammar/style checking with Vale
|
# Grammar/style checking with Vale
|
||||||
./docker-compose-wrapper.sh run docmaker-full vale document.md
|
./docker-compose-wrapper.sh run docmaker-full vale document.md
|
||||||
|
|||||||
@@ -46,8 +46,15 @@ RUN curl -L https://github.com/typst/typst/releases/latest/download/typst-x86_64
|
|||||||
# Install Marp CLI
|
# Install Marp CLI
|
||||||
RUN npm install -g @marp-team/marp-cli
|
RUN npm install -g @marp-team/marp-cli
|
||||||
|
|
||||||
# Install Markwhen
|
# Install Wandmalfarbe pandoc-latex-template for beautiful PDF generation
|
||||||
RUN npm install -g @markwhen/cli
|
RUN git clone --depth 1 https://github.com/Wandmalfarbe/pandoc-latex-template.git /tmp/pandoc-latex-template && \
|
||||||
|
mkdir -p /root/.local/share/pandoc/templates && \
|
||||||
|
# Find and copy any .latex template files to the templates directory
|
||||||
|
find /tmp/pandoc-latex-template -name "*.latex" -exec cp {} /root/.local/share/pandoc/templates/ \; && \
|
||||||
|
# Also install to system-wide location for all users
|
||||||
|
mkdir -p /usr/share/pandoc/templates && \
|
||||||
|
find /tmp/pandoc-latex-template -name "*.latex" -exec cp {} /usr/share/pandoc/templates/ \; && \
|
||||||
|
rm -rf /tmp/pandoc-latex-template
|
||||||
|
|
||||||
# Install spell/grammar checking tools
|
# Install spell/grammar checking tools
|
||||||
RUN apt-get update && apt-get install -y \
|
RUN apt-get update && apt-get install -y \
|
||||||
@@ -60,7 +67,7 @@ RUN curl -L https://github.com/errata-ai/vale/releases/download/v3.12.0/vale_3.1
|
|||||||
| tar xz -C /tmp && cp /tmp/vale /usr/local/bin && chmod +x /usr/local/bin/vale
|
| tar xz -C /tmp && cp /tmp/vale /usr/local/bin && chmod +x /usr/local/bin/vale
|
||||||
|
|
||||||
# Install text statistics tool for reading time estimation
|
# Install text statistics tool for reading time estimation
|
||||||
RUN pip3 install mdstat textstat
|
RUN pip3 install --break-system-packages mdstat textstat
|
||||||
|
|
||||||
# Install additional text processing tools
|
# Install additional text processing tools
|
||||||
RUN apt-get update && apt-get install -y \
|
RUN apt-get update && apt-get install -y \
|
||||||
|
|||||||
@@ -18,11 +18,11 @@ The RCEO-AIOS-Public-Tools-DocMaker-Base container is designed for lightweight d
|
|||||||
|
|
||||||
### Documentation Generation
|
### Documentation Generation
|
||||||
- **Pandoc**: Universal document converter
|
- **Pandoc**: Universal document converter
|
||||||
|
- **Wandmalfarbe pandoc-latex-template**: Beautiful Eisvogel LaTeX template for professional PDFs
|
||||||
- **mdBook**: Create books from Markdown files
|
- **mdBook**: Create books from Markdown files
|
||||||
- **mdbook-pdf**: PDF renderer for mdBook
|
- **mdbook-pdf**: PDF renderer for mdBook
|
||||||
- **Typst**: Modern typesetting system
|
- **Typst**: Modern typesetting system
|
||||||
- **Marp CLI**: Create presentations from Markdown
|
- **Marp CLI**: Create presentations from Markdown
|
||||||
- **Markwhen**: Interactive text-to-timeline tool
|
|
||||||
|
|
||||||
### LaTeX
|
### LaTeX
|
||||||
- **TeX Live**: Lightweight LaTeX packages for basic document typesetting
|
- **TeX Live**: Lightweight LaTeX packages for basic document typesetting
|
||||||
@@ -51,6 +51,9 @@ cd /home/localuser/AIWorkspace/AIOS-Public/Docker/RCEO-AIOS-Public-Tools-DocMake
|
|||||||
# Example: Convert a Markdown file to PDF using pandoc
|
# Example: Convert a Markdown file to PDF using pandoc
|
||||||
./docker-compose-wrapper.sh run docmaker-base pandoc input.md -o output.pdf
|
./docker-compose-wrapper.sh run docmaker-base pandoc input.md -o output.pdf
|
||||||
|
|
||||||
|
# Example: Create beautiful PDF using Eisvogel template
|
||||||
|
./docker-compose-wrapper.sh run docmaker-base pandoc input.md --template eisvogel -o output.pdf
|
||||||
|
|
||||||
# Example: Create a timeline with Markwhen
|
# Example: Create a timeline with Markwhen
|
||||||
./docker-compose-wrapper.sh run docmaker-base markwhen input.mw --output output.html
|
./docker-compose-wrapper.sh run docmaker-base markwhen input.mw --output output.html
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -43,14 +43,15 @@ This document tracks potential enhancements and tools to be added to the documen
|
|||||||
- ✅ Core system packages (bash, curl, wget, git)
|
- ✅ Core system packages (bash, curl, wget, git)
|
||||||
- ✅ Programming languages (Python 3, Node.js, Rust)
|
- ✅ Programming languages (Python 3, Node.js, Rust)
|
||||||
- ✅ Pandoc - Universal document converter
|
- ✅ Pandoc - Universal document converter
|
||||||
|
- ✅ Wandmalfarbe pandoc-latex-template - Beautiful Eisvogel LaTeX template for professional PDFs
|
||||||
- ✅ mdBook - Create books from Markdown files
|
- ✅ mdBook - Create books from Markdown files
|
||||||
- ✅ mdbook-pdf - PDF renderer for mdBook
|
- ✅ mdbook-pdf - PDF renderer for mdBook
|
||||||
- ✅ Typst - Modern typesetting system
|
- ✅ Typst - Modern typesetting system
|
||||||
- ✅ Marp CLI - Create presentations from Markdown
|
- ✅ Marp CLI - Create presentations from Markdown
|
||||||
- ✅ Markwhen - Interactive text-to-timeline tool
|
- ⏳ Markwhen - Interactive text-to-timeline tool (installation failed, needs fix)
|
||||||
- ✅ Light LaTeX packages (texlive-latex-base)
|
- ✅ Light LaTeX packages (texlive-latex-base)
|
||||||
- ✅ Spell/grammar checking tools (Hunspell, Aspell, Vale)
|
- ✅ Spell/grammar checking tools (Hunspell, Aspell, Vale)
|
||||||
- ✅ Text statistics tools (mdstat)
|
- ✅ Text statistics tools (mdstat, textstat)
|
||||||
- ✅ Non-root user management with UID/GID mapping
|
- ✅ Non-root user management with UID/GID mapping
|
||||||
- ✅ Entrypoint script for runtime user creation
|
- ✅ Entrypoint script for runtime user creation
|
||||||
|
|
||||||
|
|||||||
@@ -41,6 +41,9 @@ cd /home/localuser/AIWorkspace/AIOS-Public/Docker/RCEO-AIOS-Public-Tools-DocMake
|
|||||||
# Example: Run Python analysis
|
# Example: Run Python analysis
|
||||||
./docker-compose-wrapper.sh run docmaker-computational python analysis.py
|
./docker-compose-wrapper.sh run docmaker-computational python analysis.py
|
||||||
|
|
||||||
|
# Example: Convert a Markdown file to beautiful PDF using Eisvogel template
|
||||||
|
./docker-compose-wrapper.sh run docmaker-computational pandoc input.md --template eisvogel -o output.pdf --pdf-engine=xelatex
|
||||||
|
|
||||||
# Example: Start Jupyter notebook server
|
# Example: Start Jupyter notebook server
|
||||||
./docker-compose-wrapper.sh up
|
./docker-compose-wrapper.sh up
|
||||||
# Then access at http://localhost:8888
|
# Then access at http://localhost:8888
|
||||||
|
|||||||
@@ -29,6 +29,9 @@ cd /home/localuser/AIWorkspace/AIOS-Public/Docker/RCEO-AIOS-Public-Tools-DocMake
|
|||||||
|
|
||||||
# Example: Convert a Markdown file to PDF using pandoc with full LaTeX
|
# Example: Convert a Markdown file to PDF using pandoc with full LaTeX
|
||||||
./docker-compose-wrapper.sh run docmaker-full pandoc input.md -o output.pdf --pdf-engine=xelatex
|
./docker-compose-wrapper.sh run docmaker-full pandoc input.md -o output.pdf --pdf-engine=xelatex
|
||||||
|
|
||||||
|
# Example: Create beautiful PDF using Eisvogel template
|
||||||
|
./docker-compose-wrapper.sh run docmaker-full pandoc input.md --template eisvogel -o output.pdf --pdf-engine=xelatex
|
||||||
```
|
```
|
||||||
|
|
||||||
### Using with docker-compose directly
|
### Using with docker-compose directly
|
||||||
|
|||||||
@@ -26,6 +26,9 @@ cd /home/localuser/AIWorkspace/AIOS-Public/Docker/RCEO-AIOS-Public-Tools-DocMake
|
|||||||
|
|
||||||
# Example: Convert a Markdown file to PDF using pandoc
|
# Example: Convert a Markdown file to PDF using pandoc
|
||||||
./docker-compose-wrapper.sh run docmaker-light pandoc input.md -o output.pdf
|
./docker-compose-wrapper.sh run docmaker-light pandoc input.md -o output.pdf
|
||||||
|
|
||||||
|
# Example: Create beautiful PDF using Eisvogel template
|
||||||
|
./docker-compose-wrapper.sh run docmaker-light pandoc input.md --template eisvogel -o output.pdf
|
||||||
```
|
```
|
||||||
|
|
||||||
### Using with docker-compose directly
|
### Using with docker-compose directly
|
||||||
|
|||||||
@@ -19,3 +19,8 @@ Additional Rules:
|
|||||||
- Create thin wrapper scripts that detect and handle UID/GID mapping to ensure file permissions work across any host environment.
|
- Create thin wrapper scripts that detect and handle UID/GID mapping to ensure file permissions work across any host environment.
|
||||||
- Maintain disciplined naming and organization to prevent technical debt as the number of projects grows.
|
- Maintain disciplined naming and organization to prevent technical debt as the number of projects grows.
|
||||||
- Keep the repository root directory clean. Place all project-specific files and scripts in appropriate subdirectories rather than at the top level.
|
- Keep the repository root directory clean. Place all project-specific files and scripts in appropriate subdirectories rather than at the top level.
|
||||||
|
- Use conventional commits for all git commits with proper formatting: type(scope): brief description followed by more verbose explanation if needed.
|
||||||
|
- Commit messages should be beautiful and properly verbose, explaining what was done and why.
|
||||||
|
- Use the LLM's judgment for when to push and tag - delegate these decisions based on the significance of changes.
|
||||||
|
- All projects should include a collab/ directory with subdirectories: questions, proposals, plans, prompts, and audit.
|
||||||
|
- Follow the architectural approach: layered container architecture (base -> specialized layers), consistent security patterns (non-root user with UID/GID mapping), same operational patterns (wrapper scripts), and disciplined naming conventions.
|
||||||
|
|||||||
47
GUIDEBOOK/ArchitecturalApproach.md
Normal file
47
GUIDEBOOK/ArchitecturalApproach.md
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
# Architectural Approach
|
||||||
|
|
||||||
|
This document captures the architectural approach for project development in the AIOS-Public system.
|
||||||
|
|
||||||
|
## Container Architecture
|
||||||
|
|
||||||
|
### Layered Approach
|
||||||
|
- Base containers provide foundational tools and libraries
|
||||||
|
- Specialized containers extend base functionality for specific use cases
|
||||||
|
- Each layer adds specific capabilities while maintaining consistency
|
||||||
|
|
||||||
|
### Naming Convention
|
||||||
|
- Use `RCEO-AIOS-Public-Tools-` prefix consistently
|
||||||
|
- Include descriptive suffixes indicating container purpose
|
||||||
|
- Follow pattern: `RCEO-AIOS-Public-Tools-[domain]-[type]`
|
||||||
|
|
||||||
|
### Security Patterns
|
||||||
|
- Minimize root usage during build and runtime
|
||||||
|
- Implement non-root users for all runtime operations
|
||||||
|
- Use UID/GID mapping for proper file permissions across environments
|
||||||
|
- Detect host user IDs automatically through file system inspection
|
||||||
|
|
||||||
|
### Operational Patterns
|
||||||
|
- Create thin wrapper scripts that handle environment setup
|
||||||
|
- Use consistent patterns for user ID detection and mapping
|
||||||
|
- Maintain same operational workflow across all containers
|
||||||
|
- Provide clear documentation in README files
|
||||||
|
|
||||||
|
### Organization Principles
|
||||||
|
- Separate COO mode (operational tasks) from CTO mode (R&D tasks) containers
|
||||||
|
- Create individual directories per container type
|
||||||
|
- Maintain disciplined file organization to prevent technical debt
|
||||||
|
- Keep repository root clean with project-specific files in subdirectories
|
||||||
|
|
||||||
|
## Documentation Requirements
|
||||||
|
- Each container must have comprehensive README
|
||||||
|
- Include usage examples and environment setup instructions
|
||||||
|
- Document security and permission handling
|
||||||
|
- Provide clear container mapping and purpose
|
||||||
|
|
||||||
|
## Implementation Workflow
|
||||||
|
1. Start with architectural design document
|
||||||
|
2. Create detailed implementation plan
|
||||||
|
3. Develop following established patterns
|
||||||
|
4. Test with sample data/usage
|
||||||
|
5. Document for end users
|
||||||
|
6. Commit with conventional commit messages
|
||||||
40
collab/README.md
Normal file
40
collab/README.md
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
# Collaboration Directory
|
||||||
|
|
||||||
|
This directory contains structured collaboration artifacts for project development and decision-making.
|
||||||
|
|
||||||
|
## Directory Structure
|
||||||
|
|
||||||
|
- `questions/` - Outstanding questions and topics for discussion
|
||||||
|
- `proposals/` - Formal proposals for new features, changes, or implementations
|
||||||
|
- `plans/` - Detailed implementation plans and technical designs
|
||||||
|
- `prompts/` - Structured prompts for AI agents and automation
|
||||||
|
- `audit/` - Audit trails, reviews, and assessment records
|
||||||
|
|
||||||
|
## Usage Guidelines
|
||||||
|
|
||||||
|
### Questions
|
||||||
|
- Add new questions that need discussion or clarification
|
||||||
|
- Link related proposals or plans where appropriate
|
||||||
|
- Track resolution status
|
||||||
|
|
||||||
|
### Proposals
|
||||||
|
- Create formal proposals for significant changes or additions
|
||||||
|
- Include business rationale and technical approach
|
||||||
|
- Document expected outcomes and resource requirements
|
||||||
|
- Seek approval before implementation
|
||||||
|
|
||||||
|
### Plans
|
||||||
|
- Detail technical implementation plans
|
||||||
|
- Include architecture diagrams, technology stacks, and implementation phases
|
||||||
|
- Identify risks and mitigation strategies
|
||||||
|
- Outline next steps and dependencies
|
||||||
|
|
||||||
|
### Prompts
|
||||||
|
- Store reusable prompts for AI agents
|
||||||
|
- Document prompt effectiveness and outcomes
|
||||||
|
- Version prompts for different use cases
|
||||||
|
|
||||||
|
### Audit
|
||||||
|
- Track decisions made and their outcomes
|
||||||
|
- Document performance reviews and assessments
|
||||||
|
- Record lessons learned and improvements
|
||||||
23
collab/audit/markwhen-installation-issue.md
Normal file
23
collab/audit/markwhen-installation-issue.md
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
# Issue: Markwhen Installation Failure
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
The Markwhen installation is failing during the Docker build process with the error:
|
||||||
|
"failed to solve: process "/bin/sh -c npm install -g @markwhen/cli" did not complete successfully: exit code: 1"
|
||||||
|
|
||||||
|
## Investigation Needed
|
||||||
|
- Research the correct npm package name for Markwhen CLI
|
||||||
|
- Determine if it should be installed from GitHub repository instead
|
||||||
|
- Check if there are dependencies we're missing
|
||||||
|
- Verify if the package exists under a different name
|
||||||
|
|
||||||
|
## Possible Solutions
|
||||||
|
1. Install from GitHub repository directly
|
||||||
|
2. Use a different package name
|
||||||
|
3. Build from source
|
||||||
|
4. Check if Node.js version compatibility is an issue
|
||||||
|
|
||||||
|
## Priority
|
||||||
|
Medium - Markwhen is a useful tool for timeline generation but not critical for core functionality
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Pending investigation
|
||||||
175
collab/plans/gis-weather-plan.md
Normal file
175
collab/plans/gis-weather-plan.md
Normal file
@@ -0,0 +1,175 @@
|
|||||||
|
# GIS and Weather Data Processing Container Plan
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
This document outlines the plan for creating Docker containers to handle GIS data processing and weather data analysis. These containers will be used exclusively in CTO mode for R&D and data analysis tasks, with integration to documentation workflows and MinIO for data output.
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
### GIS Data Processing
|
||||||
|
- Support for Shapefiles and other GIS formats
|
||||||
|
- Self-hosted GIS stack (not Google Maps or other commercial services)
|
||||||
|
- Integration with tools like GDAL, Tippecanoe, DuckDB
|
||||||
|
- Heavy use of PostGIS database
|
||||||
|
- Parquet format support for efficient data storage
|
||||||
|
- Based on reference workflows from:
|
||||||
|
- https://tech.marksblogg.com/american-solar-farms.html
|
||||||
|
- https://tech.marksblogg.com/canadas-odb-buildings.html
|
||||||
|
- https://tech.marksblogg.com/ornl-fema-buildings.html
|
||||||
|
|
||||||
|
### Weather Data Processing
|
||||||
|
- GRIB data format processing
|
||||||
|
- NOAA and European weather APIs integration
|
||||||
|
- Bulk data download via HTTP/FTP
|
||||||
|
- Balloon path prediction system (to be forked/modified)
|
||||||
|
|
||||||
|
### Shared Requirements
|
||||||
|
- Python-based with appropriate libraries (GeoPandas, DuckDB, etc.)
|
||||||
|
- R support for statistical analysis
|
||||||
|
- Jupyter notebook integration for experimentation
|
||||||
|
- MinIO bucket integration for data output
|
||||||
|
- Optional but enabled GPU support for performance
|
||||||
|
- All visualization types (command-line, web, desktop)
|
||||||
|
- Flexible ETL capabilities for both GIS/Weather and business workflows
|
||||||
|
|
||||||
|
## Proposed Container Structure
|
||||||
|
|
||||||
|
### RCEO-AIOS-Public-Tools-GIS-Base
|
||||||
|
- Foundation container with core GIS libraries
|
||||||
|
- Python + geospatial stack (GDAL, GEOS, PROJ, DuckDB, Tippecanoe)
|
||||||
|
- R with spatial packages
|
||||||
|
- PostGIS client tools
|
||||||
|
- Parquet support
|
||||||
|
- File format support (Shapefiles, GeoJSON, etc.)
|
||||||
|
|
||||||
|
### RCEO-AIOS-Public-Tools-GIS-Processing
|
||||||
|
- Extends GIS-Base with advanced processing tools
|
||||||
|
- Jupyter with GIS extensions
|
||||||
|
- Specialized ETL libraries
|
||||||
|
- Performance optimization tools
|
||||||
|
|
||||||
|
### RCEO-AIOS-Public-Tools-Weather-Base
|
||||||
|
- Foundation container with weather data libraries
|
||||||
|
- GRIB format support (cfgrib)
|
||||||
|
- NOAA and European API integration tools
|
||||||
|
- Bulk download utilities (HTTP/FTP)
|
||||||
|
|
||||||
|
### RCEO-AIOS-Public-Tools-Weather-Analysis
|
||||||
|
- Extends Weather-Base with advanced analysis tools
|
||||||
|
- Balloon path prediction tools
|
||||||
|
- Forecasting libraries
|
||||||
|
- Time series analysis
|
||||||
|
|
||||||
|
### RCEO-AIOS-Public-Tools-GIS-Weather-Fusion (Optional)
|
||||||
|
- Combined container for integrated GIS + Weather analysis
|
||||||
|
- For balloon path prediction using weather data
|
||||||
|
- High-resource container for intensive tasks
|
||||||
|
|
||||||
|
## Technology Stack
|
||||||
|
|
||||||
|
### GIS Libraries
|
||||||
|
- GDAL/OGR for format translation and processing
|
||||||
|
- GEOS for geometric operations
|
||||||
|
- PROJ for coordinate transformations
|
||||||
|
- PostGIS for spatial database operations
|
||||||
|
- DuckDB for efficient data processing with spatial extensions
|
||||||
|
- Tippecanoe for tile generation
|
||||||
|
- Shapely for Python geometric operations
|
||||||
|
- GeoPandas for Python geospatial data handling
|
||||||
|
- Rasterio for raster processing in Python
|
||||||
|
- Leaflet/Mapbox for web visualization
|
||||||
|
|
||||||
|
### Data Storage & Processing
|
||||||
|
- DuckDB with spatial extensions
|
||||||
|
- Parquet format support
|
||||||
|
- MinIO client tools for data output
|
||||||
|
- PostgreSQL client for connecting to external databases
|
||||||
|
|
||||||
|
### Weather Libraries
|
||||||
|
- xarray for multi-dimensional data in Python
|
||||||
|
- cfgrib for GRIB format handling
|
||||||
|
- MetPy for meteorological calculations
|
||||||
|
- Climate Data Operators (CDO) for climate data processing
|
||||||
|
- R packages: raster, rgdal, ncdf4, rasterVis
|
||||||
|
|
||||||
|
### Visualization
|
||||||
|
- Folium for interactive maps
|
||||||
|
- Plotly for time series visualization
|
||||||
|
- Matplotlib/Seaborn for statistical plots
|
||||||
|
- R visualization packages
|
||||||
|
- Command-line visualization tools
|
||||||
|
|
||||||
|
### ETL and Workflow Tools
|
||||||
|
- Apache Airflow (optional in advanced containers)
|
||||||
|
- Prefect or similar workflow orchestrators
|
||||||
|
- DuckDB for ETL operations
|
||||||
|
- Pandas/Dask for large data processing
|
||||||
|
|
||||||
|
## Container Deployment Strategy
|
||||||
|
|
||||||
|
### Workstation Prototyping
|
||||||
|
- Lighter containers for development and testing
|
||||||
|
- Optional GPU support
|
||||||
|
- MinIO client for data output testing
|
||||||
|
|
||||||
|
### Production Servers
|
||||||
|
- Full-featured containers with all processing capabilities
|
||||||
|
- GPU-enabled variants where applicable
|
||||||
|
- Optimized for large RAM/CPU/disk requirements
|
||||||
|
|
||||||
|
## Security & User Management
|
||||||
|
- Follow same non-root user pattern as documentation containers
|
||||||
|
- UID/GID mapping for file permissions
|
||||||
|
- Minimal necessary privileges
|
||||||
|
- Proper container isolation
|
||||||
|
- Secure access to MinIO buckets
|
||||||
|
|
||||||
|
## Integration with Existing Stack
|
||||||
|
- Compatible with existing user management approach
|
||||||
|
- Can be orchestrated with documentation containers when needed
|
||||||
|
- Follow same naming conventions
|
||||||
|
- Use same wrapper script patterns
|
||||||
|
- Separate from documentation containers but can work together in CTO mode
|
||||||
|
|
||||||
|
## Implementation Phases
|
||||||
|
|
||||||
|
### Phase 1: Base GIS Container
|
||||||
|
- Create GIS-Base with GDAL, DuckDB, PostGIS client tools
|
||||||
|
- Implement Parquet and Shapefile support
|
||||||
|
- Test with sample datasets from reference posts
|
||||||
|
- Validate MinIO integration
|
||||||
|
|
||||||
|
### Phase 2: Weather Base Container
|
||||||
|
- Create Weather-Base with GRIB support
|
||||||
|
- Integrate NOAA and European API tools
|
||||||
|
- Implement bulk download capabilities
|
||||||
|
- Test with weather data sources
|
||||||
|
|
||||||
|
### Phase 3: Processing Containers
|
||||||
|
- Create GIS-Processing container with ETL tools
|
||||||
|
- Create Weather-Analysis container with prediction tools
|
||||||
|
- Add visualization and Jupyter support
|
||||||
|
- Implement optional GPU support
|
||||||
|
|
||||||
|
### Phase 4: Optional Fusion Container
|
||||||
|
- Combined container for balloon path prediction
|
||||||
|
- Integration of GIS and weather data
|
||||||
|
- High-complexity, high-resource usage
|
||||||
|
|
||||||
|
## Data Flow Architecture
|
||||||
|
- ETL workflows for processing public datasets
|
||||||
|
- Output to MinIO buckets for business use
|
||||||
|
- Integration with documentation tools for CTO mode workflows
|
||||||
|
- Support for both GIS/Weather ETL (CTO) and business ETL (COO)
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
1. Review and approve this enhanced plan
|
||||||
|
2. Begin Phase 1 implementation
|
||||||
|
3. Test with sample data from reference workflows
|
||||||
|
4. Iterate based on findings
|
||||||
|
|
||||||
|
## Risks & Considerations
|
||||||
|
- Large container sizes due to GIS libraries and dependencies
|
||||||
|
- Complex dependency management, especially with DuckDB and PostGIS
|
||||||
|
- Computational resource requirements, especially for large datasets
|
||||||
|
- GPU support implementation complexity
|
||||||
|
- Bulk data download and processing performance
|
||||||
35
collab/prompts/gis-weather-prompt.md
Normal file
35
collab/prompts/gis-weather-prompt.md
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
# GIS and Weather Data Processing - AI Prompt Template
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
This prompt template is designed to guide AI agents in implementing GIS and weather data processing containers following established patterns.
|
||||||
|
|
||||||
|
## Instructions for AI Agent
|
||||||
|
|
||||||
|
When implementing GIS and weather data processing containers:
|
||||||
|
|
||||||
|
1. Follow the established container architecture pattern (base -> specialized layers)
|
||||||
|
2. Maintain consistent naming convention: RCEO-AIOS-Public-Tools-[domain]-[type]
|
||||||
|
3. Implement non-root user with UID/GID mapping
|
||||||
|
4. Create appropriate Dockerfiles and docker-compose configurations
|
||||||
|
5. Include proper documentation and README files
|
||||||
|
6. Add wrapper scripts for environment management
|
||||||
|
7. Test with sample data to verify functionality
|
||||||
|
8. Follow same security and operational patterns as existing containers
|
||||||
|
|
||||||
|
## Technical Requirements
|
||||||
|
|
||||||
|
- Use Debian Bookworm slim as base OS
|
||||||
|
- Include appropriate GIS libraries (GDAL, GEOS, PROJ, etc.)
|
||||||
|
- Include weather data processing libraries (xarray, netCDF4, etc.)
|
||||||
|
- Implement Jupyter notebook support where appropriate
|
||||||
|
- Include R and Python stacks as needed
|
||||||
|
- Add visualization tools (Folium, Plotly, etc.)
|
||||||
|
|
||||||
|
## Quality Standards
|
||||||
|
|
||||||
|
- Ensure containers build without errors
|
||||||
|
- Verify file permissions work across environments
|
||||||
|
- Test with sample datasets
|
||||||
|
- Document usage clearly
|
||||||
|
- Follow security best practices
|
||||||
|
- Maintain consistent user experience with existing containers
|
||||||
64
collab/proposals/gis-weather-proposal.md
Normal file
64
collab/proposals/gis-weather-proposal.md
Normal file
@@ -0,0 +1,64 @@
|
|||||||
|
# GIS and Weather Data Processing Container Proposal
|
||||||
|
|
||||||
|
## Proposal Summary
|
||||||
|
Create specialized Docker containers for GIS data processing and weather data analysis to support CTO-mode R&D activities, particularly for infrastructure planning and balloon path prediction for your TSYS Group projects.
|
||||||
|
|
||||||
|
## Business Rationale
|
||||||
|
As GIS and weather data analysis become increasingly important for your TSYS Group projects (particularly for infrastructure planning like solar farms and building datasets, and balloon path prediction), there's a need for specialized containers that can handle these data types efficiently while maintaining consistency with existing infrastructure patterns. The containers will support:
|
||||||
|
- Self-hosted GIS stack for privacy and control
|
||||||
|
- Processing public datasets (NOAA, European APIs, etc.)
|
||||||
|
- ETL workflows for both technical and business data processing
|
||||||
|
- Integration with MinIO for data output to business systems
|
||||||
|
|
||||||
|
## Technical Approach
|
||||||
|
- Follow the same disciplined container architecture as the documentation tools
|
||||||
|
- Use layered approach with base and specialized containers
|
||||||
|
- Implement same security patterns (non-root user, UID/GID mapping)
|
||||||
|
- Maintain consistent naming conventions
|
||||||
|
- Use same operational patterns (wrapper scripts, etc.)
|
||||||
|
- Include PostGIS, DuckDB, and optional GPU support
|
||||||
|
- Implement MinIO integration for data output
|
||||||
|
- Support for prototyping on workstations and production on large servers
|
||||||
|
|
||||||
|
## Technology Stack
|
||||||
|
- **GIS Tools**: GDAL, Tippecanoe, DuckDB with spatial extensions
|
||||||
|
- **Database**: PostgreSQL/PostGIS client tools
|
||||||
|
- **Formats**: Shapefiles, Parquet, GRIB, GeoJSON
|
||||||
|
- **Weather**: cfgrib, xarray, MetPy
|
||||||
|
- **ETL**: Pandas, Dask, custom workflow tools
|
||||||
|
- **APIs**: NOAA, European weather APIs
|
||||||
|
- **Visualization**: Folium, Plotly, command-line tools
|
||||||
|
|
||||||
|
## Benefits
|
||||||
|
- Consistent environment across development (workstations) and production (large servers)
|
||||||
|
- Proper file permission handling across different systems
|
||||||
|
- Isolated tools prevent dependency conflicts
|
||||||
|
- Reproducible analysis environments for GIS and weather data
|
||||||
|
- Integration with documentation tools for CTO mode workflows
|
||||||
|
- Support for both technical (GIS/Weather) and business (COO) ETL workflows
|
||||||
|
- Scalable architecture with optional GPU support
|
||||||
|
- Data output capability to MinIO buckets for business use
|
||||||
|
|
||||||
|
## Resource Requirements
|
||||||
|
- Development time: 3-4 weeks for complete implementation
|
||||||
|
- Storage: Additional container images (est. 3-6GB each)
|
||||||
|
- Compute: Higher requirements for processing (can be isolated to CTO mode)
|
||||||
|
- Optional: GPU resources for performance-intensive tasks
|
||||||
|
|
||||||
|
## Expected Outcomes
|
||||||
|
- Improved capability for spatial and weather data analysis
|
||||||
|
- Consistent environments across development and production systems
|
||||||
|
- Better integration with documentation workflows
|
||||||
|
- Faster setup for ETL projects (both technical and business)
|
||||||
|
- Efficient processing of large datasets using DuckDB and Parquet
|
||||||
|
- Proper data output to MinIO buckets for business use
|
||||||
|
- Reduced technical debt through consistent patterns
|
||||||
|
|
||||||
|
## Implementation Timeline
|
||||||
|
- Week 1: Base GIS container with PostGIS, DuckDB, and data format support
|
||||||
|
- Week 2: Base Weather container with GRIB support and API integration
|
||||||
|
- Week 3: Advanced processing containers with Jupyter and visualization
|
||||||
|
- Week 4: Optional GPU variants and MinIO integration testing
|
||||||
|
|
||||||
|
## Approval Request
|
||||||
|
Please review and approve this proposal to proceed with implementation of the GIS and weather data processing containers that will support your infrastructure planning and balloon path prediction work.
|
||||||
87
collab/questions/gis-weather-questions.md
Normal file
87
collab/questions/gis-weather-questions.md
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
# GIS and Weather Data Processing - Initial Questions
|
||||||
|
|
||||||
|
## Core Questions
|
||||||
|
|
||||||
|
1. What specific GIS formats and operations are most critical for your current projects?
|
||||||
|
|
||||||
|
Well I am not entirely sure. I am guessing that I'll need to pull in shapefiles ? I will be working with an
|
||||||
|
entirely self hosted GIS stack (not Google maps or anything). I know things exist like gdal ? tippacanoe?
|
||||||
|
|
||||||
|
I think things like parquet as well. Maybe duckdb?
|
||||||
|
|
||||||
|
Reference these posts:
|
||||||
|
|
||||||
|
https://tech.marksblogg.com/american-solar-farms.html
|
||||||
|
https://tech.marksblogg.com/canadas-odb-buildings.html
|
||||||
|
https://tech.marksblogg.com/ornl-fema-buildings.html
|
||||||
|
|
||||||
|
FOr the type of workflows that I would like to run.
|
||||||
|
|
||||||
|
Extract patterns/architecture/approaches along with the specific reductions to practice.
|
||||||
|
|
||||||
|
2. What weather data sources and APIs do you currently use or plan to use?
|
||||||
|
|
||||||
|
None currently. But I'll be hacking/forking a system to predict balloon paths. I suspect I'll need to process grib data.
|
||||||
|
Also probably use the NOAA and european equivalant APIs? Maybe some bulk HTTP/FTP download?
|
||||||
|
|
||||||
|
3. Are there any specific performance requirements for processing large datasets?
|
||||||
|
|
||||||
|
I suspect I'll do some early prototyping with small data sets on my workstation and then running the container with the real data sets on my big ram/cpu/disk servers.
|
||||||
|
|
||||||
|
|
||||||
|
4. Do you need integration with specific databases (PostGIS, etc.)?
|
||||||
|
|
||||||
|
Yes I will be heavily using PostGIS for sure.
|
||||||
|
|
||||||
|
## Technical Questions
|
||||||
|
|
||||||
|
1. Should we include both Python and R stacks in the same containers or separate them?
|
||||||
|
|
||||||
|
I am not sure? Whatever you think is best?
|
||||||
|
|
||||||
|
2. What level of visualization capability is needed (command-line, web-based, desktop)?
|
||||||
|
|
||||||
|
All of those I think. I want flexibility.
|
||||||
|
|
||||||
|
|
||||||
|
3. Are there any licensing constraints or requirements to consider?
|
||||||
|
|
||||||
|
I will be working only with public data sets.
|
||||||
|
|
||||||
|
|
||||||
|
4. Do you need GPU support for any processing tasks?
|
||||||
|
|
||||||
|
Yes but make it optional. I dont want to be blocked with GPU complexity right now.
|
||||||
|
|
||||||
|
|
||||||
|
## Integration Questions
|
||||||
|
|
||||||
|
1. How should GIS/Weather outputs integrate with documentation workflows?
|
||||||
|
|
||||||
|
I will be using the GIS/Weather In CTO mode only. I will also be using documentation in CTO mode with it.
|
||||||
|
|
||||||
|
I think, for now, they can be siblings but not have strong integration.
|
||||||
|
|
||||||
|
**ANSWER**: GIS/Weather and documentation containers will operate as siblings in CTO mode, with loose integration for now.
|
||||||
|
|
||||||
|
2. Do you need persistent data storage within containers?
|
||||||
|
|
||||||
|
I do not think so. I will use docker compose to pass in directory paths .
|
||||||
|
|
||||||
|
Oh I will want to push finsihed data to minio buckets.
|
||||||
|
|
||||||
|
I don't know how to best architect my ETL toolbox.... I will mostly be doing ETL on GIS/Weather data but I can see also needing todo other business type ETL workflows in COO mode.
|
||||||
|
|
||||||
|
**ANSWER**: Use Docker compose volume mounts for data input/output. Primary output destination will be MinIO buckets for business use. ETL toolbox should handle both GIS/Weather (CTO) and business (COO) workflows.
|
||||||
|
|
||||||
|
3. What level of integration with existing documentation containers is desired?
|
||||||
|
|
||||||
|
**ANSWER**: Sibling relationship with loose integration. Both will be used in CTO mode but for different purposes.
|
||||||
|
|
||||||
|
4. Are there specific deployment environments to target (local, cloud, edge)?
|
||||||
|
|
||||||
|
Well the ultimate goal is some data sets get pushed to minio buckets for use by various lines of business.
|
||||||
|
|
||||||
|
This is all kind of new to me. I am a technical operations/system admin and easing my way into devops/sre and swe.
|
||||||
|
|
||||||
|
**ANSWER**: Primarily local deployment (workstation for prototyping, large servers for production). Data output to MinIO for business use. Targeting self-hosted environments for full control and privacy.
|
||||||
Reference in New Issue
Block a user