Remove collab directory - AIOS-Public serves as a template for projects, not a project using collab system itself

This commit is contained in:
2025-10-16 13:20:52 -05:00
parent bd9aea4cd8
commit 89a6ffeacc
6 changed files with 0 additions and 424 deletions

View File

@@ -1,40 +0,0 @@
# Collaboration Directory
This directory contains structured collaboration artifacts for project development and decision-making.
## Directory Structure
- `questions/` - Outstanding questions and topics for discussion
- `proposals/` - Formal proposals for new features, changes, or implementations
- `plans/` - Detailed implementation plans and technical designs
- `prompts/` - Structured prompts for AI agents and automation
- `audit/` - Audit trails, reviews, and assessment records
## Usage Guidelines
### Questions
- Add new questions that need discussion or clarification
- Link related proposals or plans where appropriate
- Track resolution status
### Proposals
- Create formal proposals for significant changes or additions
- Include business rationale and technical approach
- Document expected outcomes and resource requirements
- Seek approval before implementation
### Plans
- Detail technical implementation plans
- Include architecture diagrams, technology stacks, and implementation phases
- Identify risks and mitigation strategies
- Outline next steps and dependencies
### Prompts
- Store reusable prompts for AI agents
- Document prompt effectiveness and outcomes
- Version prompts for different use cases
### Audit
- Track decisions made and their outcomes
- Document performance reviews and assessments
- Record lessons learned and improvements

View File

@@ -1,23 +0,0 @@
# Issue: Markwhen Installation Failure
## Problem
The Markwhen installation is failing during the Docker build process with the error:
"failed to solve: process "/bin/sh -c npm install -g @markwhen/cli" did not complete successfully: exit code: 1"
## Investigation Needed
- Research the correct npm package name for Markwhen CLI
- Determine if it should be installed from GitHub repository instead
- Check if there are dependencies we're missing
- Verify if the package exists under a different name
## Possible Solutions
1. Install from GitHub repository directly
2. Use a different package name
3. Build from source
4. Check if Node.js version compatibility is an issue
## Priority
Medium - Markwhen is a useful tool for timeline generation but not critical for core functionality
## Status
Pending investigation

View File

@@ -1,175 +0,0 @@
# GIS and Weather Data Processing Container Plan
## Overview
This document outlines the plan for creating Docker containers to handle GIS data processing and weather data analysis. These containers will be used exclusively in CTO mode for R&D and data analysis tasks, with integration to documentation workflows and MinIO for data output.
## Requirements
### GIS Data Processing
- Support for Shapefiles and other GIS formats
- Self-hosted GIS stack (not Google Maps or other commercial services)
- Integration with tools like GDAL, Tippecanoe, DuckDB
- Heavy use of PostGIS database
- Parquet format support for efficient data storage
- Based on reference workflows from:
- https://tech.marksblogg.com/american-solar-farms.html
- https://tech.marksblogg.com/canadas-odb-buildings.html
- https://tech.marksblogg.com/ornl-fema-buildings.html
### Weather Data Processing
- GRIB data format processing
- NOAA and European weather APIs integration
- Bulk data download via HTTP/FTP
- Balloon path prediction system (to be forked/modified)
### Shared Requirements
- Python-based with appropriate libraries (GeoPandas, DuckDB, etc.)
- R support for statistical analysis
- Jupyter notebook integration for experimentation
- MinIO bucket integration for data output
- Optional but enabled GPU support for performance
- All visualization types (command-line, web, desktop)
- Flexible ETL capabilities for both GIS/Weather and business workflows
## Proposed Container Structure
### RCEO-AIOS-Public-Tools-GIS-Base
- Foundation container with core GIS libraries
- Python + geospatial stack (GDAL, GEOS, PROJ, DuckDB, Tippecanoe)
- R with spatial packages
- PostGIS client tools
- Parquet support
- File format support (Shapefiles, GeoJSON, etc.)
### RCEO-AIOS-Public-Tools-GIS-Processing
- Extends GIS-Base with advanced processing tools
- Jupyter with GIS extensions
- Specialized ETL libraries
- Performance optimization tools
### RCEO-AIOS-Public-Tools-Weather-Base
- Foundation container with weather data libraries
- GRIB format support (cfgrib)
- NOAA and European API integration tools
- Bulk download utilities (HTTP/FTP)
### RCEO-AIOS-Public-Tools-Weather-Analysis
- Extends Weather-Base with advanced analysis tools
- Balloon path prediction tools
- Forecasting libraries
- Time series analysis
### RCEO-AIOS-Public-Tools-GIS-Weather-Fusion (Optional)
- Combined container for integrated GIS + Weather analysis
- For balloon path prediction using weather data
- High-resource container for intensive tasks
## Technology Stack
### GIS Libraries
- GDAL/OGR for format translation and processing
- GEOS for geometric operations
- PROJ for coordinate transformations
- PostGIS for spatial database operations
- DuckDB for efficient data processing with spatial extensions
- Tippecanoe for tile generation
- Shapely for Python geometric operations
- GeoPandas for Python geospatial data handling
- Rasterio for raster processing in Python
- Leaflet/Mapbox for web visualization
### Data Storage & Processing
- DuckDB with spatial extensions
- Parquet format support
- MinIO client tools for data output
- PostgreSQL client for connecting to external databases
### Weather Libraries
- xarray for multi-dimensional data in Python
- cfgrib for GRIB format handling
- MetPy for meteorological calculations
- Climate Data Operators (CDO) for climate data processing
- R packages: raster, rgdal, ncdf4, rasterVis
### Visualization
- Folium for interactive maps
- Plotly for time series visualization
- Matplotlib/Seaborn for statistical plots
- R visualization packages
- Command-line visualization tools
### ETL and Workflow Tools
- Apache Airflow (optional in advanced containers)
- Prefect or similar workflow orchestrators
- DuckDB for ETL operations
- Pandas/Dask for large data processing
## Container Deployment Strategy
### Workstation Prototyping
- Lighter containers for development and testing
- Optional GPU support
- MinIO client for data output testing
### Production Servers
- Full-featured containers with all processing capabilities
- GPU-enabled variants where applicable
- Optimized for large RAM/CPU/disk requirements
## Security & User Management
- Follow same non-root user pattern as documentation containers
- UID/GID mapping for file permissions
- Minimal necessary privileges
- Proper container isolation
- Secure access to MinIO buckets
## Integration with Existing Stack
- Compatible with existing user management approach
- Can be orchestrated with documentation containers when needed
- Follow same naming conventions
- Use same wrapper script patterns
- Separate from documentation containers but can work together in CTO mode
## Implementation Phases
### Phase 1: Base GIS Container
- Create GIS-Base with GDAL, DuckDB, PostGIS client tools
- Implement Parquet and Shapefile support
- Test with sample datasets from reference posts
- Validate MinIO integration
### Phase 2: Weather Base Container
- Create Weather-Base with GRIB support
- Integrate NOAA and European API tools
- Implement bulk download capabilities
- Test with weather data sources
### Phase 3: Processing Containers
- Create GIS-Processing container with ETL tools
- Create Weather-Analysis container with prediction tools
- Add visualization and Jupyter support
- Implement optional GPU support
### Phase 4: Optional Fusion Container
- Combined container for balloon path prediction
- Integration of GIS and weather data
- High-complexity, high-resource usage
## Data Flow Architecture
- ETL workflows for processing public datasets
- Output to MinIO buckets for business use
- Integration with documentation tools for CTO mode workflows
- Support for both GIS/Weather ETL (CTO) and business ETL (COO)
## Next Steps
1. Review and approve this enhanced plan
2. Begin Phase 1 implementation
3. Test with sample data from reference workflows
4. Iterate based on findings
## Risks & Considerations
- Large container sizes due to GIS libraries and dependencies
- Complex dependency management, especially with DuckDB and PostGIS
- Computational resource requirements, especially for large datasets
- GPU support implementation complexity
- Bulk data download and processing performance

View File

@@ -1,35 +0,0 @@
# GIS and Weather Data Processing - AI Prompt Template
## Purpose
This prompt template is designed to guide AI agents in implementing GIS and weather data processing containers following established patterns.
## Instructions for AI Agent
When implementing GIS and weather data processing containers:
1. Follow the established container architecture pattern (base -> specialized layers)
2. Maintain consistent naming convention: RCEO-AIOS-Public-Tools-[domain]-[type]
3. Implement non-root user with UID/GID mapping
4. Create appropriate Dockerfiles and docker-compose configurations
5. Include proper documentation and README files
6. Add wrapper scripts for environment management
7. Test with sample data to verify functionality
8. Follow same security and operational patterns as existing containers
## Technical Requirements
- Use Debian Bookworm slim as base OS
- Include appropriate GIS libraries (GDAL, GEOS, PROJ, etc.)
- Include weather data processing libraries (xarray, netCDF4, etc.)
- Implement Jupyter notebook support where appropriate
- Include R and Python stacks as needed
- Add visualization tools (Folium, Plotly, etc.)
## Quality Standards
- Ensure containers build without errors
- Verify file permissions work across environments
- Test with sample datasets
- Document usage clearly
- Follow security best practices
- Maintain consistent user experience with existing containers

View File

@@ -1,64 +0,0 @@
# GIS and Weather Data Processing Container Proposal
## Proposal Summary
Create specialized Docker containers for GIS data processing and weather data analysis to support CTO-mode R&D activities, particularly for infrastructure planning and balloon path prediction for your TSYS Group projects.
## Business Rationale
As GIS and weather data analysis become increasingly important for your TSYS Group projects (particularly for infrastructure planning like solar farms and building datasets, and balloon path prediction), there's a need for specialized containers that can handle these data types efficiently while maintaining consistency with existing infrastructure patterns. The containers will support:
- Self-hosted GIS stack for privacy and control
- Processing public datasets (NOAA, European APIs, etc.)
- ETL workflows for both technical and business data processing
- Integration with MinIO for data output to business systems
## Technical Approach
- Follow the same disciplined container architecture as the documentation tools
- Use layered approach with base and specialized containers
- Implement same security patterns (non-root user, UID/GID mapping)
- Maintain consistent naming conventions
- Use same operational patterns (wrapper scripts, etc.)
- Include PostGIS, DuckDB, and optional GPU support
- Implement MinIO integration for data output
- Support for prototyping on workstations and production on large servers
## Technology Stack
- **GIS Tools**: GDAL, Tippecanoe, DuckDB with spatial extensions
- **Database**: PostgreSQL/PostGIS client tools
- **Formats**: Shapefiles, Parquet, GRIB, GeoJSON
- **Weather**: cfgrib, xarray, MetPy
- **ETL**: Pandas, Dask, custom workflow tools
- **APIs**: NOAA, European weather APIs
- **Visualization**: Folium, Plotly, command-line tools
## Benefits
- Consistent environment across development (workstations) and production (large servers)
- Proper file permission handling across different systems
- Isolated tools prevent dependency conflicts
- Reproducible analysis environments for GIS and weather data
- Integration with documentation tools for CTO mode workflows
- Support for both technical (GIS/Weather) and business (COO) ETL workflows
- Scalable architecture with optional GPU support
- Data output capability to MinIO buckets for business use
## Resource Requirements
- Development time: 3-4 weeks for complete implementation
- Storage: Additional container images (est. 3-6GB each)
- Compute: Higher requirements for processing (can be isolated to CTO mode)
- Optional: GPU resources for performance-intensive tasks
## Expected Outcomes
- Improved capability for spatial and weather data analysis
- Consistent environments across development and production systems
- Better integration with documentation workflows
- Faster setup for ETL projects (both technical and business)
- Efficient processing of large datasets using DuckDB and Parquet
- Proper data output to MinIO buckets for business use
- Reduced technical debt through consistent patterns
## Implementation Timeline
- Week 1: Base GIS container with PostGIS, DuckDB, and data format support
- Week 2: Base Weather container with GRIB support and API integration
- Week 3: Advanced processing containers with Jupyter and visualization
- Week 4: Optional GPU variants and MinIO integration testing
## Approval Request
Please review and approve this proposal to proceed with implementation of the GIS and weather data processing containers that will support your infrastructure planning and balloon path prediction work.

View File

@@ -1,87 +0,0 @@
# GIS and Weather Data Processing - Initial Questions
## Core Questions
1. What specific GIS formats and operations are most critical for your current projects?
Well I am not entirely sure. I am guessing that I'll need to pull in shapefiles ? I will be working with an
entirely self hosted GIS stack (not Google maps or anything). I know things exist like gdal ? tippacanoe?
I think things like parquet as well. Maybe duckdb?
Reference these posts:
https://tech.marksblogg.com/american-solar-farms.html
https://tech.marksblogg.com/canadas-odb-buildings.html
https://tech.marksblogg.com/ornl-fema-buildings.html
FOr the type of workflows that I would like to run.
Extract patterns/architecture/approaches along with the specific reductions to practice.
2. What weather data sources and APIs do you currently use or plan to use?
None currently. But I'll be hacking/forking a system to predict balloon paths. I suspect I'll need to process grib data.
Also probably use the NOAA and european equivalant APIs? Maybe some bulk HTTP/FTP download?
3. Are there any specific performance requirements for processing large datasets?
I suspect I'll do some early prototyping with small data sets on my workstation and then running the container with the real data sets on my big ram/cpu/disk servers.
4. Do you need integration with specific databases (PostGIS, etc.)?
Yes I will be heavily using PostGIS for sure.
## Technical Questions
1. Should we include both Python and R stacks in the same containers or separate them?
I am not sure? Whatever you think is best?
2. What level of visualization capability is needed (command-line, web-based, desktop)?
All of those I think. I want flexibility.
3. Are there any licensing constraints or requirements to consider?
I will be working only with public data sets.
4. Do you need GPU support for any processing tasks?
Yes but make it optional. I dont want to be blocked with GPU complexity right now.
## Integration Questions
1. How should GIS/Weather outputs integrate with documentation workflows?
I will be using the GIS/Weather In CTO mode only. I will also be using documentation in CTO mode with it.
I think, for now, they can be siblings but not have strong integration.
**ANSWER**: GIS/Weather and documentation containers will operate as siblings in CTO mode, with loose integration for now.
2. Do you need persistent data storage within containers?
I do not think so. I will use docker compose to pass in directory paths .
Oh I will want to push finsihed data to minio buckets.
I don't know how to best architect my ETL toolbox.... I will mostly be doing ETL on GIS/Weather data but I can see also needing todo other business type ETL workflows in COO mode.
**ANSWER**: Use Docker compose volume mounts for data input/output. Primary output destination will be MinIO buckets for business use. ETL toolbox should handle both GIS/Weather (CTO) and business (COO) workflows.
3. What level of integration with existing documentation containers is desired?
**ANSWER**: Sibling relationship with loose integration. Both will be used in CTO mode but for different purposes.
4. Are there specific deployment environments to target (local, cloud, edge)?
Well the ultimate goal is some data sets get pushed to minio buckets for use by various lines of business.
This is all kind of new to me. I am a technical operations/system admin and easing my way into devops/sre and swe.
**ANSWER**: Primarily local deployment (workstation for prototyping, large servers for production). Data output to MinIO for business use. Targeting self-hosted environments for full control and privacy.