Remove collab directory - AIOS-Public serves as a template for projects, not a project using collab system itself
This commit is contained in:
@@ -1,40 +0,0 @@
|
||||
# Collaboration Directory
|
||||
|
||||
This directory contains structured collaboration artifacts for project development and decision-making.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
- `questions/` - Outstanding questions and topics for discussion
|
||||
- `proposals/` - Formal proposals for new features, changes, or implementations
|
||||
- `plans/` - Detailed implementation plans and technical designs
|
||||
- `prompts/` - Structured prompts for AI agents and automation
|
||||
- `audit/` - Audit trails, reviews, and assessment records
|
||||
|
||||
## Usage Guidelines
|
||||
|
||||
### Questions
|
||||
- Add new questions that need discussion or clarification
|
||||
- Link related proposals or plans where appropriate
|
||||
- Track resolution status
|
||||
|
||||
### Proposals
|
||||
- Create formal proposals for significant changes or additions
|
||||
- Include business rationale and technical approach
|
||||
- Document expected outcomes and resource requirements
|
||||
- Seek approval before implementation
|
||||
|
||||
### Plans
|
||||
- Detail technical implementation plans
|
||||
- Include architecture diagrams, technology stacks, and implementation phases
|
||||
- Identify risks and mitigation strategies
|
||||
- Outline next steps and dependencies
|
||||
|
||||
### Prompts
|
||||
- Store reusable prompts for AI agents
|
||||
- Document prompt effectiveness and outcomes
|
||||
- Version prompts for different use cases
|
||||
|
||||
### Audit
|
||||
- Track decisions made and their outcomes
|
||||
- Document performance reviews and assessments
|
||||
- Record lessons learned and improvements
|
||||
@@ -1,23 +0,0 @@
|
||||
# Issue: Markwhen Installation Failure
|
||||
|
||||
## Problem
|
||||
The Markwhen installation is failing during the Docker build process with the error:
|
||||
"failed to solve: process "/bin/sh -c npm install -g @markwhen/cli" did not complete successfully: exit code: 1"
|
||||
|
||||
## Investigation Needed
|
||||
- Research the correct npm package name for Markwhen CLI
|
||||
- Determine if it should be installed from GitHub repository instead
|
||||
- Check if there are dependencies we're missing
|
||||
- Verify if the package exists under a different name
|
||||
|
||||
## Possible Solutions
|
||||
1. Install from GitHub repository directly
|
||||
2. Use a different package name
|
||||
3. Build from source
|
||||
4. Check if Node.js version compatibility is an issue
|
||||
|
||||
## Priority
|
||||
Medium - Markwhen is a useful tool for timeline generation but not critical for core functionality
|
||||
|
||||
## Status
|
||||
Pending investigation
|
||||
@@ -1,175 +0,0 @@
|
||||
# GIS and Weather Data Processing Container Plan
|
||||
|
||||
## Overview
|
||||
This document outlines the plan for creating Docker containers to handle GIS data processing and weather data analysis. These containers will be used exclusively in CTO mode for R&D and data analysis tasks, with integration to documentation workflows and MinIO for data output.
|
||||
|
||||
## Requirements
|
||||
|
||||
### GIS Data Processing
|
||||
- Support for Shapefiles and other GIS formats
|
||||
- Self-hosted GIS stack (not Google Maps or other commercial services)
|
||||
- Integration with tools like GDAL, Tippecanoe, DuckDB
|
||||
- Heavy use of PostGIS database
|
||||
- Parquet format support for efficient data storage
|
||||
- Based on reference workflows from:
|
||||
- https://tech.marksblogg.com/american-solar-farms.html
|
||||
- https://tech.marksblogg.com/canadas-odb-buildings.html
|
||||
- https://tech.marksblogg.com/ornl-fema-buildings.html
|
||||
|
||||
### Weather Data Processing
|
||||
- GRIB data format processing
|
||||
- NOAA and European weather APIs integration
|
||||
- Bulk data download via HTTP/FTP
|
||||
- Balloon path prediction system (to be forked/modified)
|
||||
|
||||
### Shared Requirements
|
||||
- Python-based with appropriate libraries (GeoPandas, DuckDB, etc.)
|
||||
- R support for statistical analysis
|
||||
- Jupyter notebook integration for experimentation
|
||||
- MinIO bucket integration for data output
|
||||
- Optional but enabled GPU support for performance
|
||||
- All visualization types (command-line, web, desktop)
|
||||
- Flexible ETL capabilities for both GIS/Weather and business workflows
|
||||
|
||||
## Proposed Container Structure
|
||||
|
||||
### RCEO-AIOS-Public-Tools-GIS-Base
|
||||
- Foundation container with core GIS libraries
|
||||
- Python + geospatial stack (GDAL, GEOS, PROJ, DuckDB, Tippecanoe)
|
||||
- R with spatial packages
|
||||
- PostGIS client tools
|
||||
- Parquet support
|
||||
- File format support (Shapefiles, GeoJSON, etc.)
|
||||
|
||||
### RCEO-AIOS-Public-Tools-GIS-Processing
|
||||
- Extends GIS-Base with advanced processing tools
|
||||
- Jupyter with GIS extensions
|
||||
- Specialized ETL libraries
|
||||
- Performance optimization tools
|
||||
|
||||
### RCEO-AIOS-Public-Tools-Weather-Base
|
||||
- Foundation container with weather data libraries
|
||||
- GRIB format support (cfgrib)
|
||||
- NOAA and European API integration tools
|
||||
- Bulk download utilities (HTTP/FTP)
|
||||
|
||||
### RCEO-AIOS-Public-Tools-Weather-Analysis
|
||||
- Extends Weather-Base with advanced analysis tools
|
||||
- Balloon path prediction tools
|
||||
- Forecasting libraries
|
||||
- Time series analysis
|
||||
|
||||
### RCEO-AIOS-Public-Tools-GIS-Weather-Fusion (Optional)
|
||||
- Combined container for integrated GIS + Weather analysis
|
||||
- For balloon path prediction using weather data
|
||||
- High-resource container for intensive tasks
|
||||
|
||||
## Technology Stack
|
||||
|
||||
### GIS Libraries
|
||||
- GDAL/OGR for format translation and processing
|
||||
- GEOS for geometric operations
|
||||
- PROJ for coordinate transformations
|
||||
- PostGIS for spatial database operations
|
||||
- DuckDB for efficient data processing with spatial extensions
|
||||
- Tippecanoe for tile generation
|
||||
- Shapely for Python geometric operations
|
||||
- GeoPandas for Python geospatial data handling
|
||||
- Rasterio for raster processing in Python
|
||||
- Leaflet/Mapbox for web visualization
|
||||
|
||||
### Data Storage & Processing
|
||||
- DuckDB with spatial extensions
|
||||
- Parquet format support
|
||||
- MinIO client tools for data output
|
||||
- PostgreSQL client for connecting to external databases
|
||||
|
||||
### Weather Libraries
|
||||
- xarray for multi-dimensional data in Python
|
||||
- cfgrib for GRIB format handling
|
||||
- MetPy for meteorological calculations
|
||||
- Climate Data Operators (CDO) for climate data processing
|
||||
- R packages: raster, rgdal, ncdf4, rasterVis
|
||||
|
||||
### Visualization
|
||||
- Folium for interactive maps
|
||||
- Plotly for time series visualization
|
||||
- Matplotlib/Seaborn for statistical plots
|
||||
- R visualization packages
|
||||
- Command-line visualization tools
|
||||
|
||||
### ETL and Workflow Tools
|
||||
- Apache Airflow (optional in advanced containers)
|
||||
- Prefect or similar workflow orchestrators
|
||||
- DuckDB for ETL operations
|
||||
- Pandas/Dask for large data processing
|
||||
|
||||
## Container Deployment Strategy
|
||||
|
||||
### Workstation Prototyping
|
||||
- Lighter containers for development and testing
|
||||
- Optional GPU support
|
||||
- MinIO client for data output testing
|
||||
|
||||
### Production Servers
|
||||
- Full-featured containers with all processing capabilities
|
||||
- GPU-enabled variants where applicable
|
||||
- Optimized for large RAM/CPU/disk requirements
|
||||
|
||||
## Security & User Management
|
||||
- Follow same non-root user pattern as documentation containers
|
||||
- UID/GID mapping for file permissions
|
||||
- Minimal necessary privileges
|
||||
- Proper container isolation
|
||||
- Secure access to MinIO buckets
|
||||
|
||||
## Integration with Existing Stack
|
||||
- Compatible with existing user management approach
|
||||
- Can be orchestrated with documentation containers when needed
|
||||
- Follow same naming conventions
|
||||
- Use same wrapper script patterns
|
||||
- Separate from documentation containers but can work together in CTO mode
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Base GIS Container
|
||||
- Create GIS-Base with GDAL, DuckDB, PostGIS client tools
|
||||
- Implement Parquet and Shapefile support
|
||||
- Test with sample datasets from reference posts
|
||||
- Validate MinIO integration
|
||||
|
||||
### Phase 2: Weather Base Container
|
||||
- Create Weather-Base with GRIB support
|
||||
- Integrate NOAA and European API tools
|
||||
- Implement bulk download capabilities
|
||||
- Test with weather data sources
|
||||
|
||||
### Phase 3: Processing Containers
|
||||
- Create GIS-Processing container with ETL tools
|
||||
- Create Weather-Analysis container with prediction tools
|
||||
- Add visualization and Jupyter support
|
||||
- Implement optional GPU support
|
||||
|
||||
### Phase 4: Optional Fusion Container
|
||||
- Combined container for balloon path prediction
|
||||
- Integration of GIS and weather data
|
||||
- High-complexity, high-resource usage
|
||||
|
||||
## Data Flow Architecture
|
||||
- ETL workflows for processing public datasets
|
||||
- Output to MinIO buckets for business use
|
||||
- Integration with documentation tools for CTO mode workflows
|
||||
- Support for both GIS/Weather ETL (CTO) and business ETL (COO)
|
||||
|
||||
## Next Steps
|
||||
1. Review and approve this enhanced plan
|
||||
2. Begin Phase 1 implementation
|
||||
3. Test with sample data from reference workflows
|
||||
4. Iterate based on findings
|
||||
|
||||
## Risks & Considerations
|
||||
- Large container sizes due to GIS libraries and dependencies
|
||||
- Complex dependency management, especially with DuckDB and PostGIS
|
||||
- Computational resource requirements, especially for large datasets
|
||||
- GPU support implementation complexity
|
||||
- Bulk data download and processing performance
|
||||
@@ -1,35 +0,0 @@
|
||||
# GIS and Weather Data Processing - AI Prompt Template
|
||||
|
||||
## Purpose
|
||||
This prompt template is designed to guide AI agents in implementing GIS and weather data processing containers following established patterns.
|
||||
|
||||
## Instructions for AI Agent
|
||||
|
||||
When implementing GIS and weather data processing containers:
|
||||
|
||||
1. Follow the established container architecture pattern (base -> specialized layers)
|
||||
2. Maintain consistent naming convention: RCEO-AIOS-Public-Tools-[domain]-[type]
|
||||
3. Implement non-root user with UID/GID mapping
|
||||
4. Create appropriate Dockerfiles and docker-compose configurations
|
||||
5. Include proper documentation and README files
|
||||
6. Add wrapper scripts for environment management
|
||||
7. Test with sample data to verify functionality
|
||||
8. Follow same security and operational patterns as existing containers
|
||||
|
||||
## Technical Requirements
|
||||
|
||||
- Use Debian Bookworm slim as base OS
|
||||
- Include appropriate GIS libraries (GDAL, GEOS, PROJ, etc.)
|
||||
- Include weather data processing libraries (xarray, netCDF4, etc.)
|
||||
- Implement Jupyter notebook support where appropriate
|
||||
- Include R and Python stacks as needed
|
||||
- Add visualization tools (Folium, Plotly, etc.)
|
||||
|
||||
## Quality Standards
|
||||
|
||||
- Ensure containers build without errors
|
||||
- Verify file permissions work across environments
|
||||
- Test with sample datasets
|
||||
- Document usage clearly
|
||||
- Follow security best practices
|
||||
- Maintain consistent user experience with existing containers
|
||||
@@ -1,64 +0,0 @@
|
||||
# GIS and Weather Data Processing Container Proposal
|
||||
|
||||
## Proposal Summary
|
||||
Create specialized Docker containers for GIS data processing and weather data analysis to support CTO-mode R&D activities, particularly for infrastructure planning and balloon path prediction for your TSYS Group projects.
|
||||
|
||||
## Business Rationale
|
||||
As GIS and weather data analysis become increasingly important for your TSYS Group projects (particularly for infrastructure planning like solar farms and building datasets, and balloon path prediction), there's a need for specialized containers that can handle these data types efficiently while maintaining consistency with existing infrastructure patterns. The containers will support:
|
||||
- Self-hosted GIS stack for privacy and control
|
||||
- Processing public datasets (NOAA, European APIs, etc.)
|
||||
- ETL workflows for both technical and business data processing
|
||||
- Integration with MinIO for data output to business systems
|
||||
|
||||
## Technical Approach
|
||||
- Follow the same disciplined container architecture as the documentation tools
|
||||
- Use layered approach with base and specialized containers
|
||||
- Implement same security patterns (non-root user, UID/GID mapping)
|
||||
- Maintain consistent naming conventions
|
||||
- Use same operational patterns (wrapper scripts, etc.)
|
||||
- Include PostGIS, DuckDB, and optional GPU support
|
||||
- Implement MinIO integration for data output
|
||||
- Support for prototyping on workstations and production on large servers
|
||||
|
||||
## Technology Stack
|
||||
- **GIS Tools**: GDAL, Tippecanoe, DuckDB with spatial extensions
|
||||
- **Database**: PostgreSQL/PostGIS client tools
|
||||
- **Formats**: Shapefiles, Parquet, GRIB, GeoJSON
|
||||
- **Weather**: cfgrib, xarray, MetPy
|
||||
- **ETL**: Pandas, Dask, custom workflow tools
|
||||
- **APIs**: NOAA, European weather APIs
|
||||
- **Visualization**: Folium, Plotly, command-line tools
|
||||
|
||||
## Benefits
|
||||
- Consistent environment across development (workstations) and production (large servers)
|
||||
- Proper file permission handling across different systems
|
||||
- Isolated tools prevent dependency conflicts
|
||||
- Reproducible analysis environments for GIS and weather data
|
||||
- Integration with documentation tools for CTO mode workflows
|
||||
- Support for both technical (GIS/Weather) and business (COO) ETL workflows
|
||||
- Scalable architecture with optional GPU support
|
||||
- Data output capability to MinIO buckets for business use
|
||||
|
||||
## Resource Requirements
|
||||
- Development time: 3-4 weeks for complete implementation
|
||||
- Storage: Additional container images (est. 3-6GB each)
|
||||
- Compute: Higher requirements for processing (can be isolated to CTO mode)
|
||||
- Optional: GPU resources for performance-intensive tasks
|
||||
|
||||
## Expected Outcomes
|
||||
- Improved capability for spatial and weather data analysis
|
||||
- Consistent environments across development and production systems
|
||||
- Better integration with documentation workflows
|
||||
- Faster setup for ETL projects (both technical and business)
|
||||
- Efficient processing of large datasets using DuckDB and Parquet
|
||||
- Proper data output to MinIO buckets for business use
|
||||
- Reduced technical debt through consistent patterns
|
||||
|
||||
## Implementation Timeline
|
||||
- Week 1: Base GIS container with PostGIS, DuckDB, and data format support
|
||||
- Week 2: Base Weather container with GRIB support and API integration
|
||||
- Week 3: Advanced processing containers with Jupyter and visualization
|
||||
- Week 4: Optional GPU variants and MinIO integration testing
|
||||
|
||||
## Approval Request
|
||||
Please review and approve this proposal to proceed with implementation of the GIS and weather data processing containers that will support your infrastructure planning and balloon path prediction work.
|
||||
@@ -1,87 +0,0 @@
|
||||
# GIS and Weather Data Processing - Initial Questions
|
||||
|
||||
## Core Questions
|
||||
|
||||
1. What specific GIS formats and operations are most critical for your current projects?
|
||||
|
||||
Well I am not entirely sure. I am guessing that I'll need to pull in shapefiles ? I will be working with an
|
||||
entirely self hosted GIS stack (not Google maps or anything). I know things exist like gdal ? tippacanoe?
|
||||
|
||||
I think things like parquet as well. Maybe duckdb?
|
||||
|
||||
Reference these posts:
|
||||
|
||||
https://tech.marksblogg.com/american-solar-farms.html
|
||||
https://tech.marksblogg.com/canadas-odb-buildings.html
|
||||
https://tech.marksblogg.com/ornl-fema-buildings.html
|
||||
|
||||
FOr the type of workflows that I would like to run.
|
||||
|
||||
Extract patterns/architecture/approaches along with the specific reductions to practice.
|
||||
|
||||
2. What weather data sources and APIs do you currently use or plan to use?
|
||||
|
||||
None currently. But I'll be hacking/forking a system to predict balloon paths. I suspect I'll need to process grib data.
|
||||
Also probably use the NOAA and european equivalant APIs? Maybe some bulk HTTP/FTP download?
|
||||
|
||||
3. Are there any specific performance requirements for processing large datasets?
|
||||
|
||||
I suspect I'll do some early prototyping with small data sets on my workstation and then running the container with the real data sets on my big ram/cpu/disk servers.
|
||||
|
||||
|
||||
4. Do you need integration with specific databases (PostGIS, etc.)?
|
||||
|
||||
Yes I will be heavily using PostGIS for sure.
|
||||
|
||||
## Technical Questions
|
||||
|
||||
1. Should we include both Python and R stacks in the same containers or separate them?
|
||||
|
||||
I am not sure? Whatever you think is best?
|
||||
|
||||
2. What level of visualization capability is needed (command-line, web-based, desktop)?
|
||||
|
||||
All of those I think. I want flexibility.
|
||||
|
||||
|
||||
3. Are there any licensing constraints or requirements to consider?
|
||||
|
||||
I will be working only with public data sets.
|
||||
|
||||
|
||||
4. Do you need GPU support for any processing tasks?
|
||||
|
||||
Yes but make it optional. I dont want to be blocked with GPU complexity right now.
|
||||
|
||||
|
||||
## Integration Questions
|
||||
|
||||
1. How should GIS/Weather outputs integrate with documentation workflows?
|
||||
|
||||
I will be using the GIS/Weather In CTO mode only. I will also be using documentation in CTO mode with it.
|
||||
|
||||
I think, for now, they can be siblings but not have strong integration.
|
||||
|
||||
**ANSWER**: GIS/Weather and documentation containers will operate as siblings in CTO mode, with loose integration for now.
|
||||
|
||||
2. Do you need persistent data storage within containers?
|
||||
|
||||
I do not think so. I will use docker compose to pass in directory paths .
|
||||
|
||||
Oh I will want to push finsihed data to minio buckets.
|
||||
|
||||
I don't know how to best architect my ETL toolbox.... I will mostly be doing ETL on GIS/Weather data but I can see also needing todo other business type ETL workflows in COO mode.
|
||||
|
||||
**ANSWER**: Use Docker compose volume mounts for data input/output. Primary output destination will be MinIO buckets for business use. ETL toolbox should handle both GIS/Weather (CTO) and business (COO) workflows.
|
||||
|
||||
3. What level of integration with existing documentation containers is desired?
|
||||
|
||||
**ANSWER**: Sibling relationship with loose integration. Both will be used in CTO mode but for different purposes.
|
||||
|
||||
4. Are there specific deployment environments to target (local, cloud, edge)?
|
||||
|
||||
Well the ultimate goal is some data sets get pushed to minio buckets for use by various lines of business.
|
||||
|
||||
This is all kind of new to me. I am a technical operations/system admin and easing my way into devops/sre and swe.
|
||||
|
||||
**ANSWER**: Primarily local deployment (workstation for prototyping, large servers for production). Data output to MinIO for business use. Targeting self-hosted environments for full control and privacy.
|
||||
Reference in New Issue
Block a user