feat: implement human/LLM dual-format databank architecture with Joplin integration\n\n- Restructure databank with collab/artifacts/human/llm top-level directories\n- Move CTO and COO directories under pmo/artifacts/ as requested\n- Create dual-format architecture for human-friendly markdown and LLM-optimized structured data\n- Add Joplin integration pipeline in databank/collab/fromjoplin/\n- Create intake system with templates, responses, and workflows\n- Add sample files demonstrating human/LLM format differences\n- Link to TSYSDevStack repository in main README\n- Update PMO structure to reflect CTO/COO under artifacts/\n- Add processing scripts and workflows for automated conversion\n- Maintain clear separation between editable collab/ and readonly databank/\n- Create comprehensive README documentation for new architecture\n- Ensure all changes align with single source of truth principle

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2025-10-24 12:15:36 -05:00
parent 61919ae452
commit 919349aad2
34 changed files with 1154 additions and 14 deletions
--- a/databank/collab/fromjoplin/README.md
+++ b/databank/collab/fromjoplin/README.md
@@ -0,0 +1,153 @@
+# Joplin Processing Pipeline
+
+This directory contains scripts and configurations for processing Joplin markdown exports.
+
+## Structure
+
+```
+joplin-processing/
+├── process-joplin-export.sh    # Main processing script
+├── convert-to-human-md.py      # Convert Joplin to human-friendly markdown
+├── convert-to-llm-json.py      # Convert Joplin to LLM-optimized JSON
+├── joplin-template-config.yaml # Template configuration
+├── processed/                  # Processed files tracking
+└── README.md                   # This file
+```
+
+## Workflow
+
+1. **Export**: Joplin notes exported as markdown
+2. **Place**: Drop exports in `../collab/fromjoplin/`
+3. **Trigger**: Processing script monitors directory
+4. **Convert**: Scripts convert to both human and LLM formats
+5. **Store**: Results placed in `../../artifacts/`, `../../human/`, and `../../llm/`
+6. **Track**: Processing logged in `processed/`
+
+## Processing Script
+
+```bash
+#!/bin/bash
+# process-joplin-export.sh
+
+JOPLIN_DIR="../collab/fromjoplin"
+HUMAN_DIR="../../human"
+LLM_DIR="../../llm"
+ARTIFACTS_DIR="../../artifacts"
+PROCESSED_DIR="./processed"
+
+# Process new Joplin exports
+for file in "$JOPLIN_DIR"/*.md; do
+    if [[ -f "$file" ]]; then
+        filename=$(basename "$file")
+        echo "Processing $filename..."
+        
+        # Convert to human-friendly markdown
+        python3 convert-to-human-md.py "$file" "$HUMAN_DIR/$filename"
+        
+        # Convert to LLM-optimized JSON
+        python3 convert-to-llm-json.py "$file" "$LLM_DIR/${filename%.md}.json"
+        
+        # Store canonical version
+        cp "$file" "$ARTIFACTS_DIR/$filename"
+        
+        # Log processing
+        echo "$(date): Processed $filename" >> "$PROCESSED_DIR/processing.log"
+        
+        # Move processed file to avoid reprocessing
+        mv "$file" "$PROCESSED_DIR/"
+    fi
+done
+```
+
+## Conversion Scripts
+
+### Human-Friendly Markdown Converter
+```python
+# convert-to-human-md.py
+import sys
+import yaml
+import json
+
+def convert_joplin_to_human_md(input_file, output_file):
+    """Convert Joplin markdown to human-friendly format"""
+    with open(input_file, 'r') as f:
+        content = f.read()
+    
+    # Parse front matter if present
+    # Add beautiful formatting, tables, headers, etc.
+    
+    # Write human-friendly version
+    with open(output_file, 'w') as f:
+        f.write(content)
+
+if __name__ == "__main__":
+    convert_joplin_to_human_md(sys.argv[1], sys.argv[2])
+```
+
+### LLM-Optimized JSON Converter
+```python
+# convert-to-llm-json.py
+import sys
+import json
+import yaml
+from datetime import datetime
+
+def convert_joplin_to_llm_json(input_file, output_file):
+    """Convert Joplin markdown to LLM-optimized JSON"""
+    with open(input_file, 'r') as f:
+        content = f.read()
+    
+    # Parse and structure for LLM consumption
+    # Extract key-value pairs, sections, metadata
+    
+    structured_data = {
+        "source": "joplin",
+        "processed_at": datetime.now().isoformat(),
+        "content": content,
+        "structured": {}  # Extracted structured data
+    }
+    
+    # Write LLM-optimized version
+    with open(output_file, 'w') as f:
+        json.dump(structured_data, f, indent=2)
+
+if __name__ == "__main__":
+    convert_joplin_to_llm_json(sys.argv[1], sys.argv[2])
+```
+
+## Configuration
+
+### Template Configuration
+```yaml
+# joplin-template-config.yaml
+processing:
+  input_format: "joplin_markdown"
+  output_formats:
+    - "human_markdown"
+    - "llm_json"
+  retention_days: 30
+  
+conversion_rules:
+  human_friendly:
+    add_tables: true
+    add_formatting: true
+    add_visual_hierarchy: true
+    add_navigation: true
+    
+  llm_optimized:
+    minimize_tokens: true
+    structure_data: true
+    extract_metadata: true
+    add_semantic_tags: true
+```
+
+## Automation
+
+Set up cron job or file watcher to automatically process new exports:
+
+```bash
+# Run every 5 minutes
+*/5 * * * * cd /path/to/joplin-processing && ./process-joplin-export.sh
+```
+
+---
--- a/databank/collab/fromjoplin/process-joplin.sh
+++ b/databank/collab/fromjoplin/process-joplin.sh
@@ -0,0 +1,16 @@
+#!/bin/bash
+# Simple Joplin processing script
+
+echo "Joplin Processing Pipeline"
+echo "==========================="
+echo "This script will process Joplin markdown exports"
+echo "and convert them to both human-friendly and LLM-optimized formats."
+echo ""
+echo "To use:"
+echo "1. Export notes from Joplin as markdown"
+echo "2. Place them in ./fromjoplin/"
+echo "3. Run this script to process them"
+echo "4. Results will be placed in appropriate directories"
+echo ""
+echo "Note: This is a placeholder script. Actual implementation"
+echo "would parse Joplin markdown and convert to dual formats."