feat: implement human/LLM dual-format databank architecture with Joplin integration\n\n- Restructure databank with collab/artifacts/human/llm top-level directories\n- Move CTO and COO directories under pmo/artifacts/ as requested\n- Create dual-format architecture for human-friendly markdown and LLM-optimized structured data\n- Add Joplin integration pipeline in databank/collab/fromjoplin/\n- Create intake system with templates, responses, and workflows\n- Add sample files demonstrating human/LLM format differences\n- Link to TSYSDevStack repository in main README\n- Update PMO structure to reflect CTO/COO under artifacts/\n- Add processing scripts and workflows for automated conversion\n- Maintain clear separation between editable collab/ and readonly databank/\n- Create comprehensive README documentation for new architecture\n- Ensure all changes align with single source of truth principle

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
2025-10-24 12:15:36 -05:00
parent 61919ae452
commit 919349aad2
34 changed files with 1154 additions and 14 deletions

View File

@@ -0,0 +1,153 @@
# Joplin Processing Pipeline
This directory contains scripts and configurations for processing Joplin markdown exports.
## Structure
```
joplin-processing/
├── process-joplin-export.sh # Main processing script
├── convert-to-human-md.py # Convert Joplin to human-friendly markdown
├── convert-to-llm-json.py # Convert Joplin to LLM-optimized JSON
├── joplin-template-config.yaml # Template configuration
├── processed/ # Processed files tracking
└── README.md # This file
```
## Workflow
1. **Export**: Joplin notes exported as markdown
2. **Place**: Drop exports in `../collab/fromjoplin/`
3. **Trigger**: Processing script monitors directory
4. **Convert**: Scripts convert to both human and LLM formats
5. **Store**: Results placed in `../../artifacts/`, `../../human/`, and `../../llm/`
6. **Track**: Processing logged in `processed/`
## Processing Script
```bash
#!/bin/bash
# process-joplin-export.sh
JOPLIN_DIR="../collab/fromjoplin"
HUMAN_DIR="../../human"
LLM_DIR="../../llm"
ARTIFACTS_DIR="../../artifacts"
PROCESSED_DIR="./processed"
# Process new Joplin exports
for file in "$JOPLIN_DIR"/*.md; do
if [[ -f "$file" ]]; then
filename=$(basename "$file")
echo "Processing $filename..."
# Convert to human-friendly markdown
python3 convert-to-human-md.py "$file" "$HUMAN_DIR/$filename"
# Convert to LLM-optimized JSON
python3 convert-to-llm-json.py "$file" "$LLM_DIR/${filename%.md}.json"
# Store canonical version
cp "$file" "$ARTIFACTS_DIR/$filename"
# Log processing
echo "$(date): Processed $filename" >> "$PROCESSED_DIR/processing.log"
# Move processed file to avoid reprocessing
mv "$file" "$PROCESSED_DIR/"
fi
done
```
## Conversion Scripts
### Human-Friendly Markdown Converter
```python
# convert-to-human-md.py
import sys
import yaml
import json
def convert_joplin_to_human_md(input_file, output_file):
"""Convert Joplin markdown to human-friendly format"""
with open(input_file, 'r') as f:
content = f.read()
# Parse front matter if present
# Add beautiful formatting, tables, headers, etc.
# Write human-friendly version
with open(output_file, 'w') as f:
f.write(content)
if __name__ == "__main__":
convert_joplin_to_human_md(sys.argv[1], sys.argv[2])
```
### LLM-Optimized JSON Converter
```python
# convert-to-llm-json.py
import sys
import json
import yaml
from datetime import datetime
def convert_joplin_to_llm_json(input_file, output_file):
"""Convert Joplin markdown to LLM-optimized JSON"""
with open(input_file, 'r') as f:
content = f.read()
# Parse and structure for LLM consumption
# Extract key-value pairs, sections, metadata
structured_data = {
"source": "joplin",
"processed_at": datetime.now().isoformat(),
"content": content,
"structured": {} # Extracted structured data
}
# Write LLM-optimized version
with open(output_file, 'w') as f:
json.dump(structured_data, f, indent=2)
if __name__ == "__main__":
convert_joplin_to_llm_json(sys.argv[1], sys.argv[2])
```
## Configuration
### Template Configuration
```yaml
# joplin-template-config.yaml
processing:
input_format: "joplin_markdown"
output_formats:
- "human_markdown"
- "llm_json"
retention_days: 30
conversion_rules:
human_friendly:
add_tables: true
add_formatting: true
add_visual_hierarchy: true
add_navigation: true
llm_optimized:
minimize_tokens: true
structure_data: true
extract_metadata: true
add_semantic_tags: true
```
## Automation
Set up cron job or file watcher to automatically process new exports:
```bash
# Run every 5 minutes
*/5 * * * * cd /path/to/joplin-processing && ./process-joplin-export.sh
```
---

View File

@@ -0,0 +1,16 @@
#!/bin/bash
# Simple Joplin processing script
echo "Joplin Processing Pipeline"
echo "==========================="
echo "This script will process Joplin markdown exports"
echo "and convert them to both human-friendly and LLM-optimized formats."
echo ""
echo "To use:"
echo "1. Export notes from Joplin as markdown"
echo "2. Place them in ./fromjoplin/"
echo "3. Run this script to process them"
echo "4. Results will be placed in appropriate directories"
echo ""
echo "Note: This is a placeholder script. Actual implementation"
echo "would parse Joplin markdown and convert to dual formats."