3.9 KiB
3.9 KiB
Joplin Processing Pipeline
This directory contains scripts and configurations for processing Joplin markdown exports.
Structure
joplin-processing/
├── process-joplin-export.sh # Main processing script
├── convert-to-human-md.py # Convert Joplin to human-friendly markdown
├── convert-to-llm-json.py # Convert Joplin to LLM-optimized JSON
├── joplin-template-config.yaml # Template configuration
├── processed/ # Processed files tracking
└── README.md # This file
Workflow
- Export: Joplin notes exported as markdown
- Place: Drop exports in
../collab/fromjoplin/ - Trigger: Processing script monitors directory
- Convert: Scripts convert to both human and LLM formats
- Store: Results placed in
../../artifacts/,../../human/, and../../llm/ - Track: Processing logged in
processed/
Processing Script
#!/bin/bash
# process-joplin-export.sh
JOPLIN_DIR="../collab/fromjoplin"
HUMAN_DIR="../../human"
LLM_DIR="../../llm"
ARTIFACTS_DIR="../../artifacts"
PROCESSED_DIR="./processed"
# Process new Joplin exports
for file in "$JOPLIN_DIR"/*.md; do
if [[ -f "$file" ]]; then
filename=$(basename "$file")
echo "Processing $filename..."
# Convert to human-friendly markdown
python3 convert-to-human-md.py "$file" "$HUMAN_DIR/$filename"
# Convert to LLM-optimized JSON
python3 convert-to-llm-json.py "$file" "$LLM_DIR/${filename%.md}.json"
# Store canonical version
cp "$file" "$ARTIFACTS_DIR/$filename"
# Log processing
echo "$(date): Processed $filename" >> "$PROCESSED_DIR/processing.log"
# Move processed file to avoid reprocessing
mv "$file" "$PROCESSED_DIR/"
fi
done
Conversion Scripts
Human-Friendly Markdown Converter
# convert-to-human-md.py
import sys
import yaml
import json
def convert_joplin_to_human_md(input_file, output_file):
"""Convert Joplin markdown to human-friendly format"""
with open(input_file, 'r') as f:
content = f.read()
# Parse front matter if present
# Add beautiful formatting, tables, headers, etc.
# Write human-friendly version
with open(output_file, 'w') as f:
f.write(content)
if __name__ == "__main__":
convert_joplin_to_human_md(sys.argv[1], sys.argv[2])
LLM-Optimized JSON Converter
# convert-to-llm-json.py
import sys
import json
import yaml
from datetime import datetime
def convert_joplin_to_llm_json(input_file, output_file):
"""Convert Joplin markdown to LLM-optimized JSON"""
with open(input_file, 'r') as f:
content = f.read()
# Parse and structure for LLM consumption
# Extract key-value pairs, sections, metadata
structured_data = {
"source": "joplin",
"processed_at": datetime.now().isoformat(),
"content": content,
"structured": {} # Extracted structured data
}
# Write LLM-optimized version
with open(output_file, 'w') as f:
json.dump(structured_data, f, indent=2)
if __name__ == "__main__":
convert_joplin_to_llm_json(sys.argv[1], sys.argv[2])
Configuration
Template Configuration
# joplin-template-config.yaml
processing:
input_format: "joplin_markdown"
output_formats:
- "human_markdown"
- "llm_json"
retention_days: 30
conversion_rules:
human_friendly:
add_tables: true
add_formatting: true
add_visual_hierarchy: true
add_navigation: true
llm_optimized:
minimize_tokens: true
structure_data: true
extract_metadata: true
add_semantic_tags: true
Automation
Set up cron job or file watcher to automatically process new exports:
# Run every 5 minutes
*/5 * * * * cd /path/to/joplin-processing && ./process-joplin-export.sh