153 lines
3.9 KiB
Markdown
153 lines
3.9 KiB
Markdown
# Joplin Processing Pipeline
|
|
|
|
This directory contains scripts and configurations for processing Joplin markdown exports.
|
|
|
|
## Structure
|
|
|
|
```
|
|
joplin-processing/
|
|
├── process-joplin-export.sh # Main processing script
|
|
├── convert-to-human-md.py # Convert Joplin to human-friendly markdown
|
|
├── convert-to-llm-json.py # Convert Joplin to LLM-optimized JSON
|
|
├── joplin-template-config.yaml # Template configuration
|
|
├── processed/ # Processed files tracking
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Workflow
|
|
|
|
1. **Export**: Joplin notes exported as markdown
|
|
2. **Place**: Drop exports in `../collab/fromjoplin/`
|
|
3. **Trigger**: Processing script monitors directory
|
|
4. **Convert**: Scripts convert to both human and LLM formats
|
|
5. **Store**: Results placed in `../../artifacts/`, `../../human/`, and `../../llm/`
|
|
6. **Track**: Processing logged in `processed/`
|
|
|
|
## Processing Script
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# process-joplin-export.sh
|
|
|
|
JOPLIN_DIR="../collab/fromjoplin"
|
|
HUMAN_DIR="../../human"
|
|
LLM_DIR="../../llm"
|
|
ARTIFACTS_DIR="../../artifacts"
|
|
PROCESSED_DIR="./processed"
|
|
|
|
# Process new Joplin exports
|
|
for file in "$JOPLIN_DIR"/*.md; do
|
|
if [[ -f "$file" ]]; then
|
|
filename=$(basename "$file")
|
|
echo "Processing $filename..."
|
|
|
|
# Convert to human-friendly markdown
|
|
python3 convert-to-human-md.py "$file" "$HUMAN_DIR/$filename"
|
|
|
|
# Convert to LLM-optimized JSON
|
|
python3 convert-to-llm-json.py "$file" "$LLM_DIR/${filename%.md}.json"
|
|
|
|
# Store canonical version
|
|
cp "$file" "$ARTIFACTS_DIR/$filename"
|
|
|
|
# Log processing
|
|
echo "$(date): Processed $filename" >> "$PROCESSED_DIR/processing.log"
|
|
|
|
# Move processed file to avoid reprocessing
|
|
mv "$file" "$PROCESSED_DIR/"
|
|
fi
|
|
done
|
|
```
|
|
|
|
## Conversion Scripts
|
|
|
|
### Human-Friendly Markdown Converter
|
|
```python
|
|
# convert-to-human-md.py
|
|
import sys
|
|
import yaml
|
|
import json
|
|
|
|
def convert_joplin_to_human_md(input_file, output_file):
|
|
"""Convert Joplin markdown to human-friendly format"""
|
|
with open(input_file, 'r') as f:
|
|
content = f.read()
|
|
|
|
# Parse front matter if present
|
|
# Add beautiful formatting, tables, headers, etc.
|
|
|
|
# Write human-friendly version
|
|
with open(output_file, 'w') as f:
|
|
f.write(content)
|
|
|
|
if __name__ == "__main__":
|
|
convert_joplin_to_human_md(sys.argv[1], sys.argv[2])
|
|
```
|
|
|
|
### LLM-Optimized JSON Converter
|
|
```python
|
|
# convert-to-llm-json.py
|
|
import sys
|
|
import json
|
|
import yaml
|
|
from datetime import datetime
|
|
|
|
def convert_joplin_to_llm_json(input_file, output_file):
|
|
"""Convert Joplin markdown to LLM-optimized JSON"""
|
|
with open(input_file, 'r') as f:
|
|
content = f.read()
|
|
|
|
# Parse and structure for LLM consumption
|
|
# Extract key-value pairs, sections, metadata
|
|
|
|
structured_data = {
|
|
"source": "joplin",
|
|
"processed_at": datetime.now().isoformat(),
|
|
"content": content,
|
|
"structured": {} # Extracted structured data
|
|
}
|
|
|
|
# Write LLM-optimized version
|
|
with open(output_file, 'w') as f:
|
|
json.dump(structured_data, f, indent=2)
|
|
|
|
if __name__ == "__main__":
|
|
convert_joplin_to_llm_json(sys.argv[1], sys.argv[2])
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Template Configuration
|
|
```yaml
|
|
# joplin-template-config.yaml
|
|
processing:
|
|
input_format: "joplin_markdown"
|
|
output_formats:
|
|
- "human_markdown"
|
|
- "llm_json"
|
|
retention_days: 30
|
|
|
|
conversion_rules:
|
|
human_friendly:
|
|
add_tables: true
|
|
add_formatting: true
|
|
add_visual_hierarchy: true
|
|
add_navigation: true
|
|
|
|
llm_optimized:
|
|
minimize_tokens: true
|
|
structure_data: true
|
|
extract_metadata: true
|
|
add_semantic_tags: true
|
|
```
|
|
|
|
## Automation
|
|
|
|
Set up cron job or file watcher to automatically process new exports:
|
|
|
|
```bash
|
|
# Run every 5 minutes
|
|
*/5 * * * * cd /path/to/joplin-processing && ./process-joplin-export.sh
|
|
```
|
|
|
|
--- |