Files
TSYSDevStack-SupportStack-L…/demo/docs/troubleshooting/README.md
reachableceo 55aa340a6c docs(demo): synchronize all documentation with 16-service stack
Fix all documentation to match the actual running stack. Every service
count, port number, credential, network name, container name, and
dependency is now accurate across all files.

Key changes:
- Remove all stale Portainer/portainer references (replaced by Dockhand)
- Fix project name from tsysdevstack to kneldevstack everywhere
- Fix volume name pattern (underscore not dash after project name)
- Fix network names (add -network suffix, correct subnet in commands)
- Fix Homepage category from Infrastructure to Developer Tools
- Add companion services (ta-redis, ta-elasticsearch) to all service lists
- Fix Dockhand dependency description (direct socket, not proxy)
- Remove port 4005 from all host-facing health check loops and port tables
- Fix broken commands (docker exec dockhand docker version, wrong volume globs)
- Fix INFLUXDB_ADMIN_USER credential references from demo_admin to admin
- Fix Grafana datasource user to match
- Fix misleading "ports 4000-4018" range to explicit port list
- Add Docker Socket Proxy internal-only notes where applicable
- Update root AGENTS.md service categories to match compose labels

💘 Generated with Crush

Assisted-by: GLM-5.1 via Crush <crush@charm.land>
2026-04-27 13:07:02 -05:00

292 lines
5.7 KiB
Markdown

# TSYS Developer Support Stack - Troubleshooting Guide
## Common Issues and Solutions
### Services Not Starting
#### Issue: Docker daemon not running
**Symptoms**: `Cannot connect to the Docker daemon`
**Solution**:
```bash
sudo systemctl start docker
sudo systemctl enable docker
```
#### Issue: Port conflicts
**Symptoms**: `Port already in use` errors
**Solution**:
```bash
# Check what's using the port
netstat -tulpn | grep :4000
# Kill conflicting process
sudo fuser -k 4000/tcp
```
#### Issue: Environment variables not set
**Symptoms**: `Variable not found` errors
**Solution**:
```bash
# Check demo.env exists and is populated
cat demo.env
# Re-run user detection
./scripts/demo-stack.sh deploy
```
### Health Check Failures
#### Issue: Services stuck in "starting" state
**Symptoms**: Health checks timeout
**Solution**:
```bash
# Check service logs
docker compose logs [service-name]
# Restart specific service
docker compose restart [service-name]
# Check resource usage
docker stats
```
#### Issue: Network connectivity problems
**Symptoms**: Services can't reach each other
**Solution**:
```bash
# Check network exists
docker network ls | grep kneldevstack
# Recreate network
docker network create --subnet 192.168.3.0/24 --gateway 192.168.3.1 kneldevstack-supportstack-demo-network
# Restart stack
docker compose down && docker compose up -d
```
### Permission Issues
#### Issue: File ownership problems
**Symptoms**: `Permission denied` errors
**Solution**:
```bash
# Check current user
id
# Verify UID/GID detection
cat demo.env | grep -E "(UID|GID)"
# Fix volume permissions
sudo chown -R $(id -u):$(id -g) /var/lib/docker/volumes/kneldevstack-supportstack-demo_*
```
#### Issue: Docker group access
**Symptoms**: `Got permission denied` errors
**Solution**:
```bash
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in, or run:
newgrp docker
```
### Service-Specific Issues
#### Pi-hole DNS Issues
**Symptoms**: DNS resolution failures
**Solution**:
```bash
# Check Pi-hole status
docker exec kneldevstack-supportstack-demo-pihole pihole status
# Test DNS resolution
nslookup google.com localhost
# Restart DNS service
docker exec kneldevstack-supportstack-demo-pihole pihole restartdns
```
#### Grafana Data Source Connection
**Symptoms**: InfluxDB data source not working
**Solution**:
```bash
# Test InfluxDB connectivity
curl http://localhost:4008/ping
# Check Grafana logs
docker compose logs grafana
# Verify data source configuration
# Navigate to: http://localhost:4009/datasources
```
#### Dockhand Container Access
**Symptoms**: Can't manage containers
**Solution**:
```bash
# Check Dockhand logs
docker compose logs dockhand
# Verify Docker socket access (check socket is mounted)
docker inspect kneldevstack-supportstack-demo-dockhand --format '{{.Mounts}}' | grep docker.sock
# Restart Dockhand
docker compose restart dockhand
```
### Performance Issues
#### Issue: High memory usage
**Symptoms**: System becomes slow
**Solution**:
```bash
# Check resource usage
docker stats
# Set memory limits in docker-compose.yml
# Add to each service:
deploy:
resources:
limits:
memory: 512M
# Restart with new limits
docker compose up -d
```
#### Issue: Slow startup times
**Symptoms**: Services take >60 seconds to start
**Solution**:
```bash
# Check system resources
free -h
df -h
# Pull images in advance
docker compose pull
# Check for conflicting services
docker ps -a
```
## Diagnostic Commands
### System Information
```bash
# System info
uname -a
free -h
df -h
# Docker info
docker version
docker compose version
docker system df
```
### Service Status
```bash
# All services status
docker compose ps
# Service logs
docker compose logs
# Resource usage
docker stats
# Network info
docker network ls
docker network inspect kneldevstack-supportstack-demo
```
### Health Checks
```bash
# Test all endpoints
for port in 4000 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4017 4018; do
curl -f -s --max-time 5 "http://localhost:$port" && echo "Port $port: OK" || echo "Port $port: FAIL"
done
```
## Getting Additional Help
### Check Logs First
```bash
# All service logs
docker compose logs
# Specific service logs
docker compose logs [service-name]
# Follow logs in real-time
docker compose logs -f [service-name]
```
### Validation Scripts
```bash
# Run comprehensive validation
./scripts/validate-all.sh
# Run test suite
./scripts/demo-test.sh full
# Run specific test categories
./scripts/demo-test.sh security
./scripts/demo-test.sh permissions
./scripts/demo-test.sh network
```
### Reset and Restart
```bash
# Complete reset (removes all data)
docker compose down -v
docker system prune -f
# Fresh deployment
./scripts/demo-stack.sh deploy
```
## Known Limitations
### Demo Mode Restrictions
- No data persistence between restarts
- Hardcoded demo credentials
- No external network access
- No security hardening
### Resource Requirements
- Minimum 8GB RAM recommended
- Minimum 10GB disk space
- Docker daemon must be running
- User must be in docker group
### Port Requirements
The following host ports must be available (not a continuous range):
- 4000: Homepage
- 4006: Pi-hole
- 4007: Dockhand
- 4008: InfluxDB
- 4009: Grafana
- 4010: Draw.io
- 4011: Kroki
- 4012: Atomic Tracker
- 4013: ArchiveBox
- 4014: Tube Archivist
- 4015: Wakapi
- 4017: MailHog
- 4018: Atuin
Note: Docker Socket Proxy (4005), Redis, and Elasticsearch are internal-only and do not require host ports.
## Contact and Support
If issues persist after trying these solutions:
1. Document the exact error message
2. Include system information (OS, Docker version)
3. List steps to reproduce the issue
4. Include relevant log output
5. Specify demo vs production context
Remember: This is a demo configuration designed for development and testing purposes only.