4.9 KiB
4.9 KiB
Operations Runbook
This document captures the operational playbooks for the MerchantsOfHope Supply & Demand Portal. It is intended for on-call engineers and SREs maintaining the platform across Coolify environments.
1. Service Topology
- Backend API (
merchantsofhope-supplyanddemandportal-backend)- Node.js 18, Express server on port 3001.
- Entry point waits for PostgreSQL, runs migrations, optional seeding (
RUN_SEED). - Health probe:
GET /api/health. - Persistent uploads stored at
/app/uploads/resumes(mounted volume required in production).
- Frontend (
merchantsofhope-supplyanddemandportal-frontend)- React 18 application served by the CRA dev server (in dev) or static bundle (in production image).
- Communicates with the backend via
REACT_APP_API_URL(set to the internal service URL).
- PostgreSQL (
merchantsofhope-supplyanddemandportal-database)- PostgreSQL 15, health-checked via
pg_isready. - Volume-backed data directory
merchantsofhope-supplyanddemandportal-postgres-data.
- PostgreSQL 15, health-checked via
2. Environment Variables
| Variable | Purpose | Default |
|---|---|---|
POSTGRES_* |
Database credentials used by backend and DB container | See .env.example |
DATABASE_URL |
Overrides assembled connection string | Derived automatically |
JWT_SECRET |
Required for signing auth tokens | none (must be supplied) |
RATE_LIMIT_MAX, RATE_LIMIT_WINDOW_MS |
Express rate limiter configuration | 100 req / 15 min |
DB_POOL_MAX, DB_POOL_IDLE_MS, DB_POOL_CONNECTION_TIMEOUT_MS |
pg connection pool tuning | 10 / 30000 / 5000 |
DB_WAIT_TIMEOUT_MS |
Maximum wait for database readiness in entrypoint | 60000 |
RUN_MIGRATIONS |
Run schema migrations on container boot | true |
RUN_SEED |
Run seed data on container boot | false |
USE_DOCKER_TEST_DB |
Jest helper flag (set to false in CI to reuse managed Postgres) |
true locally |
UPLOAD_DIR |
Resume storage path | uploads/resumes |
3. Deployments (Coolify)
- Ensure the Gitea pipeline has published new backend/frontend images (see workflow summary for SHA tags).
- In Coolify, update
BACKEND_IMAGE/FRONTEND_IMAGEenvironment variables to the new tags. - Trigger a deployment; Coolify will:
- Bring up PostgreSQL (if not already running).
- Start backend, wait for DB, run migrations, and expose
/api/health. - Start frontend once backend healthcheck passes.
- Post-deploy checks:
curl https://<domain>/api/healthreturns200with JSON payload.- Frontend login screen reachable.
- Review container logs for migration output (
docker compose logs backendin Coolify shell).
4. Rollback Procedure
- Identify the previous known-good image tags (from Gitea workflow history or Coolify activity log).
- Update
BACKEND_IMAGE/FRONTEND_IMAGEto the old tags. - Redeploy in Coolify. Migrations are idempotent; no additional action needed.
- Validate health endpoints and smoke-test the UI.
5. Local Development
- Run
docker compose up --buildto start the stack. The backend container waits for PostgreSQL, runs migrations automatically, and skips seeding by default. To seed once, runRUN_SEED=true docker compose up backendor executedocker compose exec ... npm run seedmanually. ./scripts/run-ci-tests.shruns lint + unit tests with the same coverage thresholds as CI.- Backend tests rely on Docker; ensure Docker Desktop/Engine is running.
6. Backup & Restore
Database
- Use
docker compose exec merchantsofhope-supplyanddemandportal-database pg_dump -U <user> <db>to generate a dump file. - Restore via
psqlpiping the dump into the running container.
Uploads
- Archive the
merchantsofhope-supplyanddemandportal-uploadsvolume (Coolify: Settings → Backups → Volume Snapshot).
7. Monitoring & Alerting
- Healthcheck endpoints should be wired into external monitoring (e.g., Uptime Kuma, Grafana Cloud).
- Rate limiter defaults protect against bursts; adjust
RATE_LIMIT_MAX/RATE_LIMIT_WINDOW_MSif legitimate traffic patterns trigger 429s.
8. Incident Response Checklist
- Validate Health –
curlbackend health endpoint, inspect Coolify container logs. - Check Database –
docker compose exec ... pg_isreadyanddocker compose exec ... psql -c 'SELECT NOW();'. - Restart Services – In Coolify or locally, redeploy backend/front containers (entrypoint will re-run migrations safely).
- Rollback if Needed – Follow rollback steps above.
- Postmortem – Capture root cause, update this runbook with remediation notes.
9. Security Posture
- JWT secrets must be at least 32 bytes and rotated regularly.
- Uploaded files are sanitized and stored on disk; configure antivirus scanning if compliance requires it.
- Rate limiting is enabled globally; consider pairing with IP allowlists at the reverse proxy if stricter controls are needed.
Keep this runbook updated as infrastructure evolves.