90 lines
4.9 KiB
Markdown
90 lines
4.9 KiB
Markdown
# Operations Runbook
|
||
|
||
This document captures the operational playbooks for the MerchantsOfHope Supply & Demand Portal. It is intended for on-call engineers and SREs maintaining the platform across Coolify environments.
|
||
|
||
## 1. Service Topology
|
||
|
||
- **Backend API (`merchantsofhope-supplyanddemandportal-backend`)**
|
||
- Node.js 18, Express server on port 3001.
|
||
- Entry point waits for PostgreSQL, runs migrations, optional seeding (`RUN_SEED`).
|
||
- Health probe: `GET /api/health`.
|
||
- Persistent uploads stored at `/app/uploads/resumes` (mounted volume required in production).
|
||
- **Frontend (`merchantsofhope-supplyanddemandportal-frontend`)**
|
||
- React 18 application served by the CRA dev server (in dev) or static bundle (in production image).
|
||
- Communicates with the backend via `REACT_APP_API_URL` (set to the internal service URL).
|
||
- **PostgreSQL (`merchantsofhope-supplyanddemandportal-database`)**
|
||
- PostgreSQL 15, health-checked via `pg_isready`.
|
||
- Volume-backed data directory `merchantsofhope-supplyanddemandportal-postgres-data`.
|
||
|
||
## 2. Environment Variables
|
||
|
||
| Variable | Purpose | Default |
|
||
| --- | --- | --- |
|
||
| `POSTGRES_*` | Database credentials used by backend and DB container | See `.env.example` |
|
||
| `DATABASE_URL` | Overrides assembled connection string | Derived automatically |
|
||
| `JWT_SECRET` | Required for signing auth tokens | none (must be supplied) |
|
||
| `RATE_LIMIT_MAX`, `RATE_LIMIT_WINDOW_MS` | Express rate limiter configuration | 100 req / 15 min |
|
||
| `DB_POOL_MAX`, `DB_POOL_IDLE_MS`, `DB_POOL_CONNECTION_TIMEOUT_MS` | pg connection pool tuning | 10 / 30000 / 5000 |
|
||
| `DB_WAIT_TIMEOUT_MS` | Maximum wait for database readiness in entrypoint | 60000 |
|
||
| `RUN_MIGRATIONS` | Run schema migrations on container boot | `true` |
|
||
| `RUN_SEED` | Run seed data on container boot | `false` |
|
||
| `USE_DOCKER_TEST_DB` | Jest helper flag (set to `false` in CI to reuse managed Postgres) | `true` locally |
|
||
| `UPLOAD_DIR` | Resume storage path | `uploads/resumes` |
|
||
|
||
## 3. Deployments (Coolify)
|
||
|
||
1. Ensure the Gitea pipeline has published new backend/frontend images (see workflow summary for SHA tags).
|
||
2. In Coolify, update `BACKEND_IMAGE` / `FRONTEND_IMAGE` environment variables to the new tags.
|
||
3. Trigger a deployment; Coolify will:
|
||
- Bring up PostgreSQL (if not already running).
|
||
- Start backend, wait for DB, run migrations, and expose `/api/health`.
|
||
- Start frontend once backend healthcheck passes.
|
||
4. Post-deploy checks:
|
||
- `curl https://<domain>/api/health` returns `200` with JSON payload.
|
||
- Frontend login screen reachable.
|
||
- Review container logs for migration output (`docker compose logs backend` in Coolify shell).
|
||
|
||
## 4. Rollback Procedure
|
||
|
||
1. Identify the previous known-good image tags (from Gitea workflow history or Coolify activity log).
|
||
2. Update `BACKEND_IMAGE` / `FRONTEND_IMAGE` to the old tags.
|
||
3. Redeploy in Coolify. Migrations are idempotent; no additional action needed.
|
||
4. Validate health endpoints and smoke-test the UI.
|
||
|
||
## 5. Local Development
|
||
|
||
- Run `docker compose up --build` to start the stack. The backend container waits for PostgreSQL, runs migrations automatically, and skips seeding by default. To seed once, run `RUN_SEED=true docker compose up backend` or execute `docker compose exec ... npm run seed` manually.
|
||
- `./scripts/run-ci-tests.sh` runs lint + unit tests with the same coverage thresholds as CI.
|
||
- Backend tests rely on Docker; ensure Docker Desktop/Engine is running.
|
||
|
||
## 6. Backup & Restore
|
||
|
||
### Database
|
||
- Use `docker compose exec merchantsofhope-supplyanddemandportal-database pg_dump -U <user> <db>` to generate a dump file.
|
||
- Restore via `psql` piping the dump into the running container.
|
||
|
||
### Uploads
|
||
- Archive the `merchantsofhope-supplyanddemandportal-uploads` volume (Coolify: Settings → Backups → Volume Snapshot).
|
||
|
||
## 7. Monitoring & Alerting
|
||
|
||
- Healthcheck endpoints should be wired into external monitoring (e.g., Uptime Kuma, Grafana Cloud).
|
||
- Rate limiter defaults protect against bursts; adjust `RATE_LIMIT_MAX` / `RATE_LIMIT_WINDOW_MS` if legitimate traffic patterns trigger 429s.
|
||
|
||
## 8. Incident Response Checklist
|
||
|
||
1. **Validate Health** – `curl` backend health endpoint, inspect Coolify container logs.
|
||
2. **Check Database** – `docker compose exec ... pg_isready` and `
|
||
docker compose exec ... psql -c 'SELECT NOW();'`.
|
||
3. **Restart Services** – In Coolify or locally, redeploy backend/front containers (entrypoint will re-run migrations safely).
|
||
4. **Rollback if Needed** – Follow rollback steps above.
|
||
5. **Postmortem** – Capture root cause, update this runbook with remediation notes.
|
||
|
||
## 9. Security Posture
|
||
|
||
- JWT secrets must be at least 32 bytes and rotated regularly.
|
||
- Uploaded files are sanitized and stored on disk; configure antivirus scanning if compliance requires it.
|
||
- Rate limiting is enabled globally; consider pairing with IP allowlists at the reverse proxy if stricter controls are needed.
|
||
|
||
Keep this runbook updated as infrastructure evolves.
|