governance: adopt TDD with full unit/integration tests; update system prompt and AGENTS templates\n\nplan: add TDD milestones and test coverage criteria; update proposal accordingly

This commit is contained in:
2025-09-17 10:26:02 -05:00
parent 3859459754
commit de8ce1a4dc
11 changed files with 77 additions and 46 deletions

View File

@@ -2,15 +2,16 @@
- Scope: Phase 1 (Crawl) MVP for `CodexHelper` with subcommands, scaffolding, prompt composition, config precedence, and safety. - Scope: Phase 1 (Crawl) MVP for `CodexHelper` with subcommands, scaffolding, prompt composition, config precedence, and safety.
- Milestones: - Milestones (TDD for each step):
1) CLI skeleton + guardrails 0) Test harness setup (bats-core), tests/ structure, CI script (local)
2) Binary detection + pass-through flags 1) CLI skeleton + guardrails (write failing tests first)
3) new-mode scaffolder (repo-only) 2) Binary detection + pass-through flags (tests first)
4) new-project scaffolder (outside repo) 3) new-mode scaffolder (repo-only) (tests first)
5) run: compose prompts + invoke codex 4) new-project scaffolder (outside repo) (tests first)
6) Config precedence (YAML+yq) 5) run: compose prompts + invoke codex (tests first)
7) Templates + copies (AGENTS.md, prompts/_mode) 6) Config precedence (YAML+yq) (tests first)
8) Docs: README quickstart + wrapper usage 7) Templates + copies (AGENTS.md, prompts/_mode) (tests first)
8) Docs: README quickstart + wrapper usage (ensure examples validated by tests where feasible)
- Key rules honored: - Key rules honored:
- One-way workflow; minimal chat; read `.llm.md`, write both. - One-way workflow; minimal chat; read `.llm.md`, write both.
@@ -23,6 +24,7 @@
- `templates/project/_shared/AGENTS.md` (copy into project root) - `templates/project/_shared/AGENTS.md` (copy into project root)
- `templates/project/<ModeName>/...` (optional mode-specific add-ons; start minimal) - `templates/project/<ModeName>/...` (optional mode-specific add-ons; start minimal)
- `docs/wrapper.md` and README updates - `docs/wrapper.md` and README updates
- Test harness (`bats`), `tests/` with coverage of all CLI paths
- Implementation details: - Implementation details:
- Guard where running: inside repo → only `new-mode` allowed. - Guard where running: inside repo → only `new-mode` allowed.
@@ -38,11 +40,12 @@
- `run` composes prompts and calls detected binary; artifacts under `runs/<ts>/`. - `run` composes prompts and calls detected binary; artifacts under `runs/<ts>/`.
- Precedence: CLI > env > project > mode > global. - Precedence: CLI > env > project > mode > global.
- `prompts/global/` used in composition. - `prompts/global/` used in composition.
- Tests: all features covered by unit/integration tests (bats); TDD observed (tests committed alongside implementation); CI/local test script present.
- Open choices (defaulting now): - Open choices (defaulting now):
- Include empty `prompts/style.md`: Yes. - Include empty `prompts/style.md`: Yes.
- Config format: YAML only; tool: yq. - Config format: YAML only; tool: yq.
- Project `.gitignore`: include `runs/` and any `*.llm.*` if user prefers later (for now, only `runs/`). - Project `.gitignore`: include `runs/` and any `*.llm.*` if user prefers later (for now, only `runs/`).
- Test framework: bats-core for bash; simple runner `scripts/test.sh`.
- Next: Implement per milestones; add concise README quickstart. - Next: Implement per milestones; add concise README quickstart.

View File

@@ -2,38 +2,34 @@
Purpose: Deliver Phase 1 (Crawl) MVP of CodexHelper: subcommands, scaffolding, prompt composition, config precedence, and safety. Purpose: Deliver Phase 1 (Crawl) MVP of CodexHelper: subcommands, scaffolding, prompt composition, config precedence, and safety.
## Milestones & Tasks ## Milestones & Tasks (TDD)
1) CLI skeleton + guardrails 0) Test harness setup
- Add `CodexHelper` bash script with `new-project`, `run`, `new-mode` subcommands and `--help`. - Add `tests/` directory and `scripts/test.sh` using bats-core (document installation/usage).
- Enforce location rules: inside this repo → only `new-mode` allowed; outside → `new-project`, `run`. - Write initial smoke tests that will fail until implementation exists.
2) Binary detection + pass-through 1) CLI skeleton + guardrails (tests first)
- Implement `detect_codex`: `CODEX_BIN` env > `which codex` > `which codex-cli`; fail with helpful message if none. - Write tests for `--help`, subcommands, and location guardrails.
- Support pass-through flags: `--mode`, `--prompt-file`, `--config`, `--sandbox`, `--full-auto`, plus `--` to forward extras. - Implement `CodexHelper` with `new-project`, `run`, `new-mode` and enforce location rules.
3) new-mode (repo-only) 2) Binary detection + pass-through (tests first)
- Create `modes/<Name>/{mode.md,defaults.yaml,system.md?}` with intake comments. - Write tests for `CODEX_BIN`, `codex`, `codex-cli` resolution and pass-through flags.
- Refuse to overwrite unless `--force`. - Implement `detect_codex` and flag forwarding.
4) new-project (outside repo) 3) new-mode (repo-only) (tests first)
- Create `<path>/<name>` (or use `<path>` if existing and empty/`--force`). - Write tests for scaffolding `modes/<Name>/...` and `--force` behavior.
- Copy templates: - Implement creation with intake comments and overwrite safeguards.
- `templates/project/_shared/AGENTS.md``<project>/AGENTS.md`
- Create `prompts/project.md` (narrative template) and empty `prompts/style.md` (optional)
- Create `prompts/_mode/` and copy read-only references of `prompts/global/system.md` and selected mode prompts
- Generate `codex.yaml` with mode + codex settings (placeholders)
- Generate `codex.sh` entrypoint to compose prompts and call codex
- Add `.gitignore` (includes `runs/`)
5) run: compose + invoke 4) new-project (outside repo) (tests first)
- Validate project structure; ensure `prompts/_mode/` exists. - Write tests for project directory creation, copying AGENTS.md, prompts, codex.yaml, codex.sh, and `.gitignore` content.
- Compose prompts in order: Global system → Mode system overlay (optional) → Mode rules → Project narrative. - Implement scaffolding and read-only copies under `prompts/_mode/`.
- Create `runs/<timestamp>/` and save composed prompt and invocation metadata.
- Invoke `$CODEX_BIN` with pass-through flags; handle `--sandbox` and `--full-auto`.
6) Config precedence (YAML + yq) 5) run: compose + invoke (tests first)
- Load `codex.yaml`; merge with mode defaults and global defaults. - Write tests for prompt composition order, `runs/<timestamp>/` outputs, and invocation args.
- Apply ENV overrides; apply CLI overrides last. - Implement composition and execution using `$CODEX_BIN`.
6) Config precedence (YAML + yq) (tests first)
- Write tests covering precedence: global < mode < project < env < CLI.
- Implement merging with `yq` and apply overrides.
7) Docs 7) Docs
- Add `docs/wrapper.md` with usage examples and config reference. - Add `docs/wrapper.md` with usage examples and config reference.
@@ -45,6 +41,7 @@ Purpose: Deliver Phase 1 (Crawl) MVP of CodexHelper: subcommands, scaffolding, p
- Write outputs to `<project>/runs/<timestamp>/...`. - Write outputs to `<project>/runs/<timestamp>/...`.
- Minimal chat; read `.llm.md`, write both `.md` and `.llm.md` for collab artifacts. - Minimal chat; read `.llm.md`, write both `.md` and `.llm.md` for collab artifacts.
- Governance/Propagation: reflect future non-project-specific norms into `prompts/global/` and AGENTS templates; log in DevLog. - Governance/Propagation: reflect future non-project-specific norms into `prompts/global/` and AGENTS templates; log in DevLog.
- TDD default: write failing tests before implementing features; require unit/integration tests for all new functionality in this repo and generated projects.
## Acceptance Criteria ## Acceptance Criteria
- Inside this repo: `CodexHelper new-mode --name Demo` creates `modes/Demo/{mode.md,defaults.yaml}` (and optional `system.md`) and refuses overwrites without `--force`. - Inside this repo: `CodexHelper new-mode --name Demo` creates `modes/Demo/{mode.md,defaults.yaml}` (and optional `system.md`) and refuses overwrites without `--force`.
@@ -53,13 +50,15 @@ Purpose: Deliver Phase 1 (Crawl) MVP of CodexHelper: subcommands, scaffolding, p
- Precedence: CLI > env > project > mode > global. - Precedence: CLI > env > project > mode > global.
- `prompts/global/{system.md,system.llm.md}` are present and included in composition. - `prompts/global/{system.md,system.llm.md}` are present and included in composition.
- Running `CodexHelper run` or `new-project` inside this repo errors with guidance. - Running `CodexHelper run` or `new-project` inside this repo errors with guidance.
- Tests: bats test suite covers all CLI paths and guardrails; tests pass locally; test runner script present.
## Assumptions/Risks ## Assumptions/Risks
- codex-cli flags may vary; well design pass-through and document tested flags. - codex-cli flags may vary; well design pass-through and document tested flags.
- `yq` is available; if missing, we provide a helpful error. - `yq` is available; if missing, we provide a helpful error.
- bats-core availability assumed; if not present, document installation and provide graceful skip with clear message.
## Timeline (targeted) ## Timeline (targeted)
- Day 1: Milestones 13 - Day 1: Milestones 13
- Day 2: Milestones 45 - Day 2: Milestones 45
- Day 3: Milestone 6 + Docs - Day 3: Milestone 6 + Docs
- Ongoing: Maintain/expand tests with each feature change (TDD).

View File

@@ -13,7 +13,8 @@
- Outputs: `<project>/runs/<ts>/...`. - Outputs: `<project>/runs/<ts>/...`.
- Layout (repo): `CodexHelper`, `bin/install.sh`, `prompts/global/{system.md,system.llm.md}`, `modes/<Name>/{mode.md,system.md?,defaults.yaml}`, `templates/project/<Name>/...`, `templates/project/_shared/AGENTS.md`, `meta/{AGENTS.seed.md,AGENTS.seed.llm.md}`. - Layout (repo): `CodexHelper`, `bin/install.sh`, `prompts/global/{system.md,system.llm.md}`, `modes/<Name>/{mode.md,system.md?,defaults.yaml}`, `templates/project/<Name>/...`, `templates/project/_shared/AGENTS.md`, `meta/{AGENTS.seed.md,AGENTS.seed.llm.md}`.
- Layout (project): `AGENTS.md`, `prompts/{project.md,style.md?}`, `prompts/_mode/`, `codex.yaml`, `codex.sh`, `runs/`. - Layout (project): `AGENTS.md`, `prompts/{project.md,style.md?}`, `prompts/_mode/`, `codex.yaml`, `codex.sh`, `runs/`.
- Governance/Propagation: non-project-specific workflow changes get recorded in `prompts/global/` and seed AGENTS templates; proposal/plan updated so scaffolding includes them. - Governance/Propagation: non-project-specific workflow changes get recorded in `prompts/global/` and seed AGENTS templates; proposal/plan updated so scaffolding includes them.
- TDD Governance: adopt test-driven development with full unit/integration tests for all features in this repo and generated projects; tests written first and required for acceptance.
- Phase 1 acceptance: - Phase 1 acceptance:
- new-mode creates mode skeleton - new-mode creates mode skeleton
- new-project scaffolds without overwrites - new-project scaffolds without overwrites
@@ -22,6 +23,7 @@
- new-project copies `templates/project/_shared/AGENTS.md` into project root as `AGENTS.md` - new-project copies `templates/project/_shared/AGENTS.md` into project root as `AGENTS.md`
- prompts/global present and used in prompt composition - prompts/global present and used in prompt composition
- governance rule: changes to global norms propagate to prompts/global and AGENTS templates; logged in DevLog - governance rule: changes to global norms propagate to prompts/global and AGENTS templates; logged in DevLog
- tests: unit/integration tests (bats) cover CLI flows and guardrails; TDD observed
\n+## Approval — Tick All That Apply \n+## Approval — Tick All That Apply
- Subcommands approved: `new-project`, `run`, `new-mode` [ ] - Subcommands approved: `new-project`, `run`, `new-mode` [ ]

View File

@@ -38,7 +38,8 @@ Purpose: Implement a bash wrapper (CodexHelper) around codex-cli with “modes
- `templates/project/<ModeName>/...` (project scaffolding templates copied on `new-project`) - `templates/project/<ModeName>/...` (project scaffolding templates copied on `new-project`)
- `templates/project/_shared/AGENTS.md` (AGENTS template copied into projects) - `templates/project/_shared/AGENTS.md` (AGENTS template copied into projects)
- `meta/{AGENTS.seed.md, AGENTS.seed.llm.md}` (seed AGENTS templates for bootstrap/reference) - `meta/{AGENTS.seed.md, AGENTS.seed.llm.md}` (seed AGENTS templates for bootstrap/reference)
- Governance/Propagation: maintain global norms in `prompts/global/` and seed AGENTS templates; reflect such changes in proposal/plan for scaffolding. - Governance/Propagation: maintain global norms in `prompts/global/` and seed AGENTS templates; reflect such changes in proposal/plan for scaffolding.
- TDD Governance: enforce test-driven development; require unit/integration tests for all features here and in generated projects.
## Project Layout (generated) ## Project Layout (generated)
- `AGENTS.md` (from `templates/project/_shared/AGENTS.md`) - `AGENTS.md` (from `templates/project/_shared/AGENTS.md`)
@@ -62,6 +63,7 @@ Purpose: Implement a bash wrapper (CodexHelper) around codex-cli with “modes
- Implementation: `codex.sh` concatenates/feeds prompts to `codex` in that order (exact mechanism depends on codex-cli interface; well use files/flags as supported). - Implementation: `codex.sh` concatenates/feeds prompts to `codex` in that order (exact mechanism depends on codex-cli interface; well use files/flags as supported).
- Explicit: `prompts/global/` is present and used as the base of composition. - Explicit: `prompts/global/` is present and used as the base of composition.
- Governance/Propagation: non-project-specific rules are folded back into global/system and templates; changes logged. - Governance/Propagation: non-project-specific rules are folded back into global/system and templates; changes logged.
- TDD: tests are written first and required for acceptance.
## Safety ## Safety
- Guardrails: - Guardrails:
@@ -96,6 +98,7 @@ Purpose: Implement a bash wrapper (CodexHelper) around codex-cli with “modes
- Project scaffold includes `AGENTS.md` copied from `templates/project/_shared/AGENTS.md`. - Project scaffold includes `AGENTS.md` copied from `templates/project/_shared/AGENTS.md`.
- `prompts/global/{system.md, system.llm.md}` exist and are included in composition. - `prompts/global/{system.md, system.llm.md}` exist and are included in composition.
- Governance/Propagation honored: when norms change, update `prompts/global/` and AGENTS templates; log in DevLog. - Governance/Propagation honored: when norms change, update `prompts/global/` and AGENTS templates; log in DevLog.
- TDD honored: a test suite (bats) covers CLI flows and guardrails; tests pass.
## Open Items for Confirmation ## Open Items for Confirmation
- Template coverage: include `prompts/style.md` by default? (well include as optional, empty file) - Template coverage: include `prompts/style.md` by default? (well include as optional, empty file)

View File

@@ -170,3 +170,17 @@ Details:
Next Steps: Next Steps:
- Await plan review/approval; then start implementation. - Await plan review/approval; then start implementation.
---
Date: 2025-09-17 16:12 (UTC)
Summary:
- Incorporated TDD as a governance rule with full unit/integration tests for all features in this repo and generated projects.
Details:
- Updated `prompts/global/system.{md,llm.md}`, `meta/AGENTS.seed.*`, and `templates/project/_shared/AGENTS.md` to mandate TDD.
- Amended the proposal and plan to include TDD milestones and test coverage acceptance criteria.
Next Steps:
- Execute the plan using TDD, starting with test harness setup and failing tests.

View File

@@ -131,3 +131,11 @@ This log is concise and structured for quick machine parsing and summarization.
- Added `collab/plan/01-codexhelper.md` and `.llm.md` with milestones, deliverables, and acceptance criteria - Added `collab/plan/01-codexhelper.md` and `.llm.md` with milestones, deliverables, and acceptance criteria
- next: - next:
- Await plan review/approval; then implement per milestones - Await plan review/approval; then implement per milestones
## 2025-09-17T16:12Z
- context: Governance update — adopt TDD and full unit tests across this repo and generated projects
- actions:
- Updated global system prompt and seed/project AGENTS templates to encode TDD requirement
- Amended proposal and plan to include TDD milestones and acceptance criteria
- next:
- Proceed with TDD in implementation; scaffold tests first

View File

@@ -3,7 +3,7 @@
- One-way workflow: questions → proposal → plan → implement; no backsteps after approval. - One-way workflow: questions → proposal → plan → implement; no backsteps after approval.
- Read `.llm.md` only; write both `.md` and `.llm.md` siblings for collab artifacts. - Read `.llm.md` only; write both `.md` and `.llm.md` siblings for collab artifacts.
- Chat ≤5 lines; default “Updated <filepath>…”; no diffs; announce only collab file changes; log details in `docs/devlog/`. - Chat ≤5 lines; default “Updated <filepath>…”; no diffs; announce only collab file changes; log details in `docs/devlog/`.
- Keep changes minimal and focused; add tests with features; consistent style. - Keep changes minimal and focused; adopt TDD (tests first); require unit/integration tests for all features; consistent style.
- Git: Conventional Commits; branch `main`; optional tags `YYYY-MM-DD-HHMM`. - Git: Conventional Commits; branch `main`; optional tags `YYYY-MM-DD-HHMM`.
- Tools: file-first; use `rg`; read ≤250 lines; respect sandbox/approvals; preface grouped commands. - Tools: file-first; use `rg`; read ≤250 lines; respect sandbox/approvals; preface grouped commands.
- Prompts/config (if applicable): YAML+yq; precedence CLI>ENV>project>mode>global; prompts order global→mode-system?→mode→project; outputs to `runs/<ts>/`; `--force` to overwrite; never `git push`. - Prompts/config (if applicable): YAML+yq; precedence CLI>ENV>project>mode>global; prompts order global→mode-system?→mode→project; outputs to `runs/<ts>/`; `--force` to overwrite; never `git push`.

View File

@@ -24,7 +24,8 @@ Note: This is a template copied into generated projects. Customize as needed for
## Code and Tests ## Code and Tests
- Keep changes minimal and focused; avoid unrelated refactors. - Keep changes minimal and focused; avoid unrelated refactors.
- Add unit tests in `tests/` alongside features. - Adopt Test-Driven Development (TDD): write tests first and require unit/integration tests for all features.
- Maintain `tests/` folder and a test runner script; keep tests fast and focused.
- Maintain consistent style with the existing codebase. - Maintain consistent style with the existing codebase.
## Git Workflow ## Git Workflow

View File

@@ -5,7 +5,7 @@
- Linear workflow: questions → proposal → plan → implement; no backsteps after approval; edits stay in current steps file. - Linear workflow: questions → proposal → plan → implement; no backsteps after approval; edits stay in current steps file.
- Chat: ≤5 lines; default “Updated <filepath>…”; no diffs; only announce collab file changes; log details in `docs/devlog/`. - Chat: ≤5 lines; default “Updated <filepath>…”; no diffs; only announce collab file changes; log details in `docs/devlog/`.
- Dev logs: update `docs/devlog/DEVLOG_{LLM,HUMAN}.md` each meaningful change. - Dev logs: update `docs/devlog/DEVLOG_{LLM,HUMAN}.md` each meaningful change.
- Coding: minimal focused changes; tests alongside features; no unrelated fixes; keep style consistent. - Coding: minimal focused changes; TDD default (write tests first); require unit/integration tests for all features; no unrelated fixes; keep style consistent.
- Git: work on `main`; Conventional Commits; tags `YYYY-MM-DD-HHMM` when needed. - Git: work on `main`; Conventional Commits; tags `YYYY-MM-DD-HHMM` when needed.
- Tools: use `apply_patch`; prefer `rg`; read ≤250 lines; respect sandbox/approvals; preface grouped commands. - Tools: use `apply_patch`; prefer `rg`; read ≤250 lines; respect sandbox/approvals; preface grouped commands.
- Plans: use plan tool for multi-step tasks; one `in_progress`; keep high quality. - Plans: use plan tool for multi-step tasks; one `in_progress`; keep high quality.

View File

@@ -33,7 +33,8 @@ You are a coding agent running in the Codex CLI (terminal-based). Be precise, sa
## Coding and Tests ## Coding and Tests
- Fix root causes; avoid unrelated refactors. - Fix root causes; avoid unrelated refactors.
- Keep changes minimal and consistent with existing style. - Keep changes minimal and consistent with existing style.
- Add unit tests under `tests/` alongside new features when appropriate. - TDD default: write failing tests first; require unit/integration tests for all new features (this repo and generated projects).
- Maintain `tests/` and a local test runner; keep tests fast and focused.
- Do not add licenses/headers unless requested. - Do not add licenses/headers unless requested.
## Git Workflow ## Git Workflow

View File

@@ -21,7 +21,7 @@ This file is copied by scaffolding into new projects. Edit to suit the project w
- Keep chat ≤5 lines; no diffs; announce only collab file changes; log details in DevLog. - Keep chat ≤5 lines; no diffs; announce only collab file changes; log details in DevLog.
## Coding, Tests, and Git ## Coding, Tests, and Git
- Minimal, focused changes; add tests with features; consistent style. - Minimal, focused changes; adopt TDD (write tests first) and require unit/integration tests for all features; consistent style.
- Conventional Commits; branch `main`; tags `YYYY-MM-DD-HHMM` when warranted. - Conventional Commits; branch `main`; tags `YYYY-MM-DD-HHMM` when warranted.
## Tooling and Safety ## Tooling and Safety