# KNEL-Football Development Journal

> **IMPORTANT**: This file is APPEND-ONLY. Never delete or modify existing entries.
> Add new entries at the TOP (after this header) with date and context.
> This serves as long-term memory for AI agents and human developers.

---

## Entry 2026-02-19 (Session 5): Critical Bug Fixes

### Context
Resumed session after context overflow. Deep orientation revealed critical bugs in
security-hardening.sh hook that were blocking FIM and SSH client configuration.

### Changes Implemented

1. **Bug Fix: Function Name Mismatch**
   - `config/hooks/live/security-hardening.sh:19` called `configure_ssh`
   - But `src/security-hardening.sh` defines `configure_ssh_client`
   - Fixed: Changed hook to call `configure_ssh_client`

2. **Bug Fix: Missing FIM Call**
   - `configure_fim` function existed in src/security-hardening.sh
   - But hook was never calling it
   - Fixed: Added `configure_fim` call to hook

### Root Cause Analysis

Commit 0807611 "feat: add FIM, comprehensive audit logging, SSH client-only" added
functions to src/security-hardening.sh but the corresponding hook was either:
- Not updated to call new functions (configure_fim)
- Calling wrong function name (configure_ssh vs configure_ssh_client)

This is a common pattern in codebase consolidation: when adding features to source
files, remember to update ALL callers (hooks, scripts, tests).

### Lessons Learned

1. **Cross-Reference Source and Callers**
   - When adding functions, search for ALL callers
   - `grep -r function_name config/` to find hooks
   - Test execution paths, not just function existence

2. **Documentation vs Reality Gap**
   - JOURNAL.md said "FIM ADDED" but hook never called it
   - STATUS.md said "SSH client-only CONFIGURED" but wrong function name
   - Lesson: Verify code execution, not just code presence

### Verification

```bash
./run.sh lint    # ✅ Zero warnings
./run.sh test    # ✅ 92 pass, 19 skip (VM tests)
```

### Action Items

1. Rebuild ISO with bug fixes (in progress)
2. Update STATUS.md with accurate state
3. Consider adding hook validation tests

### ⚠️ PERMANENT LESSONS FOR FUTURE SESSIONS

**These mistakes have happened multiple times. DO NOT repeat them.**

1. **When Adding/Modifying Functions: ALWAYS Update All Callers**
   - Pattern: Function added to `src/*.sh` but hook in `config/hooks/` not updated
   - Prevention: After editing `src/security-hardening.sh`, immediately run:
     ```bash
     grep -r "configure_ssh\|configure_fim\|configure_audit" config/hooks/
     ```
   - Test: Run `./run.sh test` before committing - don't just assume it works

2. **Documentation Claims Must Match Code Reality**
   - Pattern: JOURNAL says "ADDED" but hook never calls the function
   - Prevention: After implementing a feature, verify execution path:
     ```bash
     # For each new function in src/:
     # 1. Find where it should be called
     # 2. Add the call
     # 3. Test that it runs
     ```
   - Never trust docs without code verification

3. **Cross-Reference Before Committing**
   - This project has: `src/*.sh` → `config/hooks/**/*.sh` → executed during build
   - Any change to source files requires checking ALL downstream callers
   - Use `grep -r "function_name" .` liberally

---

## Entry 2026-02-17 (Session 4): Script Consolidation

### Context
Continued session focused on consolidating all top-level scripts into run.sh as the single
entry point. Merged test-iso.sh (344 lines) and monitor-build.sh (43 lines) into run.sh.

### Changes Implemented

1. **Script Consolidation**
   - Merged test-iso.sh VM testing framework into run.sh
   - Merged monitor-build.sh build monitoring into run.sh
   - Deleted test-iso.sh and monitor-build.sh
   - run.sh now ~500+ lines, single entry point for all operations

2. **New run.sh Commands**
   ```bash
   ./run.sh monitor [secs]          # Monitor build progress
   ./run.sh test:iso check          # Check VM testing prerequisites
   ./run.sh test:iso create         # Create and start test VM
   ./run.sh test:iso console        # Connect to VM console
   ./run.sh test:iso status         # Show VM status
   ./run.sh test:iso destroy        # Destroy VM and cleanup
   ./run.sh test:iso boot-test      # Run automated boot test
   ./run.sh test:iso secure-boot    # Test Secure Boot
   ./run.sh test:iso fde-test       # Test FDE passphrase prompt
   ```

3. **Test Updates**
   - Updated tests/system/boot_test.bats to test run.sh instead of test-iso.sh
   - Updated skip messages in fde_test.bats and secureboot_test.bats

4. **ISO Rebuild**
   - Built successfully at 15:19 CST (449 MB)
   - Checksums verified (SHA256, MD5)

### Architectural Decision Records

#### ADR-009: Single Entry Point (run.sh)
**Date**: 2026-02-17
**Status**: Accepted

**Context**: Multiple top-level scripts (run.sh, test-iso.sh, monitor-build.sh) caused
fragmentation and made the project harder to navigate.

**Decision**: Consolidate all scripts into run.sh as the single entry point.

**Rationale**:
- Simpler user experience - one command to remember
- Consistent interface for all operations
- Easier to maintain and test
- Follows Unix philosophy of doing one thing well

**Consequences**:
- run.sh is larger (~500 lines) but well-organized
- All functionality accessible via subcommands
- Deleted scripts: test-iso.sh, monitor-build.sh

### Lessons Learned

1. **VM Testing Requires libvirt Group**
   - virt-install fails if user not in libvirt group
   - QEMU fallback works but virt-install preferred for libvirt integration
   - Fix: `sudo usermod -aG libvirt $USER` then logout/login

2. **Test Updates Required After Script Moves**
   - When moving/deleting scripts, grep for all references
   - Tests in tests/system/ referenced test-iso.sh directly
   - Updated to use run.sh test:iso commands

### Files Changed

| File | Action |
|------|--------|
| run.sh | Merged test-iso.sh and monitor-build.sh |
| test-iso.sh | DELETED |
| monitor-build.sh | DELETED |
| tests/system/boot_test.bats | Updated to test run.sh |
| tests/system/fde_test.bats | Updated skip message |
| tests/system/secureboot_test.bats | Updated skip message |
| STATUS.md | Updated status to COMPLETE |
| JOURNAL.md | This entry |

### Commit

```
d9f2f02 refactor: consolidate test-iso.sh and monitor-build.sh into run.sh
```

---

## Entry 2026-02-17 (Session 3): Project Re-Orientation

### Context
New session start. User requested deep project review and orientation. Reviewed git logs,
STATUS.md, JOURNAL.md, and current system state.

### Current State Assessment

1. **ISO Status**: STALE
   - Built: 2026-02-17 10:50
   - 6 commits since build (FIM, audit, SSH client-only, shellcheck fixes)
   - Missing features: AIDE FIM, comprehensive auditd, SSH client-only
   - Rebuild required to include recent security features

2. **Test Suite**: HEALTHY
   - 111 tests total, 92 pass, 19 skip (VM-required)
   - Skip reasons: VM not running, requires manual verification
   - Categories: unit (12), integration (6), security (44), system (47)
   - Zero failures, zero shellcheck warnings

3. **Compliance**: IN PROGRESS
   - CIS 1.4 (FIM): Code ready, not in ISO
   - CIS 5.2 (SSH): Code ready, not in ISO
   - CIS 6.2 (Audit): Code ready, not in ISO
   - NIST/FedRAMP/CMMC: Same status - config ready, needs rebuild

4. **Blockers**:
   - User NOT in libvirt group (blocks VM testing)
   - ISO outdated (blocks runtime verification)

### Architecture Review

```
KNEL-Football OS (this project)
    │ WireGuard (outbound only)
    ▼
Privileged Access Workstation
    │ Direct access
    ▼
Tier0 Infrastructure
```

Key design principle: **No inbound services**. SSH client, RDP client, WireGuard client only.

### Security Features Implemented (Code)

| Feature | File | Status |
|---------|------|--------|
| Full Disk Encryption | config/hooks/installed/encryption-*.sh | ✅ Code ready |
| Password Policy | src/security-hardening.sh | ✅ Code ready |
| Firewall (nftables) | config/hooks/live/firewall-setup.sh | ✅ Code ready |
| FIM (AIDE) | config/hooks/live/aide-setup.sh | ✅ Code ready |
| Audit Logging | config/hooks/live/audit-logging.sh | ✅ Code ready |
| SSH Client-Only | config/hooks/live/ssh-client-only.sh | ✅ Code ready |
| WiFi/Bluetooth Block | config/hooks/live/security-hardening.sh | ✅ Code ready |

### Key Files to Understand

- `run.sh` - Main entry point for all operations
- `AGENTS.md` - Agent behavior guidelines (READ FIRST)
- `STATUS.md` - Manager status report
- `JOURNAL.md` - This file - AI memory
- `PRD.md` - Product requirements
- `config/preseed.cfg` - Debian installer configuration
- `config/hooks/live/` - Runtime configuration hooks
- `tests/` - BATS test suite

### Open Action Items (from STATUS.md)

1. Rebuild ISO with new security features
2. Logout/login for libvirt access (user action)
3. Run VM boot tests after ISO rebuild
4. Remove hardcoded passwords from preseed.cfg
5. Consider Secure Boot implementation

### Session Decision

**Next step**: Rebuild ISO to include FIM, audit logging, SSH client-only changes.
This is a 60-90 minute build. User should decide if they want to start it now.

### ADR-008: ISO Rebuild Priority
**Date**: 2026-02-17
**Status**: Proposed

**Context**: 6 commits with security features made since last ISO build. Need to decide
whether to rebuild now or continue development.

**Options**:
1. Rebuild now - validates features, enables runtime testing
2. Continue development - batch more changes, rebuild later

**Recommendation**: Rebuild now. Features are ready, compliance requires verification.

---

## Entry 2026-02-17 (Session 2): FIM, Audit, SSH Security Enhancements

### Context
Continued session focused on closing compliance gaps for CIS, FedRAMP, and CMMC.
Added File Integrity Monitoring (FIM), comprehensive audit logging, and SSH client-only
configuration. Resolved all shellcheck warnings and added git safety documentation.

### Changes Implemented

1. **File Integrity Monitoring (AIDE)**
   - Added `config/hooks/live/aide-setup.sh`
   - Configured to monitor /etc, /bin, /sbin, /usr/bin, /usr/sbin, /lib
   - Initializes database on first boot
   - Compliance: CIS 1.4, FedRAMP AU-7, CMMC AU.3.059

2. **Comprehensive Audit Logging**
   - Added `config/hooks/live/audit-logging.sh`
   - Monitors: auth, access, modification, privilege, session events
   - Log retention: 90 days
   - Compliance: CIS 6.2, FedRAMP AU-2, CMMC AU.2.042

3. **SSH Client-Only Configuration**
   - Modified `config/hooks/live/ssh-client-only.sh`
   - Disabled sshd service, removed server package
   - SSH client tools remain for outbound connections
   - Compliance: CIS 5.2, NIST 800-53 IA-5, CMMC IA.2.078

4. **Shellcheck Fixes**
   - Resolved all warnings in shell scripts
   - SC2120/SC2119: Functions called without arguments (correct behavior)
   - SC1091: Source files exist at runtime
   - SC2034: Variables used in templates
   - Result: ZERO shellcheck warnings

5. **Git Safety Rules**
   - Added to AGENTS.md:
     - Quote all path arguments (handles spaces)
     - Use non-interactive rebase (`git rebase --no-interactive` not available, use `-i` with care)
     - Destructive operations require user confirmation

### Test Coverage Update

```
Before Session: 31 tests
After Session:  111 tests (+80)

Unit Tests:        12 → 12 (unchanged)
Integration Tests:  6 →  6 (unchanged)
Security Tests:    13 → 44 (+31)
System Tests:       0 → 47 (+47, new category)
```

### Architectural Decision Records

#### ADR-005: File Integrity Monitoring via AIDE
**Date**: 2026-02-17
**Status**: Accepted

**Context**: Need file integrity monitoring for compliance (CIS 1.4, FedRAMP AU-7).

**Decision**: Use AIDE (Advanced Intrusion Detection Environment) with focused monitoring
of critical system directories.

**Rationale**:
- AIDE is mature, well-supported on Debian
- Lightweight compared to commercial alternatives
- Meets multiple compliance requirements
- Database can be rebuilt if needed

**Consequences**:
- Initial database creation on first boot (minor delay)
- Regular checks recommended via cron
- False positives if system packages updated legitimately

#### ADR-006: Comprehensive Audit via auditd
**Date**: 2026-02-17
**Status**: Accepted

**Context**: Need comprehensive audit logging for CIS 6.2, FedRAMP AU-2.

**Decision**: Use auditd with rules for all major event categories.

**Rationale**:
- auditd is the Linux standard for audit logging
- Kernel-level monitoring (cannot be bypassed by userspace)
- Structured logs for analysis
- Meets multiple compliance requirements

**Consequences**:
- Increased log volume (manageable with rotation)
- Performance impact minimal on workstation workloads
- Log retention policy required (90 days set)

#### ADR-007: SSH Client-Only Mode
**Date**: 2026-02-17
**Status**: Accepted

**Context**: KNEL-Football should have no inbound services.

**Decision**: Remove SSH server, keep only client tools.

**Rationale**:
- Reduces attack surface significantly
- Aligns with "outbound only" security model
- User can SSH out to other systems as needed
- No management via SSH (physical console only)

**Consequences**:
- No remote administration via SSH
- Must use physical console for management
- WireGuard outbound only, no inbound connections

### Lessons Learned

1. **Shellcheck Warnings Can Be Misleading**
   - SC2120/SC2119 warnings were false positives
   - Functions intentionally don't use arguments (generate static config)
   - Used `# shellcheck disable` sparingly, documented why

2. **Compliance Requirements Overlap**
   - CIS 1.4 (FIM) → FedRAMP AU-7 → CMMC AU.3.059
   - Single AIDE implementation satisfies all three
   - Document compliance mappings clearly

3. **Test Framework Scales Well**
   - Adding 80 new tests was straightforward
   - BATS + custom helpers pattern works
   - System tests for VM boot require special handling (libvirt)

### Action Items for Future Sessions

1. Rebuild ISO with new security features
2. Run VM boot tests after user logout/login for libvirt
3. Verify FDE runtime behavior in VM
4. Consider Secure Boot implementation
5. Update preseed.cfg to remove hardcoded passwords

---

## Entry 2026-02-17 (Session 1): Project Assessment and Test Coverage Analysis

### Context
Comprehensive project review after session handoff. User requested full orientation
and 100% test coverage including VM boot tests, Secure Boot, and FDE runtime tests.

### Insights

1. **Test Infrastructure Pattern**
   - BATS tests work well for static analysis but lack runtime verification
   - Current tests validate file existence and content, not actual behavior
   - Missing entire category: system/integration tests that boot the ISO

2. **Docker-Only Workflow is Correct**
   - All build/test commands run inside Docker containers
   - Prevents host system pollution
   - Makes builds reproducible across environments
   - Volumes: `/workspace` (read-only), `/build` (temp), `/output` (artifacts)

3. **Shellcheck Warnings Are Non-Critical**
   - SC2120/SC2119: Functions don't use arguments but called without `"$@"`
   - SC1091: Source files not available during shellcheck (exist at runtime)
   - Pattern: Functions generate config, don't need arguments

### Architectural Decision Records (ADRs)

#### ADR-001: Two-Tier Security Model
**Date**: 2026-01-28 (documented 2026-02-17)
**Status**: Accepted

**Context**: How should KNEL-Football OS access tier0 infrastructure?

**Decision**: KNEL-Football OS is a secure remote terminal, NOT direct tier0 access.
Flow: KNEL-Football OS → WireGuard VPN → Privileged Access Workstation → Tier0

**Rationale**:
- Defense in depth - multiple hops before tier0
- Compromise of laptop doesn't directly expose tier0
- WireGuard provides encrypted tunnel
- Physical workstation adds another security layer

**Consequences**:
- Network configuration focuses on WireGuard only
- WiFi/Bluetooth permanently disabled
- SSH configured for key-based auth only

#### ADR-002: Docker-Only Build Environment
**Date**: 2026-01-28 (documented 2026-02-17)
**Status**: Accepted

**Context**: How should ISO builds be executed?

**Decision**: ALL build operations run inside Docker containers. No host modifications.

**Rationale**:
- Reproducible builds across different host systems
- No pollution of host environment
- Easy cleanup (just remove containers/images)
- CI/CD friendly

**Consequences**:
- `run.sh` wraps all commands with `docker run`
- ISO build requires `--privileged` for loop devices
- Output artifacts copied via volume mounts

#### ADR-003: LUKS2 Over LUKS1
**Date**: 2026-01-28 (documented 2026-02-17)
**Status**: Accepted

**Context**: Which disk encryption format to use?

**Decision**: Use LUKS2 with Argon2id KDF, AES-256-XTS cipher, 512-bit key.

**Rationale**:
- LUKS2 is newer, more secure format
- Argon2id resists GPU/ASIC attacks better than PBKDF2
- AES-XTS is NIST-approved for disk encryption
- 512-bit key provides security margin

**Consequences**:
- Modern systems only (older grub may not support)
- Boot requires passphrase entry
- No recovery without passphrase

#### ADR-004: BATS Without External Libraries
**Date**: 2026-01-28 (documented 2026-02-17)
**Status**: Accepted

**Context**: BATS test framework libraries were failing to load.

**Decision**: Remove bats-support, bats-assert, bats-file dependencies.
Use custom assertion functions in `tests/test_helper/common.bash`.

**Rationale**:
- External library loading was unreliable
- Custom functions provide same functionality
- Fewer dependencies = fewer failure points
- Easier to debug when tests fail

**Consequences**:
- Custom assertions must be maintained
- Tests don't benefit from upstream library fixes
- But: simpler, more predictable behavior

### Patterns Observed

1. **Hook Organization**
   - `config/hooks/live/` - Runs during live session (before install)
   - `config/hooks/installed/` - Runs after installation
   - Pattern: Source shared functions, call main function

2. **Script Structure**
   ```bash
   #!/bin/bash
   set -euo pipefail
   # Functions that generate config
   main() { ... }
   # Call main if script executed directly
   ```

3. **Test Structure**
   ```bash
   #!/usr/bin/env bats
   @test "description" {
       # Setup
       # Exercise
       # Verify
   }
   ```

### Lessons Learned

1. **test:iso Command Was Broken**
   - `run.sh:172` references deleted `test-iso.sh`
   - Commit c1505a9 removed obsolete scripts including test-iso.sh
   - But run.sh was not updated to remove the command
   - Lesson: When removing files, search for all references

2. **Preseed.cfg Has Hardcoded Passwords**
   - Lines 28-31 contain default passwords
   - These are installer defaults, should be changed on first boot
   - Security risk if users don't change them
   - Lesson: Consider using installer prompts instead

3. **Test Coverage Claim vs Reality**
   - Documentation claimed 95% coverage
   - Reality: 100% static analysis, 0% runtime/VM testing
   - Lesson: Be precise about what "coverage" means

### Action Items for Future Sessions

1. Implement VM boot tests using libvirt
2. Add Secure Boot support (shim-signed, grub-efi-amd64-signed)
3. Create runtime FDE passphrase prompt tests
4. Remove hardcoded passwords from preseed.cfg
5. Fix shellcheck warnings (low priority, non-critical)

---

## Entry 2026-01-28: Initial Build Completion

### Context
First successful ISO build completed after 72 minutes.

### Insights

1. **Live-Build Stages**
   - bootstrap: Downloads base system (longest stage)
   - chroot: Installs packages, runs hooks
   - binary: Creates ISO filesystem
   - checksum: Generates SHA256/MD5

2. **Build Time Breakdown**
   - Total: ~72 minutes
   - bootstrap: ~40 minutes (network dependent)
   - chroot: ~20 minutes
   - binary: ~10 minutes

3. **ISO Size**
   - Final ISO: 450 MB
   - Includes: Debian base, IceWM, WireGuard, security tools
   - Reasonable size for secure workstation

### Patterns

1. **Docker Volume Strategy**
   - `/workspace` mounted read-only (source code)
   - `/build` for intermediate files
   - `/output` for final artifacts
   - Prevents accidental modification of source

2. **Checksum Generation**
   - Generate both SHA256 and MD5
   - Name checksum files after ISO
   - Copy to output directory with ISO

---

*End of Journal. Add new entries at the top.*