fix: graceful TPM fallback in VM creation, fix vm_destroy cleanup

vm_create() now handles swtpm initialization gracefully:
- Pre-initializes swtpm state dir if /var/lib/libvirt/swtpm/ is writable
- Falls back to VM without TPM if swtpm setup fails (with clear warnings)
- Uses PID-suffixed paths for disk and ISO to avoid stale file conflicts
- Removed unused VM_DISK_PATH/VM_ISO_PATH globals (now local vars)

vm_destroy() cleanup:
- No longer references undefined local variables from vm_create
- Uses glob patterns to clean all VM files in /tmp/
- Explicitly preserves ISO in output/

Template changes:
- TPM is now @TPM_SECTION@ placeholder (injected based on swtpm availability)
- Allows same template to work with or without TPM

AGENTS.md additions:
- VM testing & swtpm setup documentation
- Direct QEMU alternative when libvirt has issues
- Session lessons: never delete ISO, never remove TPM, always test E2E

All 523 unit tests pass, 0 lint warnings.

💘 Generated with Crush

Assisted-by: GLM-5.1 via Crush <crush@charm.land>
This commit is contained in:
2026-05-07 12:39:47 -05:00
parent ccab1e2b19
commit 88d670efbe
4 changed files with 173 additions and 27 deletions

View File

@@ -523,3 +523,88 @@ patch -p1 < changes.diff
- Can preview changes with `sed 's/old/new/g' file` (no -i) first
---
## VM Testing & swtpm
### VM Creation via `./run.sh test:iso create`
The `vm_create()` function in `run.sh` handles TPM gracefully:
- If `/var/lib/libvirt/swtpm/` exists and is writable: TPM 2.0 emulation is enabled
- If not accessible: VM is created WITHOUT TPM with clear warnings
- TPM is required for Secure Boot and disk encryption testing, but NOT required for live ISO boot testing
### One-Time swtpm Setup (if needed for full security testing)
```bash
sudo mkdir -p /var/lib/libvirt/swtpm
sudo chown libvirt-qemu:libvirt-qemu /var/lib/libvirt/swtpm
```
After this, `./run.sh test:iso create` will automatically enable TPM.
### VM Lifecycle
```bash
./run.sh test:iso create # Create and start VM (shows in virt-manager)
./run.sh test:iso status # Check VM status
./run.sh test:iso console # Serial console
./run.sh test:iso destroy # Destroy VM (ISO preserved in output/)
```
### Direct QEMU (alternative when libvirt has issues)
If libvirt swtpm is broken, you can boot the ISO directly:
```bash
# Setup swtpm manually
mkdir -p /tmp/swtpm-state
swtpm_setup --tpm-state /tmp/swtpm-state --tpm2 --createek --allow-signing --pcr-banks sha256
swtpm socket --tpmstate dir=/tmp/swtpm-state --tpm2 \
--ctrl type=unixio,path=/tmp/swtpm-sock --daemon --flags not-need-init
# Boot with QEMU
qemu-system-x86_64 \
-machine q35,smm=on -accel kvm -cpu host -smp 2 -m 4096 \
-drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE_4M.secboot.fd,readonly=on \
-drive if=pflash,format=raw,unit=1,file=/tmp/ovmf-vars.fd \
-drive file=/tmp/disk.qcow2,format=qcow2,if=virtio \
-cdrom output/knel-football-secure.iso -boot d \
-netdev user,id=net0 -device virtio-net-pci,netdev=net0 \
-vnc :5 -device virtio-gpu-pci \
-chardev socket,id=chrtpm,path=/tmp/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0
```
### Key Lesson: swtpm Must Be Pre-Initialized
swtpm's CMD_INIT fails if the TPM state hasn't been set up with `swtpm_setup` first.
For libvirt integration, this means `/var/lib/libvirt/swtpm/<vm-name>/` must exist
with initialized state and correct ownership (`libvirt-qemu:libvirt-qemu`).
---
## Session Lessons & Hard-Won Knowledge
### DO NOT Delete the ISO in vm_destroy
The ISO takes 7+ minutes to build. `vm_destroy()` must NEVER delete files from `output/`.
Only clean up `/tmp/` files (disks, copies, XML).
### DO NOT Remove TPM From Templates
TPM is required for UEFI Secure Boot and disk encryption. The template must support
conditional TPM via `@TPM_SECTION@` placeholder, not have TPM removed entirely.
### Always Use PID-Suffixed Paths in /tmp
Previous VM runs may leave files owned by `libvirt-qemu` that the current user can't
delete. Use `/tmp/${VM_NAME}-$$.ext` to avoid conflicts.
### Test End-to-End, Not Just Components
A passing validation harness does NOT mean the ISO actually boots. Always:
1. Boot in QEMU with serial capture
2. Check for kernel panics, hung tasks, failed services
3. Verify login prompt appears
4. Capture and analyze full serial output