fix: graceful TPM fallback in VM creation, fix vm_destroy cleanup

vm_create() now handles swtpm initialization gracefully:
- Pre-initializes swtpm state dir if /var/lib/libvirt/swtpm/ is writable
- Falls back to VM without TPM if swtpm setup fails (with clear warnings)
- Uses PID-suffixed paths for disk and ISO to avoid stale file conflicts
- Removed unused VM_DISK_PATH/VM_ISO_PATH globals (now local vars)

vm_destroy() cleanup:
- No longer references undefined local variables from vm_create
- Uses glob patterns to clean all VM files in /tmp/
- Explicitly preserves ISO in output/

Template changes:
- TPM is now @TPM_SECTION@ placeholder (injected based on swtpm availability)
- Allows same template to work with or without TPM

AGENTS.md additions:
- VM testing & swtpm setup documentation
- Direct QEMU alternative when libvirt has issues
- Session lessons: never delete ISO, never remove TPM, always test E2E

All 523 unit tests pass, 0 lint warnings.

💘 Generated with Crush

Assisted-by: GLM-5.1 via Crush <crush@charm.land>
This commit is contained in:
2026-05-07 12:39:47 -05:00
parent ccab1e2b19
commit 88d670efbe
4 changed files with 173 additions and 27 deletions
+85
View File
@@ -523,3 +523,88 @@ patch -p1 < changes.diff
- Can preview changes with `sed 's/old/new/g' file` (no -i) first
---
## VM Testing & swtpm
### VM Creation via `./run.sh test:iso create`
The `vm_create()` function in `run.sh` handles TPM gracefully:
- If `/var/lib/libvirt/swtpm/` exists and is writable: TPM 2.0 emulation is enabled
- If not accessible: VM is created WITHOUT TPM with clear warnings
- TPM is required for Secure Boot and disk encryption testing, but NOT required for live ISO boot testing
### One-Time swtpm Setup (if needed for full security testing)
```bash
sudo mkdir -p /var/lib/libvirt/swtpm
sudo chown libvirt-qemu:libvirt-qemu /var/lib/libvirt/swtpm
```
After this, `./run.sh test:iso create` will automatically enable TPM.
### VM Lifecycle
```bash
./run.sh test:iso create # Create and start VM (shows in virt-manager)
./run.sh test:iso status # Check VM status
./run.sh test:iso console # Serial console
./run.sh test:iso destroy # Destroy VM (ISO preserved in output/)
```
### Direct QEMU (alternative when libvirt has issues)
If libvirt swtpm is broken, you can boot the ISO directly:
```bash
# Setup swtpm manually
mkdir -p /tmp/swtpm-state
swtpm_setup --tpm-state /tmp/swtpm-state --tpm2 --createek --allow-signing --pcr-banks sha256
swtpm socket --tpmstate dir=/tmp/swtpm-state --tpm2 \
--ctrl type=unixio,path=/tmp/swtpm-sock --daemon --flags not-need-init
# Boot with QEMU
qemu-system-x86_64 \
-machine q35,smm=on -accel kvm -cpu host -smp 2 -m 4096 \
-drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE_4M.secboot.fd,readonly=on \
-drive if=pflash,format=raw,unit=1,file=/tmp/ovmf-vars.fd \
-drive file=/tmp/disk.qcow2,format=qcow2,if=virtio \
-cdrom output/knel-football-secure.iso -boot d \
-netdev user,id=net0 -device virtio-net-pci,netdev=net0 \
-vnc :5 -device virtio-gpu-pci \
-chardev socket,id=chrtpm,path=/tmp/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0
```
### Key Lesson: swtpm Must Be Pre-Initialized
swtpm's CMD_INIT fails if the TPM state hasn't been set up with `swtpm_setup` first.
For libvirt integration, this means `/var/lib/libvirt/swtpm/<vm-name>/` must exist
with initialized state and correct ownership (`libvirt-qemu:libvirt-qemu`).
---
## Session Lessons & Hard-Won Knowledge
### DO NOT Delete the ISO in vm_destroy
The ISO takes 7+ minutes to build. `vm_destroy()` must NEVER delete files from `output/`.
Only clean up `/tmp/` files (disks, copies, XML).
### DO NOT Remove TPM From Templates
TPM is required for UEFI Secure Boot and disk encryption. The template must support
conditional TPM via `@TPM_SECTION@` placeholder, not have TPM removed entirely.
### Always Use PID-Suffixed Paths in /tmp
Previous VM runs may leave files owned by `libvirt-qemu` that the current user can't
delete. Use `/tmp/${VM_NAME}-$$.ext` to avoid conflicts.
### Test End-to-End, Not Just Components
A passing validation harness does NOT mean the ISO actually boots. Always:
1. Boot in QEMU with serial capture
2. Check for kernel panics, hung tasks, failed services
3. Verify login prompt appears
4. Capture and analyze full serial output
+81 -18
View File
@@ -18,14 +18,10 @@ readonly CACHE_VOLUME="knel-football-cache"
# VM Testing Configuration (system libvirt for virt-manager visibility, /tmp for no sudo)
readonly ISO_PATH="${SCRIPT_DIR}/output/knel-football-secure.iso"
readonly VM_NAME="knel-football-test"
readonly VM_RAM="2048"
readonly VM_RAM="4096"
readonly VM_CPUS="2"
readonly VM_DISK_SIZE="10"
readonly LIBVIRT_URI="qemu:///system"
VM_DISK_PATH="/tmp/${VM_NAME}.qcow2"
readonly VM_DISK_PATH
VM_ISO_PATH="/tmp/${VM_NAME}.iso"
readonly VM_ISO_PATH
# Colors for output
readonly RED='\033[0;31m'
@@ -169,6 +165,54 @@ vm_check_prerequisites() {
return 0
}
# Setup swtpm state for libvirt TPM emulation
# Returns 0 if TPM is available, 1 if not
vm_setup_swtpm() {
local swtpm_state_dir="/var/lib/libvirt/swtpm/${VM_NAME}"
# Check if swtpm is installed
if ! command -v swtpm_setup &> /dev/null; then
log_warn "swtpm_setup not found - VM will run without TPM"
return 1
fi
# Try to create and initialize swtpm state directory
# For system libvirt, this lives under /var/lib/libvirt/swtpm/
if [[ "$LIBVIRT_URI" == *"system"* ]]; then
# Check if the base directory exists and is writable
if [[ ! -d "/var/lib/libvirt/swtpm" ]]; then
# Try to create it (may fail without root)
if ! mkdir -p "/var/lib/libvirt/swtpm" 2>/dev/null; then
log_warn "Cannot create /var/lib/libvirt/swtpm/ (needs sudo)"
log_warn "Fix: sudo mkdir -p /var/lib/libvirt/swtpm && sudo chown libvirt-qemu:libvirt-qemu /var/lib/libvirt/swtpm"
log_warn "VM will be created WITHOUT TPM"
return 1
fi
fi
# Try to create VM-specific state dir
if ! mkdir -p "$swtpm_state_dir" 2>/dev/null; then
log_warn "Cannot create swtpm state dir: $swtpm_state_dir"
log_warn "Fix: sudo mkdir -p $swtpm_state_dir && sudo chown -R libvirt-qemu:libvirt-qemu $swtpm_state_dir"
log_warn "VM will be created WITHOUT TPM"
return 1
fi
# Try to initialize TPM state
if ! swtpm_setup --tpm-state "$swtpm_state_dir" --tpm2 --createek --allow-signing --pcr-banks sha256 2>/dev/null; then
log_warn "swtpm_setup failed - VM will run without TPM"
rm -rf "$swtpm_state_dir" 2>/dev/null || true
return 1
fi
# Fix ownership for libvirt-qemu
chown -R libvirt-qemu:libvirt-qemu "$swtpm_state_dir" 2>/dev/null || true
fi
log_info "swtpm initialized successfully"
return 0
}
# Create and start VM using virsh define (virt-install requires storage pools)
vm_create() {
log_info "Creating VM: $VM_NAME (libvirt: $LIBVIRT_URI)"
@@ -177,12 +221,14 @@ vm_create() {
virsh -c "$LIBVIRT_URI" destroy "$VM_NAME" 2>/dev/null || true
virsh -c "$LIBVIRT_URI" undefine "$VM_NAME" --nvram 2>/dev/null || true
# Ensure libvirt images directory exists
mkdir -p "$(dirname "$VM_ISO_PATH")"
# Use unique paths to avoid stale libvirt-qemu owned files from previous runs
local vm_iso_path="/tmp/${VM_NAME}-$$.iso"
local vm_disk_path="/tmp/${VM_NAME}-$$.qcow2"
# Copy ISO to user storage (no root required for session libvirt)
# Copy ISO to user storage
log_info "Copying ISO to libvirt storage..."
if ! cp -f "$ISO_PATH" "$VM_ISO_PATH"; then
mkdir -p "$(dirname "$vm_iso_path")"
if ! cp -f "$ISO_PATH" "$vm_iso_path"; then
log_error "Failed to copy ISO"
return 1
fi
@@ -221,11 +267,11 @@ vm_create() {
log_warn "Using UEFI WITHOUT Secure Boot: $uefi_code"
fi
# Pre-create disk image (no root required for session libvirt)
log_info "Creating disk image: $VM_DISK_PATH"
rm -f "$VM_DISK_PATH" 2>/dev/null || true
mkdir -p "$(dirname "$VM_DISK_PATH")"
if ! qemu-img create -f qcow2 "$VM_DISK_PATH" "${VM_DISK_SIZE}G"; then
# Pre-create disk image
log_info "Creating disk image: $vm_disk_path"
rm -f "$vm_disk_path" 2>/dev/null || true
mkdir -p "$(dirname "$vm_disk_path")"
if ! qemu-img create -f qcow2 "$vm_disk_path" "${VM_DISK_SIZE}G"; then
log_error "Failed to create disk image"
return 1
fi
@@ -241,6 +287,17 @@ vm_create() {
local vm_uuid
vm_uuid=$(cat /proc/sys/kernel/random/uuid)
# Check TPM availability and configure accordingly
local tpm_section=""
if vm_setup_swtpm; then
tpm_section="<tpm model='tpm-crb'><backend type='emulator' version='2.0'/></tpm>"
log_info "TPM 2.0 emulation enabled"
else
tpm_section=""
log_warn "TPM disabled - Secure Boot and disk encryption will not work"
log_warn "This is OK for live ISO testing but not for installation"
fi
# Create VM XML from template
local vm_xml="/tmp/${VM_NAME}.xml"
sed -e "s|@VM_NAME@|${VM_NAME}|g" \
@@ -250,8 +307,9 @@ vm_create() {
-e "s|@SECURE_BOOT@|${secure_boot}|g" \
-e "s|@UEFI_CODE@|${uefi_code}|g" \
-e "s|@UEFI_VARS_TEMPLATE@|${uefi_vars}|g" \
-e "s|@VM_DISK@|${VM_DISK_PATH}|g" \
-e "s|@ISO_PATH@|${VM_ISO_PATH}|g" \
-e "s|@VM_DISK@|${vm_disk_path}|g" \
-e "s|@ISO_PATH@|${vm_iso_path}|g" \
-e "s|@TPM_SECTION@|${tpm_section}|g" \
"$template" > "$vm_xml"
log_info "Defining VM from XML..."
@@ -285,6 +343,8 @@ vm_create() {
log_info "VNC display: $vnc_display"
log_info ""
log_info "Open virt-manager - VM '$VM_NAME' should be visible under QEMU/KVM"
log_info "Disk: $vm_disk_path"
log_info "ISO: $vm_iso_path"
}
# Connect to VM console
@@ -323,9 +383,12 @@ vm_destroy() {
log_info "Destroying VM: $VM_NAME"
virsh -c "$LIBVIRT_URI" destroy "$VM_NAME" 2>/dev/null || true
virsh -c "$LIBVIRT_URI" undefine "$VM_NAME" --nvram 2>/dev/null || true
rm -f "$VM_DISK_PATH" "$VM_ISO_PATH" "/tmp/${VM_NAME}.xml" "/tmp/${VM_NAME}_VARS.fd"
log_info "Cleanup complete (ISO preserved)"
# Cleanup all VM files (ISO is preserved in output/)
rm -f /tmp/${VM_NAME}*.qcow2 /tmp/${VM_NAME}*.iso /tmp/${VM_NAME}*.xml /tmp/${VM_NAME}*.fd 2>/dev/null || true
rm -rf /var/lib/libvirt/swtpm/${VM_NAME} 2>/dev/null || true
log_info "Cleanup complete (ISO preserved in output/)"
}
# Run automated boot test
+6 -6
View File
@@ -263,16 +263,16 @@
# VM TPM Support
# =============================================================================
@test "VM template includes TPM device" {
grep -q "tpm model" /workspace/vm/template.xml
@test "VM template has TPM placeholder" {
grep -q '@TPM_SECTION@' /workspace/vm/template.xml
}
@test "VM TPM uses version 2.0" {
grep -q "version='2.0'" /workspace/vm/template.xml
@test "run.sh generates TPM XML when swtpm available" {
grep -q "tpm-crb" /workspace/run.sh
}
@test "VM TPM uses CRB model" {
grep -q "tpm-crb" /workspace/vm/template.xml
@test "run.sh has vm_setup_swtpm function" {
grep -q "vm_setup_swtpm" /workspace/run.sh
}
# =============================================================================
+1 -3
View File
@@ -24,9 +24,7 @@
</clock>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<tpm model='tpm-crb'>
<backend type='emulator' version='2.0'/>
</tpm>
@TPM_SECTION@
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='@VM_DISK@'/>