Build Loop Test Areas#
This file holds the broad coverage checklist and first human test for
build-loop-tests.md.
Required Test Areas#
The Build Test Fix loop must create and run tests for:
- database writes
- MCP tool contracts
- agent logging proof
- agent feedback proof
- stored feedback retrieval
- planned work logging and start gating
- task cancellation with audit trail
- stdio MCP plan, start planned, and cancel lifecycle
- blocker and decision dashboard visibility
- planned work and risks in agent team-state responses
- recent completions in agent team-state responses
- task check before start
- scope check before start
- task-to-workstream linking with ownership checks
- workstream dashboard linked task counts and recent task names
- overlap warning creation
- overlap recall/precision and Performance quality gate
- stale completed work ignored by pre-start overlap checks
- start rejection without confirmation
- start success with confirmation
- WorkOS AuthKit login and local-dev tenant onboarding
- MCP developer token mapping
- cross-developer task mutation rejection
- cross-developer workstream and note mutation rejection
- local-dev identity selection through the dev-login API test path
- MCP token inventory and revocation after refresh
- demo token cleanup
- usage event writes
- session duration tracking foundation
- session heartbeat and stale status
- local repo snapshot capture without file contents
- local repo snapshot conflict warnings for overlapping dirty or unpushed files
- agent-to-agent coordination claims, questions, replies, handoffs, and releases
- idempotent task and workstream completion retries
- completed work cannot be reopened through update tools
- parallel work audits for already-running and recently completed duplicate work
- agent team-state warnings from persisted parallel audit conflicts
- no stale team-state warnings after both sides of an audit conflict are done
- cross-developer session heartbeat/end rejection
- dashboard metrics from real records
- Workstreams planned-work data
- Weekly Summary planned work data
- Coordination risks, decisions, code claims, and peer messages data
- Workstreams data
- Activity completion data
- Activity audit data
- Weekly Summary generation
If the app has a browser UI, the loop must include a browser test for the main user journey.
First Human Test#
When a human opens the app in the morning, they should try this first:
- Start the server.
- Open Workstreams.
- Run the duplicate-work test scenario.
- Confirm the overlap warning appears.
- Try starting without confirmation and confirm it fails.
- Start with confirmation and confirm it appears in the UI.
- Confirm the MCP response told the agent what to do.
- Confirm the task and warning were saved.
- Confirm usage/session events exist.
- Confirm Performance changed.
- Run
Audit parallel work. - Confirm the Parallel Audit panel shows both overlapping records.
- Refresh and confirm the latest audit still appears.
- Generate the weekly summary.
- Confirm structured summary data is visible.
- Confirm
Parallel auditsis included. - Confirm summary generation does not create a new MCP token.
- Complete work through MCP and confirm Activity shows the result.
- Cancel planned work through MCP and confirm the dashboard removes it from Workstreams.
If this works, the customer-readiness slice is real.