<p>I came into this session expecting to close two issues and move on. #261 was
twelve test classes failing because <code>tenancyId</code> was null on direct Panache
persist. #234 was routing inbound messages to WorkItem creation. Both filed as
S/Low. Neither was.</p>

<p>The twelve tests from #261 were down to four by the time I picked them up —
the previous session’s commits had fixed most of them. The four survivors
weren’t tenancyId nulls at all. They were test ordering contamination:
<code>MutableCurrentPrincipal</code>, an <code>@ApplicationScoped</code> mutable bean, was leaking
tenant state between test classes. A tenancy test sets <code>tenancyId = TENANT_B</code>,
the bean survives to the next class, and the next class’s queries silently
return empty results. Not errors — empty. You’d never know unless you ran the
full suite in the right order.</p>

<p>The fix was a <code>QuarkusTestBeforeEachCallback</code> registered via ServiceLoader —
infrastructure-level reset that fires before every test method without
per-class wiring. One class, one registration file, problem structurally
eliminated.</p>

<p>Then the cascade started. Every module that depends on runtime needs
<code>quarkus.scheduler.start-mode=halted</code>, not <code>enabled=false</code>. The names sound
interchangeable. They’re not. <code>enabled=false</code> removes the Quartz <code>Scheduler</code>
CDI bean entirely — fine until something injects it for programmatic
scheduling. Eight modules had the wrong setting.</p>

<p>Then the examples. Then the queues-examples. Then the dashboard. Then the AI
unit tests constructing <code>InMemoryWorkItemStore</code> outside CDI with no
<code>CurrentPrincipal</code>. Then the postgres-broadcaster SSE tests filtering by
tenant with mismatched IDs. Then a Merkle hash mismatch from <code>canonicalBytes</code>
including <code>tenancyId</code> before the entry had it set. Each fix revealed the next,
and each time I pushed to CI without verifying locally first, I lost five
minutes waiting to discover what I could have found in seconds.</p>

<p>That pattern — fix one thing, push, wait, find the next — was the real
lesson. I built a <code>fix-ci</code> skill to formalise the right workflow: reproduce
locally, root-cause exhaustively, verify every failing test in isolation,
full local build, one push. The skill exists because I didn’t follow the
process I knew was correct.</p>

<p>#234 turned out to be a design question, not an implementation task. The
original issue said to put an <code>@ObservesAsync InboundMessage</code> observer in
casehub-work. But casehub-work and casehub-qhorus are Foundation-tier peers —
neither depends on the other. The bridge between them belongs in casehub-engine,
which already aggregates both. We spec’d a new <code>casehub-engine-inbound</code> module
that observes Qhorus <code>MessageReceivedEvent</code> (not raw <code>InboundMessage</code>) and
creates WorkItems through the existing <code>WorkItemService</code>. Two dependencies:
<code>casehub-qhorus-api</code> + <code>casehub-work-core</code>. The spec lives in the engine
workspace; two issues filed there; #234 closed as “design moved to engine.”</p>

<p>CI is green. All fifteen modules pass locally. Three garden entries submitted
for the gotchas that cost the most time. The session ran long, but the codebase
is cleaner for it — every module now handles tenancy correctly, and the test
infrastructure prevents the contamination class of bugs from recurring.</p>