Two CaseHubs, One Design
Date: 2026-04-09 Type: phase-update
Where this started
A colleague had been building casehub-engine — a CMMN-inspired implementation for agentic AI. Good work, progressing well. I wanted to go deeper into the design space myself, partly to understand it more concretely and partly so we could arrive at a shared picture of the broader architectural direction we were aiming for.
I started some exploratory work with Claude. What I didn’t expect was how quickly it would build out an entire system. What began as exploration to improve my own understanding turned into something of real value in its own right — a working framework built across just three sessions, starting from Blackboard Architecture and evolving to incorporate CMMN concepts, as discussed in the earlier posts in this series.
That created an interesting situation. Two implementations now exist. Both have real value. Getting the best of both requires careful work — not a simple merge but a deliberate synthesis of the strongest ideas from each. This post is about that synthesis.
First, what those three sessions actually produced.
What three sessions produced
Before getting to today’s work, it’s worth being precise about what the previous three sessions actually built — because the scope still surprises me when I look at it.
Session 1 — March 27, one commit, 2:33am
- Full Blackboard control loop:
CaseEngine,CasePlanModel,PlanItem,PlanningStrategy CaseFilewith per-key versioning, optimistic concurrency, change listeners, contribution trackingTaskDefinitionwith DFS cycle detection preventing circular dependencies at registration- Complete resilience layer:
RetryPolicy,PoisonPillDetector,DeadLetterQueue,IdempotencyService,TimeoutEnforcer,ConflictResolver TaskBroker,TaskScheduler,WorkerRegistrywith autonomous worker support- SPI interfaces for storage —
CaseFileStorageProvider,TaskStorageProvider - Two working example applications, LLM worker integration
- 2,400-line design document
- 73 files, 14,003 lines
Session 2 — March 28
- CMMN Stage lifecycle — nested stages, autocomplete, manual activation, entry/exit criteria
- CMMN Milestones — named achievement markers,
PENDING/ACHIEVED - In-memory storage providers — fast, zero-dependency tests
- Quarkus Flow bridge —
FlowWorker,FlowWorkflowRegistry,FlowExecutionContext
Session 3 — April 9
- POJO graph refactor —
CaseFile.getParentCase/getChildCases,Task.getOwningCase/getChildTasks PropagationContextslimmed to W3C traceId + attributes + budgetCaseFileRepository+TaskRepositorySPIs extracted to dedicated modulescasehub-persistence-memoryandcasehub-persistence-hibernatemodules- Goal model research across GOAP, BDI, HTN, DCR, CMMN, KAOS, LangChain4j — ADR-0001 written
- GitHub repository, issue tracking, retrospective mapping
Three sessions. The framework was real and working. Then I looked at what my colleague had been building in parallel.
The other implementation
The colleague had been building casehub-engine with a distinct set of architectural choices: a reactive event bus (Vert.x + Mutiny), JQ expressions for conditions, Quartz for durable worker execution, a YAML/JSON schema with codegen, and a Goal model that was already implemented.
The question wasn’t whether to merge — it was which direction and how.
Claude and I started with a surface comparison. Claude’s first pass was useful but incomplete — it missed that casehub-engine’s Workers already support plain Java lambdas alongside JQ expressions, reported PropagationContext as removed when it had only been slimmed, and initially underestimated the significance of the EventLog. I pushed for a systematic review — every file in both codebases, nothing skimmed. That’s what surfaced evalObjectTemplate(), a full template mini-DSL for input/output mapping that doesn’t use JQ at all.
What each system brings
| casehub | casehub-engine |
|---|---|
| Blackboard control loop | Reactive async — Vert.x EventBus + Mutiny |
PlanningStrategy — pluggable control reasoning |
Goal + GoalExpression + GoalKind |
| CMMN Stage lifecycle — nested, autocomplete | EventLog — full ordered event history |
| Full resilience suite — DLQ, PoisonPill, Idempotency | Binding + Trigger — contextChange, cloudEvent, schedule |
| Hybrid orchestration + choreography | Capability + input/output mapping |
Per-key versioning + ConflictResolver |
YAML/JSON schema + codegen |
PropagationContext — tracing + budget |
Durable execution via Quartz |
| Persistence SPI — memory + Hibernate | — |
The gap runs in both directions — each system has capabilities the other lacks entirely.
The direction: casehub as the base
Both systems bring substantial architectural work. casehub-engine has a production reactive event cycle, durable Quartz execution, and a working Goal model. casehub has the Blackboard control loop, CMMN stage lifecycle, resilience layer, and lineage model. The assessment at this point was that layering casehub-engine’s reactive infrastructure into casehub would involve a larger Phase 5 async refactor than the reverse. As the next post shows, that assessment was revisited once implementation began.
The merge direction was clear. The approach: evolve casehub in place, phase by phase.
The async question: making the case for a hybrid event-driven PlanningStrategy loop
The most important architectural discussion was about the synchronous control loop.
casehub currently runs a blocking while loop: evaluate → fire one task → re-evaluate. casehub-engine is fully async — Vert.x event bus, Mutiny, everything non-blocking. The instinct was to frame this as sync vs async. That’s the wrong frame.
The real question is two separate things: who decides what fires next, and does execution block threads? casehub conflates them. A PlanningStrategy that reasons sequentially about what should run next does not require blocking threads. The loop can be event-driven: a CaseFileChangedEvent fires on the bus, the strategy evaluates, a WorkerScheduleEvent is published, Quartz picks it up. Sequential logic, non-blocking execution.
More importantly: the synchronous loop can’t handle casehub’s own hybrid model cleanly. Autonomous workers run on their own threads and currently reach back into the engine via notifyAutonomousWork() — coupling workers to engine internals. With an event bus, autonomous workers just write to the CaseFile. The engine reacts. Same as everything else.
graph LR
change[CaseFile changes\nany source] --> event[CaseFileChangedEvent]
event --> eval[PlanningStrategy\nevaluates]
eval --> schedule[WorkerScheduleEvent]
schedule --> quartz[Quartz executes\nnon-blocking]
quartz --> result[Result written\nto CaseFile]
result --> change
Async isn’t a performance improvement here. It’s architecturally necessary for the hybrid model to work cleanly.
Key decisions
| Decision | Choice |
|---|---|
| Merge direction | casehub as base |
| Execution model | Async event cycle — logically configurable, physically always non-blocking |
| TaskDefinition vs Worker | TaskDefinition is sugar over Worker + Binding |
| Schema vs code | Both first-class — same pattern as Quarkus Flow |
| Expression language | Pluggable — JQ and Java lambdas, both valid everywhere |
| Context model | Pluggable CaseFile impls — JSON, typed POJO, Map |
| Quarkus Flow depth | One backend among several — not forced, natural choice for I/O-bound workers |
| Naming | bindings (not rules or dispatch-rules) |
What gets merged
The implementation plan has nine phases — naming decisions deferred where still under discussion, so work can start immediately on the naming-safe pieces:
- Unseal
ExpressionEvaluator, addLambdaExpressionEvaluator - Adopt Goal model —
Goal,GoalExpression,GoalKind,CaseCompletion - Adopt
EventLog+ Quartz for durable execution - Replace synchronous control loop with async event cycle
- Pluggable
CaseFileimplementations Binding+Triggermodel- YAML schema adoption
- Sub-cases — wire the existing POJO graph into the engine
casehub-quarkusextension — full Quarkus DX layer
The design document is written. Two implementations, one design. Implementation starts next session.