The v1.3.0 saturation termination silently capped the search after only
the heuristic-favored part of the tree, leaving most per-set ceiling cells
stuck at "0 specs" and hiding genuinely-feasible 3-spec plans in
maximize-count mode. Replace with full exhaustive enumeration plus a
batch of UX refinements that emerged during testing.
Algorithm:
- Drop the saturation early-termination entirely. Search now runs the
full open-set cartesian product to completion; the iteration cap is
also removed so no scenario exits partial.
- Add mode-dependent DFS child ordering: priority-order keeps the
priority-target-first heuristic; maximize-count orders children by
descending count of qualifications for reachable specs (generalist
courses tried first).
- Make the (count, priorityScore) comparator mode-aware: priority-order
ranks by (priorityScore, count) so the user's top spec surfaces;
maximize-count ranks by (count, priorityScore) so the highest count
wins. The same rule drives both top-K position and per-cell ceiling
selection (and the Recommended badge).
- Add an evaluated boolean to each ChoiceOutcome and set it on first
leaf evaluation. Distinguishes "still searching" from "evaluated, no
specs achieved" so the UI never shows misleading 0 specs for a cell
the search hasn't reached yet.
- Throttled progress events (~100ms) carrying iterations / total leaf
count, drive both the per-set spinner and the global progress bar.
UI:
- Top Plans header shows a horizontal progress bar with
"iterations / total · NN%" while the search runs; collapses to
"Search complete · N explored" on completion.
- Per-set spinner next to each elective set heading while any choice
in that set is unevaluated.
- Per-cell pulsing dot + "searching" text for unevaluated cells.
- Replace the "(HCR, BNK, ...)" text labels on each course with
color-coded SpecTag pills using a new fixed per-spec palette
(app/src/data/specColors.ts). Same palette applied to the Top Plans
achievement badges so the two views are visually consistent.
- "Top outcome if picked ↓" caption above the right side of each open
elective set so the spec tags are clearly identified as decision-tree
outcomes (not the course's own qualifications).
- Recommended badge moved inline next to the course name (instead of
on a separate row below) to keep button heights stable.
Tests:
- Replace the saturation early-termination test with an exhaustion test
asserting every cell ends with evaluated: true and partial: false.
- Add mode-dependent ordering test (max-count visits Climate Finance
before Corporate Governance in fall3).
- Add evaluated-flag transition test.
- Add throttled progress-event test (>= ~100ms between consecutive
emits).
- Performance smoke updated to a 60s budget for the exhaustive
user-scenario search; 8-open-set typical case completes in ~7s.
Files: solver/decisionTree.ts, solver/priority.ts (already shipped),
data/specColors.ts (new), components/{TopPlans,CourseSelection}.tsx,
state/appState.ts, workers/decisionTree.worker.ts,
__tests__/searchDecisionTree.test.ts, vite.config.ts, CHANGELOG.md,
openspec/changes/decision-tree-exhaustive-search/* (full change spec).
8.8 KiB
Context
v1.3.0 introduced a streaming decision-tree search (searchDecisionTree) with two early-termination criteria: a hard iteration cap (MAX_TREE_ITERATIONS = 10000) and a saturation criterion (SATURATION_LIMIT = 500 iterations of no top-K change). The saturation criterion fires too eagerly because the top-K updates only when a strictly better outcome is inserted; many leaves produce duplicate outcome classes that don't update top-K but still increment the saturation counter.
Two visible consequences in the user's testing:
- Per-set ceiling table shows "0 specs" for many courses. Root cause: cells are initialized with
{count: 0, specs: []}and only updated when a leaf containing them is evaluated. The DFS, ordered by the priority-target heuristic, exhausts target-favoring branches first. Saturation fires before the DFS backtracks to non-target choices in early sets. - Top Plans in maximize-count mode shows only 2-spec outcomes when 3-spec combinations exist. Root cause: the same saturation fires before the search reaches the part of the tree where 3-spec-feasible combinations live (typically generalist-course-heavy combinations).
Investigation in app/src/solver/decisionTree.ts:204-225 confirmed both behaviors. A diagnostic showed the user's reproduction case (8 open sets, 49,152 leaves) saturates at ~500 iterations, leaving most cells unevaluated.
Goals / Non-Goals
Goals:
- Per-set ceiling cells reflect the true best outcome for every (set, course) pair after search completes
- Top-K reflects the genuinely-best plans achievable for the given pin/ranking, in either mode
- UI distinguishes "still searching" from "search complete, this course achieves nothing"
- Search remains responsive: high-quality results appear in the stream within the first ~hundred iterations even though full search takes seconds
Non-Goals:
- Sub-second exhaustive search for the 8-open-set worst case (50K leaves × ~1ms LP = ~50s is acceptable in a worker)
- Replacing the LP solver or re-architecting the optimizer
- Caching across runs
- Configurable iteration cap from the UI
Decisions
Drop saturation termination entirely (not "make it smarter")
The user explicitly chose Approach A: exhaustive. Smarter saturation criteria (e.g., "stable for N iters AND every cell visited at least once") add complexity without reaching demonstrated correctness — there's always a pathological combination that defeats the heuristic. Exhaustive is simpler, demonstrably correct, and feasible in a worker.
Alternative considered: Two-phase search (fast heuristic + background exhaustive sweep). Rejected — added complexity (phase transition events, partial state), and the user's preference for "exhaustive" was explicit. The progress UI absorbs the same UX pain (user sees the search running) without the implementation cost.
Mode-dependent enumeration ordering
The current target-first heuristic biases toward priority-order mode's expected outcome (high-priority spec achieved early). For maximize-count mode, ordering by qualification breadth is a better fit because generalist courses lead to higher-count plans. Both heuristics are cheap to compute (one upper-bounds map lookup, then a per-course count) and run once per analysis.
For maximize-count, the score per course is count of (specId in course.qualifications) where upperBounds[specId] >= 9. Sorting children by this score descending puts the generalist courses first. Stable sort keeps declaration order on ties.
Alternative considered: A single unified ordering (e.g., always order by qualification breadth). Rejected — for priority-order mode, the user's priority is the meaningful signal; using breadth would suppress the priority spec early in the stream.
evaluated: boolean on ChoiceOutcome
Adds one boolean per (set, course) cell — negligible overhead. Cleaner than a sentinel value (e.g., ceilingCount = -1) that consumers might forget to handle and would need to be filtered everywhere ceilingCount is read. Boolean has obvious semantics in the UI ("not yet known" vs "known, value is 0").
Alternative considered: Omit cells entirely until evaluated. Rejected — the UI needs the courseName to render the row; restructuring to fetch course names from coursesBySet would push more responsibility into the renderer for no clear win.
Per-set "Recommended" derivation in UI, not in worker
The recommended choice is a function of analysis.choices and the comparator. Computing in the UI keeps the worker protocol simple (no new field), avoids a duplicate computation, and lets the UI re-render cheaply on each choiceUpdate.
The comparator: (ceilingCount desc, priorityScore desc). Same as the top-K. The "Recommended" course in a set is the one whose ceiling best matches the user's overall objective.
Throttled progress event (≈100ms)
Without throttling, the worker would emit a progress event per iteration — 50,000 events × 1ms = 50s of message overhead. With ~100ms throttling: ~500 events per search, each tiny. Implementation: track lastProgressEmit timestamp; emit if Date.now() - lastProgressEmit >= 100.
Alternative considered: No progress events; rely on topKUpdate for activity signal. Rejected — top-K updates fire only when something changes; long stretches of "exploring duplicates" would look like a frozen UI.
Mode-aware comparator (emerged during implementation)
After dropping saturation, exhaustive search surfaces 3-spec non-HCR plans that beat 2-spec HCR plans on the original (count, priorityScore) comparator. This conflicted with v1.3.0 spec scenarios that asserted HCR appears at topK[0] in priority-order mode with HCR ranked first. Resolution: make both the top-K and per-cell ceiling comparators mode-dependent.
priority-ordermode:(priorityScore desc, count desc, key asc)— surfaces the user's top-priority spec even when higher-count alternatives existmaximize-countmode:(count desc, priorityScore desc, key asc)— surfaces the maximum number of specs achievable
Both comparators share the deterministic assignmentKey tiebreaker for streaming stability. The CourseSelection "Recommended" badge uses the same mode-dependent rule so cell recommendations align with the top-K ranking.
Alternative considered: Keep the count-first comparator and let exhaustive search reveal high-count alternatives. Rejected — contradicts user-stated intent ("HCR top priority should surface") and breaks v1.3.0 spec scenarios.
MAX_TREE_ITERATIONS = 100,000
Empirically, 8-open-set worst case is ~50K leaves. 100K provides 2× headroom. Larger scenarios (10+ open sets) would still be capped, with partial: true displayed. The fallback to "empty choices" already exists for openSetIds.length > MAX_OPEN_SETS_FOR_ENUMERATION (= 9), so this cap rarely fires in practice.
Risks / Trade-offs
- 50s search feels slow → Progress UI + streamed top-K make it feel active; user can adopt a plan partway through if they like what's shown
- Worker CPU usage during search → Acceptable; runs in a worker thread, doesn't block UI; user can change pins to abort and restart
- Throttled progress means iteration count "jumps" → Cosmetic only; UI doesn't depend on monotonic small steps
evaluated: falseinitial state for every cell → Slightly verbose payloads; choiceUpdate already sends the full set's choices array, so the change is one boolean field per cell (negligible)- Mode switch mid-search → Current behavior already terminates and restarts the worker on any pin/ranking/mode change; unchanged
- Tests need amendments → Saturation tests removed (3 tests); exhaustion test added; mode-ordering test added; per-cell evaluated transition test added
Migration Plan
Single-PR change. No data migration. Steps:
- Algorithm + worker + state changes; tests updated
- UI updates: per-cell evaluated rendering, per-set spinner, global progress, recommended badge
- Browser-verify both modes against the v1.3.0 reproduction scenario; confirm exhaustive search completes and all cells are populated
- Bump version (
1.3.1); CHANGELOG entry; ship
Rollback: revert the change; v1.3.0 behavior restored. No persistent state to migrate.
Open Questions
- Recommended badge for ties — if two choices in a set have identical
(count, priorityScore), currently the comparator's deterministic tiebreaker (assignmentKey) picks one. UI shows just one Recommended. Acceptable for v1; could be revisited if confusing. - Should "Recommended" still show before search completes — derived from current ceilings, so it updates as the search streams. Possibly confusing if the recommendation flips mid-search. Initial behavior: show as soon as any choice has
evaluated: true; let it update with the stream. - Future: progressive
partialflag during search — out of scope. Today,partialonly matters at the cap, which fires rarely.