v1.3.0: Streamed top-K decision-tree plans + priority-aware ceiling
Fixes the bug where a specialization could show "Achievable" while no per-set ceiling cell surfaces a path to it. Reproduction: pin SP2=Business of Health & Medical Care, SP4=Foundations of Fintech, SP5=Corporate Finance, SE1=GIE; rank HCR first. Healthcare showed Achievable but every ceiling cell excluded HCR. Root cause: computeCeiling used strict > on count alone, so the first equal-count combination found won permanently and HCR-including outcomes were never recorded. Changes: - Replace per-(set, choice) computeCeiling loop with a single full-tree searchDecisionTree DFS. Both the per-set ceiling table and a new ranked top-K plan list (default K=10) are populated from one enumeration. - Comparison rule everywhere is (count desc, priority score desc, deterministic-tiebreak). priorityScore extracted from optimizer.ts into a shared priority.ts module used by both call sites. - Heuristic enumeration ordering: select the first reachable ranked spec as priorityTarget; reorder DFS children at every level so target- qualifying courses are tried first. High-priority outcomes surface in early iterations instead of being blocked by less-relevant equal-count results. - Bounded search: terminate on saturation (top-K stable for 500 iterations) or hard cap (10000 iterations); set partial=true if cap hit. Mitigates the worst-case enumeration cost. - Worker protocol: tagged-union response with topKUpdate, choiceUpdate (per-cell, replaces per-set setComplete), and allComplete events. - App state adds topPlans/topPlansPartial slices and an adoptPlan action that pins a plan's full course assignment in one click. Also fixes loadState's stale "ranking.length !== 14" check (now uses SPECIALIZATIONS.length so HCR-era saved state restores correctly). - New TopPlans component renders the ranked list with adopt buttons, placed above CourseSelection in the right column. - 17 new tests in searchDecisionTree.test.ts covering priority scoring, bounded ranked list, comparison rule, target selection, the user's reproduction scenario, streaming monotonicity, saturation termination, and a performance smoke test (< 5s for the 8-open-set case). - Existing decisionTree.test.ts: one test amended for per-cell streaming semantics; remaining 3 unchanged and passing.
This commit is contained in:
@@ -0,0 +1,130 @@
|
||||
## Context
|
||||
|
||||
The EMBA Specialization Solver's "Decision Tree" view computes, for each open elective set, the ceiling outcome (best achievable specialization count and which specs) for each course choice. Implementation: `analyzeDecisionTree` (`app/src/solver/decisionTree.ts:90`) runs a per-(set, choice) loop calling `computeCeiling`, which itself enumerates the cartesian product of remaining open sets, runs the optimizer per leaf, and returns the best result by count.
|
||||
|
||||
After adding the Healthcare specialization (J27 update), a contradiction surfaced: HCR shows status "Achievable" but no per-set ceiling cell shows HCR as part of its outcome. Reproduction:
|
||||
|
||||
```
|
||||
Pin: SP2=spr2-health-medical, SP4=spr4-fintech,
|
||||
SP5=spr5-corporate-finance, SE1=sum1-global-immersion
|
||||
Rank: HCR first
|
||||
Result: HCR status = 'achievable' (upper bound = 10 ≥ 9)
|
||||
Decision tree: 0 of 32 ceilings include HCR
|
||||
```
|
||||
|
||||
Diagnostic test confirmed: `priorityOrder` returns `[HCR, BNK]` when fed an HCR-friendly 12-course pin set, so HCR genuinely *is* achievable. The bug is in `computeCeiling`'s comparison (`decisionTree.ts:55`):
|
||||
|
||||
```ts
|
||||
if (result.achieved.length > bestCount) {
|
||||
bestCount = result.achieved.length;
|
||||
bestSpecs = result.achieved;
|
||||
}
|
||||
```
|
||||
|
||||
Strict `>` means the first equal-count result found wins permanently. Combined with declaration-order enumeration, finance-heavy combinations (which appear early in the tree) yield non-HCR `[FIN, MTO]` outcomes that block HCR-including outcomes from ever being recorded.
|
||||
|
||||
The user also wants a richer view than per-set ceilings: a streamed ranked list of complete plans (`PlanOutcome`s, top K=10), each with its full course assignment, achieved specs, and priority score, so they can pick a complete plan rather than reasoning about set choices independently.
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Decision-tree outcomes that include the user's top-priority spec surface naturally — both in the per-set table and in a new ranked top-K plan list
|
||||
- One enumeration produces both views (no duplicated work)
|
||||
- Both views update progressively with monotonic improvement (entries only enter or move up)
|
||||
- Search is bounded: terminates on saturation (top-K stable) or hard iteration cap, with a `partial` flag if cap hit
|
||||
- "Achievable" status stays permissive (per user's intent: it indicates reachability anywhere in the tree, regardless of whether a path has been found)
|
||||
|
||||
**Non-Goals:**
|
||||
- Replacing the per-set ceiling table — both views remain
|
||||
- Restructuring the optimizer or LP feasibility checker
|
||||
- Changing optimizer score weights or rank tiebreakers
|
||||
- Designing the visual placement of the new "Top Plans" panel — out of scope here, follow-up brainstorm
|
||||
- User-configurable K — fixed at 10 for this change
|
||||
|
||||
## Decisions
|
||||
|
||||
### Single full-tree DFS instead of nested per-choice loop
|
||||
|
||||
Today's structure: outer loop over (setId, choice), each calling `computeCeiling`, which itself enumerates remaining sets. That's `O(sets × choices × ∏ other-sets-courses)` redundant work — every full path is enumerated up to `setCount` times.
|
||||
|
||||
New structure: one DFS over the cartesian product of all open-set courses. Each leaf evaluates the optimizer once. Per-set ceilings update as side effects ("for each (setId, courseId) in this combination, is this leaf's outcome better than the current ceiling for that cell?"). Top-K updates as side effects too.
|
||||
|
||||
**Alternative considered:** Keep the nested loop and just fix the comparison. Rejected — the algorithm needs to materialize complete plans anyway for the top-K view, and the nested loop's per-choice context isn't useful for that. Switching paradigms is cleaner than bolting top-K onto two enumeration layers.
|
||||
|
||||
### Comparison tuple `(count, priorityScore, deterministic-tiebreak)`
|
||||
|
||||
`priorityScore(specs, ranking)` matches the optimizer's existing definition (`optimizer.ts:71-74`): `sum over specs of (15 - rankIndex(spec))`. Same formula in both modules to avoid drift; extracted into a shared utility.
|
||||
|
||||
Tiebreaker on a deterministic hash of `courseAssignments` ensures streaming order is stable across runs and across worker restarts. Without it, two equally-ranked plans could "swap" position on every emit, causing UI flicker.
|
||||
|
||||
**Alternative considered:** Compare only `(count, priorityScore)` and accept whichever inserted first when equal. Rejected — non-deterministic order makes monotonicity tests unstable and produces visible flicker if two plans tie.
|
||||
|
||||
### `priorityTarget` heuristic = first reachable spec in user's ranking
|
||||
|
||||
Selected once per `analyzeDecisionTree` call. We walk the ranking in order and pick the first specId whose `upperBound >= 9`. If no spec is reachable, `priorityTarget = null` and reordering is skipped (no-op).
|
||||
|
||||
Why "reachable" not just "ranking[0]": if the user's #1 spec has no possible path to 9 credits given the pinned + open universe, prioritizing it would just delay finding good results. Walking to the first reachable one is cheap (one upper-bound array lookup per spec).
|
||||
|
||||
**Alternative considered:** Always use `ranking[0]` regardless of reachability. Rejected — wastes the heuristic on impossible specs in cases where the user has a long ranking and their top picks are gated by missed required courses.
|
||||
|
||||
### Heuristic ordering of DFS children
|
||||
|
||||
Per open set, courses qualifying for `priorityTarget` move to the front (stable sort, ties keep declaration order). Cancelled courses still skipped (existing behavior).
|
||||
|
||||
This causes the FIRST combinations evaluated to include all `priorityTarget`-qualifying choices simultaneously. With the user's ranking (HCR first), the optimizer evaluates an HCR-feasible pin set on iteration 1 and inserts an HCR-achieving outcome immediately into top-K and the relevant per-set ceilings.
|
||||
|
||||
**Alternative considered:** Branch-and-bound style pruning. Rejected — significantly more code, harder to verify correct, and the simple reordering already gives ~order-of-magnitude speedup for the common case.
|
||||
|
||||
### Two complementary terminators: hard cap + saturation
|
||||
|
||||
- `MAX_TREE_ITERATIONS = 10000`: absolute upper bound. Returns `{ partial: true }` if hit.
|
||||
- `SATURATION_LIMIT = 500`: stop if top-K hasn't changed in the last 500 iterations.
|
||||
|
||||
Saturation handles the typical case (top-K converges quickly with the heuristic). Hard cap handles pathological cases (large open-set count, long search space).
|
||||
|
||||
**Alternative considered:** Time-based cap (e.g., 5000ms). Rejected — JS time measurement in a worker is fiddly, and iteration count is a more deterministic test surface. Time cap could be added later if needed.
|
||||
|
||||
**Alternative considered:** Run to exhaustion. Rejected — for ≥8 open sets the cartesian product is in the tens of thousands; full enumeration is O(seconds–minutes) and provides diminishing returns once top-K saturates.
|
||||
|
||||
### `BoundedRankedList<T>` as a sorted array, not a heap
|
||||
|
||||
K ≤ 50 in practice. Insertion sort is `O(K)` per insert. A heap would shave a constant factor but complicates the "did the visible list change?" check (which drives the streaming emits). The simpler structure is fast enough and easier to reason about.
|
||||
|
||||
### Worker emits per-cell `choiceUpdate`, not per-set `setComplete`
|
||||
|
||||
Today, the worker emits one event when an entire set's analysis finishes. Under streaming, a set's ceilings update incrementally as combinations are evaluated. Per-cell events let the UI re-render exactly the changed cell instead of re-rendering the whole set's row.
|
||||
|
||||
**Alternative considered:** Coalesce per-set events on a 100ms timer. Rejected for now — per-cell is simpler and the message volume (a few hundred events per analysis, each <1KB) is well within worker `postMessage` throughput. Coalescing can be added later in the UI layer if needed.
|
||||
|
||||
### "Achievable" status semantics unchanged
|
||||
|
||||
Per user's stated intent: "Achievable" should mean "the spec is reachable somewhere in the remaining decision tree, regardless of priority." The current implementation (`optimizer.ts:185-194`) already does this — it checks the upper bound and returns `achievable` when open sets exist, without verifying joint feasibility with achieved specs.
|
||||
|
||||
This change preserves that semantics. The UX contradiction the user reported ("Achievable but no path shows it") is fixed by making the top-K and per-set views actually find the path, not by tightening the status check.
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- **Performance regression risk** → Mitigation: heuristic ordering should make typical case faster than today (saturates well before hard cap); performance smoke test verifies user's scenario completes in <5s for K=10 in worker
|
||||
- **Worker message volume** (50–500 small events per analysis) → Mitigation: each event <1KB; UI can coalesce with `requestAnimationFrame` if profiling shows main-thread pressure; defer
|
||||
- **Stable streaming order** depends on deterministic hash of `courseAssignments` → Mitigation: explicit tiebreaker test; document the hash function as part of the public contract
|
||||
- **Two views displaying inconsistent info briefly** during streaming (top-K shows HCR plan, per-set table cell still shows old ceiling for one beat) → Acceptable; both converge on the same data within a few hundred ms
|
||||
- **K=10 fixed** → User-facing limitation; if 10 isn't enough we can ship a follow-up making it configurable. Defer.
|
||||
|
||||
## Migration Plan
|
||||
|
||||
Single-PR change. No data migration. Steps:
|
||||
|
||||
1. Land algorithm + worker + state changes; new "Top Plans" component starts hidden behind a feature flag (or simply absent from the layout) — user-facing UI is added in a sibling commit/PR
|
||||
2. Verify all existing decision-tree tests pass (with priority-tiebreak amendments)
|
||||
3. Verify regression test for user's scenario passes
|
||||
4. Add Top Plans component to layout
|
||||
5. Browser-verify both views update progressively
|
||||
6. Bump version (`1.3.0`), CHANGELOG entry, ship
|
||||
|
||||
Rollback: revert. The change is internal to the decision-tree module and worker protocol; no persistent state to migrate back.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **UI layout** for the Top Plans panel — handled in a follow-up brainstorm focused on UX
|
||||
- **`MAX_TREE_ITERATIONS = 10000` / `SATURATION_LIMIT = 500`** — initial values; may need tuning after browser-side measurement on representative inputs
|
||||
- **Worker message coalescing** — defer until profiling shows it's needed
|
||||
Reference in New Issue
Block a user