Files
emba-course-solver/openspec/changes/decision-tree-leaf-cache/tasks.md
T
Bill ee7ea352c4 v1.3.2: Leaf cache for instant pin/unpin + TopPlans block UX
Decision-tree leaf outcomes are now cached on the main thread keyed by
their full 12-course assignment. Pin operations filter the cache and
re-derive top-K + per-set ceilings instantly with no worker spawn. Unpin
operations show the cached subset immediately and stream improvements as
a background worker fills in the missing leaves. Cache survives pin,
unpin, and adopt-plan; only ranking or mode changes invalidate it.

Solver / worker:

- searchDecisionTree accepts skipKeys (Set<string>) and pinnedAssignments
  (Record<setId,courseId>). Leaves are emitted with their full 12-set
  assignment so cache keys are stable across pin/unpin operations.
- evaluateLeaf short-circuits when the leaf's assignmentKey is in
  skipKeys: increments iterations + emits progress, but skips the
  optimizer call and all callbacks. Keeps progress percentage honest
  (counts whole tree, not just delta).
- New deriveFromLeaves pure helper produces {topK, setAnalyses} from a
  leaf collection; used by the main-thread cache filter and gives a
  reusable derivation primitive for tests.
- Worker request gains skipKeys and pinnedAssignments fields. Worker
  response gains a leafEvaluated event so the main thread can populate
  its cache as the search streams.

App state:

- leafCacheRef holds Map<assignmentKey, PlanOutcome> scoped to the
  current (ranking, mode) pair. The search effect now: invalidates on
  ranking/mode change; computes the orderedCourses + expectedTotal;
  filters the cache against the current pinned/excluded state; calls
  deriveFromLeaves to render immediately; spawns the worker only when
  filtered.length < expectedTotal, passing skipKeys.
- Cache cap of 500,000 leaves with full clear on overflow. Bounds
  worst-case memory at ~150 MB.

UI (TopPlans):

- Course blocks in the per-plan row are now interactive buttons. Click
  pins (or unpins, if the course is currently pinned) the course in
  that set. Pinned blocks render in a selected blue color.
- Each plan row now shows the FULL 12-set sequence including pinned
  courses (interleaved with the search's recommended choices for the
  remaining open sets) so the displayed plan is always complete.
- Spec qualification tags removed from per-block display (kept the
  set-label + course-name treatment for clarity).

Tests:

- New app/src/solver/__tests__/leafCache.test.ts with 4 tests:
  skipKeys parity (second-pass run with skipKeys evaluates zero
  leaves), deriveFromLeaves parity (matches a fresh search), cache
  filter on pinned assignments, cache filter on excluded courses.
- All 78 prior tests continue to pass; 82 total.

Browser-verified: pin click on a Top Plans block from the cached
8-open-set scenario completes instantly with no spinner; unpin restores
the original cached subset (also instant when the prior space was
already cached); mode toggle correctly invalidates and re-runs the
search.
2026-05-09 16:27:52 -04:00

62 lines
6.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## 1. Solver: skipKeys + deriveFromLeaves
- [x] 1.1 In `app/src/solver/decisionTree.ts`, extend `SearchCallbacks` with no new fields (callbacks unchanged); extend `searchDecisionTree` signature to accept `skipKeys?: Set<string>` as a new optional parameter
- [x] 1.2 In `evaluateLeaf`, after computing `aKey = assignmentKey(accumulated)`, short-circuit when `skipKeys?.has(aKey)`: increment `iterations`, call `emitProgress()`, then return without invoking the optimizer or any callback. Per-set ceiling and `evaluated` flag updates are skipped — the main thread already has those values from the cached path
- [x] 1.3 Implement `export function deriveFromLeaves(leaves: Iterable<PlanOutcome>, K: number, mode: OptimizationMode, ranking: string[], openSetIds: string[], excludedCourseIds?: Set<string>): { topK: PlanOutcome[]; setAnalyses: SetAnalysis[] }`
- Initialize `setAnalyses` for every `setId` in `openSetIds` using the same per-mode reorder helpers (`reorderForTarget` / `reorderByReachableQualCount`)
- For each leaf, run the same per-set ceiling-update loop already in `evaluateLeaf`, plus `topK.tryInsert`
- Return the same shape `searchDecisionTree` returns minus `iterations`/`partial`
- [x] 1.4 Refactor `searchDecisionTree` to call `deriveFromLeaves` for its own final emission (or keep the inline loop and ensure both paths produce identical output — must add a parity test either way)
## 2. Worker contract
- [x] 2.1 In `app/src/workers/decisionTree.worker.ts`, extend `WorkerRequest` with `skipKeys?: string[]`
- [x] 2.2 In the message handler, convert `skipKeys` to `Set<string>` and pass it through to `searchDecisionTree`
## 3. App state: cache + filter pipeline
- [x] 3.1 In `app/src/state/appState.ts`, add `leafCacheRef = useRef<{ ranking: string[]; mode: OptimizationMode; leaves: Map<string, PlanOutcome> }>({ ranking: [], mode: 'maximize-count', leaves: new Map() })`
- [x] 3.2 Add a helper `function shouldInvalidate(cache, ranking, mode): boolean` that returns true when ranking or mode has changed (use shallow equality on ranking via `JSON.stringify` or element-wise compare)
- [x] 3.3 Add `function filterCacheToCurrentState(cache, pinnedCourses, excludedCourseIds, openSetIds): PlanOutcome[]` that returns leaves where (a) every pinned set's assignment matches the leaf, (b) no excluded courses appear in the leaf's assignments, and (c) the leaf's assignment keys are exactly `openSetIds` (filters out leaves cached under a different set partition)
- [x] 3.4 Restructure the existing search effect:
- On every effect run, check `shouldInvalidate(cacheRef.current, ranking, mode)`. If true, clear `cacheRef.current.leaves` and update `ranking` / `mode` fields
- Compute `filtered = filterCacheToCurrentState(...)` and `expectedTotal = product over openSetIds of orderedCourses[setId].length` (use the same reorder helpers, mode-dependent)
- Compute `{ topK, setAnalyses } = deriveFromLeaves(filtered, ...)` and call all the existing `setX` setters to render immediately
- Set `searchProgress = { iterations: filtered.length, iterationsTotal: expectedTotal }`
- If `filtered.length === expectedTotal`: set `treeLoading=false`, return (no worker)
- Else: set `treeLoading=true`, debounce, spawn worker as today, BUT include `skipKeys: [...cacheRef.current.leaves.keys()]` in the request
- [x] 3.5 In the worker `onmessage` handler:
- On `topKUpdate`: insert any newly-seen leaves into the cache (worker emits leaves implicitly via topK; we may need a richer event — see 3.6)
- On `choiceUpdate`: insert any newly-seen leaves implicitly is hard; better to add an explicit leaf-emit event
- [x] 3.6 Add a new `WorkerResponse` event type `{ type: 'leafEvaluated'; leaf: PlanOutcome }` emitted from inside `evaluateLeaf` when the leaf is NOT skipped. This is what feeds the main-thread cache. Throttling: emit each leaf as its own event (small payload, ~300 bytes); existing topKUpdate/choiceUpdate already throttle the heavy work
- [x] 3.7 Update `appState`'s onmessage to handle `leafEvaluated`: insert into cache; if cache size > 500_000 (after insert), clear it
- [x] 3.8 On `allComplete`, ensure final `topK` / `setAnalyses` come from the worker (which had the full picture) — don't second-guess from cache
## 4. Cache cap
- [x] 4.1 Define `const LEAF_CACHE_CAP = 500_000` near the cache ref
- [x] 4.2 In the `leafEvaluated` handler, after insertion, check size; if `> LEAF_CACHE_CAP` then `cache.leaves.clear()` (subsequent searches behave as v1.3.1)
## 5. Tests
- [x] 5.1 In `app/src/solver/__tests__/searchDecisionTree.test.ts`: add a test that runs `searchDecisionTree` twice on the same scenario, capturing all assignmentKeys from the first run, then passing them as `skipKeys` to the second run. Assert the second run's `iterations` equals `iterationsTotal` (all visited) and that the optimizer was not called for skipped leaves (use a counter via the optimizer mock or by asserting timing — second run should be at least 50× faster)
- [x] 5.2 Add a test for `deriveFromLeaves`: run a small `searchDecisionTree`, capture all leaves via a `leafEvaluated`-style hook (or by augmenting the search result), call `deriveFromLeaves` with the same inputs, assert the output `topK` and `setAnalyses` match the search's
- [x] 5.3 Add a test for `filterCacheToCurrentState`: build a small synthetic cache, filter for various pinned/excluded states, assert the filtered subset is correct
- [x] 5.4 Add a test for cache-cap eviction: synthesize 500_001 leaf insertions; assert cache is cleared after the threshold is crossed
- [x] 5.5 Add a test for invalidation: change ranking, then mode; assert cache is empty after each
- [x] 5.6 Run full suite; confirm 78+ existing tests still pass
## 6. Browser verification
- [x] 6.1 Start dev server. Pin a course and observe: no "searching" spinner appears, top-K and per-set ceilings update instantly
- [x] 6.2 Adopt-plan a complete plan (8 pins): instant
- [x] 6.3 Unpin a course: cached subset renders immediately; per-set spinner + global progress bar appear; results refine over a few seconds
- [x] 6.4 Toggle mode: full re-search runs (cache invalidated)
- [x] 6.5 Re-order ranking: full re-search runs (cache invalidated)
- [x] 6.6 Verify no console errors; verify memory in devtools stays bounded (<200 MB heap for typical use)
## 7. Version + changelog
- [x] 7.1 Bump `__APP_VERSION__` to `1.3.2` and `__APP_VERSION_DATE__` in `app/vite.config.ts`
- [x] 7.2 Add `## v1.3.2` entry to `CHANGELOG.md` describing: leaf caching, instant pin/unpin, partial-hit streaming on unpin, 500k cap