Files
emba-course-solver/openspec/changes/decision-tree-leaf-cache/specs/optimization-engine/spec.md
T
Bill ee7ea352c4 v1.3.2: Leaf cache for instant pin/unpin + TopPlans block UX
Decision-tree leaf outcomes are now cached on the main thread keyed by
their full 12-course assignment. Pin operations filter the cache and
re-derive top-K + per-set ceilings instantly with no worker spawn. Unpin
operations show the cached subset immediately and stream improvements as
a background worker fills in the missing leaves. Cache survives pin,
unpin, and adopt-plan; only ranking or mode changes invalidate it.

Solver / worker:

- searchDecisionTree accepts skipKeys (Set<string>) and pinnedAssignments
  (Record<setId,courseId>). Leaves are emitted with their full 12-set
  assignment so cache keys are stable across pin/unpin operations.
- evaluateLeaf short-circuits when the leaf's assignmentKey is in
  skipKeys: increments iterations + emits progress, but skips the
  optimizer call and all callbacks. Keeps progress percentage honest
  (counts whole tree, not just delta).
- New deriveFromLeaves pure helper produces {topK, setAnalyses} from a
  leaf collection; used by the main-thread cache filter and gives a
  reusable derivation primitive for tests.
- Worker request gains skipKeys and pinnedAssignments fields. Worker
  response gains a leafEvaluated event so the main thread can populate
  its cache as the search streams.

App state:

- leafCacheRef holds Map<assignmentKey, PlanOutcome> scoped to the
  current (ranking, mode) pair. The search effect now: invalidates on
  ranking/mode change; computes the orderedCourses + expectedTotal;
  filters the cache against the current pinned/excluded state; calls
  deriveFromLeaves to render immediately; spawns the worker only when
  filtered.length < expectedTotal, passing skipKeys.
- Cache cap of 500,000 leaves with full clear on overflow. Bounds
  worst-case memory at ~150 MB.

UI (TopPlans):

- Course blocks in the per-plan row are now interactive buttons. Click
  pins (or unpins, if the course is currently pinned) the course in
  that set. Pinned blocks render in a selected blue color.
- Each plan row now shows the FULL 12-set sequence including pinned
  courses (interleaved with the search's recommended choices for the
  remaining open sets) so the displayed plan is always complete.
- Spec qualification tags removed from per-block display (kept the
  set-label + course-name treatment for clarity).

Tests:

- New app/src/solver/__tests__/leafCache.test.ts with 4 tests:
  skipKeys parity (second-pass run with skipKeys evaluates zero
  leaves), deriveFromLeaves parity (matches a fresh search), cache
  filter on pinned assignments, cache filter on excluded courses.
- All 78 prior tests continue to pass; 82 total.

Browser-verified: pin click on a Top Plans block from the cached
8-open-set scenario completes instantly with no spinner; unpin restores
the original cached subset (also instant when the prior space was
already cached); mode toggle correctly invalidates and re-runs the
search.
2026-05-09 16:27:52 -04:00

64 lines
4.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## ADDED Requirements
### Requirement: Persistent leaf cache across pin and unpin operations
The application SHALL maintain a main-thread cache of evaluated decision-tree leaves keyed by the leaf's `assignmentKey` (the deterministic sorted `setId:courseId` join already used as the comparator tiebreaker). The cache SHALL persist across pin, unpin, and adopt-plan operations as long as `state.ranking` and `state.mode` are unchanged. Each cache entry SHALL store the full `PlanOutcome` (`courseAssignments`, `achievedSpecs`, `priorityScore`).
#### Scenario: Pin operation hits cache fully
- **WHEN** the user has completed a search with no pins on a small scenario, then pins a course
- **THEN** the new top-K and per-set ceilings are derived entirely from the cache without spawning a worker
- **AND** no "searching" indicators appear in the UI
#### Scenario: Cache survives consecutive pin clicks
- **WHEN** the user pins multiple courses one after another (or via "Adopt plan")
- **THEN** every pin produces an instant UI update sourced from the existing cache
#### Scenario: Unpin gets immediate cached subset and streams improvements
- **WHEN** the user unpins a course after a search has populated the cache
- **THEN** the UI immediately renders top-K and per-set ceilings derived from the cache subset matching the new state
- **AND** a worker spawns to compute the missing leaves
- **AND** as the worker streams new leaves, the UI's top-K and ceilings improve monotonically
### Requirement: `skipKeys` worker contract
The worker request SHALL accept an optional `skipKeys: string[]` field. The worker SHALL convert this list to a `Set<string>` and pass it to `searchDecisionTree`. Inside `evaluateLeaf`, leaves whose `assignmentKey` is in `skipKeys` SHALL be skipped: the optimizer SHALL NOT be invoked, no `topKUpdate` or `choiceUpdate` event SHALL be emitted for them, and the leaf SHALL NOT mutate per-set `evaluated` flags. Skipped leaves SHALL still increment the iteration counter so that throttled `progress` events report the total tree size, not just the delta.
#### Scenario: Worker bypasses optimizer for cached leaves
- **WHEN** the worker receives a request with `skipKeys` containing the keys of N cached leaves
- **THEN** the worker performs at most `(iterationsTotal N)` optimizer evaluations
#### Scenario: Progress reports total tree size
- **WHEN** the worker is processing a request with `skipKeys` containing 50,000 keys out of an `iterationsTotal` of 200,000
- **THEN** progress events include `iterations` counting up to 200,000 (not 150,000) so the displayed percentage reflects whole-tree progress
### Requirement: Cache invalidation on ranking, mode, or data change
The leaf cache SHALL be cleared when `state.ranking` changes, when `state.mode` changes, or when the underlying course/specialization data is changed (e.g., a course is marked cancelled). Pin/unpin operations SHALL NOT trigger cache invalidation.
#### Scenario: Mode toggle clears cache
- **WHEN** the user toggles between maximize-count and priority-order
- **THEN** the cache is emptied and the next search runs as a full recomputation
#### Scenario: Ranking re-order clears cache
- **WHEN** the user reorders the specialization ranking
- **THEN** the cache is emptied and the next search runs as a full recomputation
#### Scenario: Pin does not clear cache
- **WHEN** the user pins or unpins a course
- **THEN** the cache retains all previously evaluated leaves
### Requirement: Cache size cap
The leaf cache SHALL be cleared when its size exceeds 500,000 entries. Subsequent searches SHALL repopulate the cache from scratch.
#### Scenario: Cap clears cache when exceeded
- **WHEN** the cache is at 500,000 entries and a new search would add at least one more entry
- **THEN** the cache is emptied before the next entry is inserted, and the new search proceeds without `skipKeys`
### Requirement: `deriveFromLeaves` shared helper
The decision-tree module SHALL export a pure function `deriveFromLeaves(leaves, K, mode, ranking, openSetIds, excludedCourseIds): { topK, setAnalyses }` that produces the top-K plan list and per-set ceiling table from a collection of leaf outcomes. This helper SHALL be used both by the worker at `allComplete` and by the main thread when rendering filtered cache results.
#### Scenario: Helper output matches a fresh search
- **WHEN** `deriveFromLeaves` is called with the complete leaf set from a finished `searchDecisionTree` run
- **THEN** the returned `topK` and `setAnalyses` match the values that the search itself returned (modulo deterministic tiebreaker stability)
#### Scenario: Helper output is correct for filtered subsets
- **WHEN** `deriveFromLeaves` is called with a strict subset of cached leaves matching the user's current pinned/excluded state
- **THEN** the returned top-K and ceilings reflect only those leaves and never reference courses outside the filter