emba-course-solver/openspec/changes/decision-tree-leaf-cache/specs/optimization-engine/spec.md

## ADDED Requirements

### Requirement: Persistent leaf cache across pin and unpin operations
The application SHALL maintain a main-thread cache of evaluated decision-tree leaves keyed by the leaf's `assignmentKey` (the deterministic sorted `setId:courseId` join already used as the comparator tiebreaker). The cache SHALL persist across pin, unpin, and adopt-plan operations as long as `state.ranking` and `state.mode` are unchanged. Each cache entry SHALL store the full `PlanOutcome` (`courseAssignments`, `achievedSpecs`, `priorityScore`).

#### Scenario: Pin operation hits cache fully
- **WHEN** the user has completed a search with no pins on a small scenario, then pins a course
- **THEN** the new top-K and per-set ceilings are derived entirely from the cache without spawning a worker
- **AND** no "searching" indicators appear in the UI

#### Scenario: Cache survives consecutive pin clicks
- **WHEN** the user pins multiple courses one after another (or via "Adopt plan")
- **THEN** every pin produces an instant UI update sourced from the existing cache

#### Scenario: Unpin gets immediate cached subset and streams improvements
- **WHEN** the user unpins a course after a search has populated the cache
- **THEN** the UI immediately renders top-K and per-set ceilings derived from the cache subset matching the new state
- **AND** a worker spawns to compute the missing leaves
- **AND** as the worker streams new leaves, the UI's top-K and ceilings improve monotonically

### Requirement: `skipKeys` worker contract
The worker request SHALL accept an optional `skipKeys: string[]` field. The worker SHALL convert this list to a `Set<string>` and pass it to `searchDecisionTree`. Inside `evaluateLeaf`, leaves whose `assignmentKey` is in `skipKeys` SHALL be skipped: the optimizer SHALL NOT be invoked, no `topKUpdate` or `choiceUpdate` event SHALL be emitted for them, and the leaf SHALL NOT mutate per-set `evaluated` flags. Skipped leaves SHALL still increment the iteration counter so that throttled `progress` events report the total tree size, not just the delta.

#### Scenario: Worker bypasses optimizer for cached leaves
- **WHEN** the worker receives a request with `skipKeys` containing the keys of N cached leaves
- **THEN** the worker performs at most `(iterationsTotal − N)` optimizer evaluations

#### Scenario: Progress reports total tree size
- **WHEN** the worker is processing a request with `skipKeys` containing 50,000 keys out of an `iterationsTotal` of 200,000
- **THEN** progress events include `iterations` counting up to 200,000 (not 150,000) so the displayed percentage reflects whole-tree progress

### Requirement: Cache invalidation on ranking, mode, or data change
The leaf cache SHALL be cleared when `state.ranking` changes, when `state.mode` changes, or when the underlying course/specialization data is changed (e.g., a course is marked cancelled). Pin/unpin operations SHALL NOT trigger cache invalidation.

#### Scenario: Mode toggle clears cache
- **WHEN** the user toggles between maximize-count and priority-order
- **THEN** the cache is emptied and the next search runs as a full recomputation

#### Scenario: Ranking re-order clears cache
- **WHEN** the user reorders the specialization ranking
- **THEN** the cache is emptied and the next search runs as a full recomputation

#### Scenario: Pin does not clear cache
- **WHEN** the user pins or unpins a course
- **THEN** the cache retains all previously evaluated leaves

### Requirement: Cache size cap
The leaf cache SHALL be cleared when its size exceeds 500,000 entries. Subsequent searches SHALL repopulate the cache from scratch.

#### Scenario: Cap clears cache when exceeded
- **WHEN** the cache is at 500,000 entries and a new search would add at least one more entry
- **THEN** the cache is emptied before the next entry is inserted, and the new search proceeds without `skipKeys`

### Requirement: `deriveFromLeaves` shared helper
The decision-tree module SHALL export a pure function `deriveFromLeaves(leaves, K, mode, ranking, openSetIds, excludedCourseIds): { topK, setAnalyses }` that produces the top-K plan list and per-set ceiling table from a collection of leaf outcomes. This helper SHALL be used both by the worker at `allComplete` and by the main thread when rendering filtered cache results.

#### Scenario: Helper output matches a fresh search
- **WHEN** `deriveFromLeaves` is called with the complete leaf set from a finished `searchDecisionTree` run
- **THEN** the returned `topK` and `setAnalyses` match the values that the search itself returned (modulo deterministic tiebreaker stability)

#### Scenario: Helper output is correct for filtered subsets
- **WHEN** `deriveFromLeaves` is called with a strict subset of cached leaves matching the user's current pinned/excluded state
- **THEN** the returned top-K and ceilings reflect only those leaves and never reference courses outside the filter