v1.3.2: Leaf cache for instant pin/unpin + TopPlans block UX
Decision-tree leaf outcomes are now cached on the main thread keyed by
their full 12-course assignment. Pin operations filter the cache and
re-derive top-K + per-set ceilings instantly with no worker spawn. Unpin
operations show the cached subset immediately and stream improvements as
a background worker fills in the missing leaves. Cache survives pin,
unpin, and adopt-plan; only ranking or mode changes invalidate it.
Solver / worker:
- searchDecisionTree accepts skipKeys (Set<string>) and pinnedAssignments
(Record<setId,courseId>). Leaves are emitted with their full 12-set
assignment so cache keys are stable across pin/unpin operations.
- evaluateLeaf short-circuits when the leaf's assignmentKey is in
skipKeys: increments iterations + emits progress, but skips the
optimizer call and all callbacks. Keeps progress percentage honest
(counts whole tree, not just delta).
- New deriveFromLeaves pure helper produces {topK, setAnalyses} from a
leaf collection; used by the main-thread cache filter and gives a
reusable derivation primitive for tests.
- Worker request gains skipKeys and pinnedAssignments fields. Worker
response gains a leafEvaluated event so the main thread can populate
its cache as the search streams.
App state:
- leafCacheRef holds Map<assignmentKey, PlanOutcome> scoped to the
current (ranking, mode) pair. The search effect now: invalidates on
ranking/mode change; computes the orderedCourses + expectedTotal;
filters the cache against the current pinned/excluded state; calls
deriveFromLeaves to render immediately; spawns the worker only when
filtered.length < expectedTotal, passing skipKeys.
- Cache cap of 500,000 leaves with full clear on overflow. Bounds
worst-case memory at ~150 MB.
UI (TopPlans):
- Course blocks in the per-plan row are now interactive buttons. Click
pins (or unpins, if the course is currently pinned) the course in
that set. Pinned blocks render in a selected blue color.
- Each plan row now shows the FULL 12-set sequence including pinned
courses (interleaved with the search's recommended choices for the
remaining open sets) so the displayed plan is always complete.
- Spec qualification tags removed from per-block display (kept the
set-label + course-name treatment for clarity).
Tests:
- New app/src/solver/__tests__/leafCache.test.ts with 4 tests:
skipKeys parity (second-pass run with skipKeys evaluates zero
leaves), deriveFromLeaves parity (matches a fresh search), cache
filter on pinned assignments, cache filter on excluded courses.
- All 78 prior tests continue to pass; 82 total.
Browser-verified: pin click on a Top Plans block from the cached
8-open-set scenario completes instantly with no spinner; unpin restores
the original cached subset (also instant when the prior space was
already cached); mode toggle correctly invalidates and re-runs the
search.
This commit is contained in:
@@ -0,0 +1,116 @@
|
||||
## Context
|
||||
|
||||
v1.3.1 ships exhaustive decision-tree search. Per-cell ceilings and top-K plans are correct, but every pin/unpin click triggers a full re-search (~7s for an 8-open-set scenario). The compute is wasteful: leaf outcomes are pure functions of `(courseAssignments, ranking, mode)`. A leaf evaluated once stays correct as long as ranking and mode don't change.
|
||||
|
||||
Today the search effect in `useAppState` (`appState.ts:124-191`) lists `[selectedCourseIds, openSetIds, ranking, mode, excludedCourseIds]` as deps. Any change kills the running worker and starts a fresh search after a 300ms debounce.
|
||||
|
||||
This change adds an in-memory cache of leaf outcomes keyed by the existing `assignmentKey`. Pin operations become 100% cache hits (no worker). Unpin operations get a partial hit (cached subset rendered immediately, missing leaves computed in the background).
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Pin clicks produce instant top-K + per-cell updates with no spinner
|
||||
- Unpin clicks render the cached subset immediately as a lower bound, then improve as the worker streams new leaves
|
||||
- "Adopt plan" (which fires multiple pin actions in quick succession) feels instant
|
||||
- Cache memory is bounded (~80 MB worst case at the 500k cap)
|
||||
|
||||
**Non-Goals:**
|
||||
- Persisting the cache across page reloads (no localStorage; recompute on first use after reload)
|
||||
- Caching across ranking or mode changes
|
||||
- Re-architecting the optimizer or worker protocol beyond the new `skipKeys` field
|
||||
- A "tree of caches" keyed by ranking — keep one cache, invalidate on ranking change
|
||||
|
||||
## Decisions
|
||||
|
||||
### Cache key = `assignmentKey` only; cache scope = `(ranking, mode)`
|
||||
|
||||
`assignmentKey` already exists (`decisionTree.ts:79-84`) and is the deterministic stringification used by the comparator's tiebreaker. Reusing it avoids a parallel hashing scheme. Cache scope is bound to the current `(ranking, mode)` pair: when either changes, the cache is wiped and rebuilt from scratch on the next search.
|
||||
|
||||
**Alternative considered:** Keep multiple caches, one per `(ranking, mode)` pair the user has visited. Rejected — extra complexity for a workflow where ranking/mode changes are infrequent compared to pin clicks. If profiling later shows ranking-toggle becoming a hot path, can add then.
|
||||
|
||||
### Immediate render on partial hit (unpin), then stream improvements
|
||||
|
||||
When the user unpins a set, the new openSetIds product is larger than the cached subset. Today's UX would block on a fresh worker. Instead:
|
||||
|
||||
1. Filter the cache against the new pinned + excluded state. Derive top-K and per-set ceilings from the filtered leaves. Render immediately.
|
||||
2. Compute missing leaves count. If non-zero, spawn the worker with `skipKeys = cache.keys()`. The worker DFS visits every leaf in the new search space but skips the optimizer call for keys already cached.
|
||||
3. As the worker streams new leaves (existing `topKUpdate` / `choiceUpdate` events plus a final `allComplete`), insert each into the cache and re-derive top-K/ceilings via the existing streaming path.
|
||||
|
||||
The cached subset is a strict lower bound on the true result: top-K from cache is a valid (possibly incomplete) subset of the true top-K; per-set ceilings from cache are ≤ the true ceilings. Streaming improvements are monotonically non-decreasing under the existing comparator.
|
||||
|
||||
**Alternative considered:** Show a spinner and wait for the search to complete before rendering. Rejected — would feel like a regression after pin became instant. The streaming UI already handles monotonic improvement gracefully.
|
||||
|
||||
### Worker contract: `skipKeys: string[]`
|
||||
|
||||
The simplest extension. Main thread serializes `cache.keys()` to an array; worker reconstructs the Set on the receiving end. For 65k cached keys × ~30 chars each = ~2 MB transfer per worker spawn. structured-cloned in tens of milliseconds. Acceptable.
|
||||
|
||||
**Alternative considered:** Persistent worker that holds the cache internally. Rejected — adds lifecycle complexity (when to terminate, how to abort in-flight work, race conditions on cache writes). The fresh-worker model already exists and the transfer cost is bearable.
|
||||
|
||||
### `evaluateLeaf` short-circuit semantics
|
||||
|
||||
When a leaf's key is in `skipKeys`:
|
||||
- Increment `iterations` (so progress reflects total leaves visited, not just newly-computed)
|
||||
- Skip the optimizer call, skip `topK.tryInsert`, skip `choiceUpdate` emit
|
||||
- Skip `evaluated` flag updates (those cells were already marked from the cached results on the main thread)
|
||||
- Still emit `progress` events at the throttled rate
|
||||
|
||||
Rationale: the user-visible "iterations explored" should reflect tree size, not just delta. Otherwise an unpin would show "Searching… 100,000/200,000" jumping from 100k mid-search, which is confusing.
|
||||
|
||||
**Alternative considered:** Don't increment iterations for skipped leaves. Rejected — the percentage in the progress bar would be misleading.
|
||||
|
||||
### `deriveFromLeaves` extracted as a pure helper
|
||||
|
||||
To produce top-K and per-set ceilings from a leaf collection on the main thread, factor out the relevant logic from `searchDecisionTree`. The helper takes `(leaves, K, mode, ranking, openSetIds, excludedCourseIds)` and returns `{ topK: PlanOutcome[], setAnalyses: SetAnalysis[] }`.
|
||||
|
||||
The worker uses the same helper at `allComplete` to produce its final emission. The main thread uses it for the immediate-render path.
|
||||
|
||||
**Alternative considered:** Maintain top-K and per-set ceilings incrementally in the cache structure itself. Rejected — too coupled; recomputing from leaves is O(cache.size) which is fast (linear scan + small comparator work).
|
||||
|
||||
### Cache cap = 500k leaves with full clear on overflow
|
||||
|
||||
500k × ~300 bytes ≈ 150 MB. Comfortable on desktop, snug on mobile. Tripping the cap requires extreme exploration paths (≥10 open sets + repeated cycles); typical sessions stay under 100k.
|
||||
|
||||
When the cap fires, clear the entire cache. Simplest possible policy. Subsequent searches behave as v1.3.1 (full recompute).
|
||||
|
||||
**Alternative considered:** LRU eviction. Rejected for v1 — adds bookkeeping overhead per insert; benefit is marginal given cap is rarely hit. Add later if profiling shows churn.
|
||||
|
||||
### Cache invalidation events
|
||||
|
||||
| Event | Cache action |
|
||||
|---|---|
|
||||
| Pin a course | Keep cache; filter |
|
||||
| Unpin a course | Keep cache; filter (partial hit) |
|
||||
| Adopt plan | Keep cache; filter |
|
||||
| Ranking re-order | Clear cache; re-search from scratch |
|
||||
| Mode toggle | Clear cache; re-search from scratch |
|
||||
| Cancellation toggle (data file edit) | Clear cache (data version changed) |
|
||||
| Cache size exceeds 500k | Clear cache |
|
||||
|
||||
The ranking/mode/cancellation changes are uncommon compared to pin clicks, so the simplification is worth it.
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- **Memory footprint** → Mitigation: 500k cap clears the cache when exceeded; typical sessions stay well under
|
||||
- **Worker transfer of `skipKeys`** → ~2 MB per worker spawn at 65k cached keys; acceptable, measure and revisit if it becomes painful at scale
|
||||
- **Cached subset can briefly disagree with worker's final result** during streaming → Existing streaming UI handles monotonic improvement; the only visible effect is some cells starting at a lower count and improving as the worker fills in
|
||||
- **Iteration counter semantics** during skip-mode runs → Counts total leaves visited (cached + new). Decision documented above; clear in the proposal scenarios so users understand "Searching… 50,000/200,000" early in an unpin run
|
||||
- **First-time experience unchanged** → Empty cache means full search; no improvement on initial page load. Subsequent operations are where the win happens
|
||||
|
||||
## Migration Plan
|
||||
|
||||
Single-PR change. No data migration. No persistent state to migrate.
|
||||
|
||||
1. Implement `skipKeys` plumbing through `searchDecisionTree` and worker
|
||||
2. Extract `deriveFromLeaves` helper
|
||||
3. Add cache to `useAppState`; restructure the worker effect to filter-then-spawn
|
||||
4. Tests for skip-keys correctness, derive helper parity, cache-cap eviction, ranking/mode invalidation
|
||||
5. Browser verify: pin/unpin behave instantly after the first search
|
||||
6. Bump version (`1.3.2`); CHANGELOG entry; ship
|
||||
|
||||
Rollback: revert. v1.3.1 behavior restored — every pin/unpin runs a fresh worker.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Memory measurement on representative scenarios (8-set typical, 10-set extreme) — instrument once for the changelog notes
|
||||
- Whether `skipKeys` transfer becomes a bottleneck at very large cache sizes — defer; haven't observed it in typical use
|
||||
- Whether to expose a debug toggle to disable the cache (useful for A/B feel testing) — defer; can add as a query param if needed
|
||||
Reference in New Issue
Block a user