Add skip_completed parameter to JobManager.create_job() to control duplicate detection:
- When skip_completed=True (default), skips already-completed simulations (existing behavior)
- When skip_completed=False, includes ALL requested simulations regardless of completion status
API endpoint now uses request.replace_existing to control skip_completed parameter:
- replace_existing=false (default): skip_completed=True (skip duplicates)
- replace_existing=true: skip_completed=False (force re-run all simulations)
This allows users to force re-running completed simulations when needed.
- Skip already-completed model-day pairs in create_job()
- Return warnings for skipped simulations
- Raise error if all simulations are already completed
- Update create_job() return type from str to Dict[str, Any]
- Update all callers to handle new dict return type
- Add comprehensive test coverage for duplicate detection
- Log warnings when simulations are skipped
When a Docker container is shutdown and restarted, jobs with status
'pending', 'downloading_data', or 'running' remained in the database,
preventing new jobs from starting due to concurrency control checks.
This commit adds automatic cleanup of stale jobs during FastAPI startup:
- New cleanup_stale_jobs() method in JobManager (api/job_manager.py:702-779)
- Integrated into FastAPI lifespan startup (api/main.py:164-168)
- Intelligent status determination based on completion percentage:
- 'partial' if any model-days completed (preserves progress data)
- 'failed' if no progress made
- Detailed error messages with original status and completion counts
- Marks incomplete job_details as 'failed' with clear error messages
- Deployment-aware: skips cleanup in DEV mode when DB is reset
- Comprehensive logging at warning level for visibility
Testing:
- 6 new unit tests covering all cleanup scenarios (451-609)
- All 30 existing job_manager tests still pass
- Tests verify pending, running, downloading_data, partial progress,
no stale jobs, and multiple stale jobs scenarios
Resolves issue where container restarts left stale jobs blocking the
can_start_new_job() concurrency check.
- Add 'skipped' to terminal states in update_job_detail_status()
- Ensures skipped dates properly:
- Update status and completed_at timestamp
- Store skip reason in error field
- Trigger job completion checks
- Add comprehensive test suite (11 tests) covering:
- Database schema validation
- Job completion with skipped dates
- Progress tracking with skip counts
- Multi-model skip handling
- Skip reason storage
Bug was discovered via TDD - created tests first, which revealed
that skipped status wasn't being handled in the terminal state
block at line 397.
All 11 tests passing.
Implement skip status tracking to fix jobs hanging when dates are
filtered out. Jobs now properly complete when all model-days reach
terminal states (completed/failed/skipped).
Changes:
- database.py: Add 'skipped' status to job_details CHECK constraint
- job_manager.py: Update completion logic to count skipped as done
- job_manager.py: Add skipped count to progress tracking
- simulation_worker.py: Implement skip tracking with per-model granularity
- simulation_worker.py: Add _filter_completed_dates_with_tracking()
- simulation_worker.py: Add _mark_skipped_dates()
- simulation_worker.py: Update _prepare_data() to use skip tracking
- simulation_worker.py: Improve warning messages to distinguish skip types
Skip reasons:
- "Already completed" - Position data exists from previous job
- "Incomplete price data" - Missing prices (weekends/holidays/future)
The implementation correctly handles multi-model scenarios where different
models have different completion states for the same date.
BREAKING CHANGE: end_date is now required and cannot be null/empty
New Features:
- Resume mode: Set start_date to null to continue from last completed date per model
- Idempotent by default: Skip already-completed dates with replace_existing=false
- Per-model independence: Each model resumes from its own last completed date
- Cold start handling: If no data exists in resume mode, runs only end_date as single day
API Changes:
- start_date: Now optional (null enables resume mode)
- end_date: Now REQUIRED (cannot be null or empty string)
- replace_existing: New optional field (default: false for idempotent behavior)
Implementation:
- Added JobManager.get_last_completed_date_for_model() method
- Added JobManager.get_completed_model_dates() method
- Updated create_job() to support model_day_filter for selective task creation
- Fixed bug with start_date=None in price data checks
Documentation:
- Updated API_REFERENCE.md with complete examples and behavior matrix
- Updated QUICK_START.md with resume mode examples
- Updated docs/user-guide/using-the-api.md
- Added CHANGELOG_NEW_API.md with migration guide
- Updated all integration tests for new schema
- Updated client library examples (Python, TypeScript)
Migration:
- Old: {"start_date": "2025-01-16"}
- New: {"start_date": "2025-01-16", "end_date": "2025-01-16"}
- Resume: {"start_date": null, "end_date": "2025-01-31"}
See CHANGELOG_NEW_API.md for complete details.