Commit Graph

9 Commits

Author SHA1 Message Date
837962ceea feat: integrate deployment mode path resolution in database module 2025-11-01 11:22:03 -04:00
8fb2ead8ff feat: add dev database initialization and cleanup functions 2025-11-01 11:20:15 -04:00
2575e0c12a fix: add database schema migration for simulation_run_id column
Add automatic schema migration to handle existing databases that don't
have the simulation_run_id column in the positions table.

Problem:
- v0.3.0-alpha.3 databases lack simulation_run_id column
- CREATE TABLE IF NOT EXISTS doesn't add new columns to existing tables
- Index creation fails with "no such column: simulation_run_id"

Solution:
- Add _migrate_schema() function to detect and migrate old schemas
- Check if positions table exists and inspect its columns
- ALTER TABLE to add simulation_run_id if missing
- Run migration before creating indexes

This allows seamless upgrades from alpha.3 to alpha.4 without manual
database deletion or migration scripts.

Fixes docker compose startup error:
  sqlite3.OperationalError: no such column: simulation_run_id

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-31 18:41:38 -04:00
c3ea358a12 test: add comprehensive test suite for v0.3.0 on-demand price downloads
Add 64 new tests covering date utilities, price data management, and
on-demand download workflows with 100% coverage for date_utils and 85%
coverage for price_data_manager.

New test files:
- tests/unit/test_date_utils.py (22 tests)
  * Date range expansion and validation
  * Max simulation days configuration
  * Chronological ordering and boundary checks
  * 100% coverage of api/date_utils.py

- tests/unit/test_price_data_manager.py (33 tests)
  * Initialization and configuration
  * Symbol date retrieval and coverage detection
  * Priority-based download ordering
  * Rate limit and error handling
  * Data storage and coverage tracking
  * 85% coverage of api/price_data_manager.py

- tests/integration/test_on_demand_downloads.py (10 tests)
  * End-to-end download workflows
  * Rate limit handling with graceful degradation
  * Coverage tracking and gap detection
  * Data validation and filtering

Code improvements:
- Add DownloadError exception class for non-rate-limit failures
- Update all ValueError raises to DownloadError for consistency
- Add API key validation at download start
- Improve response validation to check for Meta Data

Test coverage:
- 64 tests passing (54 unit + 10 integration)
- api/date_utils.py: 100% coverage
- api/price_data_manager.py: 85% coverage
- Validates priority-first download strategy
- Confirms graceful rate limit handling
- Verifies database storage and retrieval

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-31 17:13:03 -04:00
76b946449e feat: implement date range API and on-demand downloads (WIP phase 2)
Phase 2 progress - API integration complete.

API Changes:
- Replace date_range (List[str]) with start_date/end_date (str)
- Add automatic end_date defaulting to start_date for single day
- Add date format validation
- Integrate PriceDataManager for on-demand downloads
- Add rate limit handling (trusts provider, no pre-config)
- Validate date ranges with configurable max days (MAX_SIMULATION_DAYS)

New Modules:
- api/date_utils.py - Date validation and expansion utilities
- scripts/migrate_price_data.py - Migration script for merged.jsonl

API Flow:
1. Validate date range (start <= end, max 30 days, not future)
2. Check missing price data coverage
3. Download missing data if AUTO_DOWNLOAD_PRICE_DATA=true
4. Priority-based download (maximize date completion)
5. Create job with available trading dates
6. Graceful handling of partial data (rate limits)

Configuration:
- AUTO_DOWNLOAD_PRICE_DATA (default: true)
- MAX_SIMULATION_DAYS (default: 30)
- No rate limit configuration needed

Still TODO:
- Update tools/price_tools.py to read from database
- Implement simulation run tracking
- Update .env.example
- Comprehensive testing
- Documentation updates

Breaking Changes:
- API request format changed (date_range -> start_date/end_date)
- This completes v0.3.0 preparation
2025-10-31 16:40:50 -04:00
bddf4d8b72 feat: add price data management infrastructure (WIP)
Phase 1 of v0.3.0 date range and on-demand download implementation.

Database changes:
- Add price_data table (OHLCV data, replaces merged.jsonl)
- Add price_data_coverage table (track downloaded date ranges)
- Add simulation_runs table (soft delete support)
- Add simulation_run_id to positions table
- Add comprehensive indexes for new tables

New modules:
- api/price_data_manager.py - Priority-based download manager
  - Coverage gap detection
  - Smart download prioritization (maximize date completion)
  - Rate limit handling with retry logic
  - Alpha Vantage integration

Configuration:
- configs/nasdaq100_symbols.json - NASDAQ 100 constituent list

Next steps (not yet implemented):
- Migration script for merged.jsonl -> price_data
- Update API models (start_date/end_date)
- Update tools/price_tools.py to read from database
- Simulation run tracking implementation
- API integration
- Tests and documentation

This is work in progress for the v0.3.0 release.
2025-10-31 16:37:14 -04:00
8e7e80807b refactor: remove config_path from API interface
Makes config_path an internal server detail rather than an API parameter.

Changes:
- Remove config_path from SimulateTriggerRequest
- Add config_path parameter to create_app() with default
- Store in app.state.config_path for internal use
- Update trigger endpoint to use internal config path
- Change missing config error from 400 to 500 (server error)

API calls now only need to specify date_range (and optionally models):
  POST /simulate/trigger
  {"date_range": ["2025-01-16"]}

The server uses configs/default_config.json by default.
This simplifies the API and hides implementation details from clients.
2025-10-31 15:18:56 -04:00
ec2a37e474 feat: use enabled field from config to determine which models run
Changed the API to respect the 'enabled' field in model configurations,
rather than requiring models to be explicitly specified in API requests.

Changes:
- Make 'models' parameter optional in POST /simulate/trigger
- If models not provided, read config and use enabled models
- If models provided, use as explicit override (for testing)
- Raise error if no enabled models found and none specified
- Update response message to show model count

Behavior:
- Default: Only runs models with "enabled": true in config
- Override: Can still specify models in request for manual testing
- Safety: Prevents accidental execution of disabled/expensive models

Example before (required):
  POST /simulate/trigger
  {"config_path": "...", "date_range": [...], "models": ["gpt-4"]}

Example after (optional):
  POST /simulate/trigger
  {"config_path": "...", "date_range": [...]}
  # Uses models where enabled: true

This makes the config file the source of truth for which models
should run, while still allowing ad-hoc overrides for testing.
2025-10-31 15:12:11 -04:00
fb9583b374 feat: transform to REST API service with SQLite persistence (v0.3.0)
Major architecture transformation from batch-only to API service with
database persistence for Windmill integration.

## REST API Implementation
- POST /simulate/trigger - Start simulation jobs
- GET /simulate/status/{job_id} - Monitor job progress
- GET /results - Query results with filters (job_id, date, model)
- GET /health - Service health checks

## Database Layer
- SQLite persistence with 6 tables (jobs, job_details, positions,
  holdings, reasoning_logs, tool_usage)
- Foreign key constraints with cascade deletes
- Replaces JSONL file storage

## Backend Components
- JobManager: Job lifecycle management with concurrency control
- RuntimeConfigManager: Thread-safe isolated runtime configs
- ModelDayExecutor: Single model-day execution engine
- SimulationWorker: Date-sequential, model-parallel orchestration

## Testing
- 102 unit and integration tests (85% coverage)
- Database: 98% coverage
- Job manager: 98% coverage
- API endpoints: 81% coverage
- Pydantic models: 100% coverage
- TDD approach throughout

## Docker Deployment
- Dual-mode: API server (persistent) + batch (one-time)
- Health checks with 30s interval
- Volume persistence for database and logs
- Separate entrypoints for each mode

## Validation Tools
- scripts/validate_docker_build.sh - Build validation
- scripts/test_api_endpoints.sh - Complete API testing
- scripts/test_batch_mode.sh - Batch mode validation
- DOCKER_API.md - Deployment guide
- TESTING_GUIDE.md - Testing procedures

## Configuration
- API_PORT environment variable (default: 8080)
- Backwards compatible with existing configs
- FastAPI, uvicorn, pydantic>=2.0 dependencies

Co-Authored-By: AI Assistant <noreply@example.com>
2025-10-31 11:47:10 -04:00