docs: restructure roadmap with v1.0 stability milestone and v1.x features

Major changes: - Simplified v0.4.0 to focus on smart date-based simulation API with automatic resume - Added v1.0.0 milestone for production stability, testing, and validation - Reorganized post-1.0 features into manageable v1.x releases: - v1.1.0: Position history & analytics - v1.2.0: Performance metrics & analytics - v1.3.0: Data management API - v1.4.0: Web dashboard UI - v1.5.0: Advanced configuration & customization - Moved quantitative modeling to v2.0.0 (major version bump) Key improvements: - v0.4.0 now has single /simulate/to-date endpoint with idempotent behavior - Explicit force_resimulate flag prevents accidental re-simulation - v1.0.0 includes comprehensive quality gates and production readiness checklist - Each v1.x release focuses on specific domain for easier implementation
2026-04-01 17:17:24 -04:00 · 2025-11-01 12:23:11 -04:00
parent 0e739a9720
commit 4ac89f1724
1 changed files with 435 additions and 69 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -4,81 +4,441 @@ This document outlines planned features and improvements for the AI-Trader proje

 ## Release Planning

-### v0.4.0 - Enhanced Simulation Management (Planned)
+### v0.4.0 - Simplified Simulation Control (Planned)

-**Focus:** Improved simulation control, resume capabilities, and performance analysis
+**Focus:** Streamlined date-based simulation API with automatic resume from last completed date

-#### Simulation Resume & Continuation
- **Resume from Last Completed Date** - API to continue simulations without re-running completed dates
-  - `POST /simulate/resume` - Resume last incomplete job or start from last completed date
-  - `POST /simulate/continue` - Extend existing simulation with new date range
-  - Query parameters to specify which model(s) to continue
-  - Automatic detection of last completed date per model
-  - Validation to prevent overlapping simulations
-  - Support for extending date ranges forward in time
-  - Use cases:
-    - Daily simulation updates (add today's date to existing run)
-    - Recovering from failed jobs (resume from interruption point)
-    - Incremental backtesting (extend historical analysis)
+#### Core Simulation API
+- **Smart Date-Based Simulation** - Simple API for running simulations to a target date
+  - `POST /simulate/to-date` - Run simulation up to specified date
+    - Request: `{"target_date": "2025-01-31", "models": ["model1", "model2"]}`
+    - Automatically starts from last completed date in position.jsonl
+    - Skips already-simulated dates by default (idempotent)
+    - Optional `force_resimulate: true` flag to re-run completed dates
+    - Returns: job_id, date range to be simulated, models included
+  - `GET /simulate/status/{model_name}` - Get last completed date and available date ranges
+    - Returns: last_simulated_date, next_available_date, data_coverage
+  - Behavior:
+    - If no position.jsonl exists: starts from initial_date in config or first available data
+    - If position.jsonl exists: continues from last completed date + 1 day
+    - Validates target_date has available price data
+    - Skips weekends automatically
+    - Prevents accidental re-simulation without explicit flag

-#### Position History & Analysis
- **Position History Tracking** - Track position changes over time
-  - Query endpoint: `GET /positions/history?model=<name>&start_date=<date>&end_date=<date>`
-  - Timeline view of all trades and position changes
-  - Calculate holding periods and turnover rates
-  - Support for position snapshots at specific dates
+#### Benefits
+- **Simplicity** - Single endpoint for "simulate to this date"
+- **Idempotent** - Safe to call repeatedly, won't duplicate work
+- **Incremental Updates** - Easy daily simulation updates: `POST /simulate/to-date {"target_date": "today"}`
+- **Explicit Re-simulation** - Require `force_resimulate` flag to prevent accidental data overwrites
+- **Automatic Resume** - Handles crash recovery transparently

-#### Performance Metrics
- **Advanced Performance Analytics** - Calculate standard trading metrics
-  - Sharpe ratio, Sortino ratio, maximum drawdown
-  - Win rate, average win/loss, profit factor
-  - Volatility and beta calculations
-  - Risk-adjusted returns
-  - Comparison across models
+#### Example Usage
+```bash
+# Initial backtest (Jan 1 - Jan 31)
+curl -X POST http://localhost:5000/simulate/to-date \
+  -d '{"target_date": "2025-01-31", "models": ["gpt-4"]}'

-#### Data Management
- **Price Data Management API** - Endpoints for price data operations
-  - `GET /data/coverage` - Check date ranges available per symbol
-  - `POST /data/download` - Trigger manual price data downloads
-  - `GET /data/status` - Check download progress and rate limits
-  - `DELETE /data/range` - Remove price data for specific date ranges
+# Daily update (simulate new trading day)
+curl -X POST http://localhost:5000/simulate/to-date \
+  -d '{"target_date": "2025-02-01", "models": ["gpt-4"]}'

-#### Web UI
- **Dashboard Interface** - Web-based monitoring and control interface
-  - Job management dashboard
-    - View active, pending, and completed jobs
-    - Start new simulations with form-based configuration
-    - Monitor job progress in real-time
-    - Cancel running jobs
-  - Results visualization
-    - Performance charts (P&L over time, cumulative returns)
-    - Position history timeline
-    - Model comparison views
-    - Trade log explorer with filtering
-  - Configuration management
-    - Model configuration editor
-    - Date range selection with calendar picker
-    - Price data coverage visualization
-  - Technical implementation
-    - Modern frontend framework (React, Vue.js, or Svelte)
-    - Real-time updates via WebSocket or SSE
-    - Responsive design for mobile access
-    - Chart library (Plotly.js, Chart.js, or Recharts)
-    - Served alongside API (single container deployment)
+# Check status
+curl http://localhost:5000/simulate/status/gpt-4

-#### Development Infrastructure
- **Migration to uv Package Manager** - Modern Python package management
-  - Replace pip with uv for dependency management
-  - Create pyproject.toml with project metadata and dependencies
-  - Update Dockerfile to use uv for faster, more reliable builds
-  - Update development documentation and workflows
-  - Benefits:
-    - 10-100x faster dependency resolution and installation
-    - Better dependency locking and reproducibility
-    - Unified tool for virtual environments and package management
-    - Drop-in pip replacement with improved UX
+# Force re-simulation (e.g., after config change)
+curl -X POST http://localhost:5000/simulate/to-date \
+  -d '{"target_date": "2025-01-31", "models": ["gpt-4"], "force_resimulate": true}'
+```

-### v0.5.0 - Advanced Quantitative Modeling (Planned)
+#### Technical Implementation
+- Modify `main.py` and `api/app.py` to support target date parameter
+- Update `BaseAgent.get_trading_dates()` to detect last completed date from position.jsonl
+- Add validation: target_date must have price data available
+- Add `force_resimulate` flag handling: clear position.jsonl range if enabled
+- Preserve existing `/simulate` endpoint for backward compatibility
+
+### v1.0.0 - Production Stability & Validation (Planned)
+
+**Focus:** Comprehensive testing, documentation, and production readiness
+
+#### Testing & Validation
+- **Comprehensive Test Suite** - Full coverage of core functionality
+  - Unit tests for all agent components
+    - BaseAgent methods (initialize, run_trading_session, get_trading_dates)
+    - Position management and tracking
+    - Date range handling and validation
+    - MCP tool integration
+  - Integration tests for API endpoints
+    - All /simulate endpoints with various configurations
+    - /jobs endpoints (status, cancel, results)
+    - /models endpoint for listing available models
+    - Error handling and validation
+  - End-to-end simulation tests
+    - Multi-day trading simulations with mock data
+    - Multiple concurrent model execution
+    - Resume functionality after interruption
+    - Force re-simulation scenarios
+  - Anti-look-ahead validation tests
+    - Verify price data temporal boundaries
+    - Verify search results date filtering
+    - Confirm no future data leakage in system prompts
+  - Test coverage target: >80% code coverage
+  - Continuous Integration: GitHub Actions workflow for automated testing
+
+#### Stability & Error Handling
+- **Robust Error Recovery** - Handle failures gracefully
+  - Retry logic for transient API failures (already implemented, validate)
+  - Graceful degradation when MCP services are unavailable
+  - Database connection pooling and error handling
+  - File system error handling (disk full, permission errors)
+  - Comprehensive error messages with troubleshooting guidance
+  - Logging improvements:
+    - Structured logging with consistent format
+    - Log rotation and size management
+    - Error classification (user error vs. system error)
+    - Debug mode for detailed diagnostics
+
+#### Performance & Scalability
+- **Performance Optimization** - Ensure efficient resource usage
+  - Database query optimization and indexing
+  - Price data caching and efficient lookups
+  - Concurrent simulation handling validation
+  - Memory usage profiling and optimization
+  - Long-running simulation stability testing (30+ day ranges)
+  - Load testing: multiple concurrent API requests
+  - Resource limits and rate limiting considerations
+
+#### Documentation & Examples
+- **Production-Ready Documentation** - Complete user and developer guides
+  - API documentation improvements:
+    - OpenAPI/Swagger specification
+    - Interactive API documentation (Swagger UI)
+    - Example requests/responses for all endpoints
+    - Error response documentation
+  - User guides:
+    - Quickstart guide refinement
+    - Common workflows and recipes
+    - Troubleshooting guide expansion
+    - Best practices for model configuration
+  - Developer documentation:
+    - Architecture deep-dive
+    - Contributing guidelines
+    - Custom agent development guide
+    - MCP tool development guide
+  - Example configurations:
+    - Various model providers (OpenAI, Anthropic, local models)
+    - Different trading strategies
+    - Development vs. production setups
+
+#### Security & Best Practices
+- **Security Hardening** - Production security review
+  - API authentication/authorization review (if applicable)
+  - API key management best practices documentation
+  - Input validation and sanitization review
+  - SQL injection prevention validation
+  - Rate limiting for public deployments
+  - Security considerations documentation
+  - Dependency vulnerability scanning
+  - Docker image security scanning
+
+#### Release Readiness
+- **Production Deployment Support** - Everything needed for production use
+  - Production deployment checklist
+  - Health check endpoints improvements
+  - Monitoring and observability guidance
+    - Key metrics to track (job success rate, execution time, error rates)
+    - Integration with monitoring systems (Prometheus, Grafana)
+    - Alerting recommendations
+  - Backup and disaster recovery guidance
+  - Database migration strategy
+  - Upgrade path documentation (v0.x to v1.0)
+  - Version compatibility guarantees going forward
+
+#### Quality Gates for v1.0.0 Release
+All of the following must be met before v1.0.0 release:
+- [ ] Test suite passes with >80% code coverage
+- [ ] All critical and high-priority bugs resolved
+- [ ] API documentation complete (OpenAPI spec)
+- [ ] Production deployment guide complete
+- [ ] Security review completed
+- [ ] Performance benchmarks established
+- [ ] Docker image published and tested
+- [ ] Migration guide from v0.3.0 available
+- [ ] At least 2 weeks of community testing (beta period)
+- [ ] Zero known data integrity issues
+
+### v1.1.0 - Position History & Analytics (Planned)
+
+**Focus:** Track and analyze trading behavior over time
+
+#### Position History API
+- **Position Tracking Endpoints** - Query historical position changes
+  - `GET /positions/history` - Get position timeline for model(s)
+    - Query parameters: `model`, `start_date`, `end_date`, `symbol`
+    - Returns: chronological list of all position changes
+    - Pagination support for long histories
+  - `GET /positions/snapshot` - Get positions at specific date
+    - Query parameters: `model`, `date`
+    - Returns: portfolio state at end of trading day
+  - `GET /positions/summary` - Get position statistics
+    - Holdings duration (average, min, max)
+    - Turnover rate (daily, weekly, monthly)
+    - Most/least traded symbols
+    - Trading frequency patterns
+
+#### Trade Analysis
+- **Trade-Level Insights** - Analyze individual trades
+  - `GET /trades` - List all trades with filtering
+    - Filter by: model, date range, symbol, action (buy/sell)
+    - Sort by: date, profit/loss, volume
+  - `GET /trades/{trade_id}` - Get trade details
+    - Entry/exit prices and dates
+    - Holding period
+    - Realized profit/loss
+    - Context (what else was traded that day)
+  - Trade classification:
+    - Round trips (buy + sell of same stock)
+    - Partial positions (multiple entries/exits)
+    - Long-term holds vs. day trades
+
+#### Benefits
+- Understand agent trading patterns and behavior
+- Identify strategy characteristics (momentum, mean reversion, etc.)
+- Debug unexpected trading decisions
+- Compare trading styles across models
+
+### v1.2.0 - Performance Metrics & Analytics (Planned)
+
+**Focus:** Calculate standard financial performance metrics
+
+#### Risk-Adjusted Performance
+- **Performance Metrics API** - Calculate trading performance statistics
+  - `GET /metrics/performance` - Overall performance metrics
+    - Query parameters: `model`, `start_date`, `end_date`
+    - Returns:
+      - Total return, annualized return
+      - Sharpe ratio (risk-adjusted return)
+      - Sortino ratio (downside risk-adjusted)
+      - Calmar ratio (return/max drawdown)
+      - Information ratio
+      - Alpha and beta (vs. NASDAQ 100 benchmark)
+  - `GET /metrics/risk` - Risk metrics
+    - Maximum drawdown (peak-to-trough decline)
+    - Value at Risk (VaR) at 95% and 99% confidence
+    - Conditional VaR (CVaR/Expected Shortfall)
+    - Volatility (daily, annualized)
+    - Downside deviation
+
+#### Win/Loss Analysis
+- **Trade Quality Metrics** - Analyze trade outcomes
+  - `GET /metrics/trades` - Trade statistics
+    - Win rate (% profitable trades)
+    - Average win vs. average loss
+    - Profit factor (gross profit / gross loss)
+    - Largest win/loss
+    - Win/loss streaks
+    - Expectancy (average $ per trade)
+
+#### Comparison & Benchmarking
+- **Model Comparison** - Compare multiple models
+  - `GET /metrics/compare` - Side-by-side comparison
+    - Query parameters: `models[]`, `start_date`, `end_date`
+    - Returns: all metrics for specified models
+    - Ranking by various metrics
+  - `GET /metrics/benchmark` - Compare to NASDAQ 100
+    - Outperformance/underperformance
+    - Correlation with market
+    - Beta calculation
+
+#### Time Series Metrics
+- **Rolling Performance** - Metrics over time
+  - `GET /metrics/timeseries` - Performance evolution
+    - Query parameters: `model`, `metric`, `window` (days)
+    - Returns: daily/weekly/monthly metric values
+    - Examples: rolling Sharpe ratio, rolling volatility
+    - Useful for detecting strategy degradation
+
+#### Benefits
+- Quantify agent performance objectively
+- Identify risk characteristics
+- Compare effectiveness of different AI models
+- Detect performance changes over time
+
+### v1.3.0 - Data Management API (Planned)
+
+**Focus:** Price data operations and coverage management
+
+#### Data Coverage Endpoints
+- **Price Data Management** - Control and monitor price data
+  - `GET /data/coverage` - Check available data
+    - Query parameters: `symbol`, `start_date`, `end_date`
+    - Returns: date ranges with data per symbol
+    - Identify gaps in historical data
+    - Show last refresh date per symbol
+  - `GET /data/symbols` - List all available symbols
+    - NASDAQ 100 constituents
+    - Data availability per symbol
+    - Metadata (company name, sector)
+
+#### Data Operations
+- **Download & Refresh** - Manage price data updates
+  - `POST /data/download` - Trigger data download
+    - Query parameters: `symbol`, `start_date`, `end_date`
+    - Async operation (returns job_id)
+    - Respects Alpha Vantage rate limits
+    - Updates existing data or fills gaps
+  - `GET /data/download/status` - Check download progress
+    - Query parameters: `job_id`
+    - Returns: progress, completed symbols, errors
+  - `POST /data/refresh` - Update to latest available
+    - Automatically downloads new data for all symbols
+    - Scheduled refresh capability
+
+#### Data Cleanup
+- **Data Management Operations** - Clean and maintain data
+  - `DELETE /data/range` - Remove data for date range
+    - Query parameters: `symbol`, `start_date`, `end_date`
+    - Use case: remove corrupted data before re-download
+    - Validation: prevent deletion of in-use data
+  - `POST /data/validate` - Check data integrity
+    - Verify no missing dates (weekday gaps)
+    - Check for outliers/anomalies
+    - Returns: validation report with issues
+
+#### Rate Limit Management
+- **API Quota Tracking** - Monitor external API usage
+  - `GET /data/quota` - Check Alpha Vantage quota
+    - Calls remaining today
+    - Reset time
+    - Historical usage pattern
+
+#### Benefits
+- Visibility into data coverage
+- Control over data refresh timing
+- Ability to fill gaps in historical data
+- Prevent simulations with incomplete data
+
+### v1.4.0 - Web Dashboard UI (Planned)
+
+**Focus:** Browser-based interface for monitoring and control
+
+#### Core Dashboard
+- **Web UI Foundation** - Modern web interface
+  - Technology stack:
+    - Frontend: React or Svelte (lightweight, modern)
+    - Charts: Recharts or Chart.js
+    - Real-time: Server-Sent Events (SSE) for updates
+    - Styling: Tailwind CSS for responsive design
+  - Deployment: Served alongside API (single container)
+  - URL structure: `/` (UI), `/api/` (API endpoints)
+
+#### Job Management View
+- **Simulation Control** - Monitor and start simulations
+  - Dashboard home page:
+    - Active jobs with real-time progress
+    - Recent completed jobs
+    - Failed jobs with error messages
+  - Start simulation form:
+    - Model selection (checkboxes)
+    - Date picker for target_date
+    - Force re-simulate toggle
+    - Submit button → launches job
+  - Job detail view:
+    - Live log streaming (SSE)
+    - Per-model progress
+    - Cancel job button
+    - Download logs
+
+#### Results Visualization
+- **Performance Charts** - Visual analysis of results
+  - Portfolio value over time (line chart)
+    - Multiple models on same chart
+    - Zoom/pan interactions
+    - Hover tooltips with daily values
+  - Cumulative returns comparison (line chart)
+    - Percentage-based for fair comparison
+    - Benchmark overlay (NASDAQ 100)
+  - Position timeline (stacked area chart)
+    - Show holdings composition over time
+    - Click to filter by symbol
+  - Trade log table:
+    - Sortable columns (date, symbol, action, amount)
+    - Filters (model, date range, symbol)
+    - Pagination for large histories
+
+#### Configuration Management
+- **Settings & Config** - Manage simulation settings
+  - Model configuration editor:
+    - Add/remove models
+    - Edit base URLs and API keys (masked)
+    - Enable/disable models
+    - Save to config file
+  - Data coverage visualization:
+    - Calendar heatmap showing data availability
+    - Identify gaps in price data
+    - Quick link to download missing dates
+
+#### Real-Time Updates
+- **Live Monitoring** - SSE-based updates
+  - Job status changes
+  - Progress percentage updates
+  - New trade notifications
+  - Error alerts
+
+#### Benefits
+- User-friendly interface (no curl commands needed)
+- Visual feedback for long-running simulations
+- Easy model comparison through charts
+- Quick access to results without API queries
+
+### v1.5.0 - Advanced Configuration & Customization (Planned)
+
+**Focus:** Enhanced configuration options and extensibility
+
+#### Agent Configuration
+- **Advanced Agent Settings** - Fine-tune agent behavior
+  - Per-model configuration overrides:
+    - Custom system prompts
+    - Different max_steps per model
+    - Model-specific retry policies
+    - Temperature/top_p settings
+  - Trading constraints:
+    - Maximum position sizes per stock
+    - Sector exposure limits
+    - Cash reserve requirements
+    - Maximum trades per day
+  - Risk management rules:
+    - Stop-loss thresholds
+    - Take-profit targets
+    - Maximum portfolio concentration
+
+#### Custom Trading Rules
+- **Rule Engine** - Enforce trading constraints
+  - Pre-trade validation hooks:
+    - Check if trade violates constraints
+    - Reject or adjust trades automatically
+  - Post-trade validation:
+    - Ensure position limits respected
+    - Verify portfolio balance
+  - Configurable via JSON rules file
+  - API to query active rules
+
+#### Multi-Strategy Support
+- **Strategy Variants** - Run same model with different strategies
+  - Strategy configurations:
+    - Different initial cash amounts
+    - Different universes (e.g., tech stocks only)
+    - Different time periods for same model
+  - Compare strategy effectiveness
+  - A/B testing framework
+
+#### Benefits
+- Greater control over agent behavior
+- Risk management beyond AI decision-making
+- Strategy experimentation and optimization
+- Support for diverse use cases
+
+### v2.0.0 - Advanced Quantitative Modeling (Planned)

 **Focus:** Enable AI agents to create, test, and deploy custom quantitative models

@@ -164,8 +524,14 @@ To propose a new feature:
 - **v0.1.0** - Initial release with batch execution
 - **v0.2.0** - Docker deployment support
 - **v0.3.0** - REST API, on-demand downloads, database storage (current)
- **v0.4.0** - Enhanced simulation management (planned)
- **v0.5.0** - Advanced quantitative modeling (planned)
+- **v0.4.0** - Simplified simulation control (planned)
+- **v1.0.0** - Production stability & validation (planned)
+- **v1.1.0** - Position history & analytics (planned)
+- **v1.2.0** - Performance metrics & analytics (planned)
+- **v1.3.0** - Data management API (planned)
+- **v1.4.0** - Web dashboard UI (planned)
+- **v1.5.0** - Advanced configuration & customization (planned)
+- **v2.0.0** - Advanced quantitative modeling (planned)

 ---