docs: restructure roadmap with v1.0 stability milestone and v1.x features

Major changes:
- Simplified v0.4.0 to focus on smart date-based simulation API with automatic resume
- Added v1.0.0 milestone for production stability, testing, and validation
- Reorganized post-1.0 features into manageable v1.x releases:
  - v1.1.0: Position history & analytics
  - v1.2.0: Performance metrics & analytics
  - v1.3.0: Data management API
  - v1.4.0: Web dashboard UI
  - v1.5.0: Advanced configuration & customization
- Moved quantitative modeling to v2.0.0 (major version bump)

Key improvements:
- v0.4.0 now has single /simulate/to-date endpoint with idempotent behavior
- Explicit force_resimulate flag prevents accidental re-simulation
- v1.0.0 includes comprehensive quality gates and production readiness checklist
- Each v1.x release focuses on specific domain for easier implementation
This commit is contained in:
2025-11-01 12:23:11 -04:00
parent 0e739a9720
commit 4ac89f1724

View File

@@ -4,81 +4,441 @@ This document outlines planned features and improvements for the AI-Trader proje
## Release Planning
### v0.4.0 - Enhanced Simulation Management (Planned)
### v0.4.0 - Simplified Simulation Control (Planned)
**Focus:** Improved simulation control, resume capabilities, and performance analysis
**Focus:** Streamlined date-based simulation API with automatic resume from last completed date
#### Simulation Resume & Continuation
- **Resume from Last Completed Date** - API to continue simulations without re-running completed dates
- `POST /simulate/resume` - Resume last incomplete job or start from last completed date
- `POST /simulate/continue` - Extend existing simulation with new date range
- Query parameters to specify which model(s) to continue
- Automatic detection of last completed date per model
- Validation to prevent overlapping simulations
- Support for extending date ranges forward in time
- Use cases:
- Daily simulation updates (add today's date to existing run)
- Recovering from failed jobs (resume from interruption point)
- Incremental backtesting (extend historical analysis)
#### Core Simulation API
- **Smart Date-Based Simulation** - Simple API for running simulations to a target date
- `POST /simulate/to-date` - Run simulation up to specified date
- Request: `{"target_date": "2025-01-31", "models": ["model1", "model2"]}`
- Automatically starts from last completed date in position.jsonl
- Skips already-simulated dates by default (idempotent)
- Optional `force_resimulate: true` flag to re-run completed dates
- Returns: job_id, date range to be simulated, models included
- `GET /simulate/status/{model_name}` - Get last completed date and available date ranges
- Returns: last_simulated_date, next_available_date, data_coverage
- Behavior:
- If no position.jsonl exists: starts from initial_date in config or first available data
- If position.jsonl exists: continues from last completed date + 1 day
- Validates target_date has available price data
- Skips weekends automatically
- Prevents accidental re-simulation without explicit flag
#### Position History & Analysis
- **Position History Tracking** - Track position changes over time
- Query endpoint: `GET /positions/history?model=<name>&start_date=<date>&end_date=<date>`
- Timeline view of all trades and position changes
- Calculate holding periods and turnover rates
- Support for position snapshots at specific dates
#### Benefits
- **Simplicity** - Single endpoint for "simulate to this date"
- **Idempotent** - Safe to call repeatedly, won't duplicate work
- **Incremental Updates** - Easy daily simulation updates: `POST /simulate/to-date {"target_date": "today"}`
- **Explicit Re-simulation** - Require `force_resimulate` flag to prevent accidental data overwrites
- **Automatic Resume** - Handles crash recovery transparently
#### Performance Metrics
- **Advanced Performance Analytics** - Calculate standard trading metrics
- Sharpe ratio, Sortino ratio, maximum drawdown
- Win rate, average win/loss, profit factor
- Volatility and beta calculations
- Risk-adjusted returns
- Comparison across models
#### Example Usage
```bash
# Initial backtest (Jan 1 - Jan 31)
curl -X POST http://localhost:5000/simulate/to-date \
-d '{"target_date": "2025-01-31", "models": ["gpt-4"]}'
#### Data Management
- **Price Data Management API** - Endpoints for price data operations
- `GET /data/coverage` - Check date ranges available per symbol
- `POST /data/download` - Trigger manual price data downloads
- `GET /data/status` - Check download progress and rate limits
- `DELETE /data/range` - Remove price data for specific date ranges
# Daily update (simulate new trading day)
curl -X POST http://localhost:5000/simulate/to-date \
-d '{"target_date": "2025-02-01", "models": ["gpt-4"]}'
#### Web UI
- **Dashboard Interface** - Web-based monitoring and control interface
- Job management dashboard
- View active, pending, and completed jobs
- Start new simulations with form-based configuration
- Monitor job progress in real-time
- Cancel running jobs
- Results visualization
- Performance charts (P&L over time, cumulative returns)
- Position history timeline
- Model comparison views
- Trade log explorer with filtering
- Configuration management
- Model configuration editor
- Date range selection with calendar picker
- Price data coverage visualization
- Technical implementation
- Modern frontend framework (React, Vue.js, or Svelte)
- Real-time updates via WebSocket or SSE
- Responsive design for mobile access
- Chart library (Plotly.js, Chart.js, or Recharts)
- Served alongside API (single container deployment)
# Check status
curl http://localhost:5000/simulate/status/gpt-4
#### Development Infrastructure
- **Migration to uv Package Manager** - Modern Python package management
- Replace pip with uv for dependency management
- Create pyproject.toml with project metadata and dependencies
- Update Dockerfile to use uv for faster, more reliable builds
- Update development documentation and workflows
- Benefits:
- 10-100x faster dependency resolution and installation
- Better dependency locking and reproducibility
- Unified tool for virtual environments and package management
- Drop-in pip replacement with improved UX
# Force re-simulation (e.g., after config change)
curl -X POST http://localhost:5000/simulate/to-date \
-d '{"target_date": "2025-01-31", "models": ["gpt-4"], "force_resimulate": true}'
```
### v0.5.0 - Advanced Quantitative Modeling (Planned)
#### Technical Implementation
- Modify `main.py` and `api/app.py` to support target date parameter
- Update `BaseAgent.get_trading_dates()` to detect last completed date from position.jsonl
- Add validation: target_date must have price data available
- Add `force_resimulate` flag handling: clear position.jsonl range if enabled
- Preserve existing `/simulate` endpoint for backward compatibility
### v1.0.0 - Production Stability & Validation (Planned)
**Focus:** Comprehensive testing, documentation, and production readiness
#### Testing & Validation
- **Comprehensive Test Suite** - Full coverage of core functionality
- Unit tests for all agent components
- BaseAgent methods (initialize, run_trading_session, get_trading_dates)
- Position management and tracking
- Date range handling and validation
- MCP tool integration
- Integration tests for API endpoints
- All /simulate endpoints with various configurations
- /jobs endpoints (status, cancel, results)
- /models endpoint for listing available models
- Error handling and validation
- End-to-end simulation tests
- Multi-day trading simulations with mock data
- Multiple concurrent model execution
- Resume functionality after interruption
- Force re-simulation scenarios
- Anti-look-ahead validation tests
- Verify price data temporal boundaries
- Verify search results date filtering
- Confirm no future data leakage in system prompts
- Test coverage target: >80% code coverage
- Continuous Integration: GitHub Actions workflow for automated testing
#### Stability & Error Handling
- **Robust Error Recovery** - Handle failures gracefully
- Retry logic for transient API failures (already implemented, validate)
- Graceful degradation when MCP services are unavailable
- Database connection pooling and error handling
- File system error handling (disk full, permission errors)
- Comprehensive error messages with troubleshooting guidance
- Logging improvements:
- Structured logging with consistent format
- Log rotation and size management
- Error classification (user error vs. system error)
- Debug mode for detailed diagnostics
#### Performance & Scalability
- **Performance Optimization** - Ensure efficient resource usage
- Database query optimization and indexing
- Price data caching and efficient lookups
- Concurrent simulation handling validation
- Memory usage profiling and optimization
- Long-running simulation stability testing (30+ day ranges)
- Load testing: multiple concurrent API requests
- Resource limits and rate limiting considerations
#### Documentation & Examples
- **Production-Ready Documentation** - Complete user and developer guides
- API documentation improvements:
- OpenAPI/Swagger specification
- Interactive API documentation (Swagger UI)
- Example requests/responses for all endpoints
- Error response documentation
- User guides:
- Quickstart guide refinement
- Common workflows and recipes
- Troubleshooting guide expansion
- Best practices for model configuration
- Developer documentation:
- Architecture deep-dive
- Contributing guidelines
- Custom agent development guide
- MCP tool development guide
- Example configurations:
- Various model providers (OpenAI, Anthropic, local models)
- Different trading strategies
- Development vs. production setups
#### Security & Best Practices
- **Security Hardening** - Production security review
- API authentication/authorization review (if applicable)
- API key management best practices documentation
- Input validation and sanitization review
- SQL injection prevention validation
- Rate limiting for public deployments
- Security considerations documentation
- Dependency vulnerability scanning
- Docker image security scanning
#### Release Readiness
- **Production Deployment Support** - Everything needed for production use
- Production deployment checklist
- Health check endpoints improvements
- Monitoring and observability guidance
- Key metrics to track (job success rate, execution time, error rates)
- Integration with monitoring systems (Prometheus, Grafana)
- Alerting recommendations
- Backup and disaster recovery guidance
- Database migration strategy
- Upgrade path documentation (v0.x to v1.0)
- Version compatibility guarantees going forward
#### Quality Gates for v1.0.0 Release
All of the following must be met before v1.0.0 release:
- [ ] Test suite passes with >80% code coverage
- [ ] All critical and high-priority bugs resolved
- [ ] API documentation complete (OpenAPI spec)
- [ ] Production deployment guide complete
- [ ] Security review completed
- [ ] Performance benchmarks established
- [ ] Docker image published and tested
- [ ] Migration guide from v0.3.0 available
- [ ] At least 2 weeks of community testing (beta period)
- [ ] Zero known data integrity issues
### v1.1.0 - Position History & Analytics (Planned)
**Focus:** Track and analyze trading behavior over time
#### Position History API
- **Position Tracking Endpoints** - Query historical position changes
- `GET /positions/history` - Get position timeline for model(s)
- Query parameters: `model`, `start_date`, `end_date`, `symbol`
- Returns: chronological list of all position changes
- Pagination support for long histories
- `GET /positions/snapshot` - Get positions at specific date
- Query parameters: `model`, `date`
- Returns: portfolio state at end of trading day
- `GET /positions/summary` - Get position statistics
- Holdings duration (average, min, max)
- Turnover rate (daily, weekly, monthly)
- Most/least traded symbols
- Trading frequency patterns
#### Trade Analysis
- **Trade-Level Insights** - Analyze individual trades
- `GET /trades` - List all trades with filtering
- Filter by: model, date range, symbol, action (buy/sell)
- Sort by: date, profit/loss, volume
- `GET /trades/{trade_id}` - Get trade details
- Entry/exit prices and dates
- Holding period
- Realized profit/loss
- Context (what else was traded that day)
- Trade classification:
- Round trips (buy + sell of same stock)
- Partial positions (multiple entries/exits)
- Long-term holds vs. day trades
#### Benefits
- Understand agent trading patterns and behavior
- Identify strategy characteristics (momentum, mean reversion, etc.)
- Debug unexpected trading decisions
- Compare trading styles across models
### v1.2.0 - Performance Metrics & Analytics (Planned)
**Focus:** Calculate standard financial performance metrics
#### Risk-Adjusted Performance
- **Performance Metrics API** - Calculate trading performance statistics
- `GET /metrics/performance` - Overall performance metrics
- Query parameters: `model`, `start_date`, `end_date`
- Returns:
- Total return, annualized return
- Sharpe ratio (risk-adjusted return)
- Sortino ratio (downside risk-adjusted)
- Calmar ratio (return/max drawdown)
- Information ratio
- Alpha and beta (vs. NASDAQ 100 benchmark)
- `GET /metrics/risk` - Risk metrics
- Maximum drawdown (peak-to-trough decline)
- Value at Risk (VaR) at 95% and 99% confidence
- Conditional VaR (CVaR/Expected Shortfall)
- Volatility (daily, annualized)
- Downside deviation
#### Win/Loss Analysis
- **Trade Quality Metrics** - Analyze trade outcomes
- `GET /metrics/trades` - Trade statistics
- Win rate (% profitable trades)
- Average win vs. average loss
- Profit factor (gross profit / gross loss)
- Largest win/loss
- Win/loss streaks
- Expectancy (average $ per trade)
#### Comparison & Benchmarking
- **Model Comparison** - Compare multiple models
- `GET /metrics/compare` - Side-by-side comparison
- Query parameters: `models[]`, `start_date`, `end_date`
- Returns: all metrics for specified models
- Ranking by various metrics
- `GET /metrics/benchmark` - Compare to NASDAQ 100
- Outperformance/underperformance
- Correlation with market
- Beta calculation
#### Time Series Metrics
- **Rolling Performance** - Metrics over time
- `GET /metrics/timeseries` - Performance evolution
- Query parameters: `model`, `metric`, `window` (days)
- Returns: daily/weekly/monthly metric values
- Examples: rolling Sharpe ratio, rolling volatility
- Useful for detecting strategy degradation
#### Benefits
- Quantify agent performance objectively
- Identify risk characteristics
- Compare effectiveness of different AI models
- Detect performance changes over time
### v1.3.0 - Data Management API (Planned)
**Focus:** Price data operations and coverage management
#### Data Coverage Endpoints
- **Price Data Management** - Control and monitor price data
- `GET /data/coverage` - Check available data
- Query parameters: `symbol`, `start_date`, `end_date`
- Returns: date ranges with data per symbol
- Identify gaps in historical data
- Show last refresh date per symbol
- `GET /data/symbols` - List all available symbols
- NASDAQ 100 constituents
- Data availability per symbol
- Metadata (company name, sector)
#### Data Operations
- **Download & Refresh** - Manage price data updates
- `POST /data/download` - Trigger data download
- Query parameters: `symbol`, `start_date`, `end_date`
- Async operation (returns job_id)
- Respects Alpha Vantage rate limits
- Updates existing data or fills gaps
- `GET /data/download/status` - Check download progress
- Query parameters: `job_id`
- Returns: progress, completed symbols, errors
- `POST /data/refresh` - Update to latest available
- Automatically downloads new data for all symbols
- Scheduled refresh capability
#### Data Cleanup
- **Data Management Operations** - Clean and maintain data
- `DELETE /data/range` - Remove data for date range
- Query parameters: `symbol`, `start_date`, `end_date`
- Use case: remove corrupted data before re-download
- Validation: prevent deletion of in-use data
- `POST /data/validate` - Check data integrity
- Verify no missing dates (weekday gaps)
- Check for outliers/anomalies
- Returns: validation report with issues
#### Rate Limit Management
- **API Quota Tracking** - Monitor external API usage
- `GET /data/quota` - Check Alpha Vantage quota
- Calls remaining today
- Reset time
- Historical usage pattern
#### Benefits
- Visibility into data coverage
- Control over data refresh timing
- Ability to fill gaps in historical data
- Prevent simulations with incomplete data
### v1.4.0 - Web Dashboard UI (Planned)
**Focus:** Browser-based interface for monitoring and control
#### Core Dashboard
- **Web UI Foundation** - Modern web interface
- Technology stack:
- Frontend: React or Svelte (lightweight, modern)
- Charts: Recharts or Chart.js
- Real-time: Server-Sent Events (SSE) for updates
- Styling: Tailwind CSS for responsive design
- Deployment: Served alongside API (single container)
- URL structure: `/` (UI), `/api/` (API endpoints)
#### Job Management View
- **Simulation Control** - Monitor and start simulations
- Dashboard home page:
- Active jobs with real-time progress
- Recent completed jobs
- Failed jobs with error messages
- Start simulation form:
- Model selection (checkboxes)
- Date picker for target_date
- Force re-simulate toggle
- Submit button → launches job
- Job detail view:
- Live log streaming (SSE)
- Per-model progress
- Cancel job button
- Download logs
#### Results Visualization
- **Performance Charts** - Visual analysis of results
- Portfolio value over time (line chart)
- Multiple models on same chart
- Zoom/pan interactions
- Hover tooltips with daily values
- Cumulative returns comparison (line chart)
- Percentage-based for fair comparison
- Benchmark overlay (NASDAQ 100)
- Position timeline (stacked area chart)
- Show holdings composition over time
- Click to filter by symbol
- Trade log table:
- Sortable columns (date, symbol, action, amount)
- Filters (model, date range, symbol)
- Pagination for large histories
#### Configuration Management
- **Settings & Config** - Manage simulation settings
- Model configuration editor:
- Add/remove models
- Edit base URLs and API keys (masked)
- Enable/disable models
- Save to config file
- Data coverage visualization:
- Calendar heatmap showing data availability
- Identify gaps in price data
- Quick link to download missing dates
#### Real-Time Updates
- **Live Monitoring** - SSE-based updates
- Job status changes
- Progress percentage updates
- New trade notifications
- Error alerts
#### Benefits
- User-friendly interface (no curl commands needed)
- Visual feedback for long-running simulations
- Easy model comparison through charts
- Quick access to results without API queries
### v1.5.0 - Advanced Configuration & Customization (Planned)
**Focus:** Enhanced configuration options and extensibility
#### Agent Configuration
- **Advanced Agent Settings** - Fine-tune agent behavior
- Per-model configuration overrides:
- Custom system prompts
- Different max_steps per model
- Model-specific retry policies
- Temperature/top_p settings
- Trading constraints:
- Maximum position sizes per stock
- Sector exposure limits
- Cash reserve requirements
- Maximum trades per day
- Risk management rules:
- Stop-loss thresholds
- Take-profit targets
- Maximum portfolio concentration
#### Custom Trading Rules
- **Rule Engine** - Enforce trading constraints
- Pre-trade validation hooks:
- Check if trade violates constraints
- Reject or adjust trades automatically
- Post-trade validation:
- Ensure position limits respected
- Verify portfolio balance
- Configurable via JSON rules file
- API to query active rules
#### Multi-Strategy Support
- **Strategy Variants** - Run same model with different strategies
- Strategy configurations:
- Different initial cash amounts
- Different universes (e.g., tech stocks only)
- Different time periods for same model
- Compare strategy effectiveness
- A/B testing framework
#### Benefits
- Greater control over agent behavior
- Risk management beyond AI decision-making
- Strategy experimentation and optimization
- Support for diverse use cases
### v2.0.0 - Advanced Quantitative Modeling (Planned)
**Focus:** Enable AI agents to create, test, and deploy custom quantitative models
@@ -164,8 +524,14 @@ To propose a new feature:
- **v0.1.0** - Initial release with batch execution
- **v0.2.0** - Docker deployment support
- **v0.3.0** - REST API, on-demand downloads, database storage (current)
- **v0.4.0** - Enhanced simulation management (planned)
- **v0.5.0** - Advanced quantitative modeling (planned)
- **v0.4.0** - Simplified simulation control (planned)
- **v1.0.0** - Production stability & validation (planned)
- **v1.1.0** - Position history & analytics (planned)
- **v1.2.0** - Performance metrics & analytics (planned)
- **v1.3.0** - Data management API (planned)
- **v1.4.0** - Web dashboard UI (planned)
- **v1.5.0** - Advanced configuration & customization (planned)
- **v2.0.0** - Advanced quantitative modeling (planned)
---