feat: transform to REST API service with SQLite persistence (v0.3.0)

Major architecture transformation from batch-only to API service with
database persistence for Windmill integration.

## REST API Implementation
- POST /simulate/trigger - Start simulation jobs
- GET /simulate/status/{job_id} - Monitor job progress
- GET /results - Query results with filters (job_id, date, model)
- GET /health - Service health checks

## Database Layer
- SQLite persistence with 6 tables (jobs, job_details, positions,
  holdings, reasoning_logs, tool_usage)
- Foreign key constraints with cascade deletes
- Replaces JSONL file storage

## Backend Components
- JobManager: Job lifecycle management with concurrency control
- RuntimeConfigManager: Thread-safe isolated runtime configs
- ModelDayExecutor: Single model-day execution engine
- SimulationWorker: Date-sequential, model-parallel orchestration

## Testing
- 102 unit and integration tests (85% coverage)
- Database: 98% coverage
- Job manager: 98% coverage
- API endpoints: 81% coverage
- Pydantic models: 100% coverage
- TDD approach throughout

## Docker Deployment
- Dual-mode: API server (persistent) + batch (one-time)
- Health checks with 30s interval
- Volume persistence for database and logs
- Separate entrypoints for each mode

## Validation Tools
- scripts/validate_docker_build.sh - Build validation
- scripts/test_api_endpoints.sh - Complete API testing
- scripts/test_batch_mode.sh - Batch mode validation
- DOCKER_API.md - Deployment guide
- TESTING_GUIDE.md - Testing procedures

## Configuration
- API_PORT environment variable (default: 8080)
- Backwards compatible with existing configs
- FastAPI, uvicorn, pydantic>=2.0 dependencies

Co-Authored-By: AI Assistant <noreply@example.com>
This commit is contained in:
2025-10-31 11:47:10 -04:00
parent 5da02b4ba0
commit fb9583b374
45 changed files with 13775 additions and 18 deletions

View File

@@ -24,6 +24,11 @@ SEARCH_HTTP_PORT=8001
TRADE_HTTP_PORT=8002
GETPRICE_HTTP_PORT=8003
# API Server Port (exposed on host machine for REST API)
# Container always uses 8080 internally
# Used for Windmill integration and external API access
API_PORT=8080
# Web Interface Host Port (exposed on host machine)
# Container always uses 8888 internally
WEB_HTTP_PORT=8888

View File

@@ -7,6 +7,92 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
## [0.3.0] - 2025-10-31
### Added - API Service Transformation
- **REST API Service** - Complete FastAPI implementation for external orchestration
- `POST /simulate/trigger` - Trigger simulation jobs with config, date range, and models
- `GET /simulate/status/{job_id}` - Query job progress and execution details
- `GET /results` - Retrieve simulation results with filtering (job_id, date, model)
- `GET /health` - Service health check with database connectivity verification
- **SQLite Database** - Complete persistence layer replacing JSONL files
- Jobs table - Job metadata and lifecycle tracking
- Job details table - Per model-day execution status
- Positions table - Trading position records with P&L
- Holdings table - Portfolio holdings breakdown
- Reasoning logs table - AI decision reasoning history
- Tool usage table - MCP tool usage statistics
- **Backend Components**
- JobManager - Job lifecycle management with concurrent job prevention
- RuntimeConfigManager - Isolated runtime configs for thread-safe execution
- ModelDayExecutor - Single model-day execution engine
- SimulationWorker - Job orchestration with date-sequential, model-parallel execution
- **Comprehensive Test Suite**
- 102 unit and integration tests (85% coverage)
- 19 database tests (98% coverage)
- 23 job manager tests (98% coverage)
- 10 model executor tests (84% coverage)
- 20 API endpoint tests (81% coverage)
- 20 Pydantic model tests (100% coverage)
- 10 runtime manager tests (89% coverage)
- **Docker Dual-Mode Deployment**
- API server mode - Persistent REST API service with health checks
- Batch mode - One-time simulation execution (backwards compatible)
- Separate entrypoints for each mode
- Health check configuration (30s interval, 3 retries)
- Volume persistence for SQLite database and logs
- **Validation & Testing Tools**
- `scripts/validate_docker_build.sh` - Docker build and startup validation
- `scripts/test_api_endpoints.sh` - Complete API endpoint testing suite
- `scripts/test_batch_mode.sh` - Batch mode execution validation
- TESTING_GUIDE.md - Comprehensive testing procedures and troubleshooting
- **Documentation**
- DOCKER_API.md - API deployment guide with examples
- TESTING_GUIDE.md - Validation procedures and troubleshooting
- API endpoint documentation with request/response examples
- Windmill integration patterns and examples
### Changed
- **Architecture** - Transformed from batch-only to API service with database persistence
- **Data Storage** - Migrated from JSONL files to SQLite relational database
- **Deployment** - Added dual-mode Docker deployment (API server + batch)
- **Configuration** - Added API_PORT environment variable (default: 8080)
- **Requirements** - Added fastapi>=0.120.0, uvicorn[standard]>=0.27.0, pydantic>=2.0.0
- **Docker Compose** - Split into two services (ai-trader-api and ai-trader-batch)
- **Dockerfile** - Added port 8080 exposure for API server
- **.env.example** - Added API server configuration
### Technical Implementation
- **Test-Driven Development** - All components written with tests first
- **Mock-based Testing** - Avoid heavy dependencies in unit tests
- **Pydantic V2** - Type-safe request/response validation
- **Foreign Key Constraints** - Database referential integrity with cascade deletes
- **Thread-safe Execution** - Isolated runtime configs per model-day
- **Background Job Execution** - ThreadPoolExecutor for parallel model execution
- **Automatic Status Transitions** - Job status updates based on model-day completion
### Performance & Quality
- **Code Coverage** - 85% overall (84.63% measured)
- Database layer: 98%
- Job manager: 98%
- Pydantic models: 100%
- Runtime manager: 89%
- Model executor: 84%
- FastAPI app: 81%
- **Test Execution** - 102 tests in ~2.5 seconds
- **Zero Test Failures** - All tests passing (threading tests excluded)
### Integration Ready
- **Windmill.dev** - HTTP-based integration with polling support
- **External Orchestration** - RESTful API for workflow automation
- **Monitoring** - Health checks and status tracking
- **Persistence** - SQLite database survives container restarts
### Backwards Compatibility
- **Batch Mode** - Original batch functionality preserved via Docker profile
- **Configuration** - Existing config files still work
- **Data Migration** - No automatic migration (fresh start recommended)
## [0.2.0] - 2025-10-31
### Added
@@ -113,6 +199,7 @@ For future releases, use this template:
---
[Unreleased]: https://github.com/Xe138/AI-Trader/compare/v0.2.0...HEAD
[Unreleased]: https://github.com/Xe138/AI-Trader/compare/v0.3.0...HEAD
[0.3.0]: https://github.com/Xe138/AI-Trader/compare/v0.2.0...v0.3.0
[0.2.0]: https://github.com/Xe138/AI-Trader/compare/v0.1.0...v0.2.0
[0.1.0]: https://github.com/Xe138/AI-Trader/releases/tag/v0.1.0

347
DOCKER_API.md Normal file
View File

@@ -0,0 +1,347 @@
# Docker API Server Deployment
This guide explains how to run AI-Trader as a persistent REST API server using Docker for Windmill.dev integration.
## Quick Start
### 1. Environment Setup
```bash
# Copy environment template
cp .env.example .env
# Edit .env and add your API keys:
# - OPENAI_API_KEY
# - ALPHAADVANTAGE_API_KEY
# - JINA_API_KEY
```
### 2. Start API Server
```bash
# Start in API mode (default)
docker-compose up -d ai-trader-api
# View logs
docker-compose logs -f ai-trader-api
# Check health
curl http://localhost:8080/health
```
### 3. Test API Endpoints
```bash
# Health check
curl http://localhost:8080/health
# Trigger simulation
curl -X POST http://localhost:8080/simulate/trigger \
-H "Content-Type: application/json" \
-d '{
"config_path": "/app/configs/default_config.json",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4"]
}'
# Check job status (replace JOB_ID)
curl http://localhost:8080/simulate/status/JOB_ID
# Query results
curl http://localhost:8080/results?date=2025-01-16
```
## Architecture
### Two Deployment Modes
**API Server Mode** (Windmill integration):
- REST API on port 8080
- Background job execution
- Persistent SQLite database
- Continuous uptime with health checks
- Start with: `docker-compose up -d ai-trader-api`
**Batch Mode** (one-time simulation):
- Command-line execution
- Runs to completion then exits
- Config file driven
- Start with: `docker-compose --profile batch up ai-trader-batch`
### Port Configuration
| Service | Internal Port | Default Host Port | Environment Variable |
|---------|--------------|-------------------|---------------------|
| API Server | 8080 | 8080 | `API_PORT` |
| Math MCP | 8000 | 8000 | `MATH_HTTP_PORT` |
| Search MCP | 8001 | 8001 | `SEARCH_HTTP_PORT` |
| Trade MCP | 8002 | 8002 | `TRADE_HTTP_PORT` |
| Price MCP | 8003 | 8003 | `GETPRICE_HTTP_PORT` |
| Web Dashboard | 8888 | 8888 | `WEB_HTTP_PORT` |
## API Endpoints
### POST /simulate/trigger
Trigger a new simulation job.
**Request:**
```json
{
"config_path": "/app/configs/default_config.json",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4", "claude-3.7-sonnet"]
}
```
**Response:**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"total_model_days": 4,
"message": "Simulation job created and started"
}
```
### GET /simulate/status/{job_id}
Get job progress and status.
**Response:**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "running",
"progress": {
"total_model_days": 4,
"completed": 2,
"failed": 0,
"pending": 2
},
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4", "claude-3.7-sonnet"],
"created_at": "2025-01-16T10:00:00Z",
"details": [
{
"date": "2025-01-16",
"model": "gpt-4",
"status": "completed",
"started_at": "2025-01-16T10:00:05Z",
"completed_at": "2025-01-16T10:05:23Z",
"duration_seconds": 318.5
}
]
}
```
### GET /results
Query simulation results with optional filters.
**Parameters:**
- `job_id` (optional): Filter by job UUID
- `date` (optional): Filter by trading date (YYYY-MM-DD)
- `model` (optional): Filter by model signature
**Response:**
```json
{
"results": [
{
"id": 1,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"date": "2025-01-16",
"model": "gpt-4",
"action_id": 1,
"action_type": "buy",
"symbol": "AAPL",
"amount": 10,
"price": 250.50,
"cash": 7495.00,
"portfolio_value": 10000.00,
"daily_profit": 0.00,
"daily_return_pct": 0.00,
"holdings": [
{"symbol": "AAPL", "quantity": 10},
{"symbol": "CASH", "quantity": 7495.00}
]
}
],
"count": 1
}
```
### GET /health
Service health check.
**Response:**
```json
{
"status": "healthy",
"database": "connected",
"timestamp": "2025-01-16T10:00:00Z"
}
```
## Volume Mounts
Data persists across container restarts via volume mounts:
```yaml
volumes:
- ./data:/app/data # SQLite database, price data
- ./logs:/app/logs # Application logs
- ./configs:/app/configs # Configuration files
```
**Key files:**
- `/app/data/jobs.db` - SQLite database with job history and results
- `/app/data/merged.jsonl` - Cached price data (fetched on first run)
- `/app/logs/` - Application and MCP service logs
## Configuration
### Custom Config File
Place config files in `./configs/` directory:
```json
{
"agent_type": "BaseAgent",
"date_range": {
"init_date": "2025-01-01",
"end_date": "2025-01-31"
},
"models": [
{
"name": "GPT-4",
"basemodel": "gpt-4",
"signature": "gpt-4",
"enabled": true
}
],
"agent_config": {
"max_steps": 30,
"initial_cash": 10000.0
}
}
```
Reference in API calls: `/app/configs/your_config.json`
## Troubleshooting
### Check Container Status
```bash
docker-compose ps
docker-compose logs ai-trader-api
```
### Health Check Failing
```bash
# Check if services started
docker exec ai-trader-api ps aux
# Test internal health
docker exec ai-trader-api curl http://localhost:8080/health
# Check MCP services
docker exec ai-trader-api curl http://localhost:8000/health
```
### Database Issues
```bash
# View database
docker exec ai-trader-api sqlite3 data/jobs.db ".tables"
# Reset database (WARNING: deletes all data)
rm ./data/jobs.db
docker-compose restart ai-trader-api
```
### Port Conflicts
If ports are already in use, edit `.env`:
```bash
API_PORT=9080 # Change to available port
```
## Windmill Integration
Example Windmill workflow step:
```python
import httpx
def trigger_simulation(
api_url: str,
config_path: str,
start_date: str,
end_date: str,
models: list[str]
):
"""Trigger AI trading simulation via API."""
response = httpx.post(
f"{api_url}/simulate/trigger",
json={
"config_path": config_path,
"date_range": [start_date, end_date],
"models": models
},
timeout=30.0
)
response.raise_for_status()
return response.json()
def check_status(api_url: str, job_id: str):
"""Check simulation job status."""
response = httpx.get(
f"{api_url}/simulate/status/{job_id}",
timeout=10.0
)
response.raise_for_status()
return response.json()
```
## Production Deployment
### Use Docker Hub Image
```yaml
# docker-compose.yml
services:
ai-trader-api:
image: ghcr.io/xe138/ai-trader:latest
# ... rest of config
```
### Build Locally
```yaml
# docker-compose.yml
services:
ai-trader-api:
build: .
# ... rest of config
```
### Environment Security
- Never commit `.env` to version control
- Use secrets management in production (Docker secrets, Kubernetes secrets, etc.)
- Rotate API keys regularly
## Monitoring
### Prometheus Metrics (Future)
Metrics endpoint planned: `GET /metrics`
### Log Aggregation
- Container logs: `docker-compose logs -f`
- Application logs: `./logs/api.log`
- MCP service logs: `./logs/mcp_*.log`
## Scaling Considerations
- Single-job concurrency enforced by database lock
- For parallel simulations, deploy multiple instances with separate databases
- Consider load balancer for high-availability setup
- Database size grows with number of simulations (plan for cleanup/archival)

View File

@@ -24,11 +24,11 @@ RUN mkdir -p /app/scripts && \
# Create necessary directories
RUN mkdir -p data logs data/agent_data
# Make entrypoint executable
RUN chmod +x entrypoint.sh
# Make entrypoints executable
RUN chmod +x entrypoint.sh entrypoint-api.sh
# Expose MCP service ports and web dashboard
EXPOSE 8000 8001 8002 8003 8888
# Expose MCP service ports, API server, and web dashboard
EXPOSE 8000 8001 8002 8003 8080 8888
# Set Python to run unbuffered for real-time logs
ENV PYTHONUNBUFFERED=1

View File

@@ -35,15 +35,31 @@
---
## 📝 Upcoming Updates (This Week)
## ✨ Latest Updates (v0.3.0)
We're excited to announce the following updates coming this week:
**Major Architecture Upgrade - REST API Service**
- **Hourly Trading Support** - Upgrade to hour-level precision trading
- 🚀 **Service Deployment & Parallel Execution** - Deploy production service + parallel model execution
- 🎨 **Enhanced Frontend Dashboard** - Add detailed trading log visualization (complete trading process display)
- 🌐 **REST API Server** - Complete FastAPI implementation for external orchestration
- Trigger simulations via HTTP POST
- Monitor job progress in real-time
- Query results with flexible filtering
- Health checks and monitoring
- 💾 **SQLite Database** - Full persistence layer with 6 relational tables
- Job tracking and lifecycle management
- Position records with P&L tracking
- AI reasoning logs and tool usage analytics
- 🐳 **Dual Docker Deployment** - API server mode + Batch mode
- API mode: Persistent REST service with health checks
- Batch mode: One-time simulations (backwards compatible)
- 🧪 **Comprehensive Testing** - 102 tests with 85% coverage
- Unit tests for all components
- Integration tests for API endpoints
- Validation scripts for Docker deployment
- 📚 **Production Documentation** - Complete deployment guides
- DOCKER_API.md - API deployment and usage
- TESTING_GUIDE.md - Validation procedures
Stay tuned for these exciting improvements! 🎉
See [CHANGELOG.md](CHANGELOG.md) for full details.
---
@@ -209,12 +225,56 @@ AI-Trader Bench/
## 🚀 Quick Start
### 📋 Prerequisites
### 🐳 **Docker Deployment (Recommended)**
- **Python 3.10+**
**Two deployment modes available:**
#### 🌐 API Server Mode (Windmill Integration)
```bash
# 1. Clone and configure
git clone https://github.com/Xe138/AI-Trader.git
cd AI-Trader
cp .env.example .env
# Edit .env and add your API keys
# 2. Start API server
docker-compose up -d ai-trader-api
# 3. Test API
curl http://localhost:8080/health
# 4. Trigger simulation
curl -X POST http://localhost:8080/simulate/trigger \
-H "Content-Type: application/json" \
-d '{
"config_path": "/app/configs/default_config.json",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4"]
}'
```
See [DOCKER_API.md](DOCKER_API.md) for complete API documentation.
#### 🎯 Batch Mode (One-time Simulation)
```bash
# Run single simulation
docker-compose --profile batch up ai-trader-batch
# With custom config
docker-compose --profile batch run ai-trader-batch configs/custom.json
```
---
### 💻 **Local Installation (Development)**
#### 📋 Prerequisites
- **Python 3.10+**
- **API Keys**: OpenAI, Alpha Vantage, Jina AI
- **Optional**: Docker (for containerized deployment)
### ⚡ One-Click Installation
#### ⚡ Installation Steps
```bash
# 1. Clone project

513
TESTING_GUIDE.md Normal file
View File

@@ -0,0 +1,513 @@
# AI-Trader Testing & Validation Guide
This guide provides step-by-step instructions for validating the AI-Trader Docker deployment.
## Prerequisites
- Docker Desktop installed and running
- `.env` file configured with API keys
- At least 2GB free disk space
- Internet connection for initial price data download
## Quick Start
```bash
# 1. Make scripts executable
chmod +x scripts/*.sh
# 2. Validate Docker build
bash scripts/validate_docker_build.sh
# 3. Test API endpoints
bash scripts/test_api_endpoints.sh
# 4. Test batch mode
bash scripts/test_batch_mode.sh
```
---
## Detailed Testing Procedures
### Test 1: Docker Build Validation
**Purpose:** Verify Docker image builds correctly and containers start
**Command:**
```bash
bash scripts/validate_docker_build.sh
```
**What it tests:**
- ✅ Docker and docker-compose installed
- ✅ Docker daemon running
-`.env` file exists and configured
- ✅ Image builds successfully
- ✅ Container starts in API mode
- ✅ Health endpoint responds
- ✅ No critical errors in logs
**Expected output:**
```
==========================================
AI-Trader Docker Build Validation
==========================================
Step 1: Checking prerequisites...
✓ Docker is installed: Docker version 24.0.0
✓ Docker daemon is running
✓ docker-compose is installed
Step 2: Checking environment configuration...
✓ .env file exists
✓ OPENAI_API_KEY is set
✓ ALPHAADVANTAGE_API_KEY is set
✓ JINA_API_KEY is set
Step 3: Building Docker image...
✓ Docker image built successfully
Step 4: Verifying Docker image...
✓ Image size: 850MB
✓ Exposed ports: 8000/tcp 8001/tcp 8002/tcp 8003/tcp 8080/tcp 8888/tcp
Step 5: Testing API mode startup...
✓ Container started successfully
✓ Container is running
✓ No critical errors in logs
Step 6: Testing health endpoint...
✓ Health endpoint responding
Health response: {"status":"healthy","database":"connected","timestamp":"..."}
```
**If it fails:**
- Check Docker Desktop is running
- Verify `.env` has all required keys
- Check port 8080 is not already in use
- Review logs: `docker logs ai-trader-api`
---
### Test 2: API Endpoint Testing
**Purpose:** Validate all REST API endpoints work correctly
**Command:**
```bash
# Ensure API is running first
docker-compose up -d ai-trader-api
# Run tests
bash scripts/test_api_endpoints.sh
```
**What it tests:**
- ✅ GET /health - Service health check
- ✅ POST /simulate/trigger - Job creation
- ✅ GET /simulate/status/{job_id} - Status tracking
- ✅ Job completion monitoring
- ✅ GET /results - Results retrieval
- ✅ Query filtering (by date, model)
- ✅ Concurrent job prevention
- ✅ Error handling (invalid inputs)
**Expected output:**
```
==========================================
AI-Trader API Endpoint Testing
==========================================
✓ API is accessible
Test 1: GET /health
✓ Health check passed
Test 2: POST /simulate/trigger
✓ Simulation triggered successfully
Job ID: 550e8400-e29b-41d4-a716-446655440000
Test 3: GET /simulate/status/{job_id}
✓ Job status retrieved
Job Status: pending
Test 4: Monitoring job progress
[1/30] Status: running | Progress: {"completed":1,"failed":0,...}
...
✓ Job finished with status: completed
Test 5: GET /results
✓ Results retrieved
Result count: 2
Test 6: GET /results?date=...
✓ Date-filtered results retrieved
Test 7: GET /results?model=...
✓ Model-filtered results retrieved
Test 8: Concurrent job prevention
✓ Concurrent job correctly rejected
Test 9: Error handling
✓ Invalid config path correctly rejected
```
**If it fails:**
- Ensure container is running: `docker ps | grep ai-trader-api`
- Check API logs: `docker logs ai-trader-api`
- Verify port 8080 is accessible: `curl http://localhost:8080/health`
- Check MCP services started: `docker exec ai-trader-api ps aux | grep python`
---
### Test 3: Batch Mode Testing
**Purpose:** Verify one-time simulation execution works
**Command:**
```bash
bash scripts/test_batch_mode.sh
```
**What it tests:**
- ✅ Batch mode container starts
- ✅ Simulation executes to completion
- ✅ Exit code is 0 (success)
- ✅ Position files created
- ✅ Log files generated
- ✅ Price data persists between runs
**Expected output:**
```
==========================================
AI-Trader Batch Mode Testing
==========================================
✓ Prerequisites OK
Using config: configs/default_config.json
Test 1: Building Docker image
✓ Image built successfully
Test 2: Running batch simulation
🚀 Starting AI-Trader...
✅ Environment variables validated
📊 Fetching and merging price data...
🔧 Starting MCP services...
🤖 Starting trading agent...
[Trading output...]
Test 3: Checking exit status
✓ Batch simulation completed successfully (exit code: 0)
Test 4: Verifying output files
✓ Found 1 position file(s)
Sample position data: {...}
✓ Found 3 log file(s)
Test 5: Checking price data
✓ Price data exists: 100 stocks
Test 6: Testing data persistence
✓ Second run completed successfully
✓ Price data was reused
```
**If it fails:**
- Check `.env` has valid API keys
- Verify internet connection (for price data)
- Check available disk space
- Review batch logs: `docker logs ai-trader-batch`
- Check data directory permissions
---
## Manual Testing Procedures
### Test 1: API Health Check
```bash
# Start API
docker-compose up -d ai-trader-api
# Test health endpoint
curl http://localhost:8080/health
# Expected response:
# {"status":"healthy","database":"connected","timestamp":"2025-01-16T10:00:00Z"}
```
### Test 2: Trigger Simulation
```bash
# Trigger job
curl -X POST http://localhost:8080/simulate/trigger \
-H "Content-Type: application/json" \
-d '{
"config_path": "/app/configs/default_config.json",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4"]
}'
# Expected response:
# {
# "job_id": "550e8400-e29b-41d4-a716-446655440000",
# "status": "pending",
# "total_model_days": 2,
# "message": "Simulation job ... created and started"
# }
# Save job_id for next steps
JOB_ID="550e8400-e29b-41d4-a716-446655440000"
```
### Test 3: Monitor Job Progress
```bash
# Check status (repeat until completed)
curl http://localhost:8080/simulate/status/$JOB_ID | jq '.'
# Poll with watch
watch -n 10 "curl -s http://localhost:8080/simulate/status/$JOB_ID | jq '.status, .progress'"
```
### Test 4: Retrieve Results
```bash
# Get all results for job
curl "http://localhost:8080/results?job_id=$JOB_ID" | jq '.'
# Filter by date
curl "http://localhost:8080/results?date=2025-01-16" | jq '.'
# Filter by model
curl "http://localhost:8080/results?model=gpt-4" | jq '.'
# Combine filters
curl "http://localhost:8080/results?job_id=$JOB_ID&date=2025-01-16&model=gpt-4" | jq '.'
```
### Test 5: Volume Persistence
```bash
# Stop container
docker-compose down
# Verify data persists
ls -lh data/jobs.db
ls -R data/agent_data
# Restart container
docker-compose up -d ai-trader-api
# Data should still be accessible via API
curl http://localhost:8080/results | jq '.count'
```
---
## Troubleshooting
### Problem: Container won't start
**Symptoms:**
- `docker ps` shows no ai-trader-api container
- Container exits immediately
**Debug steps:**
```bash
# Check logs
docker logs ai-trader-api
# Common issues:
# 1. Missing API keys in .env
# 2. Port 8080 already in use
# 3. Volume permission issues
```
**Solutions:**
```bash
# 1. Verify .env
cat .env | grep -E "OPENAI_API_KEY|ALPHAADVANTAGE_API_KEY|JINA_API_KEY"
# 2. Check port usage
lsof -i :8080 # Linux/Mac
netstat -ano | findstr :8080 # Windows
# 3. Fix permissions
chmod -R 755 data logs
```
### Problem: Health check fails
**Symptoms:**
- `curl http://localhost:8080/health` returns error
- Container is running but API not responding
**Debug steps:**
```bash
# Check if API process is running
docker exec ai-trader-api ps aux | grep uvicorn
# Check internal health
docker exec ai-trader-api curl http://localhost:8080/health
# Check logs for startup errors
docker logs ai-trader-api | grep -i error
```
**Solutions:**
```bash
# If MCP services didn't start:
docker exec ai-trader-api ps aux | grep python
# If database issues:
docker exec ai-trader-api ls -l /app/data/jobs.db
# Restart container
docker-compose restart ai-trader-api
```
### Problem: Job stays in "pending" status
**Symptoms:**
- Job triggered but never progresses
- Status remains "pending" indefinitely
**Debug steps:**
```bash
# Check worker logs
docker logs ai-trader-api | grep -i "worker\|simulation"
# Check database
docker exec ai-trader-api sqlite3 /app/data/jobs.db "SELECT * FROM job_details;"
# Check if MCP services are accessible
docker exec ai-trader-api curl http://localhost:8000/health
```
**Solutions:**
```bash
# Restart container (jobs resume automatically)
docker-compose restart ai-trader-api
# Check specific job status
curl http://localhost:8080/simulate/status/$JOB_ID | jq '.details'
```
### Problem: Tests timeout
**Symptoms:**
- `test_api_endpoints.sh` hangs during job monitoring
- Jobs take longer than expected
**Solutions:**
```bash
# Increase poll timeout in test script
# Edit: MAX_POLLS=60 # Increase from 30
# Or monitor job manually
watch -n 30 "curl -s http://localhost:8080/simulate/status/$JOB_ID | jq '.status, .progress'"
# Check agent logs for slowness
docker logs ai-trader-api | tail -100
```
---
## Performance Benchmarks
### Expected Execution Times
**Docker Build:**
- First build: 5-10 minutes
- Subsequent builds: 1-2 minutes (with cache)
**API Startup:**
- Container start: 5-10 seconds
- Health check ready: 15-20 seconds (including MCP services)
**Single Model-Day Simulation:**
- With existing price data: 2-5 minutes
- First run (fetching price data): 10-15 minutes
**Complete 2-Date, 2-Model Job:**
- Expected duration: 10-20 minutes
- Depends on AI model response times
---
## Continuous Monitoring
### Health Check Monitoring
```bash
# Add to cron for continuous monitoring
*/5 * * * * curl -f http://localhost:8080/health || echo "API down" | mail -s "AI-Trader Alert" admin@example.com
```
### Log Rotation
```bash
# Docker handles log rotation, but monitor size:
docker logs ai-trader-api --tail 100
# Clear old logs if needed:
docker logs ai-trader-api > /dev/null 2>&1
```
### Database Size
```bash
# Monitor database growth
docker exec ai-trader-api du -h /app/data/jobs.db
# Vacuum periodically
docker exec ai-trader-api sqlite3 /app/data/jobs.db "VACUUM;"
```
---
## Success Criteria
### Validation Complete When:
- ✅ All 3 test scripts pass without errors
- ✅ Health endpoint returns "healthy" status
- ✅ Can trigger and complete simulation job
- ✅ Results are retrievable via API
- ✅ Data persists after container restart
- ✅ Batch mode completes successfully
- ✅ No critical errors in logs
### Ready for Production When:
- ✅ All validation tests pass
- ✅ Performance meets expectations
- ✅ Monitoring is configured
- ✅ Backup strategy is in place
- ✅ Documentation is reviewed
- ✅ Team is trained on operations
---
## Next Steps After Validation
1. **Set up monitoring** - Configure health check alerts
2. **Configure backups** - Backup `/app/data` regularly
3. **Document operations** - Create runbook for team
4. **Set up CI/CD** - Automate testing and deployment
5. **Integrate with Windmill** - Connect workflows to API
6. **Scale if needed** - Deploy multiple instances with load balancer
---
## Support
For issues not covered in this guide:
1. Check `DOCKER_API.md` for detailed API documentation
2. Review container logs: `docker logs ai-trader-api`
3. Check database: `docker exec ai-trader-api sqlite3 /app/data/jobs.db ".tables"`
4. Open issue on GitHub with logs and error messages

0
api/__init__.py Normal file
View File

307
api/database.py Normal file
View File

@@ -0,0 +1,307 @@
"""
Database utilities and schema management for AI-Trader API.
This module provides:
- SQLite connection management
- Database schema initialization (6 tables)
- ACID-compliant transaction support
"""
import sqlite3
from pathlib import Path
from typing import Optional
import os
def get_db_connection(db_path: str = "data/jobs.db") -> sqlite3.Connection:
"""
Get SQLite database connection with proper configuration.
Args:
db_path: Path to SQLite database file
Returns:
Configured SQLite connection
Configuration:
- Foreign keys enabled for referential integrity
- Row factory for dict-like access
- Check same thread disabled for FastAPI async compatibility
"""
# Ensure data directory exists
db_path_obj = Path(db_path)
db_path_obj.parent.mkdir(parents=True, exist_ok=True)
conn = sqlite3.connect(db_path, check_same_thread=False)
conn.execute("PRAGMA foreign_keys = ON")
conn.row_factory = sqlite3.Row
return conn
def initialize_database(db_path: str = "data/jobs.db") -> None:
"""
Create all database tables with enhanced schema.
Tables created:
1. jobs - High-level job metadata and status
2. job_details - Per model-day execution tracking
3. positions - Trading positions and P&L metrics
4. holdings - Portfolio holdings per position
5. reasoning_logs - AI decision logs (optional, for detail=full)
6. tool_usage - Tool usage statistics
Args:
db_path: Path to SQLite database file
"""
conn = get_db_connection(db_path)
cursor = conn.cursor()
# Table 1: Jobs - Job metadata and lifecycle
cursor.execute("""
CREATE TABLE IF NOT EXISTS jobs (
job_id TEXT PRIMARY KEY,
config_path TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
date_range TEXT NOT NULL,
models TEXT NOT NULL,
created_at TEXT NOT NULL,
started_at TEXT,
updated_at TEXT,
completed_at TEXT,
total_duration_seconds REAL,
error TEXT
)
""")
# Table 2: Job Details - Per model-day execution
cursor.execute("""
CREATE TABLE IF NOT EXISTS job_details (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
started_at TEXT,
completed_at TEXT,
duration_seconds REAL,
error TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
)
""")
# Table 3: Positions - Trading positions and P&L
cursor.execute("""
CREATE TABLE IF NOT EXISTS positions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
action_id INTEGER NOT NULL,
action_type TEXT CHECK(action_type IN ('buy', 'sell', 'no_trade')),
symbol TEXT,
amount INTEGER,
price REAL,
cash REAL NOT NULL,
portfolio_value REAL NOT NULL,
daily_profit REAL,
daily_return_pct REAL,
cumulative_profit REAL,
cumulative_return_pct REAL,
created_at TEXT NOT NULL,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
)
""")
# Table 4: Holdings - Portfolio holdings
cursor.execute("""
CREATE TABLE IF NOT EXISTS holdings (
id INTEGER PRIMARY KEY AUTOINCREMENT,
position_id INTEGER NOT NULL,
symbol TEXT NOT NULL,
quantity INTEGER NOT NULL,
FOREIGN KEY (position_id) REFERENCES positions(id) ON DELETE CASCADE
)
""")
# Table 5: Reasoning Logs - AI decision logs (optional)
cursor.execute("""
CREATE TABLE IF NOT EXISTS reasoning_logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
step_number INTEGER NOT NULL,
timestamp TEXT NOT NULL,
role TEXT CHECK(role IN ('user', 'assistant', 'tool')),
content TEXT,
tool_name TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
)
""")
# Table 6: Tool Usage - Tool usage statistics
cursor.execute("""
CREATE TABLE IF NOT EXISTS tool_usage (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
tool_name TEXT NOT NULL,
call_count INTEGER NOT NULL DEFAULT 1,
total_duration_seconds REAL,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
)
""")
# Create indexes for performance
_create_indexes(cursor)
conn.commit()
conn.close()
def _create_indexes(cursor: sqlite3.Cursor) -> None:
"""Create database indexes for query performance."""
# Jobs table indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_jobs_created_at ON jobs(created_at DESC)
""")
# Job details table indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_job_details_job_id ON job_details(job_id)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_job_details_status ON job_details(status)
""")
cursor.execute("""
CREATE UNIQUE INDEX IF NOT EXISTS idx_job_details_unique
ON job_details(job_id, date, model)
""")
# Positions table indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_positions_job_id ON positions(job_id)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_positions_date ON positions(date)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_positions_model ON positions(model)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_positions_date_model ON positions(date, model)
""")
cursor.execute("""
CREATE UNIQUE INDEX IF NOT EXISTS idx_positions_unique
ON positions(job_id, date, model, action_id)
""")
# Holdings table indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_holdings_position_id ON holdings(position_id)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_holdings_symbol ON holdings(symbol)
""")
# Reasoning logs table indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_reasoning_logs_job_date_model
ON reasoning_logs(job_id, date, model)
""")
# Tool usage table indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_tool_usage_job_date_model
ON tool_usage(job_id, date, model)
""")
def drop_all_tables(db_path: str = "data/jobs.db") -> None:
"""
Drop all database tables. USE WITH CAUTION.
This is primarily for testing and development.
Args:
db_path: Path to SQLite database file
"""
conn = get_db_connection(db_path)
cursor = conn.cursor()
tables = [
'tool_usage',
'reasoning_logs',
'holdings',
'positions',
'job_details',
'jobs'
]
for table in tables:
cursor.execute(f"DROP TABLE IF EXISTS {table}")
conn.commit()
conn.close()
def vacuum_database(db_path: str = "data/jobs.db") -> None:
"""
Reclaim disk space after deletions.
Should be run periodically after cleanup operations.
Args:
db_path: Path to SQLite database file
"""
conn = get_db_connection(db_path)
conn.execute("VACUUM")
conn.close()
def get_database_stats(db_path: str = "data/jobs.db") -> dict:
"""
Get database statistics for monitoring.
Returns:
Dictionary with table row counts and database size
Example:
{
"database_size_mb": 12.5,
"jobs": 150,
"job_details": 3000,
"positions": 15000,
"holdings": 45000,
"reasoning_logs": 300000,
"tool_usage": 12000
}
"""
conn = get_db_connection(db_path)
cursor = conn.cursor()
stats = {}
# Get database file size
if os.path.exists(db_path):
size_bytes = os.path.getsize(db_path)
stats["database_size_mb"] = round(size_bytes / (1024 * 1024), 2)
else:
stats["database_size_mb"] = 0
# Get row counts for each table
tables = ['jobs', 'job_details', 'positions', 'holdings', 'reasoning_logs', 'tool_usage']
for table in tables:
cursor.execute(f"SELECT COUNT(*) FROM {table}")
stats[table] = cursor.fetchone()[0]
conn.close()
return stats

625
api/job_manager.py Normal file
View File

@@ -0,0 +1,625 @@
"""
Job lifecycle manager for simulation orchestration.
This module provides:
- Job creation and validation
- Status transitions (state machine)
- Progress tracking across model-days
- Concurrency control (single job at a time)
- Job retrieval and queries
- Cleanup operations
"""
import sqlite3
import json
import uuid
from datetime import datetime, timedelta
from typing import Optional, List, Dict, Any
from pathlib import Path
import logging
from api.database import get_db_connection
logger = logging.getLogger(__name__)
class JobManager:
"""
Manages simulation job lifecycle and orchestration.
Responsibilities:
- Create jobs with date ranges and model lists
- Track job status (pending → running → completed/partial/failed)
- Monitor progress across model-days
- Enforce single-job concurrency
- Provide job queries and retrieval
- Cleanup old jobs
State Machine:
pending → running → completed (all succeeded)
→ partial (some failed)
→ failed (job-level error)
"""
def __init__(self, db_path: str = "data/jobs.db"):
"""
Initialize JobManager.
Args:
db_path: Path to SQLite database
"""
self.db_path = db_path
def create_job(
self,
config_path: str,
date_range: List[str],
models: List[str]
) -> str:
"""
Create new simulation job.
Args:
config_path: Path to configuration file
date_range: List of dates to simulate (YYYY-MM-DD)
models: List of model signatures to execute
Returns:
job_id: UUID of created job
Raises:
ValueError: If another job is already running/pending
"""
if not self.can_start_new_job():
raise ValueError("Another simulation job is already running or pending")
job_id = str(uuid.uuid4())
created_at = datetime.utcnow().isoformat() + "Z"
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
# Insert job
cursor.execute("""
INSERT INTO jobs (
job_id, config_path, status, date_range, models, created_at
)
VALUES (?, ?, ?, ?, ?, ?)
""", (
job_id,
config_path,
"pending",
json.dumps(date_range),
json.dumps(models),
created_at
))
# Create job_details for each model-day combination
for date in date_range:
for model in models:
cursor.execute("""
INSERT INTO job_details (
job_id, date, model, status
)
VALUES (?, ?, ?, ?)
""", (job_id, date, model, "pending"))
conn.commit()
logger.info(f"Created job {job_id} with {len(date_range)} dates and {len(models)} models")
return job_id
finally:
conn.close()
def get_job(self, job_id: str) -> Optional[Dict[str, Any]]:
"""
Get job by ID.
Args:
job_id: Job UUID
Returns:
Job data dict or None if not found
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
cursor.execute("""
SELECT
job_id, config_path, status, date_range, models,
created_at, started_at, updated_at, completed_at,
total_duration_seconds, error
FROM jobs
WHERE job_id = ?
""", (job_id,))
row = cursor.fetchone()
if not row:
return None
return {
"job_id": row[0],
"config_path": row[1],
"status": row[2],
"date_range": json.loads(row[3]),
"models": json.loads(row[4]),
"created_at": row[5],
"started_at": row[6],
"updated_at": row[7],
"completed_at": row[8],
"total_duration_seconds": row[9],
"error": row[10]
}
finally:
conn.close()
def get_current_job(self) -> Optional[Dict[str, Any]]:
"""
Get most recent job.
Returns:
Most recent job data or None if no jobs exist
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
cursor.execute("""
SELECT
job_id, config_path, status, date_range, models,
created_at, started_at, updated_at, completed_at,
total_duration_seconds, error
FROM jobs
ORDER BY created_at DESC
LIMIT 1
""")
row = cursor.fetchone()
if not row:
return None
return {
"job_id": row[0],
"config_path": row[1],
"status": row[2],
"date_range": json.loads(row[3]),
"models": json.loads(row[4]),
"created_at": row[5],
"started_at": row[6],
"updated_at": row[7],
"completed_at": row[8],
"total_duration_seconds": row[9],
"error": row[10]
}
finally:
conn.close()
def find_job_by_date_range(self, date_range: List[str]) -> Optional[Dict[str, Any]]:
"""
Find job with matching date range.
Args:
date_range: List of dates to match
Returns:
Job data or None if not found
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
date_range_json = json.dumps(date_range)
cursor.execute("""
SELECT
job_id, config_path, status, date_range, models,
created_at, started_at, updated_at, completed_at,
total_duration_seconds, error
FROM jobs
WHERE date_range = ?
ORDER BY created_at DESC
LIMIT 1
""", (date_range_json,))
row = cursor.fetchone()
if not row:
return None
return {
"job_id": row[0],
"config_path": row[1],
"status": row[2],
"date_range": json.loads(row[3]),
"models": json.loads(row[4]),
"created_at": row[5],
"started_at": row[6],
"updated_at": row[7],
"completed_at": row[8],
"total_duration_seconds": row[9],
"error": row[10]
}
finally:
conn.close()
def update_job_status(
self,
job_id: str,
status: str,
error: Optional[str] = None
) -> None:
"""
Update job status.
Args:
job_id: Job UUID
status: New status (pending/running/completed/partial/failed)
error: Optional error message
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
updated_at = datetime.utcnow().isoformat() + "Z"
# Set timestamps based on status
if status == "running":
cursor.execute("""
UPDATE jobs
SET status = ?, started_at = ?, updated_at = ?
WHERE job_id = ?
""", (status, updated_at, updated_at, job_id))
elif status in ("completed", "partial", "failed"):
# Calculate duration
cursor.execute("""
SELECT started_at FROM jobs WHERE job_id = ?
""", (job_id,))
row = cursor.fetchone()
duration_seconds = None
if row and row[0]:
started_at = datetime.fromisoformat(row[0].replace("Z", ""))
completed_at = datetime.fromisoformat(updated_at.replace("Z", ""))
duration_seconds = (completed_at - started_at).total_seconds()
cursor.execute("""
UPDATE jobs
SET status = ?, completed_at = ?, updated_at = ?,
total_duration_seconds = ?, error = ?
WHERE job_id = ?
""", (status, updated_at, updated_at, duration_seconds, error, job_id))
else:
# Just update status
cursor.execute("""
UPDATE jobs
SET status = ?, updated_at = ?, error = ?
WHERE job_id = ?
""", (status, updated_at, error, job_id))
conn.commit()
logger.debug(f"Updated job {job_id} status to {status}")
finally:
conn.close()
def update_job_detail_status(
self,
job_id: str,
date: str,
model: str,
status: str,
error: Optional[str] = None
) -> None:
"""
Update model-day status and auto-update job status.
Args:
job_id: Job UUID
date: Trading date (YYYY-MM-DD)
model: Model signature
status: New status (pending/running/completed/failed)
error: Optional error message
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
updated_at = datetime.utcnow().isoformat() + "Z"
if status == "running":
cursor.execute("""
UPDATE job_details
SET status = ?, started_at = ?
WHERE job_id = ? AND date = ? AND model = ?
""", (status, updated_at, job_id, date, model))
# Update job to running if not already
cursor.execute("""
UPDATE jobs
SET status = 'running', started_at = COALESCE(started_at, ?), updated_at = ?
WHERE job_id = ? AND status = 'pending'
""", (updated_at, updated_at, job_id))
elif status in ("completed", "failed"):
# Calculate duration for detail
cursor.execute("""
SELECT started_at FROM job_details
WHERE job_id = ? AND date = ? AND model = ?
""", (job_id, date, model))
row = cursor.fetchone()
duration_seconds = None
if row and row[0]:
started_at = datetime.fromisoformat(row[0].replace("Z", ""))
completed_at = datetime.fromisoformat(updated_at.replace("Z", ""))
duration_seconds = (completed_at - started_at).total_seconds()
cursor.execute("""
UPDATE job_details
SET status = ?, completed_at = ?, duration_seconds = ?, error = ?
WHERE job_id = ? AND date = ? AND model = ?
""", (status, updated_at, duration_seconds, error, job_id, date, model))
# Check if all details are done
cursor.execute("""
SELECT
COUNT(*) as total,
SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed,
SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) as failed
FROM job_details
WHERE job_id = ?
""", (job_id,))
total, completed, failed = cursor.fetchone()
if completed + failed == total:
# All done - determine final status
if failed == 0:
final_status = "completed"
elif completed > 0:
final_status = "partial"
else:
final_status = "failed"
# Calculate job duration
cursor.execute("""
SELECT started_at FROM jobs WHERE job_id = ?
""", (job_id,))
row = cursor.fetchone()
job_duration = None
if row and row[0]:
started_at = datetime.fromisoformat(row[0].replace("Z", ""))
completed_at = datetime.fromisoformat(updated_at.replace("Z", ""))
job_duration = (completed_at - started_at).total_seconds()
cursor.execute("""
UPDATE jobs
SET status = ?, completed_at = ?, updated_at = ?, total_duration_seconds = ?
WHERE job_id = ?
""", (final_status, updated_at, updated_at, job_duration, job_id))
conn.commit()
logger.debug(f"Updated job_detail {job_id}/{date}/{model} to {status}")
finally:
conn.close()
def get_job_details(self, job_id: str) -> List[Dict[str, Any]]:
"""
Get all model-day execution details for a job.
Args:
job_id: Job UUID
Returns:
List of job_detail records with date, model, status, error
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
cursor.execute("""
SELECT date, model, status, error, started_at, completed_at, duration_seconds
FROM job_details
WHERE job_id = ?
ORDER BY date, model
""", (job_id,))
rows = cursor.fetchall()
details = []
for row in rows:
details.append({
"date": row[0],
"model": row[1],
"status": row[2],
"error": row[3],
"started_at": row[4],
"completed_at": row[5],
"duration_seconds": row[6]
})
return details
finally:
conn.close()
def get_job_progress(self, job_id: str) -> Dict[str, Any]:
"""
Get job progress summary.
Args:
job_id: Job UUID
Returns:
Progress dict with total_model_days, completed, failed, current, details
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
cursor.execute("""
SELECT
COUNT(*) as total,
SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed,
SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) as failed
FROM job_details
WHERE job_id = ?
""", (job_id,))
total, completed, failed = cursor.fetchone()
# Get currently running model-day
cursor.execute("""
SELECT date, model
FROM job_details
WHERE job_id = ? AND status = 'running'
LIMIT 1
""", (job_id,))
current_row = cursor.fetchone()
current = {"date": current_row[0], "model": current_row[1]} if current_row else None
# Get all details
cursor.execute("""
SELECT date, model, status, duration_seconds, error
FROM job_details
WHERE job_id = ?
ORDER BY date, model
""", (job_id,))
details = []
for row in cursor.fetchall():
details.append({
"date": row[0],
"model": row[1],
"status": row[2],
"duration_seconds": row[3],
"error": row[4]
})
return {
"total_model_days": total,
"completed": completed or 0,
"failed": failed or 0,
"current": current,
"details": details
}
finally:
conn.close()
def can_start_new_job(self) -> bool:
"""
Check if new job can be started.
Returns:
True if no jobs are pending/running, False otherwise
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
cursor.execute("""
SELECT COUNT(*)
FROM jobs
WHERE status IN ('pending', 'running')
""")
count = cursor.fetchone()[0]
return count == 0
finally:
conn.close()
def get_running_jobs(self) -> List[Dict[str, Any]]:
"""
Get all running/pending jobs.
Returns:
List of job dicts
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
cursor.execute("""
SELECT
job_id, config_path, status, date_range, models,
created_at, started_at, updated_at, completed_at,
total_duration_seconds, error
FROM jobs
WHERE status IN ('pending', 'running')
ORDER BY created_at DESC
""")
jobs = []
for row in cursor.fetchall():
jobs.append({
"job_id": row[0],
"config_path": row[1],
"status": row[2],
"date_range": json.loads(row[3]),
"models": json.loads(row[4]),
"created_at": row[5],
"started_at": row[6],
"updated_at": row[7],
"completed_at": row[8],
"total_duration_seconds": row[9],
"error": row[10]
})
return jobs
finally:
conn.close()
def cleanup_old_jobs(self, days: int = 30) -> Dict[str, int]:
"""
Delete jobs older than threshold.
Args:
days: Delete jobs older than this many days
Returns:
Dict with jobs_deleted count
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
cutoff_date = (datetime.utcnow() - timedelta(days=days)).isoformat() + "Z"
# Get count before deletion
cursor.execute("""
SELECT COUNT(*)
FROM jobs
WHERE created_at < ? AND status IN ('completed', 'partial', 'failed')
""", (cutoff_date,))
count = cursor.fetchone()[0]
# Delete old jobs (foreign key cascade will delete related records)
cursor.execute("""
DELETE FROM jobs
WHERE created_at < ? AND status IN ('completed', 'partial', 'failed')
""", (cutoff_date,))
conn.commit()
logger.info(f"Cleaned up {count} jobs older than {days} days")
return {"jobs_deleted": count}
finally:
conn.close()

366
api/main.py Normal file
View File

@@ -0,0 +1,366 @@
"""
FastAPI REST API for AI-Trader simulation service.
Provides endpoints for:
- Triggering simulation jobs
- Checking job status
- Querying results
- Health checks
"""
import logging
from typing import Optional, List, Dict, Any
from datetime import datetime
from pathlib import Path
from fastapi import FastAPI, HTTPException, Query
from fastapi.responses import JSONResponse
from pydantic import BaseModel, Field, field_validator
from api.job_manager import JobManager
from api.simulation_worker import SimulationWorker
from api.database import get_db_connection
import threading
import time
logger = logging.getLogger(__name__)
# Pydantic models for request/response validation
class SimulateTriggerRequest(BaseModel):
"""Request body for POST /simulate/trigger."""
config_path: str = Field(..., description="Path to configuration file")
date_range: List[str] = Field(..., min_length=1, description="List of trading dates (YYYY-MM-DD)")
models: List[str] = Field(..., min_length=1, description="List of model signatures to simulate")
@field_validator("date_range")
@classmethod
def validate_date_range(cls, v):
"""Validate date format."""
for date in v:
try:
datetime.strptime(date, "%Y-%m-%d")
except ValueError:
raise ValueError(f"Invalid date format: {date}. Expected YYYY-MM-DD")
return v
class SimulateTriggerResponse(BaseModel):
"""Response body for POST /simulate/trigger."""
job_id: str
status: str
total_model_days: int
message: str
class JobProgress(BaseModel):
"""Job progress information."""
total_model_days: int
completed: int
failed: int
pending: int
class JobStatusResponse(BaseModel):
"""Response body for GET /simulate/status/{job_id}."""
job_id: str
status: str
progress: JobProgress
date_range: List[str]
models: List[str]
created_at: str
started_at: Optional[str] = None
completed_at: Optional[str] = None
total_duration_seconds: Optional[float] = None
error: Optional[str] = None
details: List[Dict[str, Any]]
class HealthResponse(BaseModel):
"""Response body for GET /health."""
status: str
database: str
timestamp: str
def create_app(db_path: str = "data/jobs.db") -> FastAPI:
"""
Create FastAPI application instance.
Args:
db_path: Path to SQLite database
Returns:
Configured FastAPI app
"""
app = FastAPI(
title="AI-Trader Simulation API",
description="REST API for triggering and monitoring AI trading simulations",
version="1.0.0"
)
# Store db_path in app state
app.state.db_path = db_path
@app.post("/simulate/trigger", response_model=SimulateTriggerResponse, status_code=200)
async def trigger_simulation(request: SimulateTriggerRequest):
"""
Trigger a new simulation job.
Creates a job with specified config, dates, and models.
Job runs asynchronously in background thread.
Raises:
HTTPException 400: If another job is already running or config invalid
HTTPException 422: If request validation fails
"""
try:
# Validate config path exists
if not Path(request.config_path).exists():
raise HTTPException(
status_code=400,
detail=f"Config path does not exist: {request.config_path}"
)
job_manager = JobManager(db_path=app.state.db_path)
# Check if can start new job
if not job_manager.can_start_new_job():
raise HTTPException(
status_code=400,
detail="Another simulation job is already running or pending. Please wait for it to complete."
)
# Create job
job_id = job_manager.create_job(
config_path=request.config_path,
date_range=request.date_range,
models=request.models
)
# Start worker in background thread (only if not in test mode)
if not getattr(app.state, "test_mode", False):
def run_worker():
worker = SimulationWorker(job_id=job_id, db_path=app.state.db_path)
worker.run()
thread = threading.Thread(target=run_worker, daemon=True)
thread.start()
logger.info(f"Triggered simulation job {job_id}")
return SimulateTriggerResponse(
job_id=job_id,
status="pending",
total_model_days=len(request.date_range) * len(request.models),
message=f"Simulation job {job_id} created and started"
)
except HTTPException:
raise
except ValueError as e:
logger.error(f"Validation error: {e}")
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
logger.error(f"Failed to trigger simulation: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
@app.get("/simulate/status/{job_id}", response_model=JobStatusResponse)
async def get_job_status(job_id: str):
"""
Get status and progress of a simulation job.
Args:
job_id: Job UUID
Returns:
Job status, progress, and model-day details
Raises:
HTTPException 404: If job not found
"""
try:
job_manager = JobManager(db_path=app.state.db_path)
# Get job info
job = job_manager.get_job(job_id)
if not job:
raise HTTPException(status_code=404, detail=f"Job {job_id} not found")
# Get progress
progress = job_manager.get_job_progress(job_id)
# Get model-day details
details = job_manager.get_job_details(job_id)
# Calculate pending (total - completed - failed)
pending = progress["total_model_days"] - progress["completed"] - progress["failed"]
return JobStatusResponse(
job_id=job["job_id"],
status=job["status"],
progress=JobProgress(
total_model_days=progress["total_model_days"],
completed=progress["completed"],
failed=progress["failed"],
pending=pending
),
date_range=job["date_range"],
models=job["models"],
created_at=job["created_at"],
started_at=job.get("started_at"),
completed_at=job.get("completed_at"),
total_duration_seconds=job.get("total_duration_seconds"),
error=job.get("error"),
details=details
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to get job status: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
@app.get("/results")
async def get_results(
job_id: Optional[str] = Query(None, description="Filter by job ID"),
date: Optional[str] = Query(None, description="Filter by date (YYYY-MM-DD)"),
model: Optional[str] = Query(None, description="Filter by model signature")
):
"""
Query simulation results.
Supports filtering by job_id, date, and/or model.
Returns position data with holdings.
Args:
job_id: Optional job UUID filter
date: Optional date filter (YYYY-MM-DD)
model: Optional model signature filter
Returns:
List of position records with holdings
"""
try:
conn = get_db_connection(app.state.db_path)
cursor = conn.cursor()
# Build query with filters
query = """
SELECT
p.id,
p.job_id,
p.date,
p.model,
p.action_id,
p.action_type,
p.symbol,
p.amount,
p.price,
p.cash,
p.portfolio_value,
p.daily_profit,
p.daily_return_pct,
p.created_at
FROM positions p
WHERE 1=1
"""
params = []
if job_id:
query += " AND p.job_id = ?"
params.append(job_id)
if date:
query += " AND p.date = ?"
params.append(date)
if model:
query += " AND p.model = ?"
params.append(model)
query += " ORDER BY p.date, p.model, p.action_id"
cursor.execute(query, params)
rows = cursor.fetchall()
results = []
for row in rows:
position_id = row[0]
# Get holdings for this position
cursor.execute("""
SELECT symbol, quantity
FROM holdings
WHERE position_id = ?
ORDER BY symbol
""", (position_id,))
holdings = [{"symbol": h[0], "quantity": h[1]} for h in cursor.fetchall()]
results.append({
"id": row[0],
"job_id": row[1],
"date": row[2],
"model": row[3],
"action_id": row[4],
"action_type": row[5],
"symbol": row[6],
"amount": row[7],
"price": row[8],
"cash": row[9],
"portfolio_value": row[10],
"daily_profit": row[11],
"daily_return_pct": row[12],
"created_at": row[13],
"holdings": holdings
})
conn.close()
return {"results": results, "count": len(results)}
except Exception as e:
logger.error(f"Failed to query results: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
@app.get("/health", response_model=HealthResponse)
async def health_check():
"""
Health check endpoint.
Verifies database connectivity and service status.
Returns:
Health status and timestamp
"""
try:
# Test database connection
conn = get_db_connection(app.state.db_path)
cursor = conn.cursor()
cursor.execute("SELECT 1")
cursor.fetchone()
conn.close()
database_status = "connected"
except Exception as e:
logger.error(f"Database health check failed: {e}")
database_status = "disconnected"
return HealthResponse(
status="healthy" if database_status == "connected" else "unhealthy",
database=database_status,
timestamp=datetime.utcnow().isoformat() + "Z"
)
return app
# Create default app instance
app = create_app()
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8080)

342
api/model_day_executor.py Normal file
View File

@@ -0,0 +1,342 @@
"""
Single model-day execution engine.
This module provides:
- Isolated execution of one model for one trading day
- Runtime config management per execution
- Result persistence to SQLite (positions, holdings, reasoning)
- Automatic status updates via JobManager
- Cleanup of temporary resources
"""
import logging
import os
from typing import Dict, Any, Optional, List, TYPE_CHECKING
from pathlib import Path
from api.runtime_manager import RuntimeConfigManager
from api.job_manager import JobManager
from api.database import get_db_connection
# Lazy import to avoid loading heavy dependencies during testing
if TYPE_CHECKING:
from agent.base_agent.base_agent import BaseAgent
logger = logging.getLogger(__name__)
class ModelDayExecutor:
"""
Executes a single model for a single trading day.
Responsibilities:
- Create isolated runtime config
- Initialize and run trading agent
- Persist results to SQLite
- Update job status
- Cleanup resources
Lifecycle:
1. __init__() → Create runtime config
2. execute() → Run agent, write results, update status
3. cleanup → Delete runtime config
"""
def __init__(
self,
job_id: str,
date: str,
model_sig: str,
config_path: str,
db_path: str = "data/jobs.db",
data_dir: str = "data"
):
"""
Initialize ModelDayExecutor.
Args:
job_id: Job UUID
date: Trading date (YYYY-MM-DD)
model_sig: Model signature
config_path: Path to configuration file
db_path: Path to SQLite database
data_dir: Data directory for runtime configs
"""
self.job_id = job_id
self.date = date
self.model_sig = model_sig
self.config_path = config_path
self.db_path = db_path
self.data_dir = data_dir
# Create isolated runtime config
self.runtime_manager = RuntimeConfigManager(data_dir=data_dir)
self.runtime_config_path = self.runtime_manager.create_runtime_config(
job_id=job_id,
model_sig=model_sig,
date=date
)
self.job_manager = JobManager(db_path=db_path)
logger.info(f"Initialized executor for {model_sig} on {date} (job: {job_id})")
def execute(self) -> Dict[str, Any]:
"""
Execute trading session and persist results.
Returns:
Result dict with success status and metadata
Process:
1. Update job_detail status to 'running'
2. Initialize and run trading agent
3. Write results to SQLite
4. Update job_detail status to 'completed' or 'failed'
5. Cleanup runtime config
SQLite writes:
- positions: Trading position record
- holdings: Portfolio holdings breakdown
- reasoning_logs: AI reasoning steps (if available)
- tool_usage: Tool usage statistics (if available)
"""
try:
# Update status to running
self.job_manager.update_job_detail_status(
self.job_id,
self.date,
self.model_sig,
"running"
)
# Set environment variable for agent to use isolated config
os.environ["RUNTIME_ENV_PATH"] = self.runtime_config_path
# Initialize agent
agent = self._initialize_agent()
# Run trading session
logger.info(f"Running trading session for {self.model_sig} on {self.date}")
session_result = agent.run_trading_session(self.date)
# Persist results to SQLite
self._write_results_to_db(agent, session_result)
# Update status to completed
self.job_manager.update_job_detail_status(
self.job_id,
self.date,
self.model_sig,
"completed"
)
logger.info(f"Successfully completed {self.model_sig} on {self.date}")
return {
"success": True,
"job_id": self.job_id,
"date": self.date,
"model": self.model_sig,
"session_result": session_result
}
except Exception as e:
error_msg = f"Execution failed: {str(e)}"
logger.error(f"{self.model_sig} on {self.date}: {error_msg}", exc_info=True)
# Update status to failed
self.job_manager.update_job_detail_status(
self.job_id,
self.date,
self.model_sig,
"failed",
error=error_msg
)
return {
"success": False,
"job_id": self.job_id,
"date": self.date,
"model": self.model_sig,
"error": error_msg
}
finally:
# Always cleanup runtime config
self.runtime_manager.cleanup_runtime_config(self.runtime_config_path)
def _initialize_agent(self):
"""
Initialize trading agent with config.
Returns:
Configured BaseAgent instance
"""
# Lazy import to avoid loading heavy dependencies during testing
from agent.base_agent.base_agent import BaseAgent
# Load config
import json
with open(self.config_path, 'r') as f:
config = json.load(f)
# Find model config
model_config = None
for model in config.get("models", []):
if model.get("signature") == self.model_sig:
model_config = model
break
if not model_config:
raise ValueError(f"Model {self.model_sig} not found in config")
# Initialize agent
agent = BaseAgent(
model_name=model_config.get("basemodel"),
signature=self.model_sig,
config=config
)
# Register agent (creates initial position if needed)
agent.register_agent()
return agent
def _write_results_to_db(self, agent, session_result: Dict[str, Any]) -> None:
"""
Write execution results to SQLite.
Args:
agent: Trading agent instance
session_result: Result from run_trading_session()
Writes to:
- positions: Position record with action and P&L
- holdings: Current portfolio holdings
- reasoning_logs: AI reasoning steps (if available)
- tool_usage: Tool usage stats (if available)
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
try:
# Get current positions and trade info
positions = agent.get_positions() if hasattr(agent, 'get_positions') else {}
last_trade = agent.get_last_trade() if hasattr(agent, 'get_last_trade') else None
# Calculate portfolio value
current_prices = agent.get_current_prices() if hasattr(agent, 'get_current_prices') else {}
total_value = self._calculate_portfolio_value(positions, current_prices)
# Get previous value for P&L calculation
cursor.execute("""
SELECT portfolio_value
FROM positions
WHERE job_id = ? AND model = ? AND date < ?
ORDER BY date DESC
LIMIT 1
""", (self.job_id, self.model_sig, self.date))
row = cursor.fetchone()
previous_value = row[0] if row else 10000.0 # Initial portfolio value
daily_profit = total_value - previous_value
daily_return_pct = (daily_profit / previous_value * 100) if previous_value > 0 else 0
# Determine action_id (sequence number for this model)
cursor.execute("""
SELECT COALESCE(MAX(action_id), 0) + 1
FROM positions
WHERE job_id = ? AND model = ?
""", (self.job_id, self.model_sig))
action_id = cursor.fetchone()[0]
# Insert position record
action_type = last_trade.get("action") if last_trade else "no_trade"
symbol = last_trade.get("symbol") if last_trade else None
amount = last_trade.get("amount") if last_trade else None
price = last_trade.get("price") if last_trade else None
cash = positions.get("CASH", 0.0)
from datetime import datetime
created_at = datetime.utcnow().isoformat() + "Z"
cursor.execute("""
INSERT INTO positions (
job_id, date, model, action_id, action_type, symbol,
amount, price, cash, portfolio_value, daily_profit, daily_return_pct, created_at
)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""", (
self.job_id, self.date, self.model_sig, action_id, action_type,
symbol, amount, price, cash, total_value,
daily_profit, daily_return_pct, created_at
))
position_id = cursor.lastrowid
# Insert holdings
for symbol, quantity in positions.items():
cursor.execute("""
INSERT INTO holdings (position_id, symbol, quantity)
VALUES (?, ?, ?)
""", (position_id, symbol, float(quantity)))
# Insert reasoning logs (if available)
if hasattr(agent, 'get_reasoning_steps'):
reasoning_steps = agent.get_reasoning_steps()
for step in reasoning_steps:
cursor.execute("""
INSERT INTO reasoning_logs (
job_id, date, model, step_number, timestamp, content
)
VALUES (?, ?, ?, ?, ?, ?)
""", (
self.job_id, self.date, self.model_sig,
step.get("step"), created_at, step.get("reasoning")
))
# Insert tool usage (if available)
if hasattr(agent, 'get_tool_usage') and hasattr(agent, 'get_tool_usage'):
tool_usage = agent.get_tool_usage()
for tool_name, count in tool_usage.items():
cursor.execute("""
INSERT INTO tool_usage (
job_id, date, model, tool_name, call_count
)
VALUES (?, ?, ?, ?, ?)
""", (self.job_id, self.date, self.model_sig, tool_name, count))
conn.commit()
logger.debug(f"Wrote results to DB for {self.model_sig} on {self.date}")
finally:
conn.close()
def _calculate_portfolio_value(
self,
positions: Dict[str, float],
current_prices: Dict[str, float]
) -> float:
"""
Calculate total portfolio value.
Args:
positions: Current holdings (symbol: quantity)
current_prices: Current market prices (symbol: price)
Returns:
Total portfolio value in dollars
"""
total = 0.0
for symbol, quantity in positions.items():
if symbol == "CASH":
total += quantity
else:
price = current_prices.get(symbol, 0.0)
total += quantity * price
return total

459
api/models.py Normal file
View File

@@ -0,0 +1,459 @@
"""
Pydantic data models for AI-Trader API.
This module defines:
- Request models (input validation)
- Response models (output serialization)
- Nested models for complex data structures
"""
from pydantic import BaseModel, Field
from typing import Optional, List, Dict, Literal, Any
from datetime import datetime
# ==================== Request Models ====================
class TriggerSimulationRequest(BaseModel):
"""Request model for POST /simulate/trigger endpoint."""
config_path: str = Field(
default="configs/default_config.json",
description="Path to configuration file"
)
class Config:
json_schema_extra = {
"example": {
"config_path": "configs/default_config.json"
}
}
class ResultsQueryParams(BaseModel):
"""Query parameters for GET /results endpoint."""
date: str = Field(
...,
pattern=r"^\d{4}-\d{2}-\d{2}$",
description="Date in YYYY-MM-DD format"
)
model: Optional[str] = Field(
None,
description="Model signature filter (optional)"
)
detail: Literal["minimal", "full"] = Field(
default="minimal",
description="Response detail level"
)
class Config:
json_schema_extra = {
"example": {
"date": "2025-01-16",
"model": "gpt-5",
"detail": "minimal"
}
}
# ==================== Nested Response Models ====================
class JobProgress(BaseModel):
"""Progress tracking for simulation jobs."""
total_model_days: int = Field(
...,
description="Total number of model-days to execute"
)
completed: int = Field(
...,
description="Number of model-days completed"
)
failed: int = Field(
...,
description="Number of model-days that failed"
)
current: Optional[Dict[str, str]] = Field(
None,
description="Currently executing model-day (if any)"
)
details: Optional[List[Dict]] = Field(
None,
description="Detailed progress for each model-day"
)
class Config:
json_schema_extra = {
"example": {
"total_model_days": 4,
"completed": 2,
"failed": 0,
"current": {"date": "2025-01-16", "model": "gpt-5"},
"details": [
{
"date": "2025-01-16",
"model": "gpt-5",
"status": "completed",
"duration_seconds": 45.2
}
]
}
}
class DailyPnL(BaseModel):
"""Daily profit and loss metrics."""
profit: float = Field(
...,
description="Daily profit in dollars"
)
return_pct: float = Field(
...,
description="Daily return percentage"
)
portfolio_value: float = Field(
...,
description="Total portfolio value"
)
class Config:
json_schema_extra = {
"example": {
"profit": 150.50,
"return_pct": 1.51,
"portfolio_value": 10150.50
}
}
class Trade(BaseModel):
"""Individual trade record."""
id: int = Field(
...,
description="Trade sequence ID"
)
action: str = Field(
...,
description="Trade action (buy/sell)"
)
symbol: str = Field(
...,
description="Stock symbol"
)
amount: int = Field(
...,
description="Number of shares"
)
price: Optional[float] = Field(
None,
description="Trade price per share"
)
total: Optional[float] = Field(
None,
description="Total trade value"
)
class Config:
json_schema_extra = {
"example": {
"id": 1,
"action": "buy",
"symbol": "AAPL",
"amount": 10,
"price": 255.88,
"total": 2558.80
}
}
class AIReasoning(BaseModel):
"""AI reasoning and decision-making summary."""
total_steps: int = Field(
...,
description="Total reasoning steps taken"
)
stop_signal_received: bool = Field(
...,
description="Whether AI sent stop signal"
)
reasoning_summary: str = Field(
...,
description="Summary of AI reasoning"
)
tool_usage: Dict[str, int] = Field(
...,
description="Tool usage counts"
)
class Config:
json_schema_extra = {
"example": {
"total_steps": 15,
"stop_signal_received": True,
"reasoning_summary": "Market analysis indicates...",
"tool_usage": {
"search": 3,
"get_price": 5,
"math": 2,
"trade": 1
}
}
}
class ModelResult(BaseModel):
"""Simulation results for a single model on a single date."""
model: str = Field(
...,
description="Model signature"
)
positions: Dict[str, float] = Field(
...,
description="Current positions (symbol: quantity)"
)
daily_pnl: DailyPnL = Field(
...,
description="Daily P&L metrics"
)
trades: Optional[List[Trade]] = Field(
None,
description="Trades executed (detail=full only)"
)
ai_reasoning: Optional[AIReasoning] = Field(
None,
description="AI reasoning summary (detail=full only)"
)
log_file_path: Optional[str] = Field(
None,
description="Path to detailed log file (detail=full only)"
)
class Config:
json_schema_extra = {
"example": {
"model": "gpt-5",
"positions": {
"AAPL": 10,
"MSFT": 5,
"CASH": 7500.0
},
"daily_pnl": {
"profit": 150.50,
"return_pct": 1.51,
"portfolio_value": 10150.50
}
}
}
# ==================== Response Models ====================
class TriggerSimulationResponse(BaseModel):
"""Response model for POST /simulate/trigger endpoint."""
job_id: str = Field(
...,
description="Unique job identifier"
)
status: str = Field(
...,
description="Job status (accepted/running/current)"
)
date_range: List[str] = Field(
...,
description="Dates to be simulated"
)
models: List[str] = Field(
...,
description="Models to execute"
)
created_at: str = Field(
...,
description="Job creation timestamp (ISO 8601)"
)
message: str = Field(
...,
description="Human-readable status message"
)
progress: Optional[JobProgress] = Field(
None,
description="Progress (if job already running)"
)
class Config:
json_schema_extra = {
"example": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "accepted",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-5", "claude-3.7-sonnet"],
"created_at": "2025-01-20T14:30:00Z",
"message": "Simulation job queued successfully"
}
}
class JobStatusResponse(BaseModel):
"""Response model for GET /simulate/status/{job_id} endpoint."""
job_id: str = Field(
...,
description="Job identifier"
)
status: str = Field(
...,
description="Job status (pending/running/completed/partial/failed)"
)
date_range: List[str] = Field(
...,
description="Dates being simulated"
)
models: List[str] = Field(
...,
description="Models being executed"
)
progress: JobProgress = Field(
...,
description="Execution progress"
)
created_at: str = Field(
...,
description="Job creation timestamp"
)
updated_at: Optional[str] = Field(
None,
description="Last update timestamp"
)
completed_at: Optional[str] = Field(
None,
description="Job completion timestamp"
)
total_duration_seconds: Optional[float] = Field(
None,
description="Total execution duration"
)
class Config:
json_schema_extra = {
"example": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "running",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-5"],
"progress": {
"total_model_days": 2,
"completed": 1,
"failed": 0,
"current": {"date": "2025-01-17", "model": "gpt-5"}
},
"created_at": "2025-01-20T14:30:00Z"
}
}
class ResultsResponse(BaseModel):
"""Response model for GET /results endpoint."""
date: str = Field(
...,
description="Trading date"
)
results: List[ModelResult] = Field(
...,
description="Results for each model"
)
class Config:
json_schema_extra = {
"example": {
"date": "2025-01-16",
"results": [
{
"model": "gpt-5",
"positions": {"AAPL": 10, "CASH": 7500.0},
"daily_pnl": {
"profit": 150.50,
"return_pct": 1.51,
"portfolio_value": 10150.50
}
}
]
}
}
class HealthCheckResponse(BaseModel):
"""Response model for GET /health endpoint."""
status: str = Field(
...,
description="Overall health status (healthy/unhealthy)"
)
timestamp: str = Field(
...,
description="Health check timestamp"
)
services: Dict[str, Dict] = Field(
...,
description="Status of each service"
)
storage: Dict[str, Any] = Field(
...,
description="Storage status"
)
database: Dict[str, Any] = Field(
...,
description="Database status"
)
class Config:
json_schema_extra = {
"example": {
"status": "healthy",
"timestamp": "2025-01-20T14:30:00Z",
"services": {
"mcp_math": {"status": "up", "url": "http://localhost:8000/mcp"},
"mcp_search": {"status": "up", "url": "http://localhost:8001/mcp"}
},
"storage": {
"data_directory": "/app/data",
"writable": True,
"free_space_mb": 15234
},
"database": {
"status": "connected",
"path": "/app/data/jobs.db"
}
}
}
class ErrorResponse(BaseModel):
"""Standard error response model."""
error: str = Field(
...,
description="Error code/type"
)
message: str = Field(
...,
description="Human-readable error message"
)
details: Optional[Dict] = Field(
None,
description="Additional error details"
)
class Config:
json_schema_extra = {
"example": {
"error": "invalid_date",
"message": "Date must be in YYYY-MM-DD format",
"details": {"provided": "2025/01/16"}
}
}

131
api/runtime_manager.py Normal file
View File

@@ -0,0 +1,131 @@
"""
Runtime configuration manager for isolated model-day execution.
This module provides:
- Isolated runtime config file creation per model-day
- Prevention of state collisions between concurrent executions
- Automatic cleanup of temporary config files
"""
import os
import json
from pathlib import Path
import logging
logger = logging.getLogger(__name__)
class RuntimeConfigManager:
"""
Manages isolated runtime configuration files for concurrent model execution.
Problem:
Multiple models running concurrently need separate runtime_env.json files
to avoid race conditions on TODAY_DATE, SIGNATURE, IF_TRADE values.
Solution:
Create temporary runtime config file per model-day execution:
- /app/data/runtime_env_{job_id}_{model}_{date}.json
Lifecycle:
1. create_runtime_config() → Creates temp file
2. Executor sets RUNTIME_ENV_PATH env var
3. Agent uses isolated config via get_config_value/write_config_value
4. cleanup_runtime_config() → Deletes temp file
"""
def __init__(self, data_dir: str = "data"):
"""
Initialize RuntimeConfigManager.
Args:
data_dir: Directory for runtime config files (default: "data")
"""
self.data_dir = Path(data_dir)
self.data_dir.mkdir(parents=True, exist_ok=True)
def create_runtime_config(
self,
job_id: str,
model_sig: str,
date: str
) -> str:
"""
Create isolated runtime config file for this execution.
Args:
job_id: Job UUID
model_sig: Model signature
date: Trading date (YYYY-MM-DD)
Returns:
Path to created runtime config file
Example:
config_path = manager.create_runtime_config(
"abc123...",
"gpt-5",
"2025-01-16"
)
# Returns: "data/runtime_env_abc123_gpt-5_2025-01-16.json"
"""
# Generate unique filename (use first 8 chars of job_id for brevity)
job_id_short = job_id[:8] if len(job_id) > 8 else job_id
filename = f"runtime_env_{job_id_short}_{model_sig}_{date}.json"
config_path = self.data_dir / filename
# Initialize with default values
initial_config = {
"TODAY_DATE": date,
"SIGNATURE": model_sig,
"IF_TRADE": False,
"JOB_ID": job_id
}
with open(config_path, "w", encoding="utf-8") as f:
json.dump(initial_config, f, indent=4)
logger.debug(f"Created runtime config: {config_path}")
return str(config_path)
def cleanup_runtime_config(self, config_path: str) -> None:
"""
Delete runtime config file after execution.
Args:
config_path: Path to runtime config file
Note:
Silently ignores if file doesn't exist (already cleaned up)
"""
try:
if os.path.exists(config_path):
os.unlink(config_path)
logger.debug(f"Cleaned up runtime config: {config_path}")
except Exception as e:
logger.warning(f"Failed to cleanup runtime config {config_path}: {e}")
def cleanup_all_runtime_configs(self) -> int:
"""
Cleanup all runtime config files (for maintenance/startup).
Returns:
Number of files deleted
Use case:
- On API startup to clean stale configs from previous runs
- Periodic maintenance
"""
count = 0
for config_file in self.data_dir.glob("runtime_env_*.json"):
try:
config_file.unlink()
count += 1
logger.debug(f"Deleted stale runtime config: {config_file}")
except Exception as e:
logger.warning(f"Failed to delete {config_file}: {e}")
if count > 0:
logger.info(f"Cleaned up {count} stale runtime config files")
return count

210
api/simulation_worker.py Normal file
View File

@@ -0,0 +1,210 @@
"""
Simulation job orchestration worker.
This module provides:
- Job execution orchestration
- Date-sequential, model-parallel execution
- Progress tracking and status updates
- Error handling and recovery
"""
import logging
from typing import Dict, Any, List
from concurrent.futures import ThreadPoolExecutor, as_completed
from api.job_manager import JobManager
from api.model_day_executor import ModelDayExecutor
logger = logging.getLogger(__name__)
class SimulationWorker:
"""
Orchestrates execution of a simulation job.
Responsibilities:
- Execute all model-day combinations for a job
- Date-sequential execution (one date at a time)
- Model-parallel execution (all models for a date run concurrently)
- Update job status throughout execution
- Handle failures gracefully
Execution Strategy:
For each date in job.date_range:
Execute all models in parallel using ThreadPoolExecutor
Wait for all models to complete before moving to next date
Status Transitions:
pending → running → completed (all succeeded)
→ partial (some failed)
→ failed (job-level error)
"""
def __init__(self, job_id: str, db_path: str = "data/jobs.db", max_workers: int = 4):
"""
Initialize SimulationWorker.
Args:
job_id: Job UUID to execute
db_path: Path to SQLite database
max_workers: Maximum concurrent model executions per date
"""
self.job_id = job_id
self.db_path = db_path
self.max_workers = max_workers
self.job_manager = JobManager(db_path=db_path)
logger.info(f"Initialized worker for job {job_id}")
def run(self) -> Dict[str, Any]:
"""
Execute the simulation job.
Returns:
Result dict with success status and summary
Process:
1. Get job details (dates, models, config)
2. For each date sequentially:
a. Execute all models in parallel
b. Wait for all to complete
c. Update progress
3. Determine final job status
4. Update job with final status
Error Handling:
- Individual model failures: Mark detail as failed, continue with others
- Job-level errors: Mark entire job as failed
"""
try:
# Get job info
job = self.job_manager.get_job(self.job_id)
if not job:
raise ValueError(f"Job {self.job_id} not found")
date_range = job["date_range"]
models = job["models"]
config_path = job["config_path"]
logger.info(f"Starting job {self.job_id}: {len(date_range)} dates, {len(models)} models")
# Execute date-by-date (sequential)
for date in date_range:
logger.info(f"Processing date {date} with {len(models)} models")
self._execute_date(date, models, config_path)
# Job completed - determine final status
progress = self.job_manager.get_job_progress(self.job_id)
if progress["failed"] == 0:
final_status = "completed"
elif progress["completed"] > 0:
final_status = "partial"
else:
final_status = "failed"
# Note: Job status is already updated by model_day_executor's detail status updates
# We don't need to explicitly call update_job_status here as it's handled automatically
# by the status transition logic in JobManager.update_job_detail_status
logger.info(f"Job {self.job_id} finished with status: {final_status}")
return {
"success": True,
"job_id": self.job_id,
"status": final_status,
"total_model_days": progress["total_model_days"],
"completed": progress["completed"],
"failed": progress["failed"]
}
except Exception as e:
error_msg = f"Job execution failed: {str(e)}"
logger.error(f"Job {self.job_id}: {error_msg}", exc_info=True)
# Update job to failed
self.job_manager.update_job_status(self.job_id, "failed", error=error_msg)
return {
"success": False,
"job_id": self.job_id,
"error": error_msg
}
def _execute_date(self, date: str, models: List[str], config_path: str) -> None:
"""
Execute all models for a single date in parallel.
Args:
date: Trading date (YYYY-MM-DD)
models: List of model signatures to execute
config_path: Path to configuration file
Uses ThreadPoolExecutor to run all models concurrently for this date.
Waits for all models to complete before returning.
"""
with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
# Submit all model executions for this date
futures = []
for model in models:
future = executor.submit(
self._execute_model_day,
date,
model,
config_path
)
futures.append(future)
# Wait for all to complete
for future in as_completed(futures):
try:
result = future.result()
if result["success"]:
logger.debug(f"Completed {result['model']} on {result['date']}")
else:
logger.warning(f"Failed {result['model']} on {result['date']}: {result.get('error')}")
except Exception as e:
logger.error(f"Exception in model execution: {e}", exc_info=True)
def _execute_model_day(self, date: str, model: str, config_path: str) -> Dict[str, Any]:
"""
Execute a single model for a single date.
Args:
date: Trading date (YYYY-MM-DD)
model: Model signature
config_path: Path to configuration file
Returns:
Execution result dict
"""
try:
executor = ModelDayExecutor(
job_id=self.job_id,
date=date,
model_sig=model,
config_path=config_path,
db_path=self.db_path
)
result = executor.execute()
return result
except Exception as e:
logger.error(f"Failed to execute {model} on {date}: {e}", exc_info=True)
return {
"success": False,
"job_id": self.job_id,
"date": date,
"model": model,
"error": str(e)
}
def get_job_info(self) -> Dict[str, Any]:
"""
Get job information.
Returns:
Job data dict
"""
return self.job_manager.get_job(self.job_id)

View File

@@ -0,0 +1,6 @@
{
"TODAY_DATE": "2025-01-16",
"SIGNATURE": "gpt-5",
"IF_TRADE": false,
"JOB_ID": "test-job-123"
}

View File

@@ -1,9 +1,10 @@
services:
ai-trader:
# Batch mode: Run one-time simulations with config file
ai-trader-batch:
image: ghcr.io/xe138/ai-trader:latest
# Uncomment to build locally instead of pulling:
# build: .
container_name: ai-trader-app
container_name: ai-trader-batch
volumes:
- ${VOLUME_PATH:-.}/data:/app/data
- ${VOLUME_PATH:-.}/logs:/app/logs
@@ -35,4 +36,58 @@ services:
- "${TRADE_HTTP_PORT:-8002}:8002"
- "${GETPRICE_HTTP_PORT:-8003}:8003"
- "${WEB_HTTP_PORT:-8888}:8888"
restart: on-failure:3 # Restart max 3 times on failure, prevents endless loops
restart: "no" # Batch jobs should not auto-restart
profiles:
- batch # Only start with: docker-compose --profile batch up
# API mode: REST API server for Windmill integration
ai-trader-api:
image: ghcr.io/xe138/ai-trader:latest
# Uncomment to build locally instead of pulling:
# build: .
container_name: ai-trader-api
entrypoint: ["./entrypoint-api.sh"]
volumes:
- ${VOLUME_PATH:-.}/data:/app/data
- ${VOLUME_PATH:-.}/logs:/app/logs
- ${VOLUME_PATH:-.}/configs:/app/configs
environment:
# AI Model API Configuration
- OPENAI_API_BASE=${OPENAI_API_BASE}
- OPENAI_API_KEY=${OPENAI_API_KEY}
# Data Source Configuration
- ALPHAADVANTAGE_API_KEY=${ALPHAADVANTAGE_API_KEY}
- JINA_API_KEY=${JINA_API_KEY}
# System Configuration
- RUNTIME_ENV_PATH=/app/data/runtime_env.json
# MCP Service Ports (fixed internally)
- MATH_HTTP_PORT=8000
- SEARCH_HTTP_PORT=8001
- TRADE_HTTP_PORT=8002
- GETPRICE_HTTP_PORT=8003
# API Configuration
- API_PORT=${API_PORT:-8080}
# Agent Configuration
- AGENT_MAX_STEP=${AGENT_MAX_STEP:-30}
ports:
# MCP service ports (internal)
- "${MATH_HTTP_PORT:-8000}:8000"
- "${SEARCH_HTTP_PORT:-8001}:8001"
- "${TRADE_HTTP_PORT:-8002}:8002"
- "${GETPRICE_HTTP_PORT:-8003}:8003"
# API server port (primary interface)
- "${API_PORT:-8080}:8080"
# Web dashboard
- "${WEB_HTTP_PORT:-8888}:8888"
restart: unless-stopped # Keep API server running
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s

View File

@@ -0,0 +1,631 @@
# AI-Trader API Service - Enhanced Specifications Summary
## Changes from Original Specifications
Based on user feedback, the specifications have been enhanced with:
1. **SQLite-backed results storage** (instead of reading position.jsonl on-demand)
2. **Comprehensive Python testing suite** with pytest
3. **Defined testing thresholds** for coverage, performance, and quality gates
---
## Document Index
### Core Specifications (Original)
1. **[api-specification.md](./api-specification.md)** - REST API endpoints and data models
2. **[job-manager-specification.md](./job-manager-specification.md)** - Job tracking and database layer
3. **[worker-specification.md](./worker-specification.md)** - Background worker architecture
4. **[implementation-specifications.md](./implementation-specifications.md)** - Agent, Docker, Windmill integration
### Enhanced Specifications (New)
5. **[database-enhanced-specification.md](./database-enhanced-specification.md)** - SQLite results storage
6. **[testing-specification.md](./testing-specification.md)** - Comprehensive testing suite
### Summary Documents
7. **[README-SPECS.md](./README-SPECS.md)** - Original specifications overview
8. **[ENHANCED-SPECIFICATIONS-SUMMARY.md](./ENHANCED-SPECIFICATIONS-SUMMARY.md)** - This document
---
## Key Enhancement #1: SQLite Results Storage
### What Changed
**Before:**
- `/results` endpoint reads `position.jsonl` files on-demand
- File I/O on every API request
- No support for advanced queries (date ranges, aggregations)
**After:**
- Simulation results written to SQLite during execution
- Fast database queries (10-100x faster than file I/O)
- Advanced analytics: timeseries, leaderboards, aggregations
### New Database Tables
```sql
-- Results storage
CREATE TABLE positions (
id INTEGER PRIMARY KEY,
job_id TEXT,
date TEXT,
model TEXT,
action_id INTEGER,
action_type TEXT,
symbol TEXT,
amount INTEGER,
price REAL,
cash REAL,
portfolio_value REAL,
daily_profit REAL,
daily_return_pct REAL,
cumulative_profit REAL,
cumulative_return_pct REAL,
created_at TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id)
);
CREATE TABLE holdings (
id INTEGER PRIMARY KEY,
position_id INTEGER,
symbol TEXT,
quantity INTEGER,
FOREIGN KEY (position_id) REFERENCES positions(id)
);
CREATE TABLE reasoning_logs (
id INTEGER PRIMARY KEY,
job_id TEXT,
date TEXT,
model TEXT,
step_number INTEGER,
timestamp TEXT,
role TEXT,
content TEXT,
tool_name TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id)
);
CREATE TABLE tool_usage (
id INTEGER PRIMARY KEY,
job_id TEXT,
date TEXT,
model TEXT,
tool_name TEXT,
call_count INTEGER,
total_duration_seconds REAL,
FOREIGN KEY (job_id) REFERENCES jobs(job_id)
);
```
### New API Endpoints
```python
# Enhanced results endpoint (now reads from SQLite)
GET /results?date=2025-01-16&model=gpt-5&detail=minimal|full
# New analytics endpoints
GET /portfolio/timeseries?model=gpt-5&start_date=2025-01-01&end_date=2025-01-31
GET /leaderboard?date=2025-01-16 # Rankings by portfolio value
```
### Migration Strategy
**Phase 1:** Dual-write mode
- Agent writes to `position.jsonl` (existing code)
- Executor writes to SQLite after agent completes
- Ensures backward compatibility
**Phase 2:** Verification
- Compare SQLite data vs JSONL data
- Fix any discrepancies
**Phase 3:** Switch over
- `/results` endpoint reads from SQLite
- JSONL writes become optional (can deprecate later)
### Performance Improvement
| Operation | Before (JSONL) | After (SQLite) | Speedup |
|-----------|----------------|----------------|---------|
| Get results for 1 date | 200-500ms | 20-50ms | **10x faster** |
| Get timeseries (30 days) | 6-15 seconds | 100-300ms | **50x faster** |
| Get leaderboard | 5-10 seconds | 50-100ms | **100x faster** |
---
## Key Enhancement #2: Comprehensive Testing Suite
### Testing Thresholds
| Metric | Minimum | Target | Enforcement |
|--------|---------|--------|-------------|
| **Code Coverage** | 85% | 90% | CI fails if below |
| **Critical Path Coverage** | 90% | 95% | Manual review |
| **Unit Test Speed** | <10s | <5s | Benchmark tracking |
| **Integration Test Speed** | <60s | <30s | Benchmark tracking |
| **API Response Times** | <500ms | <200ms | Load testing |
### Test Suite Structure
```
tests/
├── unit/ # 80 tests, <10 seconds
│ ├── test_job_manager.py # 95% coverage target
│ ├── test_database.py
│ ├── test_runtime_manager.py
│ ├── test_results_service.py # 95% coverage target
│ └── test_models.py
├── integration/ # 30 tests, <60 seconds
│ ├── test_api_endpoints.py # Full FastAPI testing
│ ├── test_worker.py
│ ├── test_executor.py
│ └── test_end_to_end.py
├── performance/ # 20 tests
│ ├── test_database_benchmarks.py
│ ├── test_api_load.py # Locust load testing
│ └── test_simulation_timing.py
├── security/ # 10 tests
│ ├── test_api_security.py # SQL injection, XSS, path traversal
│ └── test_auth.py # Future: API key validation
└── e2e/ # 10 tests, Docker required
└── test_docker_workflow.py # Full Docker compose scenario
```
### Quality Gates
**All PRs must pass:**
1. ✅ All tests passing (unit + integration)
2. ✅ Code coverage ≥ 85%
3. ✅ No critical security vulnerabilities (Bandit scan)
4. ✅ Linting passes (Ruff or Flake8)
5. ✅ Type checking passes (mypy strict mode)
6. ✅ No performance regressions (±10% tolerance)
**Release checklist:**
1. ✅ All quality gates pass
2. ✅ End-to-end tests pass in Docker
3. ✅ Load testing passes (100 concurrent requests)
4. ✅ Security scan passes (OWASP ZAP)
5. ✅ Manual smoke tests complete
### CI/CD Integration
```yaml
# .github/workflows/test.yml
name: Test Suite
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run unit tests
run: pytest tests/unit/ --cov=api --cov-fail-under=85
- name: Run integration tests
run: pytest tests/integration/
- name: Security scan
run: bandit -r api/ -ll
- name: Upload coverage
uses: codecov/codecov-action@v3
```
### Test Coverage Breakdown
| Component | Minimum | Target | Tests |
|-----------|---------|--------|-------|
| `api/job_manager.py` | 90% | 95% | 25 tests |
| `api/worker.py` | 85% | 90% | 15 tests |
| `api/executor.py` | 85% | 90% | 12 tests |
| `api/results_service.py` | 90% | 95% | 18 tests |
| `api/database.py` | 95% | 100% | 10 tests |
| `api/runtime_manager.py` | 85% | 90% | 8 tests |
| `api/main.py` | 80% | 85% | 20 tests |
| **Total** | **85%** | **90%** | **~150 tests** |
---
## Updated Implementation Plan
### Phase 1: API Foundation (Days 1-2)
- [x] Create `api/` directory structure
- [ ] Implement `api/models.py` with Pydantic models
- [ ] Implement `api/database.py` with **enhanced schema** (6 tables)
- [ ] Implement `api/job_manager.py` with job CRUD operations
- [ ] **NEW:** Write unit tests for job_manager (target: 95% coverage)
- [ ] Test database operations manually
**Testing Deliverables:**
- 25 unit tests for job_manager
- 10 unit tests for database utilities
- 85%+ coverage for Phase 1 code
---
### Phase 2: Worker & Executor (Days 3-4)
- [ ] Implement `api/runtime_manager.py`
- [ ] Implement `api/executor.py` for single model-day execution
- [ ] **NEW:** Add SQLite write logic to executor (`_store_results_to_db()`)
- [ ] Implement `api/worker.py` for job orchestration
- [ ] **NEW:** Write unit tests for worker and executor (target: 85% coverage)
- [ ] Test runtime config isolation
**Testing Deliverables:**
- 15 unit tests for worker
- 12 unit tests for executor
- 8 unit tests for runtime_manager
- 85%+ coverage for Phase 2 code
---
### Phase 3: Results Service & FastAPI Endpoints (Days 5-6)
- [ ] **NEW:** Implement `api/results_service.py` (SQLite-backed)
- [ ] `get_results(date, model, detail)`
- [ ] `get_portfolio_timeseries(model, start_date, end_date)`
- [ ] `get_leaderboard(date)`
- [ ] Implement `api/main.py` with all endpoints
- [ ] `/simulate/trigger` with background tasks
- [ ] `/simulate/status/{job_id}`
- [ ] `/simulate/current`
- [ ] `/results` (now reads from SQLite)
- [ ] **NEW:** `/portfolio/timeseries`
- [ ] **NEW:** `/leaderboard`
- [ ] `/health` with MCP checks
- [ ] **NEW:** Write unit tests for results_service (target: 95% coverage)
- [ ] **NEW:** Write integration tests for API endpoints (target: 80% coverage)
- [ ] Test all endpoints with Postman/curl
**Testing Deliverables:**
- 18 unit tests for results_service
- 20 integration tests for API endpoints
- Performance benchmarks for database queries
- 85%+ coverage for Phase 3 code
---
### Phase 4: Docker Integration (Day 7)
- [ ] Update `Dockerfile`
- [ ] Create `docker-entrypoint-api.sh`
- [ ] Create `requirements-api.txt`
- [ ] Update `docker-compose.yml`
- [ ] Test Docker build
- [ ] Test container startup and health checks
- [ ] **NEW:** Run E2E tests in Docker environment
- [ ] Test end-to-end simulation via API in Docker
**Testing Deliverables:**
- 10 E2E tests with Docker
- Docker health check validation
- Performance testing in containerized environment
---
### Phase 5: Windmill Integration (Days 8-9)
- [ ] Create Windmill scripts (trigger, poll, store)
- [ ] **UPDATED:** Modify `store_simulation_results.py` to use new `/results` endpoint
- [ ] Test scripts locally against Docker API
- [ ] Deploy scripts to Windmill instance
- [ ] Create Windmill workflow
- [ ] Test workflow end-to-end
- [ ] Create Windmill dashboard (using new `/portfolio/timeseries` and `/leaderboard` endpoints)
- [ ] Document Windmill setup process
**Testing Deliverables:**
- Integration tests for Windmill scripts
- End-to-end workflow validation
- Dashboard functionality verification
---
### Phase 6: Testing, Security & Documentation (Day 10)
- [ ] **NEW:** Run full test suite and verify all thresholds met
- [ ] Code coverage ≥ 85%
- [ ] All ~150 tests passing
- [ ] Performance benchmarks within limits
- [ ] **NEW:** Security testing
- [ ] Bandit scan (Python security issues)
- [ ] SQL injection tests
- [ ] Input validation tests
- [ ] OWASP ZAP scan (optional)
- [ ] **NEW:** Load testing with Locust
- [ ] 100 concurrent users
- [ ] API endpoints within performance thresholds
- [ ] Integration tests for complete workflow
- [ ] Update README.md with API usage
- [ ] Create API documentation (Swagger/OpenAPI - auto-generated by FastAPI)
- [ ] Create deployment guide
- [ ] Create troubleshooting guide
- [ ] **NEW:** Generate test coverage report
**Testing Deliverables:**
- Full test suite execution report
- Security scan results
- Load testing results
- Coverage report (HTML + XML)
- CI/CD pipeline configuration
---
## New Files Created
### Database & Results
- `api/results_service.py` - SQLite-backed results retrieval
- `api/import_historical_data.py` - Migration script for existing position.jsonl files
### Testing Suite
- `tests/conftest.py` - Shared pytest fixtures
- `tests/unit/test_job_manager.py` - 25 tests
- `tests/unit/test_database.py` - 10 tests
- `tests/unit/test_runtime_manager.py` - 8 tests
- `tests/unit/test_results_service.py` - 18 tests
- `tests/unit/test_models.py` - 5 tests
- `tests/integration/test_api_endpoints.py` - 20 tests
- `tests/integration/test_worker.py` - 15 tests
- `tests/integration/test_executor.py` - 12 tests
- `tests/integration/test_end_to_end.py` - 5 tests
- `tests/performance/test_database_benchmarks.py` - 10 tests
- `tests/performance/test_api_load.py` - Locust load testing
- `tests/security/test_api_security.py` - 10 tests
- `tests/e2e/test_docker_workflow.py` - 10 tests
- `pytest.ini` - Test configuration
- `requirements-dev.txt` - Testing dependencies
### CI/CD
- `.github/workflows/test.yml` - GitHub Actions workflow
---
## Updated File Structure
```
AI-Trader/
├── api/
│ ├── __init__.py
│ ├── main.py # FastAPI application
│ ├── models.py # Pydantic request/response models
│ ├── job_manager.py # Job lifecycle management
│ ├── database.py # SQLite utilities (enhanced schema)
│ ├── worker.py # Background simulation worker
│ ├── executor.py # Single model-day execution (+ SQLite writes)
│ ├── runtime_manager.py # Runtime config isolation
│ ├── results_service.py # NEW: SQLite-backed results retrieval
│ └── import_historical_data.py # NEW: JSONL → SQLite migration
├── tests/ # NEW: Comprehensive test suite
│ ├── conftest.py
│ ├── unit/ # 80 tests, <10s
│ ├── integration/ # 30 tests, <60s
│ ├── performance/ # 20 tests
│ ├── security/ # 10 tests
│ └── e2e/ # 10 tests
├── docs/
│ ├── api-specification.md
│ ├── job-manager-specification.md
│ ├── worker-specification.md
│ ├── implementation-specifications.md
│ ├── database-enhanced-specification.md # NEW
│ ├── testing-specification.md # NEW
│ ├── README-SPECS.md
│ └── ENHANCED-SPECIFICATIONS-SUMMARY.md # NEW (this file)
├── data/
│ ├── jobs.db # SQLite database (6 tables)
│ ├── runtime_env*.json # Runtime configs (temporary)
│ ├── agent_data/ # Existing position/log data
│ └── merged.jsonl # Existing price data
├── pytest.ini # NEW: Test configuration
├── requirements-dev.txt # NEW: Testing dependencies
├── .github/workflows/test.yml # NEW: CI/CD pipeline
└── ... (existing files)
```
---
## Benefits Summary
### Performance
- **10-100x faster** results queries (SQLite vs file I/O)
- **Advanced analytics** - timeseries, leaderboards, aggregations in milliseconds
- **Optimized indexes** for common queries
### Quality
- **85% minimum coverage** enforced by CI/CD
- **150 comprehensive tests** across unit, integration, performance, security
- **Quality gates** prevent regressions
- **Type safety** with mypy strict mode
### Maintainability
- **SQLite single source of truth** - easier backup, restore, migration
- **Automated testing** catches bugs early
- **CI/CD integration** provides fast feedback on every commit
- **Security scanning** prevents vulnerabilities
### Analytics Capabilities
**New queries enabled by SQLite:**
```python
# Portfolio timeseries for charting
GET /portfolio/timeseries?model=gpt-5&start_date=2025-01-01&end_date=2025-01-31
# Model leaderboard
GET /leaderboard?date=2025-01-31
# Advanced filtering (future)
SELECT * FROM positions
WHERE daily_return_pct > 2.0
ORDER BY portfolio_value DESC;
# Aggregations (future)
SELECT model, AVG(daily_return_pct) as avg_return
FROM positions
GROUP BY model
ORDER BY avg_return DESC;
```
---
## Migration from Original Spec
If you've already started implementation based on original specs:
### Step 1: Database Schema Migration
```sql
-- Run enhanced schema creation
-- See database-enhanced-specification.md Section 2.1
```
### Step 2: Add Results Service
```bash
# Create new file
touch api/results_service.py
# Implement as per database-enhanced-specification.md Section 4.1
```
### Step 3: Update Executor
```python
# In api/executor.py, add after agent.run_trading_session():
self._store_results_to_db(job_id, date, model_sig)
```
### Step 4: Update API Endpoints
```python
# In api/main.py, update /results endpoint to use ResultsService
from api.results_service import ResultsService
results_service = ResultsService()
@app.get("/results")
async def get_results(...):
return results_service.get_results(date, model, detail)
```
### Step 5: Add Test Suite
```bash
mkdir -p tests/{unit,integration,performance,security,e2e}
# Create test files as per testing-specification.md Section 4-8
```
### Step 6: Configure CI/CD
```bash
mkdir -p .github/workflows
# Create test.yml as per testing-specification.md Section 10.1
```
---
## Testing Execution Guide
### Run Unit Tests
```bash
pytest tests/unit/ -v --cov=api --cov-report=term-missing
```
### Run Integration Tests
```bash
pytest tests/integration/ -v
```
### Run All Tests (Except E2E)
```bash
pytest tests/ -v --ignore=tests/e2e/ --cov=api --cov-report=html
```
### Run E2E Tests (Requires Docker)
```bash
pytest tests/e2e/ -v -s
```
### Run Performance Benchmarks
```bash
pytest tests/performance/ --benchmark-only
```
### Run Security Tests
```bash
pytest tests/security/ -v
bandit -r api/ -ll
```
### Generate Coverage Report
```bash
pytest tests/unit/ tests/integration/ --cov=api --cov-report=html
open htmlcov/index.html # View in browser
```
### Run Load Tests
```bash
locust -f tests/performance/test_api_load.py --host=http://localhost:8080
# Open http://localhost:8089 for Locust UI
```
---
## Questions & Next Steps
### Review Checklist
Please review:
1.**Enhanced database schema** with 6 tables for comprehensive results storage
2.**Migration strategy** for backward compatibility (dual-write mode)
3.**Testing thresholds** (85% coverage minimum, performance benchmarks)
4.**Test suite structure** (150 tests across 5 categories)
5.**CI/CD integration** with quality gates
6.**Updated implementation plan** (10 days, 6 phases)
### Questions to Consider
1. **Database migration timing:** Start with dual-write mode immediately, or add in Phase 2?
2. **Testing priorities:** Should we implement tests alongside features (TDD) or after each phase?
3. **CI/CD platform:** GitHub Actions (as specified) or different platform?
4. **Performance baselines:** Should we run benchmarks before implementation to track improvement?
5. **Security priorities:** Which security tests are MVP vs nice-to-have?
### Ready to Implement?
**Option A:** Approve specifications and begin Phase 1 implementation
- Create API directory structure
- Implement enhanced database schema
- Write unit tests for database layer
- Target: 2 days, 90%+ coverage for database code
**Option B:** Request modifications to specifications
- Clarify any unclear requirements
- Adjust testing thresholds
- Modify implementation timeline
**Option C:** Implement in parallel workstreams
- Workstream 1: Core API (Phases 1-3)
- Workstream 2: Testing suite (parallel with Phase 1-3)
- Workstream 3: Docker + Windmill (Phases 4-5)
- Benefits: Faster delivery, more parallelization
- Requires: Clear interfaces between components
---
## Summary
**Enhanced specifications** add:
1. 🗄️ **SQLite results storage** - 10-100x faster queries, advanced analytics
2. 🧪 **Comprehensive testing** - 150 tests, 85% coverage, quality gates
3. 🔒 **Security testing** - SQL injection, XSS, input validation
4.**Performance benchmarks** - Catch regressions early
5. 🚀 **CI/CD pipeline** - Automated quality checks on every commit
**Total effort:** Still ~10 days, but with significantly higher code quality and confidence in deployments.
**Risk mitigation:** Extensive testing catches bugs before production, preventing costly hotfixes.
**Long-term value:** Maintainable, well-tested codebase enables rapid feature development.
---
Ready to proceed? Please provide feedback or approval to begin implementation!

436
docs/README-SPECS.md Normal file
View File

@@ -0,0 +1,436 @@
# AI-Trader API Service - Technical Specifications Summary
## Overview
This directory contains comprehensive technical specifications for transforming the AI-Trader batch simulation system into an API service compatible with Windmill automation.
## Specification Documents
### 1. [API Specification](./api-specification.md)
**Purpose:** Defines all API endpoints, request/response formats, and data models
**Key Contents:**
- **5 REST Endpoints:**
- `POST /simulate/trigger` - Queue catch-up simulation job
- `GET /simulate/status/{job_id}` - Poll job progress
- `GET /simulate/current` - Get latest job
- `GET /results` - Retrieve simulation results (minimal/full detail)
- `GET /health` - Service health check
- **Pydantic Models** for type-safe request/response handling
- **Error Handling** strategies and HTTP status codes
- **SQLite Schema** for jobs and job_details tables
- **Configuration Management** via environment variables
**Status Codes:** 200 OK, 202 Accepted, 400 Bad Request, 404 Not Found, 409 Conflict, 503 Service Unavailable
---
### 2. [Job Manager Specification](./job-manager-specification.md)
**Purpose:** Details the job tracking and database layer
**Key Contents:**
- **SQLite Database Schema:**
- `jobs` table - High-level job metadata
- `job_details` table - Per model-day execution tracking
- **JobManager Class Interface:**
- `create_job()` - Create new simulation job
- `get_job()` - Retrieve job by ID
- `update_job_status()` - State transitions (pending → running → completed/partial/failed)
- `get_job_progress()` - Detailed progress metrics
- `can_start_new_job()` - Concurrency control
- **State Machine:** Job status transitions and business logic
- **Concurrency Control:** Single-job execution enforcement
- **Testing Strategy:** Unit tests with temporary databases
**Key Feature:** Independent model execution - one model's failure doesn't block others (results in "partial" status)
---
### 3. [Background Worker Specification](./worker-specification.md)
**Purpose:** Defines async job execution architecture
**Key Contents:**
- **Execution Pattern:** Date-sequential, Model-parallel
- All models for Date 1 run in parallel
- Date 2 starts only after all models finish Date 1
- Ensures position.jsonl integrity (no concurrent writes)
- **SimulationWorker Class:**
- Orchestrates job execution
- Manages date sequencing
- Handles job-level errors
- **ModelDayExecutor Class:**
- Executes single model-day simulation
- Updates job_detail status
- Isolates runtime configuration
- **RuntimeConfigManager:**
- Creates temporary runtime_env_{job_id}_{model}_{date}.json files
- Prevents state collisions between concurrent models
- Cleans up after execution
- **Error Handling:** Graceful failure (models continue despite peer failures)
- **Logging:** Structured JSON logging with job/model/date context
**Performance:** 3 models × 5 days = ~7-15 minutes (vs. ~22-45 minutes sequential)
---
### 4. [Implementation Specification](./implementation-specifications.md)
**Purpose:** Complete implementation guide covering Agent, Docker, and Windmill
**Key Contents:**
#### Part 1: BaseAgent Refactoring
- **Analysis:** Existing `run_trading_session()` already compatible with API mode
- **Required Changes:** ✅ NONE! Existing code works as-is
- **Worker Integration:** Calls `agent.run_trading_session(date)` directly
#### Part 2: Docker Configuration
- **Modified Dockerfile:** Adds FastAPI dependencies, new entrypoint
- **docker-entrypoint-api.sh:** Starts MCP services → launches uvicorn
- **Health Checks:** Verifies MCP services and database connectivity
- **Volume Mounts:** `./data`, `./configs` for persistence
#### Part 3: Windmill Integration
- **Flow 1: trigger_simulation.ts** - Daily cron triggers API
- **Flow 2: poll_simulation_status.ts** - Polls every 5 min until complete
- **Flow 3: store_simulation_results.py** - Stores results in Windmill DB
- **Dashboard:** Charts and tables showing portfolio performance
- **Workflow Orchestration:** Complete YAML workflow definition
#### Part 4: File Structure
- New `api/` directory with 7 modules
- New `windmill/` directory with scripts and dashboard
- New `docs/` directory (this folder)
- `data/jobs.db` for job tracking
#### Part 5: Implementation Checklist
10-day implementation plan broken into 6 phases
---
## Architecture Highlights
### Request Flow
```
1. Windmill → POST /simulate/trigger
2. API creates job in SQLite (status: pending)
3. API queues BackgroundTask
4. API returns 202 Accepted with job_id
5. Worker starts (status: running)
6. For each date sequentially:
For each model in parallel:
- Create isolated runtime config
- Execute agent.run_trading_session(date)
- Update job_detail status
7. Worker finishes (status: completed/partial/failed)
8. Windmill polls GET /simulate/status/{job_id}
9. When complete: Windmill calls GET /results?date=X
10. Windmill stores results in internal DB
11. Windmill dashboard displays performance
```
### Data Flow
```
Input: configs/default_config.json
API: Calculates date_range (last position → today)
Worker: Executes simulations
Output: data/agent_data/{model}/position/position.jsonl
data/agent_data/{model}/log/{date}/log.jsonl
data/jobs.db (job tracking)
API: Reads position.jsonl + calculates P&L
Windmill: Stores in internal DB → Dashboard visualization
```
---
## Key Design Decisions
### 1. Pattern B: Lazy On-Demand Processing
- **Chosen:** Windmill controls simulation timing via API calls
- **Benefit:** Centralized scheduling in Windmill
- **Tradeoff:** First Windmill call of the day triggers long-running job
### 2. SQLite vs. PostgreSQL
- **Chosen:** SQLite for MVP
- **Rationale:** Low concurrency (1 job at a time), simple deployment
- **Future:** PostgreSQL for production with multiple concurrent jobs
### 3. Date-Sequential, Model-Parallel Execution
- **Chosen:** Dates run sequentially, models run in parallel per date
- **Rationale:** Prevents position.jsonl race conditions, faster than fully sequential
- **Performance:** ~50% faster than sequential (3 models in parallel)
### 4. Independent Model Failures
- **Chosen:** One model's failure doesn't block others
- **Benefit:** Partial results better than no results
- **Implementation:** Job status becomes "partial" if any model fails
### 5. Minimal BaseAgent Changes
- **Chosen:** No modifications to agent code
- **Rationale:** Existing `run_trading_session()` is perfect API interface
- **Benefit:** Maintains backward compatibility with batch mode
---
## Implementation Prerequisites
### Required Environment Variables
```bash
OPENAI_API_BASE=...
OPENAI_API_KEY=...
ALPHAADVANTAGE_API_KEY=...
JINA_API_KEY=...
RUNTIME_ENV_PATH=/app/data/runtime_env.json
MATH_HTTP_PORT=8000
SEARCH_HTTP_PORT=8001
TRADE_HTTP_PORT=8002
GETPRICE_HTTP_PORT=8003
API_HOST=0.0.0.0
API_PORT=8080
```
### Required Python Packages (new)
```
fastapi==0.109.0
uvicorn[standard]==0.27.0
pydantic==2.5.3
```
### Docker Requirements
- Docker Engine 20.10+
- Docker Compose 2.0+
- 2GB RAM minimum for container
- 10GB disk space for data
### Windmill Requirements
- Windmill instance (self-hosted or cloud)
- Network access from Windmill to AI-Trader API
- Windmill CLI for deployment (optional)
---
## Testing Strategy
### Unit Tests
- `tests/test_job_manager.py` - Database operations
- `tests/test_worker.py` - Job execution logic
- `tests/test_executor.py` - Model-day execution
### Integration Tests
- `tests/test_api_endpoints.py` - FastAPI endpoint behavior
- `tests/test_end_to_end.py` - Full workflow (trigger → execute → retrieve)
### Manual Testing
- Docker container startup
- Health check endpoint
- Windmill workflow execution
- Dashboard visualization
---
## Performance Expectations
### Single Model-Day Execution
- **Duration:** 30-60 seconds (varies by AI model latency)
- **Bottlenecks:** AI API calls, MCP tool latency
### Multi-Model Job
- **Example:** 3 models × 5 days = 15 model-days
- **Parallel Execution:** ~7-15 minutes
- **Sequential Execution:** ~22-45 minutes
- **Speedup:** ~3x (number of models)
### API Response Times
- `/simulate/trigger`: < 1 second (just queues job)
- `/simulate/status`: < 100ms (SQLite query)
- `/results?detail=minimal`: < 500ms (file read + JSON parsing)
- `/results?detail=full`: < 2 seconds (parse log files)
---
## Security Considerations
### MVP Security
- **Network Isolation:** Docker network (no public exposure)
- **No Authentication:** Assumes Windmill → API is trusted network
### Future Enhancements
- API key authentication (`X-API-Key` header)
- Rate limiting per client
- HTTPS/TLS encryption
- Input sanitization for path traversal prevention
---
## Deployment Steps
### 1. Build Docker Image
```bash
docker-compose build
```
### 2. Start API Service
```bash
docker-compose up -d
```
### 3. Verify Health
```bash
curl http://localhost:8080/health
```
### 4. Test Trigger
```bash
curl -X POST http://localhost:8080/simulate/trigger \
-H "Content-Type: application/json" \
-d '{"config_path": "configs/default_config.json"}'
```
### 5. Deploy Windmill Scripts
```bash
wmill script push windmill/trigger_simulation.ts
wmill script push windmill/poll_simulation_status.ts
wmill script push windmill/store_simulation_results.py
```
### 6. Create Windmill Workflow
- Import `windmill/daily_simulation_workflow.yaml`
- Configure resource `ai_trader_api` with API URL
- Set cron schedule (daily 6 AM)
### 7. Create Windmill Dashboard
- Import `windmill/dashboard.json`
- Verify data visualization
---
## Troubleshooting Guide
### Issue: Health check fails
**Symptoms:** `curl http://localhost:8080/health` returns 503
**Possible Causes:**
1. MCP services not running
2. Database file permission error
3. API server not started
**Solutions:**
```bash
# Check MCP services
docker-compose exec ai-trader curl http://localhost:8000/health
# Check API logs
docker-compose logs -f ai-trader
# Restart container
docker-compose restart
```
### Issue: Job stuck in "running" status
**Symptoms:** Job never completes, status remains "running"
**Possible Causes:**
1. Agent execution crashed
2. Model API timeout
3. Worker process died
**Solutions:**
```bash
# Check job details for error messages
curl http://localhost:8080/simulate/status/{job_id}
# Check container logs
docker-compose logs -f ai-trader
# If API restarted, stale jobs are marked as failed on startup
docker-compose restart
```
### Issue: Windmill can't reach API
**Symptoms:** Connection refused from Windmill scripts
**Solutions:**
- Verify Windmill and AI-Trader on same Docker network
- Check firewall rules
- Use container name (ai-trader) instead of localhost in Windmill resource
- Verify API_PORT environment variable
---
## Migration from Batch Mode
### For Users Currently Running Batch Mode
**Option 1: Dual Mode (Recommended)**
- Keep existing `main.py` for manual testing
- Add new API mode for production automation
- Use different config files for each mode
**Option 2: API-Only**
- Replace batch execution entirely
- All simulations via API calls
- More consistent with production workflow
### Migration Checklist
- [ ] Backup existing `data/` directory
- [ ] Update `.env` with API configuration
- [ ] Test API mode in separate environment first
- [ ] Gradually migrate Windmill workflows
- [ ] Monitor logs for errors
- [ ] Validate results match batch mode output
---
## Next Steps
1. **Review Specifications**
- Read all 4 specification documents
- Ask clarifying questions
- Approve design before implementation
2. **Implementation Phase 1** (Days 1-2)
- Set up `api/` directory structure
- Implement database and job_manager
- Write unit tests
3. **Implementation Phase 2** (Days 3-4)
- Implement worker and executor
- Test with mock agents
4. **Implementation Phase 3** (Days 5-6)
- Implement FastAPI endpoints
- Test with Postman/curl
5. **Implementation Phase 4** (Day 7)
- Docker integration
- End-to-end testing
6. **Implementation Phase 5** (Days 8-9)
- Windmill integration
- Dashboard creation
7. **Implementation Phase 6** (Day 10)
- Final testing
- Documentation
---
## Questions or Feedback?
Please review all specifications and provide feedback on:
1. API endpoint design
2. Database schema
3. Execution pattern (date-sequential, model-parallel)
4. Error handling approach
5. Windmill integration workflow
6. Any concerns or suggested improvements
**Ready to proceed with implementation?** Confirm approval of specifications to begin Phase 1.

837
docs/api-specification.md Normal file
View File

@@ -0,0 +1,837 @@
# AI-Trader API Service - Technical Specification
## 1. API Endpoints Specification
### 1.1 POST /simulate/trigger
**Purpose:** Trigger a catch-up simulation from the last completed date to the most recent trading day.
**Request:**
```http
POST /simulate/trigger HTTP/1.1
Content-Type: application/json
```
**Response (202 Accepted):**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "accepted",
"date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
"models": ["claude-3.7-sonnet", "gpt-5"],
"created_at": "2025-01-20T14:30:00Z",
"message": "Simulation job queued successfully"
}
```
**Response (200 OK - Job Already Running):**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "running",
"date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
"models": ["claude-3.7-sonnet", "gpt-5"],
"progress": {
"total_model_days": 6,
"completed": 3,
"failed": 0,
"current": {
"date": "2025-01-17",
"model": "gpt-5"
}
},
"created_at": "2025-01-20T14:25:00Z",
"message": "Simulation already in progress"
}
```
**Response (200 OK - Already Up To Date):**
```json
{
"status": "current",
"message": "Simulation already up-to-date",
"last_simulation_date": "2025-01-20",
"next_trading_day": "2025-01-21"
}
```
**Response (409 Conflict):**
```json
{
"error": "conflict",
"message": "Different simulation already running",
"current_job_id": "previous-job-uuid",
"current_date_range": ["2025-01-10", "2025-01-15"]
}
```
**Business Logic:**
1. Load configuration from `config_path` (or default)
2. Determine last completed date from each model's `position.jsonl`
3. Calculate date range: `max(last_dates) + 1 day``most_recent_trading_day`
4. Filter for weekdays only (Monday-Friday)
5. If date_range is empty, return "already up-to-date"
6. Check for existing jobs with same date range → return existing job
7. Check for running jobs with different date range → return 409
8. Create new job in SQLite with status=`pending`
9. Queue background task to execute simulation
10. Return 202 with job details
---
### 1.2 GET /simulate/status/{job_id}
**Purpose:** Poll the status and progress of a simulation job.
**Request:**
```http
GET /simulate/status/550e8400-e29b-41d4-a716-446655440000 HTTP/1.1
```
**Response (200 OK - Running):**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "running",
"date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
"models": ["claude-3.7-sonnet", "gpt-5"],
"progress": {
"total_model_days": 6,
"completed": 3,
"failed": 0,
"current": {
"date": "2025-01-17",
"model": "gpt-5"
},
"details": [
{"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
{"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
{"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 42.1},
{"date": "2025-01-17", "model": "gpt-5", "status": "running", "duration_seconds": null}
]
},
"created_at": "2025-01-20T14:25:00Z",
"updated_at": "2025-01-20T14:27:15Z"
}
```
**Response (200 OK - Completed):**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
"models": ["claude-3.7-sonnet", "gpt-5"],
"progress": {
"total_model_days": 6,
"completed": 6,
"failed": 0,
"details": [
{"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
{"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
{"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 42.1},
{"date": "2025-01-17", "model": "gpt-5", "status": "completed", "duration_seconds": 40.3},
{"date": "2025-01-20", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 43.8},
{"date": "2025-01-20", "model": "gpt-5", "status": "completed", "duration_seconds": 39.1}
]
},
"created_at": "2025-01-20T14:25:00Z",
"completed_at": "2025-01-20T14:29:45Z",
"total_duration_seconds": 285.0
}
```
**Response (200 OK - Partial Failure):**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "partial",
"date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
"models": ["claude-3.7-sonnet", "gpt-5"],
"progress": {
"total_model_days": 6,
"completed": 4,
"failed": 2,
"details": [
{"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
{"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
{"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "failed", "error": "MCP service timeout after 3 retries", "duration_seconds": null},
{"date": "2025-01-17", "model": "gpt-5", "status": "completed", "duration_seconds": 40.3},
{"date": "2025-01-20", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 43.8},
{"date": "2025-01-20", "model": "gpt-5", "status": "failed", "error": "AI model API timeout", "duration_seconds": null}
]
},
"created_at": "2025-01-20T14:25:00Z",
"completed_at": "2025-01-20T14:29:45Z"
}
```
**Response (404 Not Found):**
```json
{
"error": "not_found",
"message": "Job not found",
"job_id": "invalid-job-id"
}
```
**Business Logic:**
1. Query SQLite jobs table for job_id
2. If not found, return 404
3. Return job metadata + progress from job_details table
4. Status transitions: `pending``running``completed`/`partial`/`failed`
---
### 1.3 GET /simulate/current
**Purpose:** Get the most recent simulation job (for Windmill to discover job_id).
**Request:**
```http
GET /simulate/current HTTP/1.1
```
**Response (200 OK):**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "running",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["claude-3.7-sonnet", "gpt-5"],
"progress": {
"total_model_days": 4,
"completed": 2,
"failed": 0
},
"created_at": "2025-01-20T14:25:00Z"
}
```
**Response (404 Not Found):**
```json
{
"error": "not_found",
"message": "No simulation jobs found"
}
```
**Business Logic:**
1. Query SQLite: `SELECT * FROM jobs ORDER BY created_at DESC LIMIT 1`
2. Return job details with progress summary
---
### 1.4 GET /results
**Purpose:** Retrieve simulation results for a specific date and model.
**Request:**
```http
GET /results?date=2025-01-15&model=gpt-5&detail=minimal HTTP/1.1
```
**Query Parameters:**
- `date` (required): Trading date in YYYY-MM-DD format
- `model` (optional): Model signature (if omitted, returns all models)
- `detail` (optional): Response detail level
- `minimal` (default): Positions + daily P&L
- `full`: + trade history + AI reasoning logs + tool usage stats
**Response (200 OK - minimal):**
```json
{
"date": "2025-01-15",
"results": [
{
"model": "gpt-5",
"positions": {
"AAPL": 10,
"MSFT": 5,
"NVDA": 0,
"CASH": 8500.00
},
"daily_pnl": {
"profit": 150.50,
"return_pct": 1.5,
"portfolio_value": 10150.50
}
}
]
}
```
**Response (200 OK - full):**
```json
{
"date": "2025-01-15",
"results": [
{
"model": "gpt-5",
"positions": {
"AAPL": 10,
"MSFT": 5,
"CASH": 8500.00
},
"daily_pnl": {
"profit": 150.50,
"return_pct": 1.5,
"portfolio_value": 10150.50
},
"trades": [
{
"id": 1,
"action": "buy",
"symbol": "AAPL",
"amount": 10,
"price": 255.88,
"total": 2558.80
}
],
"ai_reasoning": {
"total_steps": 15,
"stop_signal_received": true,
"reasoning_summary": "Market analysis indicated strong buy signal for AAPL...",
"tool_usage": {
"search": 3,
"get_price": 5,
"math": 2,
"trade": 1
}
},
"log_file_path": "data/agent_data/gpt-5/log/2025-01-15/log.jsonl"
}
]
}
```
**Response (400 Bad Request):**
```json
{
"error": "invalid_date",
"message": "Date must be in YYYY-MM-DD format"
}
```
**Response (404 Not Found):**
```json
{
"error": "no_data",
"message": "No simulation data found for date 2025-01-15 and model gpt-5"
}
```
**Business Logic:**
1. Validate date format
2. Read `position.jsonl` for specified model(s) and date
3. For `detail=minimal`: Return positions + calculate daily P&L
4. For `detail=full`:
- Parse `log.jsonl` to extract reasoning summary
- Count tool usage from log messages
- Extract trades from position file
5. Return aggregated results
---
### 1.5 GET /health
**Purpose:** Health check endpoint for Docker and monitoring.
**Request:**
```http
GET /health HTTP/1.1
```
**Response (200 OK):**
```json
{
"status": "healthy",
"timestamp": "2025-01-20T14:30:00Z",
"services": {
"mcp_math": {"status": "up", "url": "http://localhost:8000/mcp"},
"mcp_search": {"status": "up", "url": "http://localhost:8001/mcp"},
"mcp_trade": {"status": "up", "url": "http://localhost:8002/mcp"},
"mcp_getprice": {"status": "up", "url": "http://localhost:8003/mcp"}
},
"storage": {
"data_directory": "/app/data",
"writable": true,
"free_space_mb": 15234
},
"database": {
"status": "connected",
"path": "/app/data/jobs.db"
}
}
```
**Response (503 Service Unavailable):**
```json
{
"status": "unhealthy",
"timestamp": "2025-01-20T14:30:00Z",
"services": {
"mcp_math": {"status": "down", "url": "http://localhost:8000/mcp", "error": "Connection refused"},
"mcp_search": {"status": "up", "url": "http://localhost:8001/mcp"},
"mcp_trade": {"status": "up", "url": "http://localhost:8002/mcp"},
"mcp_getprice": {"status": "up", "url": "http://localhost:8003/mcp"}
},
"storage": {
"data_directory": "/app/data",
"writable": true
},
"database": {
"status": "connected"
}
}
```
---
## 2. Data Models
### 2.1 SQLite Schema
**Table: jobs**
```sql
CREATE TABLE jobs (
job_id TEXT PRIMARY KEY,
config_path TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
date_range TEXT NOT NULL, -- JSON array of dates
models TEXT NOT NULL, -- JSON array of model signatures
created_at TEXT NOT NULL,
started_at TEXT,
completed_at TEXT,
total_duration_seconds REAL,
error TEXT
);
CREATE INDEX idx_jobs_status ON jobs(status);
CREATE INDEX idx_jobs_created_at ON jobs(created_at DESC);
```
**Table: job_details**
```sql
CREATE TABLE job_details (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
started_at TEXT,
completed_at TEXT,
duration_seconds REAL,
error TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
);
CREATE INDEX idx_job_details_job_id ON job_details(job_id);
CREATE INDEX idx_job_details_status ON job_details(status);
```
### 2.2 Pydantic Models
**Request Models:**
```python
from pydantic import BaseModel, Field
from typing import Optional, Literal
class TriggerSimulationRequest(BaseModel):
config_path: Optional[str] = Field(default="configs/default_config.json", description="Path to configuration file")
class ResultsQueryParams(BaseModel):
date: str = Field(..., pattern=r"^\d{4}-\d{2}-\d{2}$", description="Date in YYYY-MM-DD format")
model: Optional[str] = Field(None, description="Model signature filter")
detail: Literal["minimal", "full"] = Field(default="minimal", description="Response detail level")
```
**Response Models:**
```python
class JobProgress(BaseModel):
total_model_days: int
completed: int
failed: int
current: Optional[dict] = None # {"date": str, "model": str}
details: Optional[list] = None # List of JobDetailResponse
class TriggerSimulationResponse(BaseModel):
job_id: str
status: str
date_range: list[str]
models: list[str]
created_at: str
message: str
progress: Optional[JobProgress] = None
class JobStatusResponse(BaseModel):
job_id: str
status: str
date_range: list[str]
models: list[str]
progress: JobProgress
created_at: str
updated_at: Optional[str] = None
completed_at: Optional[str] = None
total_duration_seconds: Optional[float] = None
class DailyPnL(BaseModel):
profit: float
return_pct: float
portfolio_value: float
class Trade(BaseModel):
id: int
action: str
symbol: str
amount: int
price: Optional[float] = None
total: Optional[float] = None
class AIReasoning(BaseModel):
total_steps: int
stop_signal_received: bool
reasoning_summary: str
tool_usage: dict[str, int]
class ModelResult(BaseModel):
model: str
positions: dict[str, float]
daily_pnl: DailyPnL
trades: Optional[list[Trade]] = None
ai_reasoning: Optional[AIReasoning] = None
log_file_path: Optional[str] = None
class ResultsResponse(BaseModel):
date: str
results: list[ModelResult]
```
---
## 3. Configuration Management
### 3.1 Environment Variables
Required environment variables remain the same as batch mode:
```bash
# OpenAI API Configuration
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_API_KEY=sk-...
# Alpha Vantage API
ALPHAADVANTAGE_API_KEY=...
# Jina Search API
JINA_API_KEY=...
# Runtime Config Path (now shared by API and worker)
RUNTIME_ENV_PATH=/app/data/runtime_env.json
# MCP Service Ports
MATH_HTTP_PORT=8000
SEARCH_HTTP_PORT=8001
TRADE_HTTP_PORT=8002
GETPRICE_HTTP_PORT=8003
# API Server Configuration
API_HOST=0.0.0.0
API_PORT=8080
# Job Configuration
MAX_CONCURRENT_JOBS=1 # Only one simulation job at a time
```
### 3.2 Runtime State Management
**Challenge:** Multiple model-days running concurrently need isolated `runtime_env.json` state.
**Solution:** Per-job runtime config files
- `runtime_env_base.json` - Template
- `runtime_env_{job_id}_{model}_{date}.json` - Job-specific runtime config
- Worker passes custom `RUNTIME_ENV_PATH` to each simulation execution
**Modified `write_config_value()` and `get_config_value()`:**
- Accept optional `runtime_path` parameter
- Worker manages lifecycle: create → use → cleanup
---
## 4. Error Handling
### 4.1 Error Response Format
All errors follow this structure:
```json
{
"error": "error_code",
"message": "Human-readable error description",
"details": {
// Optional additional context
}
}
```
### 4.2 HTTP Status Codes
- `200 OK` - Successful request
- `202 Accepted` - Job queued successfully
- `400 Bad Request` - Invalid input parameters
- `404 Not Found` - Resource not found (job, results)
- `409 Conflict` - Concurrent job conflict
- `500 Internal Server Error` - Unexpected server error
- `503 Service Unavailable` - Health check failed
### 4.3 Retry Strategy for Workers
Models run independently - failure of one model doesn't block others:
```python
async def run_model_day(job_id: str, date: str, model_config: dict):
try:
# Execute simulation for this model-day
await agent.run_trading_session(date)
update_job_detail_status(job_id, date, model, "completed")
except Exception as e:
# Log error, update status to failed, continue with next model-day
update_job_detail_status(job_id, date, model, "failed", error=str(e))
# Do NOT raise - let other models continue
```
---
## 5. Concurrency & Locking
### 5.1 Job Execution Policy
**Rule:** Maximum 1 running job at a time (configurable via `MAX_CONCURRENT_JOBS`)
**Enforcement:**
```python
def can_start_new_job() -> bool:
running_jobs = db.query(
"SELECT COUNT(*) FROM jobs WHERE status IN ('pending', 'running')"
).fetchone()[0]
return running_jobs < MAX_CONCURRENT_JOBS
```
### 5.2 Position File Concurrency
**Challenge:** Multiple model-days writing to same model's `position.jsonl`
**Solution:** Sequential execution per model
```python
# For each date in date_range:
# For each model in parallel: ← Models run in parallel
# Execute model-day sequentially ← Dates for same model run sequentially
```
**Execution Pattern:**
```
Date 2025-01-16:
- Model A (running)
- Model B (running)
- Model C (running)
Date 2025-01-17: ← Starts only after all models finish 2025-01-16
- Model A (running)
- Model B (running)
- Model C (running)
```
**Rationale:**
- Models write to different position files → No conflict
- Same model's dates run sequentially → No race condition on position.jsonl
- Date-level parallelism across models → Faster overall execution
---
## 6. Performance Considerations
### 6.1 Execution Time Estimates
Based on current implementation:
- Single model-day: ~30-60 seconds (depends on AI model latency + tool calls)
- 3 models × 5 days = 15 model-days ≈ 7.5-15 minutes (parallel execution)
### 6.2 Timeout Configuration
**API Request Timeout:**
- `/simulate/trigger`: 10 seconds (just queue job)
- `/simulate/status`: 5 seconds (read from DB)
- `/results`: 30 seconds (file I/O + parsing)
**Worker Timeout:**
- Per model-day: 5 minutes (inherited from `max_retries` × `base_delay`)
- Entire job: No timeout (job runs until all model-days complete or fail)
### 6.3 Optimization Opportunities (Future)
1. **Results caching:** Store computed daily_pnl in SQLite to avoid recomputation
2. **Parallel date execution:** If position file locking is implemented, run dates in parallel
3. **Streaming responses:** For `/simulate/status`, use SSE to push updates instead of polling
---
## 7. Logging & Observability
### 7.1 Structured Logging
All API logs use JSON format:
```json
{
"timestamp": "2025-01-20T14:30:00Z",
"level": "INFO",
"logger": "api.worker",
"message": "Starting simulation for model-day",
"job_id": "550e8400-...",
"date": "2025-01-16",
"model": "gpt-5"
}
```
### 7.2 Log Levels
- `DEBUG` - Detailed execution flow (tool calls, price fetches)
- `INFO` - Job lifecycle events (created, started, completed)
- `WARNING` - Recoverable errors (retry attempts)
- `ERROR` - Model-day failures (logged but job continues)
- `CRITICAL` - System failures (MCP services down, DB corruption)
### 7.3 Audit Trail
All job state transitions logged to `api_audit.log`:
```json
{
"timestamp": "2025-01-20T14:30:00Z",
"event": "job_created",
"job_id": "550e8400-...",
"user": "windmill-service", // Future: from auth header
"details": {"date_range": [...], "models": [...]}
}
```
---
## 8. Security Considerations
### 8.1 Authentication (Future)
For MVP, API relies on network isolation (Docker network). Future enhancements:
- API key authentication via header: `X-API-Key: <token>`
- JWT tokens for Windmill integration
- Rate limiting per API key
### 8.2 Input Validation
- All date parameters validated with regex: `^\d{4}-\d{2}-\d{2}$`
- Config paths restricted to `configs/` directory (prevent path traversal)
- Model signatures sanitized (alphanumeric + hyphens only)
### 8.3 File Access Controls
- Results API only reads from `data/agent_data/` directory
- Config API only reads from `configs/` directory
- No arbitrary file read via API parameters
---
## 9. Deployment Configuration
### 9.1 Docker Compose
```yaml
version: '3.8'
services:
ai-trader-api:
build:
context: .
dockerfile: Dockerfile
ports:
- "8080:8080"
volumes:
- ./data:/app/data
- ./configs:/app/configs
env_file:
- .env
environment:
- MODE=api
- API_PORT=8080
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
restart: unless-stopped
```
### 9.2 Dockerfile Modifications
```dockerfile
# ... existing layers ...
# Install API dependencies
COPY requirements-api.txt /app/
RUN pip install --no-cache-dir -r requirements-api.txt
# Copy API application code
COPY api/ /app/api/
# Copy entrypoint script
COPY docker-entrypoint.sh /app/
RUN chmod +x /app/docker-entrypoint.sh
EXPOSE 8080
CMD ["/app/docker-entrypoint.sh"]
```
### 9.3 Entrypoint Script
```bash
#!/bin/bash
set -e
echo "Starting MCP services..."
cd /app/agent_tools
python start_mcp_services.py &
MCP_PID=$!
echo "Waiting for MCP services to be ready..."
sleep 10
echo "Starting API server..."
cd /app
uvicorn api.main:app --host ${API_HOST:-0.0.0.0} --port ${API_PORT:-8080} --workers 1
# Cleanup on exit
trap "kill $MCP_PID 2>/dev/null || true" EXIT
```
---
## 10. API Versioning (Future)
For v2 and beyond:
- URL prefix: `/api/v1/simulate/trigger`, `/api/v2/simulate/trigger`
- Header-based: `Accept: application/vnd.ai-trader.v1+json`
MVP uses unversioned endpoints (implied v1).
---
## Next Steps
After reviewing this specification, we'll proceed to:
1. **Component 2:** Job Manager & SQLite Schema Implementation
2. **Component 3:** Background Worker Architecture
3. **Component 4:** BaseAgent Refactoring for Single-Day Execution
4. **Component 5:** Docker & Deployment Configuration
5. **Component 6:** Windmill Integration Flows
Please review this API specification and provide feedback or approval to continue.
5. **Component 6:** Windmill Integration Flows
Please review this API specification and provide feedback or approval to continue.

View File

@@ -0,0 +1,911 @@
# Enhanced Database Specification - Results Storage in SQLite
## 1. Overview
**Change from Original Spec:** Instead of reading `position.jsonl` on-demand, simulation results are written to SQLite during execution for faster retrieval and queryability.
**Benefits:**
- **Faster `/results` endpoint** - No file I/O on every request
- **Advanced querying** - Filter by date range, model, performance metrics
- **Aggregations** - Portfolio timeseries, leaderboards, statistics
- **Data integrity** - Single source of truth with ACID guarantees
- **Backup/restore** - Single database file instead of scattered JSONL files
**Tradeoff:** Additional database writes during simulation (minimal performance impact)
---
## 2. Enhanced Database Schema
### 2.1 Complete Table Structure
```sql
-- Job tracking tables (from original spec)
CREATE TABLE IF NOT EXISTS jobs (
job_id TEXT PRIMARY KEY,
config_path TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
date_range TEXT NOT NULL,
models TEXT NOT NULL,
created_at TEXT NOT NULL,
started_at TEXT,
completed_at TEXT,
total_duration_seconds REAL,
error TEXT
);
CREATE TABLE IF NOT EXISTS job_details (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
started_at TEXT,
completed_at TEXT,
duration_seconds REAL,
error TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
);
-- NEW: Simulation results storage
CREATE TABLE IF NOT EXISTS positions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
action_id INTEGER NOT NULL, -- Sequence number within that day
action_type TEXT CHECK(action_type IN ('buy', 'sell', 'no_trade')),
symbol TEXT,
amount INTEGER,
price REAL,
cash REAL NOT NULL,
portfolio_value REAL NOT NULL,
daily_profit REAL,
daily_return_pct REAL,
cumulative_profit REAL,
cumulative_return_pct REAL,
created_at TEXT NOT NULL,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS holdings (
id INTEGER PRIMARY KEY AUTOINCREMENT,
position_id INTEGER NOT NULL,
symbol TEXT NOT NULL,
quantity INTEGER NOT NULL,
FOREIGN KEY (position_id) REFERENCES positions(id) ON DELETE CASCADE
);
-- NEW: AI reasoning logs (optional - for detail=full)
CREATE TABLE IF NOT EXISTS reasoning_logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
step_number INTEGER NOT NULL,
timestamp TEXT NOT NULL,
role TEXT CHECK(role IN ('user', 'assistant', 'tool')),
content TEXT,
tool_name TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
);
-- NEW: Tool usage statistics
CREATE TABLE IF NOT EXISTS tool_usage (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
tool_name TEXT NOT NULL,
call_count INTEGER NOT NULL DEFAULT 1,
total_duration_seconds REAL,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
);
-- Indexes for performance
CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status);
CREATE INDEX IF NOT EXISTS idx_jobs_created_at ON jobs(created_at DESC);
CREATE INDEX IF NOT EXISTS idx_job_details_job_id ON job_details(job_id);
CREATE INDEX IF NOT EXISTS idx_job_details_status ON job_details(status);
CREATE UNIQUE INDEX IF NOT EXISTS idx_job_details_unique ON job_details(job_id, date, model);
CREATE INDEX IF NOT EXISTS idx_positions_job_id ON positions(job_id);
CREATE INDEX IF NOT EXISTS idx_positions_date ON positions(date);
CREATE INDEX IF NOT EXISTS idx_positions_model ON positions(model);
CREATE INDEX IF NOT EXISTS idx_positions_date_model ON positions(date, model);
CREATE UNIQUE INDEX IF NOT EXISTS idx_positions_unique ON positions(job_id, date, model, action_id);
CREATE INDEX IF NOT EXISTS idx_holdings_position_id ON holdings(position_id);
CREATE INDEX IF NOT EXISTS idx_holdings_symbol ON holdings(symbol);
CREATE INDEX IF NOT EXISTS idx_reasoning_logs_job_date_model ON reasoning_logs(job_id, date, model);
CREATE INDEX IF NOT EXISTS idx_tool_usage_job_date_model ON tool_usage(job_id, date, model);
```
---
### 2.2 Table Relationships
```
jobs (1) ──┬──> (N) job_details
├──> (N) positions ──> (N) holdings
├──> (N) reasoning_logs
└──> (N) tool_usage
```
---
### 2.3 Data Examples
#### positions table
```
id | job_id | date | model | action_id | action_type | symbol | amount | price | cash | portfolio_value | daily_profit | daily_return_pct | cumulative_profit | cumulative_return_pct | created_at
---|------------|------------|-------|-----------|-------------|--------|--------|--------|---------|-----------------|--------------|------------------|-------------------|----------------------|------------
1 | abc-123... | 2025-01-16 | gpt-5 | 0 | no_trade | NULL | NULL | NULL | 10000.0 | 10000.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2025-01-16T09:30:00Z
2 | abc-123... | 2025-01-16 | gpt-5 | 1 | buy | AAPL | 10 | 255.88 | 7441.2 | 10000.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2025-01-16T09:35:12Z
3 | abc-123... | 2025-01-17 | gpt-5 | 0 | no_trade | NULL | NULL | NULL | 7441.2 | 10150.5 | 150.5 | 1.51 | 150.5 | 1.51 | 2025-01-17T09:30:00Z
4 | abc-123... | 2025-01-17 | gpt-5 | 1 | sell | AAPL | 5 | 262.24 | 8752.4 | 10150.5 | 150.5 | 1.51 | 150.5 | 1.51 | 2025-01-17T09:42:38Z
```
#### holdings table
```
id | position_id | symbol | quantity
---|-------------|--------|----------
1 | 2 | AAPL | 10
2 | 3 | AAPL | 10
3 | 4 | AAPL | 5
```
#### tool_usage table
```
id | job_id | date | model | tool_name | call_count | total_duration_seconds
---|------------|------------|-------|------------|------------|-----------------------
1 | abc-123... | 2025-01-16 | gpt-5 | get_price | 5 | 2.3
2 | abc-123... | 2025-01-16 | gpt-5 | search | 3 | 12.7
3 | abc-123... | 2025-01-16 | gpt-5 | trade | 1 | 0.8
4 | abc-123... | 2025-01-16 | gpt-5 | math | 2 | 0.1
```
---
## 3. Data Migration from position.jsonl
### 3.1 Migration Strategy
**During execution:** Write to BOTH SQLite AND position.jsonl for backward compatibility
**Migration path:**
1. **Phase 1:** Dual-write mode (write to both SQLite and JSONL)
2. **Phase 2:** Verify SQLite data matches JSONL
3. **Phase 3:** Switch `/results` endpoint to read from SQLite
4. **Phase 4:** (Optional) Deprecate JSONL writes
**Import existing data:** One-time migration script to populate SQLite from existing position.jsonl files
---
### 3.2 Import Script
```python
# api/import_historical_data.py
import json
import sqlite3
from pathlib import Path
from datetime import datetime
from api.database import get_db_connection
def import_position_jsonl(
model_signature: str,
position_file: Path,
job_id: str = "historical-import"
) -> int:
"""
Import existing position.jsonl data into SQLite.
Args:
model_signature: Model signature (e.g., "gpt-5")
position_file: Path to position.jsonl
job_id: Job ID to associate with (use "historical-import" for existing data)
Returns:
Number of records imported
"""
conn = get_db_connection()
cursor = conn.cursor()
imported_count = 0
initial_cash = 10000.0
with open(position_file, 'r') as f:
for line in f:
if not line.strip():
continue
record = json.loads(line)
date = record['date']
action_id = record['id']
action = record.get('this_action', {})
positions = record.get('positions', {})
# Extract action details
action_type = action.get('action', 'no_trade')
symbol = action.get('symbol', None)
amount = action.get('amount', None)
price = None # Not stored in original position.jsonl
# Extract holdings
cash = positions.get('CASH', 0.0)
holdings = {k: v for k, v in positions.items() if k != 'CASH' and v > 0}
# Calculate portfolio value (approximate - need price data)
portfolio_value = cash # Base value
# Calculate profits (need previous record)
daily_profit = 0.0
daily_return_pct = 0.0
cumulative_profit = cash - initial_cash # Simplified
cumulative_return_pct = (cumulative_profit / initial_cash) * 100
# Insert position record
cursor.execute("""
INSERT INTO positions (
job_id, date, model, action_id, action_type, symbol, amount, price,
cash, portfolio_value, daily_profit, daily_return_pct,
cumulative_profit, cumulative_return_pct, created_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""", (
job_id, date, model_signature, action_id, action_type, symbol, amount, price,
cash, portfolio_value, daily_profit, daily_return_pct,
cumulative_profit, cumulative_return_pct, datetime.utcnow().isoformat() + "Z"
))
position_id = cursor.lastrowid
# Insert holdings
for sym, qty in holdings.items():
cursor.execute("""
INSERT INTO holdings (position_id, symbol, quantity)
VALUES (?, ?, ?)
""", (position_id, sym, qty))
imported_count += 1
conn.commit()
conn.close()
return imported_count
def import_all_historical_data(base_path: Path = Path("data/agent_data")) -> dict:
"""
Import all existing position.jsonl files from data/agent_data/.
Returns:
Summary dict with import counts per model
"""
summary = {}
for model_dir in base_path.iterdir():
if not model_dir.is_dir():
continue
model_signature = model_dir.name
position_file = model_dir / "position" / "position.jsonl"
if not position_file.exists():
continue
print(f"Importing {model_signature}...")
count = import_position_jsonl(model_signature, position_file)
summary[model_signature] = count
print(f" Imported {count} records")
return summary
if __name__ == "__main__":
print("Starting historical data import...")
summary = import_all_historical_data()
print(f"\nImport complete: {summary}")
print(f"Total records: {sum(summary.values())}")
```
---
## 4. Updated Results Service
### 4.1 ResultsService Class
```python
# api/results_service.py
from typing import List, Dict, Optional
from datetime import datetime
from api.database import get_db_connection
class ResultsService:
"""
Service for retrieving simulation results from SQLite.
Replaces on-demand reading of position.jsonl files.
"""
def __init__(self, db_path: str = "data/jobs.db"):
self.db_path = db_path
def get_results(
self,
date: str,
model: Optional[str] = None,
detail: str = "minimal"
) -> Dict:
"""
Get simulation results for specified date and model(s).
Args:
date: Trading date (YYYY-MM-DD)
model: Optional model signature filter
detail: "minimal" or "full"
Returns:
{
"date": str,
"results": [
{
"model": str,
"positions": {...},
"daily_pnl": {...},
"trades": [...], // if detail=full
"ai_reasoning": {...} // if detail=full
}
]
}
"""
conn = get_db_connection(self.db_path)
# Get all models for this date (or specific model)
if model:
models = [model]
else:
cursor = conn.cursor()
cursor.execute("""
SELECT DISTINCT model FROM positions WHERE date = ?
""", (date,))
models = [row[0] for row in cursor.fetchall()]
results = []
for mdl in models:
result = self._get_model_result(conn, date, mdl, detail)
if result:
results.append(result)
conn.close()
return {
"date": date,
"results": results
}
def _get_model_result(
self,
conn,
date: str,
model: str,
detail: str
) -> Optional[Dict]:
"""Get result for single model on single date"""
cursor = conn.cursor()
# Get latest position for this date (highest action_id)
cursor.execute("""
SELECT
cash, portfolio_value, daily_profit, daily_return_pct,
cumulative_profit, cumulative_return_pct
FROM positions
WHERE date = ? AND model = ?
ORDER BY action_id DESC
LIMIT 1
""", (date, model))
row = cursor.fetchone()
if not row:
return None
cash, portfolio_value, daily_profit, daily_return_pct, cumulative_profit, cumulative_return_pct = row
# Get holdings for latest position
cursor.execute("""
SELECT h.symbol, h.quantity
FROM holdings h
JOIN positions p ON h.position_id = p.id
WHERE p.date = ? AND p.model = ?
ORDER BY p.action_id DESC
LIMIT 100 -- One position worth of holdings
""", (date, model))
holdings = {row[0]: row[1] for row in cursor.fetchall()}
holdings['CASH'] = cash
result = {
"model": model,
"positions": holdings,
"daily_pnl": {
"profit": daily_profit,
"return_pct": daily_return_pct,
"portfolio_value": portfolio_value
},
"cumulative_pnl": {
"profit": cumulative_profit,
"return_pct": cumulative_return_pct
}
}
# Add full details if requested
if detail == "full":
result["trades"] = self._get_trades(cursor, date, model)
result["ai_reasoning"] = self._get_reasoning(cursor, date, model)
result["tool_usage"] = self._get_tool_usage(cursor, date, model)
return result
def _get_trades(self, cursor, date: str, model: str) -> List[Dict]:
"""Get all trades executed on this date"""
cursor.execute("""
SELECT action_id, action_type, symbol, amount, price
FROM positions
WHERE date = ? AND model = ? AND action_type IN ('buy', 'sell')
ORDER BY action_id
""", (date, model))
trades = []
for row in cursor.fetchall():
trades.append({
"id": row[0],
"action": row[1],
"symbol": row[2],
"amount": row[3],
"price": row[4],
"total": row[3] * row[4] if row[3] and row[4] else None
})
return trades
def _get_reasoning(self, cursor, date: str, model: str) -> Dict:
"""Get AI reasoning summary"""
cursor.execute("""
SELECT COUNT(*) as total_steps,
COUNT(CASE WHEN role = 'assistant' THEN 1 END) as assistant_messages,
COUNT(CASE WHEN role = 'tool' THEN 1 END) as tool_messages
FROM reasoning_logs
WHERE date = ? AND model = ?
""", (date, model))
row = cursor.fetchone()
total_steps = row[0] if row else 0
# Get reasoning summary (last assistant message with FINISH_SIGNAL)
cursor.execute("""
SELECT content FROM reasoning_logs
WHERE date = ? AND model = ? AND role = 'assistant'
AND content LIKE '%<FINISH_SIGNAL>%'
ORDER BY step_number DESC
LIMIT 1
""", (date, model))
row = cursor.fetchone()
reasoning_summary = row[0] if row else "No reasoning summary available"
return {
"total_steps": total_steps,
"stop_signal_received": "<FINISH_SIGNAL>" in reasoning_summary,
"reasoning_summary": reasoning_summary[:500] # Truncate for brevity
}
def _get_tool_usage(self, cursor, date: str, model: str) -> Dict[str, int]:
"""Get tool usage counts"""
cursor.execute("""
SELECT tool_name, call_count
FROM tool_usage
WHERE date = ? AND model = ?
""", (date, model))
return {row[0]: row[1] for row in cursor.fetchall()}
def get_portfolio_timeseries(
self,
model: str,
start_date: Optional[str] = None,
end_date: Optional[str] = None
) -> List[Dict]:
"""
Get portfolio value over time for a model.
Returns:
[
{"date": "2025-01-16", "portfolio_value": 10000.0, "daily_return_pct": 0.0},
{"date": "2025-01-17", "portfolio_value": 10150.5, "daily_return_pct": 1.51},
...
]
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
query = """
SELECT date, portfolio_value, daily_return_pct, cumulative_return_pct
FROM (
SELECT date, portfolio_value, daily_return_pct, cumulative_return_pct,
ROW_NUMBER() OVER (PARTITION BY date ORDER BY action_id DESC) as rn
FROM positions
WHERE model = ?
)
WHERE rn = 1
"""
params = [model]
if start_date:
query += " AND date >= ?"
params.append(start_date)
if end_date:
query += " AND date <= ?"
params.append(end_date)
query += " ORDER BY date ASC"
cursor.execute(query, params)
timeseries = []
for row in cursor.fetchall():
timeseries.append({
"date": row[0],
"portfolio_value": row[1],
"daily_return_pct": row[2],
"cumulative_return_pct": row[3]
})
conn.close()
return timeseries
def get_leaderboard(self, date: Optional[str] = None) -> List[Dict]:
"""
Get model performance leaderboard.
Args:
date: Optional date filter (latest results if not specified)
Returns:
[
{"model": "gpt-5", "portfolio_value": 10500, "cumulative_return_pct": 5.0, "rank": 1},
{"model": "claude-3.7-sonnet", "portfolio_value": 10300, "cumulative_return_pct": 3.0, "rank": 2},
...
]
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
if date:
# Specific date leaderboard
cursor.execute("""
SELECT model, portfolio_value, cumulative_return_pct
FROM (
SELECT model, portfolio_value, cumulative_return_pct,
ROW_NUMBER() OVER (PARTITION BY model ORDER BY action_id DESC) as rn
FROM positions
WHERE date = ?
)
WHERE rn = 1
ORDER BY portfolio_value DESC
""", (date,))
else:
# Latest results for each model
cursor.execute("""
SELECT model, portfolio_value, cumulative_return_pct
FROM (
SELECT model, portfolio_value, cumulative_return_pct,
ROW_NUMBER() OVER (PARTITION BY model ORDER BY date DESC, action_id DESC) as rn
FROM positions
)
WHERE rn = 1
ORDER BY portfolio_value DESC
""")
leaderboard = []
rank = 1
for row in cursor.fetchall():
leaderboard.append({
"rank": rank,
"model": row[0],
"portfolio_value": row[1],
"cumulative_return_pct": row[2]
})
rank += 1
conn.close()
return leaderboard
```
---
## 5. Updated Executor - Write to SQLite
```python
# api/executor.py (additions to existing code)
class ModelDayExecutor:
# ... existing code ...
async def run_model_day(
self,
job_id: str,
date: str,
model_config: Dict[str, Any],
agent_class: type,
config: Dict[str, Any]
) -> None:
"""Execute simulation for one model on one date"""
# ... existing execution code ...
try:
# Execute trading session
await agent.run_trading_session(date)
# NEW: Extract and store results in SQLite
self._store_results_to_db(job_id, date, model_sig)
# Mark as completed
self.job_manager.update_job_detail_status(
job_id, date, model_sig, "completed"
)
except Exception as e:
# ... error handling ...
def _store_results_to_db(self, job_id: str, date: str, model: str) -> None:
"""
Extract data from position.jsonl and log.jsonl, store in SQLite.
This runs after agent.run_trading_session() completes.
"""
from api.database import get_db_connection
from pathlib import Path
import json
conn = get_db_connection()
cursor = conn.cursor()
# Read position.jsonl for this model
position_file = Path(f"data/agent_data/{model}/position/position.jsonl")
if not position_file.exists():
logger.warning(f"Position file not found: {position_file}")
return
# Find records for this date
with open(position_file, 'r') as f:
for line in f:
if not line.strip():
continue
record = json.loads(line)
if record['date'] != date:
continue # Skip other dates
# Extract fields
action_id = record['id']
action = record.get('this_action', {})
positions = record.get('positions', {})
action_type = action.get('action', 'no_trade')
symbol = action.get('symbol')
amount = action.get('amount')
price = None # TODO: Get from price data if needed
cash = positions.get('CASH', 0.0)
holdings = {k: v for k, v in positions.items() if k != 'CASH' and v > 0}
# Calculate portfolio value (simplified - improve with actual prices)
portfolio_value = cash # + sum(holdings value)
# Calculate daily P&L (compare to previous day's closing value)
# TODO: Implement proper P&L calculation
# Insert position
cursor.execute("""
INSERT INTO positions (
job_id, date, model, action_id, action_type, symbol, amount, price,
cash, portfolio_value, daily_profit, daily_return_pct,
cumulative_profit, cumulative_return_pct, created_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""", (
job_id, date, model, action_id, action_type, symbol, amount, price,
cash, portfolio_value, 0.0, 0.0, # TODO: Calculate P&L
0.0, 0.0, # TODO: Calculate cumulative P&L
datetime.utcnow().isoformat() + "Z"
))
position_id = cursor.lastrowid
# Insert holdings
for sym, qty in holdings.items():
cursor.execute("""
INSERT INTO holdings (position_id, symbol, quantity)
VALUES (?, ?, ?)
""", (position_id, sym, qty))
# Parse log.jsonl for reasoning (if detail=full is needed later)
# TODO: Implement log parsing and storage in reasoning_logs table
conn.commit()
conn.close()
logger.info(f"Stored results for {model} on {date} in SQLite")
```
---
## 6. Migration Path
### 6.1 Backward Compatibility
**Keep position.jsonl writes** to ensure existing tools/scripts continue working:
```python
# In agent/base_agent/base_agent.py - no changes needed
# position.jsonl writing continues as normal
# In api/executor.py - AFTER position.jsonl is written
await agent.run_trading_session(date) # Writes to position.jsonl
self._store_results_to_db(job_id, date, model_sig) # Copies to SQLite
```
### 6.2 Gradual Migration
**Week 1:** Deploy with dual-write (JSONL + SQLite)
**Week 2:** Verify data consistency, fix any discrepancies
**Week 3:** Switch `/results` endpoint to read from SQLite
**Week 4:** (Optional) Remove JSONL writes
---
## 7. Updated API Endpoints
### 7.1 Enhanced `/results` Endpoint
```python
# api/main.py
from api.results_service import ResultsService
results_service = ResultsService()
@app.get("/results")
async def get_results(
date: str,
model: Optional[str] = None,
detail: str = "minimal"
):
"""Get simulation results from SQLite (fast!)"""
# Validate date format
try:
datetime.strptime(date, "%Y-%m-%d")
except ValueError:
raise HTTPException(status_code=400, detail="Invalid date format (use YYYY-MM-DD)")
results = results_service.get_results(date, model, detail)
if not results["results"]:
raise HTTPException(status_code=404, detail=f"No data found for date {date}")
return results
```
### 7.2 New Endpoints for Advanced Queries
```python
@app.get("/portfolio/timeseries")
async def get_portfolio_timeseries(
model: str,
start_date: Optional[str] = None,
end_date: Optional[str] = None
):
"""Get portfolio value over time for a model"""
timeseries = results_service.get_portfolio_timeseries(model, start_date, end_date)
if not timeseries:
raise HTTPException(status_code=404, detail=f"No data found for model {model}")
return {
"model": model,
"timeseries": timeseries
}
@app.get("/leaderboard")
async def get_leaderboard(date: Optional[str] = None):
"""Get model performance leaderboard"""
leaderboard = results_service.get_leaderboard(date)
return {
"date": date or "latest",
"leaderboard": leaderboard
}
```
---
## 8. Database Maintenance
### 8.1 Cleanup Old Data
```python
# api/job_manager.py (add method)
def cleanup_old_data(self, days: int = 90) -> dict:
"""
Delete jobs and associated data older than specified days.
Returns:
Summary of deleted records
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
cutoff_date = (datetime.utcnow() - timedelta(days=days)).isoformat() + "Z"
# Count records before deletion
cursor.execute("SELECT COUNT(*) FROM jobs WHERE created_at < ?", (cutoff_date,))
jobs_to_delete = cursor.fetchone()[0]
cursor.execute("""
SELECT COUNT(*) FROM positions
WHERE job_id IN (SELECT job_id FROM jobs WHERE created_at < ?)
""", (cutoff_date,))
positions_to_delete = cursor.fetchone()[0]
# Delete (CASCADE will handle related tables)
cursor.execute("DELETE FROM jobs WHERE created_at < ?", (cutoff_date,))
conn.commit()
conn.close()
return {
"cutoff_date": cutoff_date,
"jobs_deleted": jobs_to_delete,
"positions_deleted": positions_to_delete
}
```
### 8.2 Vacuum Database
```python
def vacuum_database(self) -> None:
"""Reclaim disk space after deletes"""
conn = get_db_connection(self.db_path)
conn.execute("VACUUM")
conn.close()
```
---
## Summary
**Enhanced database schema** with 6 tables:
- `jobs`, `job_details` (job tracking)
- `positions`, `holdings` (simulation results)
- `reasoning_logs`, `tool_usage` (AI details)
**Benefits:**
-**10-100x faster** `/results` queries (no file I/O)
- 📊 **Advanced analytics** - timeseries, leaderboards, aggregations
- 🔒 **Data integrity** - ACID compliance, foreign keys
- 🗄️ **Single source of truth** - all data in one place
**Migration strategy:** Dual-write (JSONL + SQLite) for backward compatibility
**Next:** Comprehensive testing suite specification

View File

@@ -0,0 +1,873 @@
# Implementation Specifications: Agent, Docker, and Windmill Integration
## Part 1: BaseAgent Refactoring
### 1.1 Current State Analysis
**Current `base_agent.py` structure:**
- `run_date_range(init_date, end_date)` - Loops through all dates
- `run_trading_session(today_date)` - Executes single day
- `get_trading_dates()` - Calculates dates from position.jsonl
**What works well:**
- `run_trading_session()` is already isolated for single-day execution ✅
- Agent initialization is separate from execution ✅
- Position tracking via position.jsonl ✅
**What needs modification:**
- `runtime_env.json` management (move to RuntimeConfigManager)
- `get_trading_dates()` logic (move to API layer for date range calculation)
### 1.2 Required Changes
#### Change 1: No modifications needed to core execution logic
**Rationale:** `BaseAgent.run_trading_session(today_date)` already supports single-day execution. The worker will call this method directly.
```python
# Current code (already suitable for API mode):
async def run_trading_session(self, today_date: str) -> None:
"""Run single day trading session"""
# This method is perfect as-is for worker to call
```
**Action:** ✅ No changes needed
---
#### Change 2: Make runtime config path injectable
**Current issue:**
```python
# In base_agent.py, uses global config
from tools.general_tools import get_config_value, write_config_value
```
**Problem:** `get_config_value()` reads from `os.environ["RUNTIME_ENV_PATH"]`, which the worker will override per execution.
**Solution:** Already works! The worker sets `RUNTIME_ENV_PATH` before calling agent methods:
```python
# In executor.py
os.environ["RUNTIME_ENV_PATH"] = runtime_config_path
await agent.run_trading_session(date)
```
**Action:** ✅ No changes needed (env var override is sufficient)
---
#### Change 3: Optional - Separate agent initialization from date-range logic
**Current code in `main.py`:**
```python
# Creates agent
agent = AgentClass(...)
await agent.initialize()
# Runs all dates
await agent.run_date_range(INIT_DATE, END_DATE)
```
**For API mode:**
```python
# Worker creates agent
agent = AgentClass(...)
await agent.initialize()
# Worker calls run_trading_session directly for each date
for date in date_range:
await agent.run_trading_session(date)
```
**Action:** ✅ Worker will not use `run_date_range()` method. No changes needed to agent.
---
### 1.3 Summary: BaseAgent Changes
**Result:** **NO CODE CHANGES REQUIRED** to `base_agent.py`!
The existing architecture is already compatible with the API worker pattern:
- `run_trading_session()` is the perfect interface
- Runtime config is managed via environment variables
- Position tracking works as-is
**Only change needed:** Worker must call `agent.register_agent()` if position file doesn't exist (already handled by `get_trading_dates()` logic).
---
## Part 2: Docker Configuration
### 2.1 Current Docker Setup
**Existing files:**
- `Dockerfile` - Multi-stage build for batch mode
- `docker-compose.yml` - Service definition
- `docker-entrypoint.sh` - Launches data fetch + main.py
### 2.2 Modified Dockerfile
```dockerfile
# Existing stages remain the same...
FROM python:3.10-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements
COPY requirements.txt requirements-api.txt ./
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir -r requirements-api.txt
# Copy application code
COPY . /app
# Create data directories
RUN mkdir -p /app/data /app/configs
# Copy and set permissions for entrypoint
COPY docker-entrypoint-api.sh /app/
RUN chmod +x /app/docker-entrypoint-api.sh
# Expose API port
EXPOSE 8080
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
# Run API service
CMD ["/app/docker-entrypoint-api.sh"]
```
### 2.3 New requirements-api.txt
```
fastapi==0.109.0
uvicorn[standard]==0.27.0
pydantic==2.5.3
pydantic-settings==2.1.0
python-multipart==0.0.6
```
### 2.4 New docker-entrypoint-api.sh
```bash
#!/bin/bash
set -e
echo "=================================="
echo "AI-Trader API Service Starting"
echo "=================================="
# Cleanup stale runtime configs from previous runs
echo "Cleaning up stale runtime configs..."
python3 -c "from api.runtime_manager import RuntimeConfigManager; RuntimeConfigManager().cleanup_all_runtime_configs()"
# Start MCP services in background
echo "Starting MCP services..."
cd /app/agent_tools
python3 start_mcp_services.py &
MCP_PID=$!
# Wait for MCP services to be ready
echo "Waiting for MCP services to initialize..."
sleep 10
# Verify MCP services are running
echo "Verifying MCP services..."
for port in ${MATH_HTTP_PORT:-8000} ${SEARCH_HTTP_PORT:-8001} ${TRADE_HTTP_PORT:-8002} ${GETPRICE_HTTP_PORT:-8003}; do
if ! curl -f -s http://localhost:$port/health > /dev/null 2>&1; then
echo "WARNING: MCP service on port $port not responding"
else
echo "✓ MCP service on port $port is healthy"
fi
done
# Start API server
echo "Starting FastAPI server..."
cd /app
# Use environment variables for host and port
API_HOST=${API_HOST:-0.0.0.0}
API_PORT=${API_PORT:-8080}
echo "API will be available at http://${API_HOST}:${API_PORT}"
echo "=================================="
# Start uvicorn with single worker (for simplicity in MVP)
exec uvicorn api.main:app \
--host ${API_HOST} \
--port ${API_PORT} \
--workers 1 \
--log-level info
# Cleanup function (called on exit)
trap "echo 'Shutting down...'; kill $MCP_PID 2>/dev/null || true" EXIT SIGTERM SIGINT
```
### 2.5 Updated docker-compose.yml
```yaml
version: '3.8'
services:
ai-trader:
build:
context: .
dockerfile: Dockerfile
container_name: ai-trader-api
ports:
- "8080:8080"
volumes:
- ./data:/app/data
- ./configs:/app/configs
- ./logs:/app/logs
env_file:
- .env
environment:
- API_HOST=0.0.0.0
- API_PORT=8080
- RUNTIME_ENV_PATH=/app/data/runtime_env.json
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
restart: unless-stopped
networks:
- ai-trader-network
networks:
ai-trader-network:
driver: bridge
```
### 2.6 Environment Variables Reference
```bash
# .env file example for API mode
# OpenAI Configuration
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_API_KEY=sk-...
# API Keys
ALPHAADVANTAGE_API_KEY=your_alpha_vantage_key
JINA_API_KEY=your_jina_key
# MCP Service Ports
MATH_HTTP_PORT=8000
SEARCH_HTTP_PORT=8001
TRADE_HTTP_PORT=8002
GETPRICE_HTTP_PORT=8003
# API Configuration
API_HOST=0.0.0.0
API_PORT=8080
# Runtime Config
RUNTIME_ENV_PATH=/app/data/runtime_env.json
# Job Configuration
MAX_CONCURRENT_JOBS=1
```
### 2.7 Docker Commands Reference
```bash
# Build image
docker-compose build
# Start service
docker-compose up
# Start in background
docker-compose up -d
# View logs
docker-compose logs -f
# Check health
docker-compose ps
# Stop service
docker-compose down
# Restart service
docker-compose restart
# Execute command in running container
docker-compose exec ai-trader python3 -c "from api.job_manager import JobManager; jm = JobManager(); print(jm.get_current_job())"
# Access container shell
docker-compose exec ai-trader bash
```
---
## Part 3: Windmill Integration
### 3.1 Windmill Overview
Windmill (windmill.dev) is a workflow automation platform that can:
- Schedule cron jobs
- Execute TypeScript/Python scripts
- Store state between runs
- Build UI dashboards
**Integration approach:**
1. Windmill cron job triggers simulation daily
2. Windmill polls for job completion
3. Windmill retrieves results and stores in internal database
4. Windmill dashboard displays performance metrics
### 3.2 Flow 1: Daily Simulation Trigger
**File:** `windmill/trigger_simulation.ts`
```typescript
import { Resource } from "https://deno.land/x/windmill@v1.0.0/mod.ts";
export async function main(
ai_trader_api: Resource<"ai_trader_api">
) {
const apiUrl = ai_trader_api.base_url; // e.g., "http://ai-trader:8080"
// Trigger simulation
const response = await fetch(`${apiUrl}/simulate/trigger`, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
config_path: "configs/default_config.json"
}),
});
if (!response.ok) {
throw new Error(`API error: ${response.status} ${response.statusText}`);
}
const data = await response.json();
// Handle different response types
if (data.status === "current") {
console.log("Simulation already up-to-date");
return {
action: "skipped",
message: data.message,
last_date: data.last_simulation_date
};
}
// Store job_id in Windmill state for poller to pick up
await Deno.writeTextFile(
`/tmp/current_job_id.txt`,
data.job_id
);
console.log(`Simulation triggered: ${data.job_id}`);
console.log(`Date range: ${data.date_range.join(", ")}`);
console.log(`Models: ${data.models.join(", ")}`);
return {
action: "triggered",
job_id: data.job_id,
date_range: data.date_range,
models: data.models,
status: data.status
};
}
```
**Windmill Resource Configuration:**
```json
{
"resource_type": "ai_trader_api",
"base_url": "http://ai-trader:8080"
}
```
**Schedule:** Every day at 6:00 AM
---
### 3.3 Flow 2: Job Status Poller
**File:** `windmill/poll_simulation_status.ts`
```typescript
import { Resource } from "https://deno.land/x/windmill@v1.0.0/mod.ts";
export async function main(
ai_trader_api: Resource<"ai_trader_api">,
job_id?: string
) {
const apiUrl = ai_trader_api.base_url;
// Get job_id from parameter or from current job file
let jobId = job_id;
if (!jobId) {
try {
jobId = await Deno.readTextFile("/tmp/current_job_id.txt");
} catch {
// No current job
return {
status: "no_job",
message: "No active simulation job"
};
}
}
// Poll status
const response = await fetch(`${apiUrl}/simulate/status/${jobId}`);
if (!response.ok) {
if (response.status === 404) {
return {
status: "not_found",
message: "Job not found",
job_id: jobId
};
}
throw new Error(`API error: ${response.status}`);
}
const data = await response.json();
console.log(`Job ${jobId}: ${data.status}`);
console.log(`Progress: ${data.progress.completed}/${data.progress.total_model_days} model-days`);
// If job is complete, retrieve results
if (data.status === "completed" || data.status === "partial") {
console.log("Job finished, retrieving results...");
const results = [];
for (const date of data.date_range) {
const resultsResponse = await fetch(
`${apiUrl}/results?date=${date}&detail=minimal`
);
if (resultsResponse.ok) {
const dateResults = await resultsResponse.json();
results.push(dateResults);
}
}
// Clean up job_id file
try {
await Deno.remove("/tmp/current_job_id.txt");
} catch {
// Ignore
}
return {
status: data.status,
job_id: jobId,
completed_at: data.completed_at,
duration_seconds: data.total_duration_seconds,
results: results
};
}
// Job still running
return {
status: data.status,
job_id: jobId,
progress: data.progress,
started_at: data.created_at
};
}
```
**Schedule:** Every 5 minutes (will skip if no active job)
---
### 3.4 Flow 3: Results Retrieval and Storage
**File:** `windmill/store_simulation_results.py`
```python
import wmill
from datetime import datetime
def main(
job_results: dict,
database: str = "simulation_results"
):
"""
Store simulation results in Windmill's internal database.
Args:
job_results: Output from poll_simulation_status flow
database: Database name for storage
"""
if job_results.get("status") not in ("completed", "partial"):
return {"message": "Job not complete, skipping storage"}
# Extract results
job_id = job_results["job_id"]
results = job_results.get("results", [])
stored_count = 0
for date_result in results:
date = date_result["date"]
for model_result in date_result["results"]:
model = model_result["model"]
positions = model_result["positions"]
pnl = model_result["daily_pnl"]
# Store in Windmill database
record = {
"job_id": job_id,
"date": date,
"model": model,
"cash": positions.get("CASH", 0),
"portfolio_value": pnl["portfolio_value"],
"daily_profit": pnl["profit"],
"daily_return_pct": pnl["return_pct"],
"stored_at": datetime.utcnow().isoformat()
}
# Use Windmill's internal storage
wmill.set_variable(
path=f"{database}/{model}/{date}",
value=record
)
stored_count += 1
return {
"stored_count": stored_count,
"job_id": job_id,
"message": f"Stored {stored_count} model-day results"
}
```
---
### 3.5 Windmill Dashboard Example
**File:** `windmill/dashboard.json` (Windmill App Builder)
```json
{
"grid": [
{
"type": "table",
"id": "performance_table",
"configuration": {
"title": "Model Performance Summary",
"data_source": {
"type": "script",
"path": "f/simulation_results/get_latest_performance"
},
"columns": [
{"field": "model", "header": "Model"},
{"field": "latest_date", "header": "Latest Date"},
{"field": "portfolio_value", "header": "Portfolio Value"},
{"field": "total_return_pct", "header": "Total Return %"},
{"field": "daily_return_pct", "header": "Daily Return %"}
]
}
},
{
"type": "chart",
"id": "portfolio_chart",
"configuration": {
"title": "Portfolio Value Over Time",
"chart_type": "line",
"data_source": {
"type": "script",
"path": "f/simulation_results/get_timeseries"
},
"x_axis": "date",
"y_axis": "portfolio_value",
"series": "model"
}
}
]
}
```
**Supporting Script:** `windmill/get_latest_performance.py`
```python
import wmill
def main(database: str = "simulation_results"):
"""Get latest performance for each model"""
# Query Windmill variables
all_vars = wmill.list_variables(path_prefix=f"{database}/")
# Group by model
models = {}
for var in all_vars:
parts = var["path"].split("/")
if len(parts) >= 3:
model = parts[1]
date = parts[2]
value = wmill.get_variable(var["path"])
if model not in models:
models[model] = []
models[model].append(value)
# Compute summary for each model
summary = []
for model, records in models.items():
# Sort by date
records.sort(key=lambda x: x["date"], reverse=True)
latest = records[0]
# Calculate total return
initial_value = 10000 # Initial cash
total_return_pct = ((latest["portfolio_value"] - initial_value) / initial_value) * 100
summary.append({
"model": model,
"latest_date": latest["date"],
"portfolio_value": latest["portfolio_value"],
"total_return_pct": round(total_return_pct, 2),
"daily_return_pct": latest["daily_return_pct"]
})
return summary
```
---
### 3.6 Windmill Workflow Orchestration
**Main Workflow:** `windmill/daily_simulation_workflow.yaml`
```yaml
name: Daily AI Trader Simulation
description: Trigger simulation, poll status, and store results
triggers:
- type: cron
schedule: "0 6 * * *" # Every day at 6 AM
steps:
- id: trigger
name: Trigger Simulation
script: f/ai_trader/trigger_simulation
outputs:
- job_id
- action
- id: wait
name: Wait for Job Start
type: sleep
duration: 10s
- id: poll_loop
name: Poll Until Complete
type: loop
max_iterations: 60 # Poll for up to 5 hours (60 × 5min)
interval: 5m
script: f/ai_trader/poll_simulation_status
inputs:
job_id: ${{ steps.trigger.outputs.job_id }}
break_condition: |
${{ steps.poll_loop.outputs.status in ['completed', 'partial', 'failed'] }}
- id: store_results
name: Store Results in Database
script: f/ai_trader/store_simulation_results
inputs:
job_results: ${{ steps.poll_loop.outputs }}
condition: |
${{ steps.poll_loop.outputs.status in ['completed', 'partial'] }}
- id: notify
name: Send Notification
type: email
to: admin@example.com
subject: "AI Trader Simulation Complete"
body: |
Simulation completed for ${{ steps.poll_loop.outputs.job_id }}
Status: ${{ steps.poll_loop.outputs.status }}
Duration: ${{ steps.poll_loop.outputs.duration_seconds }}s
```
---
### 3.7 Testing Windmill Integration Locally
**1. Start AI-Trader API:**
```bash
docker-compose up -d
```
**2. Test trigger endpoint:**
```bash
curl -X POST http://localhost:8080/simulate/trigger \
-H "Content-Type: application/json" \
-d '{"config_path": "configs/default_config.json"}'
```
**3. Test status polling:**
```bash
JOB_ID="<job_id_from_step_2>"
curl http://localhost:8080/simulate/status/$JOB_ID
```
**4. Test results retrieval:**
```bash
curl "http://localhost:8080/results?date=2025-01-16&model=gpt-5&detail=minimal"
```
**5. Deploy to Windmill:**
```bash
# Install Windmill CLI
npm install -g windmill-cli
# Login to your Windmill instance
wmill login https://your-windmill-instance.com
# Deploy scripts
wmill script push windmill/trigger_simulation.ts
wmill script push windmill/poll_simulation_status.ts
wmill script push windmill/store_simulation_results.py
# Deploy workflow
wmill flow push windmill/daily_simulation_workflow.yaml
```
---
## Part 4: Complete File Structure
After implementation, the project structure will be:
```
AI-Trader/
├── api/
│ ├── __init__.py
│ ├── main.py # FastAPI application
│ ├── models.py # Pydantic request/response models
│ ├── job_manager.py # Job lifecycle management
│ ├── database.py # SQLite utilities
│ ├── worker.py # Background simulation worker
│ ├── executor.py # Single model-day execution
│ └── runtime_manager.py # Runtime config isolation
├── docs/
│ ├── api-specification.md
│ ├── job-manager-specification.md
│ ├── worker-specification.md
│ └── implementation-specifications.md
├── windmill/
│ ├── trigger_simulation.ts
│ ├── poll_simulation_status.ts
│ ├── store_simulation_results.py
│ ├── get_latest_performance.py
│ ├── daily_simulation_workflow.yaml
│ └── dashboard.json
├── agent/
│ └── base_agent/
│ └── base_agent.py # NO CHANGES NEEDED
├── agent_tools/
│ └── ... (existing MCP tools)
├── data/
│ ├── jobs.db # SQLite database (created automatically)
│ ├── runtime_env*.json # Runtime configs (temporary)
│ ├── agent_data/ # Existing position/log data
│ └── merged.jsonl # Existing price data
├── Dockerfile # Updated for API mode
├── docker-compose.yml # Updated service definition
├── docker-entrypoint-api.sh # New API entrypoint
├── requirements-api.txt # FastAPI dependencies
├── .env # Environment configuration
└── main.py # Existing (used by worker)
```
---
## Part 5: Implementation Checklist
### Phase 1: API Foundation (Days 1-2)
- [ ] Create `api/` directory structure
- [ ] Implement `api/models.py` with Pydantic models
- [ ] Implement `api/database.py` with SQLite utilities
- [ ] Implement `api/job_manager.py` with job CRUD operations
- [ ] Write unit tests for job_manager
- [ ] Test database operations manually
### Phase 2: Worker & Executor (Days 3-4)
- [ ] Implement `api/runtime_manager.py`
- [ ] Implement `api/executor.py` for single model-day execution
- [ ] Implement `api/worker.py` for job orchestration
- [ ] Test worker with mock agent
- [ ] Test runtime config isolation
### Phase 3: FastAPI Endpoints (Days 5-6)
- [ ] Implement `api/main.py` with all endpoints
- [ ] Implement `/simulate/trigger` with background tasks
- [ ] Implement `/simulate/status/{job_id}`
- [ ] Implement `/simulate/current`
- [ ] Implement `/results` with detail levels
- [ ] Implement `/health` with MCP checks
- [ ] Test all endpoints with Postman/curl
### Phase 4: Docker Integration (Day 7)
- [ ] Update `Dockerfile`
- [ ] Create `docker-entrypoint-api.sh`
- [ ] Create `requirements-api.txt`
- [ ] Update `docker-compose.yml`
- [ ] Test Docker build
- [ ] Test container startup and health checks
- [ ] Test end-to-end simulation via API in Docker
### Phase 5: Windmill Integration (Days 8-9)
- [ ] Create Windmill scripts (trigger, poll, store)
- [ ] Test scripts locally against Docker API
- [ ] Deploy scripts to Windmill instance
- [ ] Create Windmill workflow
- [ ] Test workflow end-to-end
- [ ] Create Windmill dashboard
- [ ] Document Windmill setup process
### Phase 6: Testing & Documentation (Day 10)
- [ ] Integration tests for complete workflow
- [ ] Load testing (multiple concurrent requests)
- [ ] Error scenario testing (MCP down, API timeout)
- [ ] Update README.md with API usage
- [ ] Create API documentation (Swagger/OpenAPI)
- [ ] Create deployment guide
- [ ] Create troubleshooting guide
---
## Summary
This comprehensive specification covers:
1. **BaseAgent Refactoring:** Minimal changes needed (existing code compatible)
2. **Docker Configuration:** API service mode with health checks and proper entrypoint
3. **Windmill Integration:** Complete workflow automation with TypeScript/Python scripts
4. **File Structure:** Clear organization of new API components
5. **Implementation Checklist:** Step-by-step plan for 10-day implementation
**Total estimated implementation time:** 10 working days for MVP
**Next Step:** Review all specifications (api-specification.md, job-manager-specification.md, worker-specification.md, and this document) and approve before beginning implementation.

View File

@@ -0,0 +1,963 @@
# Job Manager & Database Specification
## 1. Overview
The Job Manager is responsible for:
1. **Job lifecycle management** - Creating, tracking, updating job status
2. **Database operations** - SQLite CRUD operations for jobs and job_details
3. **Concurrency control** - Ensuring only one simulation runs at a time
4. **State persistence** - Maintaining job state across API restarts
---
## 2. Database Schema
### 2.1 SQLite Database Location
```
data/jobs.db
```
**Rationale:** Co-located with simulation data for easy volume mounting
### 2.2 Table: jobs
**Purpose:** Track high-level job metadata and status
```sql
CREATE TABLE IF NOT EXISTS jobs (
job_id TEXT PRIMARY KEY,
config_path TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
date_range TEXT NOT NULL, -- JSON array: ["2025-01-16", "2025-01-17"]
models TEXT NOT NULL, -- JSON array: ["claude-3.7-sonnet", "gpt-5"]
created_at TEXT NOT NULL, -- ISO 8601: "2025-01-20T14:30:00Z"
started_at TEXT, -- When first model-day started
completed_at TEXT, -- When last model-day finished
total_duration_seconds REAL,
error TEXT -- Top-level error message if job failed
);
-- Indexes for performance
CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status);
CREATE INDEX IF NOT EXISTS idx_jobs_created_at ON jobs(created_at DESC);
```
**Field Details:**
- `job_id`: UUID v4 (e.g., `550e8400-e29b-41d4-a716-446655440000`)
- `status`: Current job state
- `pending`: Job created, not started yet
- `running`: At least one model-day is executing
- `completed`: All model-days succeeded
- `partial`: Some model-days succeeded, some failed
- `failed`: All model-days failed (rare edge case)
- `date_range`: JSON string for easy querying
- `models`: JSON string of enabled model signatures
### 2.3 Table: job_details
**Purpose:** Track individual model-day execution status
```sql
CREATE TABLE IF NOT EXISTS job_details (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL, -- "2025-01-16"
model TEXT NOT NULL, -- "gpt-5"
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
started_at TEXT,
completed_at TEXT,
duration_seconds REAL,
error TEXT, -- Error message if this model-day failed
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
);
-- Indexes
CREATE INDEX IF NOT EXISTS idx_job_details_job_id ON job_details(job_id);
CREATE INDEX IF NOT EXISTS idx_job_details_status ON job_details(status);
CREATE UNIQUE INDEX IF NOT EXISTS idx_job_details_unique ON job_details(job_id, date, model);
```
**Field Details:**
- Each row represents one model-day (e.g., `gpt-5` on `2025-01-16`)
- `UNIQUE INDEX` prevents duplicate execution entries
- `ON DELETE CASCADE` ensures orphaned records are cleaned up
### 2.4 Example Data
**jobs table:**
```
job_id | config_path | status | date_range | models | created_at | started_at | completed_at | total_duration_seconds
--------------------------------------|--------------------------|-----------|-----------------------------------|---------------------------------|----------------------|----------------------|----------------------|----------------------
550e8400-e29b-41d4-a716-446655440000 | configs/default_config.json | completed | ["2025-01-16","2025-01-17"] | ["gpt-5","claude-3.7-sonnet"] | 2025-01-20T14:25:00Z | 2025-01-20T14:25:10Z | 2025-01-20T14:29:45Z | 275.3
```
**job_details table:**
```
id | job_id | date | model | status | started_at | completed_at | duration_seconds | error
---|--------------------------------------|------------|--------------------|-----------|----------------------|----------------------|------------------|------
1 | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-16 | gpt-5 | completed | 2025-01-20T14:25:10Z | 2025-01-20T14:25:48Z | 38.2 | NULL
2 | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-16 | claude-3.7-sonnet | completed | 2025-01-20T14:25:10Z | 2025-01-20T14:25:55Z | 45.1 | NULL
3 | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-17 | gpt-5 | completed | 2025-01-20T14:25:56Z | 2025-01-20T14:26:36Z | 40.0 | NULL
4 | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-17 | claude-3.7-sonnet | completed | 2025-01-20T14:25:56Z | 2025-01-20T14:26:42Z | 46.5 | NULL
```
---
## 3. Job Manager Class
### 3.1 File Structure
```
api/
├── job_manager.py # Core JobManager class
├── database.py # SQLite connection and utilities
└── models.py # Pydantic models
```
### 3.2 JobManager Interface
```python
# api/job_manager.py
from datetime import datetime
from typing import Optional, List, Dict, Tuple
import uuid
import json
from api.database import get_db_connection
class JobManager:
"""Manages simulation job lifecycle and database operations"""
def __init__(self, db_path: str = "data/jobs.db"):
self.db_path = db_path
self._initialize_database()
def _initialize_database(self) -> None:
"""Create tables if they don't exist"""
conn = get_db_connection(self.db_path)
# Execute CREATE TABLE statements from section 2.2 and 2.3
conn.close()
# ========== Job Creation ==========
def create_job(
self,
config_path: str,
date_range: List[str],
models: List[str]
) -> str:
"""
Create a new simulation job.
Args:
config_path: Path to config file
date_range: List of trading dates to simulate
models: List of model signatures to run
Returns:
job_id: UUID of created job
Raises:
ValueError: If another job is already running
"""
# 1. Check if any jobs are currently running
if not self.can_start_new_job():
raise ValueError("Another simulation job is already running")
# 2. Generate job ID
job_id = str(uuid.uuid4())
# 3. Create job record
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
cursor.execute("""
INSERT INTO jobs (
job_id, config_path, status, date_range, models, created_at
) VALUES (?, ?, ?, ?, ?, ?)
""", (
job_id,
config_path,
"pending",
json.dumps(date_range),
json.dumps(models),
datetime.utcnow().isoformat() + "Z"
))
# 4. Create job_details records for each model-day
for date in date_range:
for model in models:
cursor.execute("""
INSERT INTO job_details (
job_id, date, model, status
) VALUES (?, ?, ?, ?)
""", (job_id, date, model, "pending"))
conn.commit()
conn.close()
return job_id
# ========== Job Retrieval ==========
def get_job(self, job_id: str) -> Optional[Dict]:
"""
Get job metadata by ID.
Returns:
Job dict with keys: job_id, config_path, status, date_range (list),
models (list), created_at, started_at, completed_at, total_duration_seconds
Returns None if job not found.
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
cursor.execute("SELECT * FROM jobs WHERE job_id = ?", (job_id,))
row = cursor.fetchone()
conn.close()
if row is None:
return None
return {
"job_id": row[0],
"config_path": row[1],
"status": row[2],
"date_range": json.loads(row[3]),
"models": json.loads(row[4]),
"created_at": row[5],
"started_at": row[6],
"completed_at": row[7],
"total_duration_seconds": row[8],
"error": row[9]
}
def get_current_job(self) -> Optional[Dict]:
"""Get most recent job (for /simulate/current endpoint)"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
cursor.execute("""
SELECT * FROM jobs
ORDER BY created_at DESC
LIMIT 1
""")
row = cursor.fetchone()
conn.close()
if row is None:
return None
return self._row_to_job_dict(row)
def get_running_jobs(self) -> List[Dict]:
"""Get all running or pending jobs"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
cursor.execute("""
SELECT * FROM jobs
WHERE status IN ('pending', 'running')
ORDER BY created_at DESC
""")
rows = cursor.fetchall()
conn.close()
return [self._row_to_job_dict(row) for row in rows]
# ========== Job Status Updates ==========
def update_job_status(
self,
job_id: str,
status: str,
error: Optional[str] = None
) -> None:
"""Update job status (pending → running → completed/partial/failed)"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
updates = {"status": status}
if status == "running" and self.get_job(job_id)["status"] == "pending":
updates["started_at"] = datetime.utcnow().isoformat() + "Z"
if status in ("completed", "partial", "failed"):
updates["completed_at"] = datetime.utcnow().isoformat() + "Z"
# Calculate total duration
job = self.get_job(job_id)
if job["started_at"]:
started = datetime.fromisoformat(job["started_at"].replace("Z", ""))
completed = datetime.utcnow()
updates["total_duration_seconds"] = (completed - started).total_seconds()
if error:
updates["error"] = error
# Build dynamic UPDATE query
set_clause = ", ".join([f"{k} = ?" for k in updates.keys()])
values = list(updates.values()) + [job_id]
cursor.execute(f"""
UPDATE jobs
SET {set_clause}
WHERE job_id = ?
""", values)
conn.commit()
conn.close()
def update_job_detail_status(
self,
job_id: str,
date: str,
model: str,
status: str,
error: Optional[str] = None
) -> None:
"""Update individual model-day status"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
updates = {"status": status}
# Get current detail status to determine if this is a status transition
cursor.execute("""
SELECT status, started_at FROM job_details
WHERE job_id = ? AND date = ? AND model = ?
""", (job_id, date, model))
row = cursor.fetchone()
if row:
current_status = row[0]
if status == "running" and current_status == "pending":
updates["started_at"] = datetime.utcnow().isoformat() + "Z"
if status in ("completed", "failed"):
updates["completed_at"] = datetime.utcnow().isoformat() + "Z"
# Calculate duration if started_at exists
if row[1]: # started_at
started = datetime.fromisoformat(row[1].replace("Z", ""))
completed = datetime.utcnow()
updates["duration_seconds"] = (completed - started).total_seconds()
if error:
updates["error"] = error
# Build UPDATE query
set_clause = ", ".join([f"{k} = ?" for k in updates.keys()])
values = list(updates.values()) + [job_id, date, model]
cursor.execute(f"""
UPDATE job_details
SET {set_clause}
WHERE job_id = ? AND date = ? AND model = ?
""", values)
conn.commit()
conn.close()
# After updating detail, check if overall job status needs update
self._update_job_status_from_details(job_id)
def _update_job_status_from_details(self, job_id: str) -> None:
"""
Recalculate job status based on job_details statuses.
Logic:
- If any detail is 'running' → job is 'running'
- If all details are 'completed' → job is 'completed'
- If some details are 'completed' and some 'failed' → job is 'partial'
- If all details are 'failed' → job is 'failed'
- If all details are 'pending' → job is 'pending'
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
cursor.execute("""
SELECT status, COUNT(*)
FROM job_details
WHERE job_id = ?
GROUP BY status
""", (job_id,))
status_counts = {row[0]: row[1] for row in cursor.fetchall()}
conn.close()
# Determine overall job status
if status_counts.get("running", 0) > 0:
new_status = "running"
elif status_counts.get("pending", 0) > 0:
# Some details still pending, job is either pending or running
current_job = self.get_job(job_id)
new_status = current_job["status"] # Keep current status
elif status_counts.get("failed", 0) > 0 and status_counts.get("completed", 0) > 0:
new_status = "partial"
elif status_counts.get("failed", 0) > 0:
new_status = "failed"
else:
new_status = "completed"
self.update_job_status(job_id, new_status)
# ========== Job Progress ==========
def get_job_progress(self, job_id: str) -> Dict:
"""
Get detailed progress for a job.
Returns:
{
"total_model_days": int,
"completed": int,
"failed": int,
"current": {"date": str, "model": str} | None,
"details": [
{"date": str, "model": str, "status": str, "duration_seconds": float | None, "error": str | None},
...
]
}
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
# Get all details for this job
cursor.execute("""
SELECT date, model, status, started_at, completed_at, duration_seconds, error
FROM job_details
WHERE job_id = ?
ORDER BY date ASC, model ASC
""", (job_id,))
rows = cursor.fetchall()
conn.close()
if not rows:
return {
"total_model_days": 0,
"completed": 0,
"failed": 0,
"current": None,
"details": []
}
total = len(rows)
completed = sum(1 for row in rows if row[2] == "completed")
failed = sum(1 for row in rows if row[2] == "failed")
# Find currently running model-day
current = None
for row in rows:
if row[2] == "running":
current = {"date": row[0], "model": row[1]}
break
# Build details list
details = []
for row in rows:
details.append({
"date": row[0],
"model": row[1],
"status": row[2],
"started_at": row[3],
"completed_at": row[4],
"duration_seconds": row[5],
"error": row[6]
})
return {
"total_model_days": total,
"completed": completed,
"failed": failed,
"current": current,
"details": details
}
# ========== Concurrency Control ==========
def can_start_new_job(self) -> bool:
"""Check if a new job can be started (max 1 concurrent job)"""
running_jobs = self.get_running_jobs()
return len(running_jobs) == 0
def find_job_by_date_range(self, date_range: List[str]) -> Optional[Dict]:
"""Find job with exact matching date range (for idempotency check)"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
# Query recent jobs (last 24 hours)
cursor.execute("""
SELECT * FROM jobs
WHERE created_at > datetime('now', '-1 day')
ORDER BY created_at DESC
""")
rows = cursor.fetchall()
conn.close()
# Check each job's date_range
target_range = set(date_range)
for row in rows:
job_range = set(json.loads(row[3])) # date_range column
if job_range == target_range:
return self._row_to_job_dict(row)
return None
# ========== Utility Methods ==========
def _row_to_job_dict(self, row: tuple) -> Dict:
"""Convert DB row to job dictionary"""
return {
"job_id": row[0],
"config_path": row[1],
"status": row[2],
"date_range": json.loads(row[3]),
"models": json.loads(row[4]),
"created_at": row[5],
"started_at": row[6],
"completed_at": row[7],
"total_duration_seconds": row[8],
"error": row[9]
}
def cleanup_old_jobs(self, days: int = 30) -> int:
"""
Delete jobs older than specified days (cleanup maintenance).
Returns:
Number of jobs deleted
"""
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
cursor.execute("""
DELETE FROM jobs
WHERE created_at < datetime('now', '-' || ? || ' days')
""", (days,))
deleted_count = cursor.rowcount
conn.commit()
conn.close()
return deleted_count
```
---
## 4. Database Utility Module
```python
# api/database.py
import sqlite3
from typing import Optional
import os
def get_db_connection(db_path: str = "data/jobs.db") -> sqlite3.Connection:
"""
Get SQLite database connection.
Ensures:
- Database directory exists
- Foreign keys are enabled
- Row factory returns dict-like objects
"""
# Ensure data directory exists
os.makedirs(os.path.dirname(db_path), exist_ok=True)
conn = sqlite3.connect(db_path, check_same_thread=False)
conn.execute("PRAGMA foreign_keys = ON") # Enable FK constraints
conn.row_factory = sqlite3.Row # Return rows as dict-like objects
return conn
def initialize_database(db_path: str = "data/jobs.db") -> None:
"""Create database tables if they don't exist"""
conn = get_db_connection(db_path)
cursor = conn.cursor()
# Create jobs table
cursor.execute("""
CREATE TABLE IF NOT EXISTS jobs (
job_id TEXT PRIMARY KEY,
config_path TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
date_range TEXT NOT NULL,
models TEXT NOT NULL,
created_at TEXT NOT NULL,
started_at TEXT,
completed_at TEXT,
total_duration_seconds REAL,
error TEXT
)
""")
# Create indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_jobs_created_at ON jobs(created_at DESC)
""")
# Create job_details table
cursor.execute("""
CREATE TABLE IF NOT EXISTS job_details (
id INTEGER PRIMARY KEY AUTOINCREMENT,
job_id TEXT NOT NULL,
date TEXT NOT NULL,
model TEXT NOT NULL,
status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
started_at TEXT,
completed_at TEXT,
duration_seconds REAL,
error TEXT,
FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
)
""")
# Create indexes
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_job_details_job_id ON job_details(job_id)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_job_details_status ON job_details(status)
""")
cursor.execute("""
CREATE UNIQUE INDEX IF NOT EXISTS idx_job_details_unique
ON job_details(job_id, date, model)
""")
conn.commit()
conn.close()
```
---
## 5. State Transitions
### 5.1 Job Status State Machine
```
pending ──────────────> running ──────────> completed
│ │
│ │
└────────────> partial
│ │
└────────────> failed
```
**Transition Logic:**
- `pending → running`: When first model-day starts executing
- `running → completed`: When all model-days complete successfully
- `running → partial`: When some model-days succeed, some fail
- `running → failed`: When all model-days fail (rare)
### 5.2 Job Detail Status State Machine
```
pending ──────> running ──────> completed
└───────────> failed
```
**Transition Logic:**
- `pending → running`: When worker starts executing that model-day
- `running → completed`: When `agent.run_trading_session()` succeeds
- `running → failed`: When `agent.run_trading_session()` raises exception after retries
---
## 6. Concurrency Scenarios
### 6.1 Scenario: Duplicate Trigger Requests
**Timeline:**
1. Request A: POST /simulate/trigger → Job created with date_range=[2025-01-16, 2025-01-17]
2. Request B (5 seconds later): POST /simulate/trigger → Same date range
**Expected Behavior:**
- Request A: Returns `{"job_id": "abc123", "status": "accepted"}`
- Request B: `find_job_by_date_range()` finds Job abc123
- Request B: Returns `{"job_id": "abc123", "status": "running", ...}` (same job)
**Code:**
```python
# In /simulate/trigger endpoint
existing_job = job_manager.find_job_by_date_range(date_range)
if existing_job:
# Return existing job instead of creating duplicate
return existing_job
```
### 6.2 Scenario: Concurrent Jobs with Different Dates
**Timeline:**
1. Job A running: date_range=[2025-01-01 to 2025-01-10] (started 5 min ago)
2. Request: POST /simulate/trigger with date_range=[2025-01-11 to 2025-01-15]
**Expected Behavior:**
- `can_start_new_job()` returns False (Job A is still running)
- Request returns 409 Conflict with details of Job A
### 6.3 Scenario: Job Cleanup on API Restart
**Problem:** API crashes while job is running. On restart, job stuck in "running" state.
**Solution:** On API startup, detect stale jobs and mark as failed:
```python
# In api/main.py startup event
@app.on_event("startup")
async def startup_event():
job_manager = JobManager()
# Find jobs stuck in 'running' or 'pending' state
stale_jobs = job_manager.get_running_jobs()
for job in stale_jobs:
# Mark as failed with explanation
job_manager.update_job_status(
job["job_id"],
"failed",
error="API restarted while job was running"
)
```
---
## 7. Testing Strategy
### 7.1 Unit Tests
```python
# tests/test_job_manager.py
import pytest
from api.job_manager import JobManager
import tempfile
import os
@pytest.fixture
def job_manager():
# Use temporary database for tests
temp_db = tempfile.NamedTemporaryFile(delete=False, suffix=".db")
temp_db.close()
jm = JobManager(db_path=temp_db.name)
yield jm
# Cleanup
os.unlink(temp_db.name)
def test_create_job(job_manager):
job_id = job_manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5", "claude-3.7-sonnet"]
)
assert job_id is not None
job = job_manager.get_job(job_id)
assert job["status"] == "pending"
assert job["date_range"] == ["2025-01-16", "2025-01-17"]
# Check job_details created
progress = job_manager.get_job_progress(job_id)
assert progress["total_model_days"] == 4 # 2 dates × 2 models
def test_concurrent_job_blocked(job_manager):
# Create first job
job1_id = job_manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
# Try to create second job while first is pending
with pytest.raises(ValueError, match="Another simulation job is already running"):
job_manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
# Mark first job as completed
job_manager.update_job_status(job1_id, "completed")
# Now second job should be allowed
job2_id = job_manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
assert job2_id is not None
def test_job_status_transitions(job_manager):
job_id = job_manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
# Update job detail to running
job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
# Job should now be 'running'
job = job_manager.get_job(job_id)
assert job["status"] == "running"
assert job["started_at"] is not None
# Complete the detail
job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
# Job should now be 'completed'
job = job_manager.get_job(job_id)
assert job["status"] == "completed"
assert job["completed_at"] is not None
def test_partial_job_status(job_manager):
job_id = job_manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5", "claude-3.7-sonnet"]
)
# One model succeeds
job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
# One model fails
job_manager.update_job_detail_status(job_id, "2025-01-16", "claude-3.7-sonnet", "running")
job_manager.update_job_detail_status(
job_id, "2025-01-16", "claude-3.7-sonnet", "failed",
error="API timeout"
)
# Job should be 'partial'
job = job_manager.get_job(job_id)
assert job["status"] == "partial"
progress = job_manager.get_job_progress(job_id)
assert progress["completed"] == 1
assert progress["failed"] == 1
```
---
## 8. Performance Considerations
### 8.1 Database Indexing
- `idx_jobs_status`: Fast filtering for running jobs
- `idx_jobs_created_at DESC`: Fast retrieval of most recent job
- `idx_job_details_unique`: Prevent duplicate model-day entries
### 8.2 Connection Pooling
For MVP, using `sqlite3.connect()` per operation is acceptable (low concurrency).
For higher concurrency (future), consider:
- SQLAlchemy ORM with connection pooling
- PostgreSQL for production deployments
### 8.3 Query Optimization
**Avoid N+1 queries:**
```python
# BAD: Separate query for each job's progress
for job in jobs:
progress = job_manager.get_job_progress(job["job_id"])
# GOOD: Join jobs and job_details in single query
SELECT
jobs.*,
COUNT(job_details.id) as total,
SUM(CASE WHEN job_details.status = 'completed' THEN 1 ELSE 0 END) as completed
FROM jobs
LEFT JOIN job_details ON jobs.job_id = job_details.job_id
GROUP BY jobs.job_id
```
---
## 9. Error Handling
### 9.1 Database Errors
**Scenario:** SQLite database is locked or corrupted
**Handling:**
```python
try:
job_id = job_manager.create_job(...)
except sqlite3.OperationalError as e:
# Database locked - retry with exponential backoff
logger.error(f"Database error: {e}")
raise HTTPException(status_code=503, detail="Database temporarily unavailable")
except sqlite3.IntegrityError as e:
# Constraint violation (e.g., duplicate job_id)
logger.error(f"Integrity error: {e}")
raise HTTPException(status_code=400, detail="Invalid job data")
```
### 9.2 Foreign Key Violations
**Scenario:** Attempt to create job_detail for non-existent job
**Prevention:**
- Always create job record before job_details records
- Use transactions to ensure atomicity
```python
def create_job(self, ...):
conn = get_db_connection(self.db_path)
try:
cursor = conn.cursor()
# Insert job
cursor.execute("INSERT INTO jobs ...")
# Insert job_details
for date in date_range:
for model in models:
cursor.execute("INSERT INTO job_details ...")
conn.commit() # Atomic commit
except Exception as e:
conn.rollback() # Rollback on any error
raise
finally:
conn.close()
```
---
## 10. Migration Strategy
### 10.1 Schema Versioning
For future schema changes, use migration scripts:
```
data/
└── migrations/
├── 001_initial_schema.sql
├── 002_add_priority_column.sql
└── ...
```
Track applied migrations in database:
```sql
CREATE TABLE IF NOT EXISTS schema_migrations (
version INTEGER PRIMARY KEY,
applied_at TEXT NOT NULL
);
```
### 10.2 Backward Compatibility
When adding columns:
- Use `ALTER TABLE ADD COLUMN ... DEFAULT ...` for backward compatibility
- Never remove columns (deprecate instead)
- Version API responses to handle schema changes
---
## Summary
The Job Manager provides:
1. **Robust job tracking** with SQLite persistence
2. **Concurrency control** ensuring single-job execution
3. **Granular progress monitoring** at model-day level
4. **Flexible status handling** (completed/partial/failed)
5. **Idempotency** for duplicate trigger requests
Next specification: **Background Worker Architecture**

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,900 @@
# Background Worker Architecture Specification
## 1. Overview
The Background Worker executes simulation jobs asynchronously, allowing the API to return immediately (202 Accepted) while simulations run in the background.
**Key Responsibilities:**
1. Execute simulation jobs queued by `/simulate/trigger` endpoint
2. Manage per-model-day execution with status updates
3. Handle errors gracefully (model failures don't block other models)
4. Coordinate runtime configuration for concurrent model execution
5. Update job status in database throughout execution
---
## 2. Worker Architecture
### 2.1 Execution Model
**Pattern:** Date-sequential, Model-parallel execution
```
Job: Simulate 2025-01-16 to 2025-01-18 for models [gpt-5, claude-3.7-sonnet]
Execution flow:
┌─────────────────────────────────────────────────────────────┐
│ Date: 2025-01-16 │
│ ├─ gpt-5 (running) ┐ │
│ └─ claude-3.7-sonnet (running) ┘ Parallel │
└─────────────────────────────────────────────────────────────┘
▼ (both complete)
┌─────────────────────────────────────────────────────────────┐
│ Date: 2025-01-17 │
│ ├─ gpt-5 (running) ┐ │
│ └─ claude-3.7-sonnet (running) ┘ Parallel │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Date: 2025-01-18 │
│ ├─ gpt-5 (running) ┐ │
│ └─ claude-3.7-sonnet (running) ┘ Parallel │
└─────────────────────────────────────────────────────────────┘
```
**Rationale:**
- **Models run in parallel** → Faster total execution (30-60s per model-day, 3 models = ~30-60s per date instead of ~90-180s)
- **Dates run sequentially** → Ensures position.jsonl integrity (no concurrent writes to same file)
- **Independent failure handling** → One model's failure doesn't block other models
---
### 2.2 File Structure
```
api/
├── worker.py # SimulationWorker class
├── executor.py # Single model-day execution logic
└── runtime_manager.py # Runtime config isolation
```
---
## 3. Worker Implementation
### 3.1 SimulationWorker Class
```python
# api/worker.py
import asyncio
from typing import List, Dict
from datetime import datetime
import logging
from api.job_manager import JobManager
from api.executor import ModelDayExecutor
from main import load_config, get_agent_class
logger = logging.getLogger(__name__)
class SimulationWorker:
"""
Executes simulation jobs in the background.
Manages:
- Date-sequential, model-parallel execution
- Job status updates throughout execution
- Error handling and recovery
"""
def __init__(self, job_manager: JobManager):
self.job_manager = job_manager
self.executor = ModelDayExecutor(job_manager)
async def run_job(self, job_id: str) -> None:
"""
Execute a simulation job.
Args:
job_id: UUID of job to execute
Flow:
1. Load job from database
2. Load configuration file
3. Initialize agents for each model
4. For each date sequentially:
- Run all models in parallel
- Update status after each model-day
5. Mark job as completed/partial/failed
"""
logger.info(f"Starting simulation job {job_id}")
try:
# 1. Load job metadata
job = self.job_manager.get_job(job_id)
if not job:
logger.error(f"Job {job_id} not found")
return
# 2. Update job status to 'running'
self.job_manager.update_job_status(job_id, "running")
# 3. Load configuration
config = load_config(job["config_path"])
# 4. Get enabled models from config
enabled_models = [
m for m in config["models"]
if m.get("signature") in job["models"] and m.get("enabled", True)
]
if not enabled_models:
raise ValueError("No enabled models found in configuration")
# 5. Get agent class
agent_type = config.get("agent_type", "BaseAgent")
AgentClass = get_agent_class(agent_type)
# 6. Execute each date sequentially
for date in job["date_range"]:
logger.info(f"[Job {job_id}] Processing date: {date}")
# Run all models for this date in parallel
tasks = []
for model_config in enabled_models:
task = self.executor.run_model_day(
job_id=job_id,
date=date,
model_config=model_config,
agent_class=AgentClass,
config=config
)
tasks.append(task)
# Wait for all models to complete this date
results = await asyncio.gather(*tasks, return_exceptions=True)
# Log any exceptions (already handled by executor, just for visibility)
for i, result in enumerate(results):
if isinstance(result, Exception):
model_sig = enabled_models[i]["signature"]
logger.error(f"[Job {job_id}] Model {model_sig} failed on {date}: {result}")
logger.info(f"[Job {job_id}] Date {date} completed")
# 7. Job execution finished - final status will be set by job_manager
# based on job_details statuses
logger.info(f"[Job {job_id}] All dates processed")
except Exception as e:
logger.error(f"[Job {job_id}] Fatal error: {e}", exc_info=True)
self.job_manager.update_job_status(job_id, "failed", error=str(e))
```
---
### 3.2 ModelDayExecutor
```python
# api/executor.py
import asyncio
import os
import logging
from typing import Dict, Any
from datetime import datetime
from pathlib import Path
from api.job_manager import JobManager
from api.runtime_manager import RuntimeConfigManager
from tools.general_tools import write_config_value
logger = logging.getLogger(__name__)
class ModelDayExecutor:
"""
Executes a single model-day simulation.
Responsibilities:
- Initialize agent for specific model
- Set up isolated runtime configuration
- Execute trading session
- Update job_detail status
- Handle errors without blocking other models
"""
def __init__(self, job_manager: JobManager):
self.job_manager = job_manager
self.runtime_manager = RuntimeConfigManager()
async def run_model_day(
self,
job_id: str,
date: str,
model_config: Dict[str, Any],
agent_class: type,
config: Dict[str, Any]
) -> None:
"""
Execute simulation for one model on one date.
Args:
job_id: Job UUID
date: Trading date (YYYY-MM-DD)
model_config: Model configuration dict from config file
agent_class: Agent class (e.g., BaseAgent)
config: Full configuration dict
Updates:
- job_details status: pending → running → completed/failed
- Writes to position.jsonl and log.jsonl
"""
model_sig = model_config["signature"]
logger.info(f"[Job {job_id}] Starting {model_sig} on {date}")
# Update status to 'running'
self.job_manager.update_job_detail_status(
job_id, date, model_sig, "running"
)
# Create isolated runtime config for this execution
runtime_config_path = self.runtime_manager.create_runtime_config(
job_id=job_id,
model_sig=model_sig,
date=date
)
try:
# 1. Extract model parameters
basemodel = model_config.get("basemodel")
openai_base_url = model_config.get("openai_base_url")
openai_api_key = model_config.get("openai_api_key")
if not basemodel:
raise ValueError(f"Model {model_sig} missing basemodel field")
# 2. Get agent configuration
agent_config = config.get("agent_config", {})
log_config = config.get("log_config", {})
max_steps = agent_config.get("max_steps", 10)
max_retries = agent_config.get("max_retries", 3)
base_delay = agent_config.get("base_delay", 0.5)
initial_cash = agent_config.get("initial_cash", 10000.0)
log_path = log_config.get("log_path", "./data/agent_data")
# 3. Get stock symbols from prompts
from prompts.agent_prompt import all_nasdaq_100_symbols
# 4. Create agent instance
agent = agent_class(
signature=model_sig,
basemodel=basemodel,
stock_symbols=all_nasdaq_100_symbols,
log_path=log_path,
openai_base_url=openai_base_url,
openai_api_key=openai_api_key,
max_steps=max_steps,
max_retries=max_retries,
base_delay=base_delay,
initial_cash=initial_cash,
init_date=date # Note: This is used for initial registration
)
# 5. Initialize MCP connection and AI model
# (Only do this once per job, not per date - optimization for future)
await agent.initialize()
# 6. Set runtime configuration for this execution
# Override RUNTIME_ENV_PATH to use isolated config
original_runtime_path = os.environ.get("RUNTIME_ENV_PATH")
os.environ["RUNTIME_ENV_PATH"] = runtime_config_path
try:
# Write runtime config values
write_config_value("TODAY_DATE", date)
write_config_value("SIGNATURE", model_sig)
write_config_value("IF_TRADE", False)
# 7. Execute trading session
await agent.run_trading_session(date)
# 8. Mark as completed
self.job_manager.update_job_detail_status(
job_id, date, model_sig, "completed"
)
logger.info(f"[Job {job_id}] Completed {model_sig} on {date}")
finally:
# Restore original runtime path
if original_runtime_path:
os.environ["RUNTIME_ENV_PATH"] = original_runtime_path
else:
os.environ.pop("RUNTIME_ENV_PATH", None)
except Exception as e:
# Log error and update status to 'failed'
error_msg = f"{type(e).__name__}: {str(e)}"
logger.error(
f"[Job {job_id}] Failed {model_sig} on {date}: {error_msg}",
exc_info=True
)
self.job_manager.update_job_detail_status(
job_id, date, model_sig, "failed", error=error_msg
)
finally:
# Cleanup runtime config file
self.runtime_manager.cleanup_runtime_config(runtime_config_path)
```
---
### 3.3 RuntimeConfigManager
```python
# api/runtime_manager.py
import os
import json
import tempfile
from pathlib import Path
import logging
logger = logging.getLogger(__name__)
class RuntimeConfigManager:
"""
Manages isolated runtime configuration files for concurrent model execution.
Problem:
Multiple models running concurrently need separate runtime_env.json files
to avoid race conditions on TODAY_DATE, SIGNATURE, IF_TRADE values.
Solution:
Create temporary runtime config file per model-day execution:
- /app/data/runtime_env_{job_id}_{model}_{date}.json
Lifecycle:
1. create_runtime_config() → Creates temp file
2. Executor sets RUNTIME_ENV_PATH env var
3. Agent uses isolated config via get_config_value/write_config_value
4. cleanup_runtime_config() → Deletes temp file
"""
def __init__(self, data_dir: str = "data"):
self.data_dir = Path(data_dir)
self.data_dir.mkdir(parents=True, exist_ok=True)
def create_runtime_config(
self,
job_id: str,
model_sig: str,
date: str
) -> str:
"""
Create isolated runtime config file for this execution.
Args:
job_id: Job UUID
model_sig: Model signature
date: Trading date
Returns:
Path to created runtime config file
"""
# Generate unique filename
filename = f"runtime_env_{job_id[:8]}_{model_sig}_{date}.json"
config_path = self.data_dir / filename
# Initialize with default values
initial_config = {
"TODAY_DATE": date,
"SIGNATURE": model_sig,
"IF_TRADE": False,
"JOB_ID": job_id
}
with open(config_path, "w", encoding="utf-8") as f:
json.dump(initial_config, f, indent=4)
logger.debug(f"Created runtime config: {config_path}")
return str(config_path)
def cleanup_runtime_config(self, config_path: str) -> None:
"""
Delete runtime config file after execution.
Args:
config_path: Path to runtime config file
"""
try:
if os.path.exists(config_path):
os.unlink(config_path)
logger.debug(f"Cleaned up runtime config: {config_path}")
except Exception as e:
logger.warning(f"Failed to cleanup runtime config {config_path}: {e}")
def cleanup_all_runtime_configs(self) -> int:
"""
Cleanup all runtime config files (for maintenance/startup).
Returns:
Number of files deleted
"""
count = 0
for config_file in self.data_dir.glob("runtime_env_*.json"):
try:
config_file.unlink()
count += 1
except Exception as e:
logger.warning(f"Failed to delete {config_file}: {e}")
if count > 0:
logger.info(f"Cleaned up {count} stale runtime config files")
return count
```
---
## 4. Integration with FastAPI
### 4.1 Background Task Pattern
```python
# api/main.py
from fastapi import FastAPI, BackgroundTasks, HTTPException
from api.job_manager import JobManager
from api.worker import SimulationWorker
from api.models import TriggerSimulationRequest, TriggerSimulationResponse
app = FastAPI(title="AI-Trader API")
# Global instances
job_manager = JobManager()
worker = SimulationWorker(job_manager)
@app.post("/simulate/trigger", response_model=TriggerSimulationResponse)
async def trigger_simulation(
request: TriggerSimulationRequest,
background_tasks: BackgroundTasks
):
"""
Trigger a catch-up simulation job.
Returns:
202 Accepted with job details if new job queued
200 OK with existing job details if already running
"""
# 1. Load configuration
config = load_config(request.config_path)
# 2. Determine date range (last position date → most recent trading day)
date_range = calculate_date_range(config)
if not date_range:
return {
"status": "current",
"message": "Simulation already up-to-date",
"last_simulation_date": get_last_simulation_date(config),
"next_trading_day": get_next_trading_day()
}
# 3. Get enabled models
models = [m["signature"] for m in config["models"] if m.get("enabled", True)]
# 4. Check for existing job with same date range
existing_job = job_manager.find_job_by_date_range(date_range)
if existing_job:
# Return existing job status
progress = job_manager.get_job_progress(existing_job["job_id"])
return {
"job_id": existing_job["job_id"],
"status": existing_job["status"],
"date_range": date_range,
"models": models,
"created_at": existing_job["created_at"],
"message": "Simulation already in progress",
"progress": progress
}
# 5. Create new job
try:
job_id = job_manager.create_job(
config_path=request.config_path,
date_range=date_range,
models=models
)
except ValueError as e:
# Another job is running (different date range)
raise HTTPException(status_code=409, detail=str(e))
# 6. Queue background task
background_tasks.add_task(worker.run_job, job_id)
# 7. Return immediately with job details
return {
"job_id": job_id,
"status": "accepted",
"date_range": date_range,
"models": models,
"created_at": datetime.utcnow().isoformat() + "Z",
"message": "Simulation job queued successfully"
}
```
---
## 5. Agent Initialization Optimization
### 5.1 Current Issue
**Problem:** Each model-day calls `agent.initialize()`, which:
1. Creates new MCP client connections
2. Creates new AI model instance
For a 5-day simulation with 3 models = 15 `initialize()` calls → Slow
### 5.2 Optimization Strategy (Future Enhancement)
**Option A: Persistent Agent Instances**
Create agent once per model, reuse for all dates:
```python
class SimulationWorker:
async def run_job(self, job_id: str) -> None:
# ... load config ...
# Initialize all agents once
agents = {}
for model_config in enabled_models:
agent = await self._create_and_initialize_agent(
model_config, AgentClass, config
)
agents[model_config["signature"]] = agent
# Execute dates
for date in job["date_range"]:
tasks = []
for model_sig, agent in agents.items():
task = self.executor.run_model_day_with_agent(
job_id, date, agent
)
tasks.append(task)
await asyncio.gather(*tasks, return_exceptions=True)
```
**Benefit:** ~10-15s saved per job (avoid repeated MCP handshakes)
**Tradeoff:** More memory usage (agents kept in memory), more complex error handling
**Recommendation:** Implement in v2 after MVP validation
---
## 6. Error Handling & Recovery
### 6.1 Model-Day Failure Scenarios
**Scenario 1: AI Model API Timeout**
```python
# In executor.run_model_day()
try:
await agent.run_trading_session(date)
except asyncio.TimeoutError:
error_msg = "AI model API timeout after 30s"
self.job_manager.update_job_detail_status(
job_id, date, model_sig, "failed", error=error_msg
)
# Do NOT raise - let other models continue
```
**Scenario 2: MCP Service Down**
```python
# In agent.initialize()
except RuntimeError as e:
if "Failed to initialize MCP client" in str(e):
error_msg = "MCP services unavailable - check agent_tools/start_mcp_services.py"
self.job_manager.update_job_detail_status(
job_id, date, model_sig, "failed", error=error_msg
)
# This likely affects all models - but still don't raise, let job_manager determine final status
```
**Scenario 3: Out of Cash**
```python
# In trade tool
if position["CASH"] < total_cost:
# Trade tool returns error message
# Agent receives error, continues reasoning (might sell other stocks)
# Not a fatal error - trading session completes normally
```
### 6.2 Job-Level Failure
**When does entire job fail?**
Only if:
1. Configuration file is invalid/missing
2. Agent class import fails
3. Database errors during status updates
In these cases, `worker.run_job()` catches exception and marks job as `failed`.
All other errors (model-day failures) result in `partial` status.
---
## 7. Logging Strategy
### 7.1 Log Levels by Component
**Worker (api/worker.py):**
- `INFO`: Job start/end, date transitions
- `ERROR`: Fatal job errors
**Executor (api/executor.py):**
- `INFO`: Model-day start/completion
- `ERROR`: Model-day failures (with exc_info=True)
**Agent (base_agent.py):**
- Existing logging (step-by-step execution)
### 7.2 Structured Logging Format
```python
import logging
import json
class JSONFormatter(logging.Formatter):
def format(self, record):
log_record = {
"timestamp": self.formatTime(record, self.datefmt),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
}
# Add extra fields if present
if hasattr(record, "job_id"):
log_record["job_id"] = record.job_id
if hasattr(record, "model"):
log_record["model"] = record.model
if hasattr(record, "date"):
log_record["date"] = record.date
return json.dumps(log_record)
# Configure logger
handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())
logger = logging.getLogger("api")
logger.addHandler(handler)
logger.setLevel(logging.INFO)
```
### 7.3 Log Output Example
```json
{"timestamp": "2025-01-20T14:30:00Z", "level": "INFO", "logger": "api.worker", "message": "Starting simulation job 550e8400-...", "job_id": "550e8400-..."}
{"timestamp": "2025-01-20T14:30:01Z", "level": "INFO", "logger": "api.executor", "message": "Starting gpt-5 on 2025-01-16", "job_id": "550e8400-...", "model": "gpt-5", "date": "2025-01-16"}
{"timestamp": "2025-01-20T14:30:45Z", "level": "INFO", "logger": "api.executor", "message": "Completed gpt-5 on 2025-01-16", "job_id": "550e8400-...", "model": "gpt-5", "date": "2025-01-16"}
```
---
## 8. Testing Strategy
### 8.1 Unit Tests
```python
# tests/test_worker.py
import pytest
from unittest.mock import AsyncMock, MagicMock, patch
from api.worker import SimulationWorker
from api.job_manager import JobManager
@pytest.fixture
def mock_job_manager():
jm = MagicMock(spec=JobManager)
jm.get_job.return_value = {
"job_id": "test-job-123",
"config_path": "configs/test.json",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-5"]
}
return jm
@pytest.fixture
def worker(mock_job_manager):
return SimulationWorker(mock_job_manager)
@pytest.mark.asyncio
async def test_run_job_success(worker, mock_job_manager):
# Mock executor
worker.executor.run_model_day = AsyncMock(return_value=None)
await worker.run_job("test-job-123")
# Verify job status updated to running
mock_job_manager.update_job_status.assert_any_call("test-job-123", "running")
# Verify executor called for each model-day
assert worker.executor.run_model_day.call_count == 2 # 2 dates × 1 model
@pytest.mark.asyncio
async def test_run_job_partial_failure(worker, mock_job_manager):
# Mock executor - first call succeeds, second fails
worker.executor.run_model_day = AsyncMock(
side_effect=[None, Exception("API timeout")]
)
await worker.run_job("test-job-123")
# Job should continue despite one failure
assert worker.executor.run_model_day.call_count == 2
# Job status determined by job_manager based on job_details
# (tested in test_job_manager.py)
```
### 8.2 Integration Tests
```python
# tests/test_integration.py
import pytest
from api.main import app
from fastapi.testclient import TestClient
client = TestClient(app)
def test_trigger_and_poll_simulation():
# 1. Trigger simulation
response = client.post("/simulate/trigger", json={
"config_path": "configs/test.json"
})
assert response.status_code == 202
job_id = response.json()["job_id"]
# 2. Poll status (may need to wait for background task)
import time
time.sleep(2) # Wait for execution to start
response = client.get(f"/simulate/status/{job_id}")
assert response.status_code == 200
assert response.json()["status"] in ("running", "completed")
# 3. Wait for completion (with timeout)
max_wait = 60 # seconds
start_time = time.time()
while time.time() - start_time < max_wait:
response = client.get(f"/simulate/status/{job_id}")
status = response.json()["status"]
if status in ("completed", "partial", "failed"):
break
time.sleep(5)
assert status in ("completed", "partial")
```
---
## 9. Performance Monitoring
### 9.1 Metrics to Track
**Job-level metrics:**
- Total duration (from trigger to completion)
- Model-day failure rate
- Average model-day duration
**System-level metrics:**
- Concurrent job count (should be ≤ 1)
- Database query latency
- MCP service response times
### 9.2 Instrumentation (Future)
```python
# api/metrics.py
from prometheus_client import Counter, Histogram, Gauge
# Job metrics
job_counter = Counter('simulation_jobs_total', 'Total simulation jobs', ['status'])
job_duration = Histogram('simulation_job_duration_seconds', 'Job execution time')
# Model-day metrics
model_day_counter = Counter('model_days_total', 'Total model-days', ['model', 'status'])
model_day_duration = Histogram('model_day_duration_seconds', 'Model-day execution time', ['model'])
# System metrics
concurrent_jobs = Gauge('concurrent_jobs', 'Number of running jobs')
```
**Usage:**
```python
# In worker.run_job()
with job_duration.time():
await self._execute_job_logic(job_id)
job_counter.labels(status=final_status).inc()
```
---
## 10. Concurrency Safety
### 10.1 Thread Safety
**FastAPI Background Tasks:**
- Run in threadpool (default) or asyncio tasks
- For MVP, using asyncio tasks (async functions)
**SQLite Thread Safety:**
- `check_same_thread=False` allows multi-thread access
- Each operation opens new connection → Safe for low concurrency
**File I/O:**
- `position.jsonl` writes are sequential per model → Safe
- Different models write to different files → Safe
### 10.2 Race Condition Scenarios
**Scenario: Two trigger requests at exact same time**
```
Thread A: Check can_start_new_job() → True
Thread B: Check can_start_new_job() → True
Thread A: Create job → Success
Thread B: Create job → Success (PROBLEM: 2 jobs running)
```
**Mitigation: Database-level locking**
```python
def can_start_new_job(self) -> bool:
conn = get_db_connection(self.db_path)
cursor = conn.cursor()
# Use SELECT ... FOR UPDATE to lock rows (not supported in SQLite)
# Instead, use UNIQUE constraint on (status, created_at) for pending/running jobs
cursor.execute("""
SELECT COUNT(*) FROM jobs
WHERE status IN ('pending', 'running')
""")
count = cursor.fetchone()[0]
conn.close()
return count == 0
```
**For MVP:** Accept risk of rare double-job scenario (extremely unlikely with Windmill polling)
**For Production:** Use PostgreSQL with row-level locking or distributed lock (Redis)
---
## Summary
The Background Worker provides:
1. **Async job execution** with FastAPI BackgroundTasks
2. **Parallel model execution** for faster completion
3. **Isolated runtime configs** to prevent state collisions
4. **Graceful error handling** where model failures don't block others
5. **Comprehensive logging** for debugging and monitoring
**Next specification:** BaseAgent Refactoring for Single-Day Execution

63
entrypoint-api.sh Normal file
View File

@@ -0,0 +1,63 @@
#!/bin/bash
set -e # Exit on any error
echo "🚀 Starting AI-Trader API Server..."
# Validate required environment variables
echo "🔍 Validating environment variables..."
MISSING_VARS=()
if [ -z "$OPENAI_API_KEY" ]; then
MISSING_VARS+=("OPENAI_API_KEY")
fi
if [ -z "$ALPHAADVANTAGE_API_KEY" ]; then
MISSING_VARS+=("ALPHAADVANTAGE_API_KEY")
fi
if [ -z "$JINA_API_KEY" ]; then
MISSING_VARS+=("JINA_API_KEY")
fi
if [ ${#MISSING_VARS[@]} -gt 0 ]; then
echo ""
echo "❌ ERROR: Missing required environment variables:"
for var in "${MISSING_VARS[@]}"; do
echo " - $var"
done
echo ""
echo "Please set these variables in your .env file:"
echo " 1. Copy .env.example to .env"
echo " 2. Edit .env and add your API keys"
echo " 3. Restart the container"
echo ""
exit 1
fi
echo "✅ Environment variables validated"
# Step 1: Initialize database
echo "📊 Initializing database..."
python -c "from api.database import initialize_database; initialize_database('data/jobs.db')"
echo "✅ Database initialized"
# Step 2: Start MCP services in background
echo "🔧 Starting MCP services..."
cd /app
python agent_tools/start_mcp_services.py &
MCP_PID=$!
# Step 3: Wait for services to initialize
echo "⏳ Waiting for MCP services to start..."
sleep 3
# Step 4: Start FastAPI server with uvicorn
echo "🌐 Starting FastAPI server on port ${API_PORT:-8080}..."
uvicorn api.main:app \
--host 0.0.0.0 \
--port ${API_PORT:-8080} \
--log-level info \
--access-log
# Cleanup on exit
trap "echo '🛑 Stopping services...'; kill $MCP_PID 2>/dev/null; exit 0" EXIT SIGTERM SIGINT

45
pytest.ini Normal file
View File

@@ -0,0 +1,45 @@
[pytest]
# Test discovery
python_files = test_*.py
python_classes = Test*
python_functions = test_*
# Output options
addopts =
-v
--strict-markers
--tb=short
--cov=api
--cov-report=term-missing
--cov-report=html:htmlcov
--cov-fail-under=85
# Markers
markers =
unit: Unit tests (fast, isolated)
integration: Integration tests (with real dependencies)
performance: Performance and benchmark tests
security: Security tests
e2e: End-to-end tests (Docker required)
slow: Tests that take >10 seconds
# Test paths
testpaths = tests
# Coverage options
[coverage:run]
source = api
omit =
*/tests/*
*/conftest.py
*/__init__.py
[coverage:report]
exclude_lines =
pragma: no cover
def __repr__
raise AssertionError
raise NotImplementedError
if __name__ == .__main__.:
if TYPE_CHECKING:
@abstractmethod

25
requirements-dev.txt Normal file
View File

@@ -0,0 +1,25 @@
# Development and Testing Dependencies
# Testing framework
pytest==7.4.3
pytest-cov==4.1.0
pytest-asyncio==0.21.1
pytest-benchmark==4.0.0
# Mocking and fixtures
pytest-mock==3.12.0
# Code quality
ruff==0.1.7
black==23.11.0
isort==5.12.0
mypy==1.7.1
# Security
bandit==1.7.5
# Load testing
locust==2.18.3
# Type stubs
types-requests==2.31.0.10

View File

@@ -1,4 +1,7 @@
langchain==1.0.2
langchain-openai==1.0.1
langchain-mcp-adapters>=0.1.0
fastmcp==2.12.5
fastmcp==2.12.5
fastapi>=0.120.0
uvicorn[standard]>=0.27.0
pydantic>=2.0.0

242
scripts/test_api_endpoints.sh Executable file
View File

@@ -0,0 +1,242 @@
#!/bin/bash
# API Endpoint Testing Script
# Tests all REST API endpoints in running Docker container
set -e
echo "=========================================="
echo "AI-Trader API Endpoint Testing"
echo "=========================================="
echo ""
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Configuration
API_BASE_URL=${API_BASE_URL:-http://localhost:8080}
TEST_CONFIG="/app/configs/default_config.json"
# Check if API is running
echo "Checking if API is accessible..."
if ! curl -f "$API_BASE_URL/health" &> /dev/null; then
echo -e "${RED}${NC} API is not accessible at $API_BASE_URL"
echo "Make sure the container is running:"
echo " docker-compose up -d ai-trader-api"
exit 1
fi
echo -e "${GREEN}${NC} API is accessible"
echo ""
# Test 1: Health Check
echo -e "${BLUE}Test 1: GET /health${NC}"
echo "Testing health endpoint..."
HEALTH_RESPONSE=$(curl -s "$API_BASE_URL/health")
HEALTH_STATUS=$(echo $HEALTH_RESPONSE | jq -r '.status' 2>/dev/null || echo "error")
if [ "$HEALTH_STATUS" = "healthy" ]; then
echo -e "${GREEN}${NC} Health check passed"
echo "Response: $HEALTH_RESPONSE" | jq '.' 2>/dev/null || echo "$HEALTH_RESPONSE"
else
echo -e "${RED}${NC} Health check failed"
echo "Response: $HEALTH_RESPONSE"
fi
echo ""
# Test 2: Trigger Simulation
echo -e "${BLUE}Test 2: POST /simulate/trigger${NC}"
echo "Triggering test simulation (2 dates, 1 model)..."
TRIGGER_PAYLOAD=$(cat <<EOF
{
"config_path": "$TEST_CONFIG",
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4"]
}
EOF
)
echo "Request payload:"
echo "$TRIGGER_PAYLOAD" | jq '.'
TRIGGER_RESPONSE=$(curl -s -X POST "$API_BASE_URL/simulate/trigger" \
-H "Content-Type: application/json" \
-d "$TRIGGER_PAYLOAD")
JOB_ID=$(echo $TRIGGER_RESPONSE | jq -r '.job_id' 2>/dev/null)
if [ -n "$JOB_ID" ] && [ "$JOB_ID" != "null" ]; then
echo -e "${GREEN}${NC} Simulation triggered successfully"
echo "Job ID: $JOB_ID"
echo "Response: $TRIGGER_RESPONSE" | jq '.' 2>/dev/null || echo "$TRIGGER_RESPONSE"
else
echo -e "${RED}${NC} Failed to trigger simulation"
echo "Response: $TRIGGER_RESPONSE"
exit 1
fi
echo ""
# Test 3: Check Job Status
echo -e "${BLUE}Test 3: GET /simulate/status/{job_id}${NC}"
echo "Checking job status for: $JOB_ID"
echo "Waiting 5 seconds for job to start..."
sleep 5
STATUS_RESPONSE=$(curl -s "$API_BASE_URL/simulate/status/$JOB_ID")
JOB_STATUS=$(echo $STATUS_RESPONSE | jq -r '.status' 2>/dev/null)
if [ -n "$JOB_STATUS" ] && [ "$JOB_STATUS" != "null" ]; then
echo -e "${GREEN}${NC} Job status retrieved"
echo "Job Status: $JOB_STATUS"
echo "Response: $STATUS_RESPONSE" | jq '.' 2>/dev/null || echo "$STATUS_RESPONSE"
else
echo -e "${RED}${NC} Failed to get job status"
echo "Response: $STATUS_RESPONSE"
fi
echo ""
# Test 4: Poll until completion or timeout
echo -e "${BLUE}Test 4: Monitoring job progress${NC}"
echo "Polling job status (max 5 minutes)..."
MAX_POLLS=30
POLL_INTERVAL=10
POLL_COUNT=0
while [ $POLL_COUNT -lt $MAX_POLLS ]; do
STATUS_RESPONSE=$(curl -s "$API_BASE_URL/simulate/status/$JOB_ID")
JOB_STATUS=$(echo $STATUS_RESPONSE | jq -r '.status' 2>/dev/null)
PROGRESS=$(echo $STATUS_RESPONSE | jq -r '.progress' 2>/dev/null)
echo "[$((POLL_COUNT + 1))/$MAX_POLLS] Status: $JOB_STATUS | Progress: $PROGRESS"
if [ "$JOB_STATUS" = "completed" ] || [ "$JOB_STATUS" = "partial" ] || [ "$JOB_STATUS" = "failed" ]; then
echo -e "${GREEN}${NC} Job finished with status: $JOB_STATUS"
echo "Final response:"
echo "$STATUS_RESPONSE" | jq '.' 2>/dev/null || echo "$STATUS_RESPONSE"
break
fi
POLL_COUNT=$((POLL_COUNT + 1))
if [ $POLL_COUNT -lt $MAX_POLLS ]; then
sleep $POLL_INTERVAL
fi
done
if [ $POLL_COUNT -eq $MAX_POLLS ]; then
echo -e "${YELLOW}${NC} Job did not complete within timeout (still $JOB_STATUS)"
echo "Job may still be running. Check status later with:"
echo " curl $API_BASE_URL/simulate/status/$JOB_ID"
fi
echo ""
# Test 5: Query Results
echo -e "${BLUE}Test 5: GET /results${NC}"
echo "Querying results for job: $JOB_ID"
RESULTS_RESPONSE=$(curl -s "$API_BASE_URL/results?job_id=$JOB_ID")
RESULT_COUNT=$(echo $RESULTS_RESPONSE | jq -r '.count' 2>/dev/null)
if [ -n "$RESULT_COUNT" ] && [ "$RESULT_COUNT" != "null" ]; then
echo -e "${GREEN}${NC} Results retrieved"
echo "Result count: $RESULT_COUNT"
if [ "$RESULT_COUNT" -gt 0 ]; then
echo "Sample result:"
echo "$RESULTS_RESPONSE" | jq '.results[0]' 2>/dev/null || echo "$RESULTS_RESPONSE"
else
echo -e "${YELLOW}${NC} No results found (job may not be complete yet)"
fi
else
echo -e "${RED}${NC} Failed to retrieve results"
echo "Response: $RESULTS_RESPONSE"
fi
echo ""
# Test 6: Query Results by Date
echo -e "${BLUE}Test 6: GET /results?date=...${NC}"
echo "Querying results by date filter..."
DATE_RESULTS=$(curl -s "$API_BASE_URL/results?date=2025-01-16")
DATE_COUNT=$(echo $DATE_RESULTS | jq -r '.count' 2>/dev/null)
if [ -n "$DATE_COUNT" ] && [ "$DATE_COUNT" != "null" ]; then
echo -e "${GREEN}${NC} Date-filtered results retrieved"
echo "Results for 2025-01-16: $DATE_COUNT"
else
echo -e "${RED}${NC} Failed to retrieve date-filtered results"
fi
echo ""
# Test 7: Query Results by Model
echo -e "${BLUE}Test 7: GET /results?model=...${NC}"
echo "Querying results by model filter..."
MODEL_RESULTS=$(curl -s "$API_BASE_URL/results?model=gpt-4")
MODEL_COUNT=$(echo $MODEL_RESULTS | jq -r '.count' 2>/dev/null)
if [ -n "$MODEL_COUNT" ] && [ "$MODEL_COUNT" != "null" ]; then
echo -e "${GREEN}${NC} Model-filtered results retrieved"
echo "Results for gpt-4: $MODEL_COUNT"
else
echo -e "${RED}${NC} Failed to retrieve model-filtered results"
fi
echo ""
# Test 8: Concurrent Job Prevention
echo -e "${BLUE}Test 8: Concurrent job prevention${NC}"
echo "Attempting to trigger second job (should fail if first is still running)..."
SECOND_TRIGGER=$(curl -s -X POST "$API_BASE_URL/simulate/trigger" \
-H "Content-Type: application/json" \
-d "$TRIGGER_PAYLOAD")
if echo "$SECOND_TRIGGER" | grep -qi "already running"; then
echo -e "${GREEN}${NC} Concurrent job correctly rejected"
echo "Response: $SECOND_TRIGGER"
elif echo "$SECOND_TRIGGER" | jq -r '.job_id' 2>/dev/null | grep -q "-"; then
echo -e "${YELLOW}${NC} Second job was accepted (first job may have completed)"
echo "Response: $SECOND_TRIGGER" | jq '.' 2>/dev/null || echo "$SECOND_TRIGGER"
else
echo -e "${YELLOW}${NC} Unexpected response"
echo "Response: $SECOND_TRIGGER"
fi
echo ""
# Test 9: Invalid Requests
echo -e "${BLUE}Test 9: Error handling${NC}"
echo "Testing invalid config path..."
INVALID_TRIGGER=$(curl -s -X POST "$API_BASE_URL/simulate/trigger" \
-H "Content-Type: application/json" \
-d '{"config_path": "/invalid/path.json", "date_range": ["2025-01-16"], "models": ["gpt-4"]}')
if echo "$INVALID_TRIGGER" | grep -qi "does not exist"; then
echo -e "${GREEN}${NC} Invalid config path correctly rejected"
else
echo -e "${YELLOW}${NC} Unexpected response for invalid config"
echo "Response: $INVALID_TRIGGER"
fi
echo ""
# Summary
echo "=========================================="
echo "Test Summary"
echo "=========================================="
echo ""
echo "All API endpoints tested successfully!"
echo ""
echo "Job Details:"
echo " Job ID: $JOB_ID"
echo " Final Status: $JOB_STATUS"
echo " Results Count: $RESULT_COUNT"
echo ""
echo "To view full job details:"
echo " curl $API_BASE_URL/simulate/status/$JOB_ID | jq ."
echo ""
echo "To view all results:"
echo " curl $API_BASE_URL/results | jq ."
echo ""

232
scripts/test_batch_mode.sh Executable file
View File

@@ -0,0 +1,232 @@
#!/bin/bash
# Batch Mode Testing Script
# Tests Docker batch mode with one-time simulation
set -e
echo "=========================================="
echo "AI-Trader Batch Mode Testing"
echo "=========================================="
echo ""
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Check prerequisites
echo "Checking prerequisites..."
if ! command -v docker &> /dev/null; then
echo -e "${RED}${NC} Docker not installed"
exit 1
fi
if [ ! -f .env ]; then
echo -e "${RED}${NC} .env file not found"
echo "Copy .env.example to .env and configure API keys"
exit 1
fi
echo -e "${GREEN}${NC} Prerequisites OK"
echo ""
# Check if custom config exists
CONFIG_FILE=${1:-configs/default_config.json}
if [ ! -f "$CONFIG_FILE" ]; then
echo -e "${YELLOW}${NC} Config file not found: $CONFIG_FILE"
echo "Creating test config..."
mkdir -p configs
cat > configs/test_batch.json <<EOF
{
"agent_type": "BaseAgent",
"date_range": {
"init_date": "2025-01-16",
"end_date": "2025-01-17"
},
"models": [
{
"name": "GPT-4 Test",
"basemodel": "gpt-4",
"signature": "gpt-4-test",
"enabled": true
}
],
"agent_config": {
"max_steps": 10,
"initial_cash": 10000.0
},
"log_config": {
"log_path": "./data/agent_data"
}
}
EOF
CONFIG_FILE="configs/test_batch.json"
echo -e "${GREEN}${NC} Created test config: $CONFIG_FILE"
fi
echo "Using config: $CONFIG_FILE"
echo ""
# Test 1: Build image
echo -e "${BLUE}Test 1: Building Docker image${NC}"
echo "This may take a few minutes..."
if docker build -t ai-trader-batch-test . > /tmp/docker-build.log 2>&1; then
echo -e "${GREEN}${NC} Image built successfully"
else
echo -e "${RED}${NC} Build failed"
echo "Check logs: /tmp/docker-build.log"
tail -20 /tmp/docker-build.log
exit 1
fi
echo ""
# Test 2: Run batch simulation
echo -e "${BLUE}Test 2: Running batch simulation${NC}"
echo "Starting container in batch mode..."
echo "Config: $CONFIG_FILE"
echo ""
# Use docker-compose if available, otherwise docker run
if command -v docker-compose &> /dev/null || docker compose version &> /dev/null; then
echo "Using docker-compose..."
# Ensure API is stopped
docker-compose down 2>/dev/null || true
# Run batch mode
echo "Executing: docker-compose --profile batch run --rm ai-trader-batch $CONFIG_FILE"
docker-compose --profile batch run --rm ai-trader-batch "$CONFIG_FILE"
BATCH_EXIT_CODE=$?
else
echo "Using docker run..."
docker run --rm \
--env-file .env \
-v "$(pwd)/data:/app/data" \
-v "$(pwd)/logs:/app/logs" \
-v "$(pwd)/configs:/app/configs" \
ai-trader-batch-test \
"$CONFIG_FILE"
BATCH_EXIT_CODE=$?
fi
echo ""
# Test 3: Check exit code
echo -e "${BLUE}Test 3: Checking exit status${NC}"
if [ $BATCH_EXIT_CODE -eq 0 ]; then
echo -e "${GREEN}${NC} Batch simulation completed successfully (exit code: 0)"
else
echo -e "${RED}${NC} Batch simulation failed (exit code: $BATCH_EXIT_CODE)"
echo "Check logs in ./logs/ directory"
exit 1
fi
echo ""
# Test 4: Verify output files
echo -e "${BLUE}Test 4: Verifying output files${NC}"
# Check if data directory has position files
POSITION_FILES=$(find data/agent_data -name "position.jsonl" 2>/dev/null | wc -l)
if [ $POSITION_FILES -gt 0 ]; then
echo -e "${GREEN}${NC} Found $POSITION_FILES position file(s)"
# Show sample position data
SAMPLE_POSITION=$(find data/agent_data -name "position.jsonl" 2>/dev/null | head -1)
if [ -n "$SAMPLE_POSITION" ]; then
echo "Sample position data from: $SAMPLE_POSITION"
head -1 "$SAMPLE_POSITION" | jq '.' 2>/dev/null || head -1 "$SAMPLE_POSITION"
fi
else
echo -e "${YELLOW}${NC} No position files found"
echo "This could indicate the simulation didn't complete trading"
fi
echo ""
# Check log files
LOG_COUNT=$(find logs -name "*.log" 2>/dev/null | wc -l)
if [ $LOG_COUNT -gt 0 ]; then
echo -e "${GREEN}${NC} Found $LOG_COUNT log file(s)"
else
echo -e "${YELLOW}${NC} No log files found"
fi
echo ""
# Test 5: Check price data
echo -e "${BLUE}Test 5: Checking price data${NC}"
if [ -f "data/merged.jsonl" ]; then
STOCK_COUNT=$(wc -l < data/merged.jsonl)
echo -e "${GREEN}${NC} Price data exists: $STOCK_COUNT stocks"
else
echo -e "${YELLOW}${NC} No price data file found"
echo "First run will download price data"
fi
echo ""
# Test 6: Re-run to test data persistence
echo -e "${BLUE}Test 6: Testing data persistence${NC}"
echo "Running batch mode again to verify data persists..."
echo ""
if command -v docker-compose &> /dev/null || docker compose version &> /dev/null; then
docker-compose --profile batch run --rm ai-trader-batch "$CONFIG_FILE" > /tmp/batch-second-run.log 2>&1
SECOND_EXIT_CODE=$?
else
docker run --rm \
--env-file .env \
-v "$(pwd)/data:/app/data" \
-v "$(pwd)/logs:/app/logs" \
-v "$(pwd)/configs:/app/configs" \
ai-trader-batch-test \
"$CONFIG_FILE" > /tmp/batch-second-run.log 2>&1
SECOND_EXIT_CODE=$?
fi
if [ $SECOND_EXIT_CODE -eq 0 ]; then
echo -e "${GREEN}${NC} Second run completed successfully"
# Check if it reused price data (should be faster)
if grep -q "Using existing price data" /tmp/batch-second-run.log; then
echo -e "${GREEN}${NC} Price data was reused (data persistence working)"
else
echo -e "${YELLOW}${NC} Could not verify price data reuse"
fi
else
echo -e "${RED}${NC} Second run failed"
fi
echo ""
# Summary
echo "=========================================="
echo "Batch Mode Test Summary"
echo "=========================================="
echo ""
echo "Tests completed:"
echo " ✓ Docker image build"
echo " ✓ Batch mode execution"
echo " ✓ Exit code verification"
echo " ✓ Output file generation"
echo " ✓ Data persistence"
echo ""
echo "Output locations:"
echo " Position data: data/agent_data/*/position/"
echo " Trading logs: data/agent_data/*/log/"
echo " System logs: logs/"
echo " Price data: data/merged.jsonl"
echo ""
echo "To view position data:"
echo " find data/agent_data -name 'position.jsonl' -exec cat {} \;"
echo ""
echo "To view trading logs:"
echo " find data/agent_data -name 'log.jsonl' | head -1 | xargs cat"
echo ""

221
scripts/validate_docker_build.sh Executable file
View File

@@ -0,0 +1,221 @@
#!/bin/bash
# Docker Build & Validation Script
# Run this script to validate the Docker setup before production deployment
set -e # Exit on error
echo "=========================================="
echo "AI-Trader Docker Build Validation"
echo "=========================================="
echo ""
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Function to print status
print_status() {
if [ $1 -eq 0 ]; then
echo -e "${GREEN}${NC} $2"
else
echo -e "${RED}${NC} $2"
fi
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
# Step 1: Check prerequisites
echo "Step 1: Checking prerequisites..."
# Check if Docker is installed
if command -v docker &> /dev/null; then
print_status 0 "Docker is installed: $(docker --version)"
else
print_status 1 "Docker is not installed"
echo "Please install Docker: https://docs.docker.com/get-docker/"
exit 1
fi
# Check if Docker daemon is running
if docker info &> /dev/null; then
print_status 0 "Docker daemon is running"
else
print_status 1 "Docker daemon is not running"
echo "Please start Docker Desktop or Docker daemon"
exit 1
fi
# Check if docker-compose is available
if command -v docker-compose &> /dev/null; then
print_status 0 "docker-compose is installed: $(docker-compose --version)"
elif docker compose version &> /dev/null; then
print_status 0 "docker compose (plugin) is available"
COMPOSE_CMD="docker compose"
else
print_status 1 "docker-compose is not available"
exit 1
fi
# Default to docker-compose if not set
COMPOSE_CMD=${COMPOSE_CMD:-docker-compose}
echo ""
# Step 2: Check environment file
echo "Step 2: Checking environment configuration..."
if [ -f .env ]; then
print_status 0 ".env file exists"
# Check required variables
required_vars=("OPENAI_API_KEY" "ALPHAADVANTAGE_API_KEY" "JINA_API_KEY")
missing_vars=()
for var in "${required_vars[@]}"; do
if grep -q "^${var}=" .env && ! grep -q "^${var}=your_.*_key_here" .env && ! grep -q "^${var}=$" .env; then
print_status 0 "$var is set"
else
missing_vars+=("$var")
print_status 1 "$var is missing or not configured"
fi
done
if [ ${#missing_vars[@]} -gt 0 ]; then
print_warning "Some required environment variables are not configured"
echo "Please edit .env and add:"
for var in "${missing_vars[@]}"; do
echo " - $var"
done
echo ""
read -p "Continue anyway? (y/n) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
fi
else
print_status 1 ".env file not found"
echo "Creating .env from .env.example..."
cp .env.example .env
print_warning "Please edit .env and add your API keys before continuing"
exit 1
fi
echo ""
# Step 3: Build Docker image
echo "Step 3: Building Docker image..."
echo "This may take several minutes on first build..."
echo ""
if docker build -t ai-trader-test . ; then
print_status 0 "Docker image built successfully"
else
print_status 1 "Docker build failed"
exit 1
fi
echo ""
# Step 4: Check image
echo "Step 4: Verifying Docker image..."
IMAGE_SIZE=$(docker images ai-trader-test --format "{{.Size}}")
print_status 0 "Image size: $IMAGE_SIZE"
# List exposed ports
EXPOSED_PORTS=$(docker inspect ai-trader-test --format '{{range $p, $conf := .Config.ExposedPorts}}{{$p}} {{end}}')
print_status 0 "Exposed ports: $EXPOSED_PORTS"
echo ""
# Step 5: Test API mode startup (brief)
echo "Step 5: Testing API mode startup..."
echo "Starting container in background..."
$COMPOSE_CMD up -d ai-trader-api
if [ $? -eq 0 ]; then
print_status 0 "Container started successfully"
echo "Waiting 10 seconds for services to initialize..."
sleep 10
# Check if container is still running
if docker ps | grep -q ai-trader-api; then
print_status 0 "Container is running"
# Check logs for errors
ERROR_COUNT=$(docker logs ai-trader-api 2>&1 | grep -i "error" | grep -v "ERROR:" | wc -l)
if [ $ERROR_COUNT -gt 0 ]; then
print_warning "Found $ERROR_COUNT error messages in logs"
echo "Check logs with: docker logs ai-trader-api"
else
print_status 0 "No critical errors in logs"
fi
else
print_status 1 "Container stopped unexpectedly"
echo "Check logs with: docker logs ai-trader-api"
exit 1
fi
else
print_status 1 "Failed to start container"
exit 1
fi
echo ""
# Step 6: Test health endpoint
echo "Step 6: Testing health endpoint..."
# Wait a bit more for API to be ready
sleep 5
if curl -f http://localhost:8080/health &> /dev/null; then
print_status 0 "Health endpoint responding"
# Get health details
HEALTH_DATA=$(curl -s http://localhost:8080/health)
echo "Health response: $HEALTH_DATA"
else
print_status 1 "Health endpoint not responding"
print_warning "This could indicate:"
echo " - API server failed to start"
echo " - Port 8080 is already in use"
echo " - MCP services failed to initialize"
echo ""
echo "Check logs with: docker logs ai-trader-api"
fi
echo ""
# Step 7: Cleanup
echo "Step 7: Cleanup..."
read -p "Stop the container? (y/n) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
$COMPOSE_CMD down
print_status 0 "Container stopped"
fi
echo ""
echo "=========================================="
echo "Validation Summary"
echo "=========================================="
echo ""
echo "Next steps:"
echo "1. If all checks passed, proceed with API endpoint testing:"
echo " bash scripts/test_api_endpoints.sh"
echo ""
echo "2. Test batch mode:"
echo " bash scripts/test_batch_mode.sh"
echo ""
echo "3. If any checks failed, review logs:"
echo " docker logs ai-trader-api"
echo ""
echo "4. For troubleshooting, see: DOCKER_API.md"
echo ""

0
tests/__init__.py Normal file
View File

140
tests/conftest.py Normal file
View File

@@ -0,0 +1,140 @@
"""
Shared pytest fixtures for AI-Trader API tests.
This module provides reusable fixtures for:
- Test database setup/teardown
- Mock configurations
- Test data factories
"""
import pytest
import tempfile
import os
from pathlib import Path
from api.database import initialize_database, get_db_connection
@pytest.fixture(scope="session")
def test_db_path():
"""Create temporary database file for testing session."""
temp_db = tempfile.NamedTemporaryFile(delete=False, suffix=".db")
temp_db.close()
yield temp_db.name
# Cleanup after all tests
try:
os.unlink(temp_db.name)
except FileNotFoundError:
pass
@pytest.fixture(scope="function")
def clean_db(test_db_path):
"""
Provide clean database for each test function.
This fixture:
1. Initializes schema if needed
2. Clears all data before test
3. Returns database path
Usage:
def test_something(clean_db):
conn = get_db_connection(clean_db)
# ... test code
"""
# Ensure schema exists
initialize_database(test_db_path)
# Clear all tables
conn = get_db_connection(test_db_path)
cursor = conn.cursor()
# Delete in correct order (respecting foreign keys)
cursor.execute("DELETE FROM tool_usage")
cursor.execute("DELETE FROM reasoning_logs")
cursor.execute("DELETE FROM holdings")
cursor.execute("DELETE FROM positions")
cursor.execute("DELETE FROM job_details")
cursor.execute("DELETE FROM jobs")
conn.commit()
conn.close()
return test_db_path
@pytest.fixture
def sample_job_data():
"""Sample job data for testing."""
return {
"job_id": "test-job-123",
"config_path": "configs/test.json",
"status": "pending",
"date_range": '["2025-01-16", "2025-01-17"]',
"models": '["gpt-5", "claude-3.7-sonnet"]',
"created_at": "2025-01-20T14:30:00Z"
}
@pytest.fixture
def sample_position_data():
"""Sample position data for testing."""
return {
"job_id": "test-job-123",
"date": "2025-01-16",
"model": "gpt-5",
"action_id": 1,
"action_type": "buy",
"symbol": "AAPL",
"amount": 10,
"price": 255.88,
"cash": 7441.2,
"portfolio_value": 10000.0,
"daily_profit": 0.0,
"daily_return_pct": 0.0,
"cumulative_profit": 0.0,
"cumulative_return_pct": 0.0,
"created_at": "2025-01-16T09:30:00Z"
}
@pytest.fixture
def mock_config():
"""Mock configuration for testing."""
return {
"agent_type": "BaseAgent",
"date_range": {
"init_date": "2025-01-16",
"end_date": "2025-01-17"
},
"models": [
{
"name": "test-model",
"basemodel": "openai/gpt-4",
"signature": "test-model",
"enabled": True
}
],
"agent_config": {
"max_steps": 10,
"max_retries": 3,
"base_delay": 0.5,
"initial_cash": 10000.0
},
"log_config": {
"log_path": "./data/agent_data"
}
}
# Pytest configuration hooks
def pytest_configure(config):
"""Configure pytest with custom markers."""
config.addinivalue_line("markers", "unit: Unit tests (fast, isolated)")
config.addinivalue_line("markers", "integration: Integration tests (with dependencies)")
config.addinivalue_line("markers", "performance: Performance and benchmark tests")
config.addinivalue_line("markers", "security: Security tests")
config.addinivalue_line("markers", "e2e: End-to-end tests (Docker required)")
config.addinivalue_line("markers", "slow: Tests that take >10 seconds")

0
tests/e2e/__init__.py Normal file
View File

View File

View File

@@ -0,0 +1,295 @@
"""
Integration tests for FastAPI endpoints.
Coverage target: 90%+
Tests verify:
- POST /simulate/trigger: Job creation and trigger
- GET /simulate/status/{job_id}: Job status retrieval
- GET /results: Results querying with filters
- GET /health: Health check endpoint
- Error handling and validation
"""
import pytest
from fastapi.testclient import TestClient
from pathlib import Path
import json
@pytest.fixture
def api_client(clean_db, tmp_path):
"""Create FastAPI test client with clean database."""
from api.main import create_app
# Create test config
test_config = tmp_path / "test_config.json"
test_config.write_text(json.dumps({
"agent_type": "BaseAgent",
"date_range": {"init_date": "2025-01-16", "end_date": "2025-01-17"},
"models": [
{"name": "Test Model", "basemodel": "gpt-4", "signature": "gpt-4", "enabled": True}
],
"agent_config": {"max_steps": 30, "initial_cash": 10000.0},
"log_config": {"log_path": "./data/agent_data"}
}))
app = create_app(db_path=clean_db)
# Enable test mode to prevent background worker from starting
app.state.test_mode = True
client = TestClient(app)
client.test_config_path = str(test_config)
client.db_path = clean_db
return client
@pytest.mark.integration
class TestSimulateTriggerEndpoint:
"""Test POST /simulate/trigger endpoint."""
def test_trigger_creates_job(self, api_client):
"""Should create job and return job_id."""
response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4"]
})
assert response.status_code == 200
data = response.json()
assert "job_id" in data
assert data["status"] == "pending"
assert data["total_model_days"] == 2
def test_trigger_validates_config_path(self, api_client):
"""Should reject nonexistent config path."""
response = api_client.post("/simulate/trigger", json={
"config_path": "/nonexistent/config.json",
"date_range": ["2025-01-16"],
"models": ["gpt-4"]
})
assert response.status_code == 400
assert "does not exist" in response.json()["detail"].lower()
def test_trigger_validates_date_range(self, api_client):
"""Should reject invalid date range."""
response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": [], # Empty date range
"models": ["gpt-4"]
})
assert response.status_code == 422 # Pydantic validation error
def test_trigger_validates_models(self, api_client):
"""Should reject empty model list."""
response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": ["2025-01-16"],
"models": [] # Empty models
})
assert response.status_code == 422 # Pydantic validation error
def test_trigger_enforces_single_job_limit(self, api_client):
"""Should reject trigger when job already running."""
# Create first job
api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": ["2025-01-16"],
"models": ["gpt-4"]
})
# Try to create second job
response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": ["2025-01-17"],
"models": ["gpt-4"]
})
assert response.status_code == 400
assert "already running" in response.json()["detail"].lower()
@pytest.mark.integration
class TestSimulateStatusEndpoint:
"""Test GET /simulate/status/{job_id} endpoint."""
def test_status_returns_job_info(self, api_client):
"""Should return job status and progress."""
# Create job
create_response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": ["2025-01-16"],
"models": ["gpt-4"]
})
job_id = create_response.json()["job_id"]
# Get status
response = api_client.get(f"/simulate/status/{job_id}")
assert response.status_code == 200
data = response.json()
assert data["job_id"] == job_id
assert data["status"] == "pending"
assert "progress" in data
assert data["progress"]["total_model_days"] == 1
def test_status_returns_404_for_nonexistent_job(self, api_client):
"""Should return 404 for unknown job_id."""
response = api_client.get("/simulate/status/nonexistent-job-id")
assert response.status_code == 404
assert "not found" in response.json()["detail"].lower()
def test_status_includes_model_day_details(self, api_client):
"""Should include model-day execution details."""
# Create job
create_response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": ["2025-01-16", "2025-01-17"],
"models": ["gpt-4"]
})
job_id = create_response.json()["job_id"]
# Get status
response = api_client.get(f"/simulate/status/{job_id}")
assert response.status_code == 200
data = response.json()
assert "details" in data
assert len(data["details"]) == 2 # 2 dates
assert all("date" in detail for detail in data["details"])
assert all("model" in detail for detail in data["details"])
assert all("status" in detail for detail in data["details"])
@pytest.mark.integration
class TestResultsEndpoint:
"""Test GET /results endpoint."""
def test_results_returns_all_results(self, api_client):
"""Should return all results without filters."""
response = api_client.get("/results")
assert response.status_code == 200
data = response.json()
assert "results" in data
assert isinstance(data["results"], list)
def test_results_filters_by_job_id(self, api_client):
"""Should filter results by job_id."""
# Create job
create_response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path,
"date_range": ["2025-01-16"],
"models": ["gpt-4"]
})
job_id = create_response.json()["job_id"]
# Query results
response = api_client.get(f"/results?job_id={job_id}")
assert response.status_code == 200
data = response.json()
# Should return empty list initially (no completed executions yet)
assert isinstance(data["results"], list)
def test_results_filters_by_date(self, api_client):
"""Should filter results by date."""
response = api_client.get("/results?date=2025-01-16")
assert response.status_code == 200
data = response.json()
assert isinstance(data["results"], list)
def test_results_filters_by_model(self, api_client):
"""Should filter results by model."""
response = api_client.get("/results?model=gpt-4")
assert response.status_code == 200
data = response.json()
assert isinstance(data["results"], list)
def test_results_combines_multiple_filters(self, api_client):
"""Should support multiple filter parameters."""
response = api_client.get("/results?date=2025-01-16&model=gpt-4")
assert response.status_code == 200
data = response.json()
assert isinstance(data["results"], list)
def test_results_includes_position_data(self, api_client):
"""Should include position and holdings data."""
# This test will pass once we have actual data
response = api_client.get("/results")
assert response.status_code == 200
data = response.json()
# Each result should have expected structure
for result in data["results"]:
assert "job_id" in result or True # Pass if empty
@pytest.mark.integration
class TestHealthEndpoint:
"""Test GET /health endpoint."""
def test_health_returns_ok(self, api_client):
"""Should return healthy status."""
response = api_client.get("/health")
assert response.status_code == 200
data = response.json()
assert data["status"] == "healthy"
def test_health_includes_database_check(self, api_client):
"""Should verify database connectivity."""
response = api_client.get("/health")
assert response.status_code == 200
data = response.json()
assert "database" in data
assert data["database"] == "connected"
def test_health_includes_system_info(self, api_client):
"""Should include system information."""
response = api_client.get("/health")
assert response.status_code == 200
data = response.json()
assert "version" in data or "timestamp" in data
@pytest.mark.integration
class TestErrorHandling:
"""Test error handling across endpoints."""
def test_invalid_json_returns_422(self, api_client):
"""Should handle malformed JSON."""
response = api_client.post(
"/simulate/trigger",
data="invalid json",
headers={"Content-Type": "application/json"}
)
assert response.status_code == 422
def test_missing_required_fields_returns_422(self, api_client):
"""Should validate required fields."""
response = api_client.post("/simulate/trigger", json={
"config_path": api_client.test_config_path
# Missing date_range and models
})
assert response.status_code == 422
def test_invalid_job_id_format_returns_404(self, api_client):
"""Should handle invalid job_id format gracefully."""
response = api_client.get("/simulate/status/invalid-format")
assert response.status_code == 404
# Coverage target: 90%+ for api/main.py

View File

View File

0
tests/unit/__init__.py Normal file
View File

501
tests/unit/test_database.py Normal file
View File

@@ -0,0 +1,501 @@
"""
Unit tests for api/database.py module.
Coverage target: 95%+
Tests verify:
- Database connection management
- Schema initialization
- Table creation and indexes
- Foreign key constraints
- Utility functions
"""
import pytest
import sqlite3
import os
import tempfile
from pathlib import Path
from api.database import (
get_db_connection,
initialize_database,
drop_all_tables,
vacuum_database,
get_database_stats
)
@pytest.mark.unit
class TestDatabaseConnection:
"""Test database connection functionality."""
def test_get_db_connection_creates_directory(self):
"""Should create data directory if it doesn't exist."""
temp_dir = tempfile.mkdtemp()
db_path = os.path.join(temp_dir, "subdir", "test.db")
conn = get_db_connection(db_path)
assert conn is not None
assert os.path.exists(os.path.dirname(db_path))
conn.close()
os.unlink(db_path)
os.rmdir(os.path.dirname(db_path))
os.rmdir(temp_dir)
def test_get_db_connection_enables_foreign_keys(self):
"""Should enable foreign key constraints."""
temp_db = tempfile.NamedTemporaryFile(delete=False, suffix=".db")
temp_db.close()
conn = get_db_connection(temp_db.name)
# Check if foreign keys are enabled
cursor = conn.cursor()
cursor.execute("PRAGMA foreign_keys")
result = cursor.fetchone()[0]
assert result == 1 # 1 = enabled
conn.close()
os.unlink(temp_db.name)
def test_get_db_connection_row_factory(self):
"""Should set row factory for dict-like access."""
temp_db = tempfile.NamedTemporaryFile(delete=False, suffix=".db")
temp_db.close()
conn = get_db_connection(temp_db.name)
assert conn.row_factory == sqlite3.Row
conn.close()
os.unlink(temp_db.name)
def test_get_db_connection_thread_safety(self):
"""Should allow check_same_thread=False for async compatibility."""
temp_db = tempfile.NamedTemporaryFile(delete=False, suffix=".db")
temp_db.close()
# This should not raise an error
conn = get_db_connection(temp_db.name)
assert conn is not None
conn.close()
os.unlink(temp_db.name)
@pytest.mark.unit
class TestSchemaInitialization:
"""Test database schema initialization."""
def test_initialize_database_creates_all_tables(self, clean_db):
"""Should create all 6 tables."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Query sqlite_master for table names
cursor.execute("""
SELECT name FROM sqlite_master
WHERE type='table' AND name NOT LIKE 'sqlite_%'
ORDER BY name
""")
tables = [row[0] for row in cursor.fetchall()]
expected_tables = [
'holdings',
'job_details',
'jobs',
'positions',
'reasoning_logs',
'tool_usage'
]
assert sorted(tables) == sorted(expected_tables)
conn.close()
def test_initialize_database_creates_jobs_table(self, clean_db):
"""Should create jobs table with correct schema."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("PRAGMA table_info(jobs)")
columns = {row[1]: row[2] for row in cursor.fetchall()}
expected_columns = {
'job_id': 'TEXT',
'config_path': 'TEXT',
'status': 'TEXT',
'date_range': 'TEXT',
'models': 'TEXT',
'created_at': 'TEXT',
'started_at': 'TEXT',
'updated_at': 'TEXT',
'completed_at': 'TEXT',
'total_duration_seconds': 'REAL',
'error': 'TEXT'
}
for col_name, col_type in expected_columns.items():
assert col_name in columns
assert columns[col_name] == col_type
conn.close()
def test_initialize_database_creates_positions_table(self, clean_db):
"""Should create positions table with correct schema."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("PRAGMA table_info(positions)")
columns = {row[1]: row[2] for row in cursor.fetchall()}
required_columns = [
'id', 'job_id', 'date', 'model', 'action_id', 'action_type',
'symbol', 'amount', 'price', 'cash', 'portfolio_value',
'daily_profit', 'daily_return_pct', 'cumulative_profit',
'cumulative_return_pct', 'created_at'
]
for col_name in required_columns:
assert col_name in columns
conn.close()
def test_initialize_database_creates_indexes(self, clean_db):
"""Should create all performance indexes."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("""
SELECT name FROM sqlite_master
WHERE type='index' AND name LIKE 'idx_%'
ORDER BY name
""")
indexes = [row[0] for row in cursor.fetchall()]
required_indexes = [
'idx_jobs_status',
'idx_jobs_created_at',
'idx_job_details_job_id',
'idx_job_details_status',
'idx_job_details_unique',
'idx_positions_job_id',
'idx_positions_date',
'idx_positions_model',
'idx_positions_date_model',
'idx_positions_unique',
'idx_holdings_position_id',
'idx_holdings_symbol',
'idx_reasoning_logs_job_date_model',
'idx_tool_usage_job_date_model'
]
for index in required_indexes:
assert index in indexes, f"Missing index: {index}"
conn.close()
def test_initialize_database_idempotent(self, clean_db):
"""Should be safe to call multiple times."""
# Initialize once (already done by clean_db fixture)
# Initialize again
initialize_database(clean_db)
# Should still have correct tables
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("""
SELECT COUNT(*) FROM sqlite_master
WHERE type='table' AND name='jobs'
""")
assert cursor.fetchone()[0] == 1 # Only one jobs table
conn.close()
@pytest.mark.unit
class TestForeignKeyConstraints:
"""Test foreign key constraint enforcement."""
def test_cascade_delete_job_details(self, clean_db, sample_job_data):
"""Should cascade delete job_details when job is deleted."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Insert job
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", (
sample_job_data["job_id"],
sample_job_data["config_path"],
sample_job_data["status"],
sample_job_data["date_range"],
sample_job_data["models"],
sample_job_data["created_at"]
))
# Insert job_detail
cursor.execute("""
INSERT INTO job_details (job_id, date, model, status)
VALUES (?, ?, ?, ?)
""", (sample_job_data["job_id"], "2025-01-16", "gpt-5", "pending"))
conn.commit()
# Verify job_detail exists
cursor.execute("SELECT COUNT(*) FROM job_details WHERE job_id = ?", (sample_job_data["job_id"],))
assert cursor.fetchone()[0] == 1
# Delete job
cursor.execute("DELETE FROM jobs WHERE job_id = ?", (sample_job_data["job_id"],))
conn.commit()
# Verify job_detail was cascade deleted
cursor.execute("SELECT COUNT(*) FROM job_details WHERE job_id = ?", (sample_job_data["job_id"],))
assert cursor.fetchone()[0] == 0
conn.close()
def test_cascade_delete_positions(self, clean_db, sample_job_data, sample_position_data):
"""Should cascade delete positions when job is deleted."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Insert job
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", (
sample_job_data["job_id"],
sample_job_data["config_path"],
sample_job_data["status"],
sample_job_data["date_range"],
sample_job_data["models"],
sample_job_data["created_at"]
))
# Insert position
cursor.execute("""
INSERT INTO positions (
job_id, date, model, action_id, action_type, symbol, amount, price,
cash, portfolio_value, daily_profit, daily_return_pct,
cumulative_profit, cumulative_return_pct, created_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""", tuple(sample_position_data.values()))
conn.commit()
# Delete job
cursor.execute("DELETE FROM jobs WHERE job_id = ?", (sample_job_data["job_id"],))
conn.commit()
# Verify position was cascade deleted
cursor.execute("SELECT COUNT(*) FROM positions WHERE job_id = ?", (sample_job_data["job_id"],))
assert cursor.fetchone()[0] == 0
conn.close()
def test_cascade_delete_holdings(self, clean_db, sample_job_data, sample_position_data):
"""Should cascade delete holdings when position is deleted."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Insert job
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", (
sample_job_data["job_id"],
sample_job_data["config_path"],
sample_job_data["status"],
sample_job_data["date_range"],
sample_job_data["models"],
sample_job_data["created_at"]
))
# Insert position
cursor.execute("""
INSERT INTO positions (
job_id, date, model, action_id, action_type, symbol, amount, price,
cash, portfolio_value, daily_profit, daily_return_pct,
cumulative_profit, cumulative_return_pct, created_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""", tuple(sample_position_data.values()))
position_id = cursor.lastrowid
# Insert holding
cursor.execute("""
INSERT INTO holdings (position_id, symbol, quantity)
VALUES (?, ?, ?)
""", (position_id, "AAPL", 10))
conn.commit()
# Verify holding exists
cursor.execute("SELECT COUNT(*) FROM holdings WHERE position_id = ?", (position_id,))
assert cursor.fetchone()[0] == 1
# Delete position
cursor.execute("DELETE FROM positions WHERE id = ?", (position_id,))
conn.commit()
# Verify holding was cascade deleted
cursor.execute("SELECT COUNT(*) FROM holdings WHERE position_id = ?", (position_id,))
assert cursor.fetchone()[0] == 0
conn.close()
@pytest.mark.unit
class TestUtilityFunctions:
"""Test database utility functions."""
def test_drop_all_tables(self, test_db_path):
"""Should drop all tables when called."""
# Initialize database
initialize_database(test_db_path)
# Verify tables exist
conn = get_db_connection(test_db_path)
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'")
assert cursor.fetchone()[0] == 6
conn.close()
# Drop all tables
drop_all_tables(test_db_path)
# Verify tables are gone
conn = get_db_connection(test_db_path)
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'")
assert cursor.fetchone()[0] == 0
conn.close()
def test_vacuum_database(self, clean_db):
"""Should execute VACUUM command without errors."""
# This should not raise an error
vacuum_database(clean_db)
# Verify database still accessible
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM jobs")
assert cursor.fetchone()[0] == 0
conn.close()
def test_get_database_stats_empty(self, clean_db):
"""Should return correct stats for empty database."""
stats = get_database_stats(clean_db)
assert "database_size_mb" in stats
assert stats["jobs"] == 0
assert stats["job_details"] == 0
assert stats["positions"] == 0
assert stats["holdings"] == 0
assert stats["reasoning_logs"] == 0
assert stats["tool_usage"] == 0
def test_get_database_stats_with_data(self, clean_db, sample_job_data):
"""Should return correct row counts with data."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Insert job
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", (
sample_job_data["job_id"],
sample_job_data["config_path"],
sample_job_data["status"],
sample_job_data["date_range"],
sample_job_data["models"],
sample_job_data["created_at"]
))
# Insert job_detail
cursor.execute("""
INSERT INTO job_details (job_id, date, model, status)
VALUES (?, ?, ?, ?)
""", (sample_job_data["job_id"], "2025-01-16", "gpt-5", "pending"))
conn.commit()
conn.close()
stats = get_database_stats(clean_db)
assert stats["jobs"] == 1
assert stats["job_details"] == 1
assert stats["database_size_mb"] > 0
@pytest.mark.unit
class TestCheckConstraints:
"""Test CHECK constraints on table columns."""
def test_jobs_status_constraint(self, clean_db):
"""Should reject invalid job status values."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Try to insert job with invalid status
with pytest.raises(sqlite3.IntegrityError, match="CHECK constraint failed"):
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", ("test-job", "configs/test.json", "invalid_status", "[]", "[]", "2025-01-20T00:00:00Z"))
conn.close()
def test_job_details_status_constraint(self, clean_db, sample_job_data):
"""Should reject invalid job_detail status values."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Insert valid job first
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", tuple(sample_job_data.values()))
# Try to insert job_detail with invalid status
with pytest.raises(sqlite3.IntegrityError, match="CHECK constraint failed"):
cursor.execute("""
INSERT INTO job_details (job_id, date, model, status)
VALUES (?, ?, ?, ?)
""", (sample_job_data["job_id"], "2025-01-16", "gpt-5", "invalid_status"))
conn.close()
def test_positions_action_type_constraint(self, clean_db, sample_job_data):
"""Should reject invalid action_type values."""
conn = get_db_connection(clean_db)
cursor = conn.cursor()
# Insert valid job first
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", tuple(sample_job_data.values()))
# Try to insert position with invalid action_type
with pytest.raises(sqlite3.IntegrityError, match="CHECK constraint failed"):
cursor.execute("""
INSERT INTO positions (
job_id, date, model, action_id, action_type, cash, portfolio_value, created_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
""", (sample_job_data["job_id"], "2025-01-16", "gpt-5", 1, "invalid_action", 10000, 10000, "2025-01-16T00:00:00Z"))
conn.close()
# Coverage target: 95%+ for api/database.py

View File

@@ -0,0 +1,422 @@
"""
Unit tests for api/job_manager.py - Job lifecycle management.
Coverage target: 95%+
Tests verify:
- Job creation and validation
- Status transitions (state machine)
- Progress tracking
- Concurrency control
- Job retrieval and queries
- Cleanup operations
"""
import pytest
import json
from datetime import datetime, timedelta
@pytest.mark.unit
class TestJobCreation:
"""Test job creation and validation."""
def test_create_job_success(self, clean_db):
"""Should create job with pending status."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5", "claude-3.7-sonnet"]
)
assert job_id is not None
job = manager.get_job(job_id)
assert job["status"] == "pending"
assert job["date_range"] == ["2025-01-16", "2025-01-17"]
assert job["models"] == ["gpt-5", "claude-3.7-sonnet"]
assert job["created_at"] is not None
def test_create_job_with_job_details(self, clean_db):
"""Should create job_details for each model-day."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5"]
)
progress = manager.get_job_progress(job_id)
assert progress["total_model_days"] == 2 # 2 dates × 1 model
assert progress["completed"] == 0
assert progress["failed"] == 0
def test_create_job_blocks_concurrent(self, clean_db):
"""Should prevent creating second job while first is pending."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job1_id = manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5"]
)
with pytest.raises(ValueError, match="Another simulation job is already running"):
manager.create_job(
"configs/test.json",
["2025-01-17"],
["gpt-5"]
)
def test_create_job_after_completion(self, clean_db):
"""Should allow new job after previous completes."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job1_id = manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5"]
)
manager.update_job_status(job1_id, "completed")
# Now second job should be allowed
job2_id = manager.create_job(
"configs/test.json",
["2025-01-17"],
["gpt-5"]
)
assert job2_id is not None
@pytest.mark.unit
class TestJobStatusTransitions:
"""Test job status state machine."""
def test_pending_to_running(self, clean_db):
"""Should transition from pending to running."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5"]
)
# Update detail to running
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
job = manager.get_job(job_id)
assert job["status"] == "running"
assert job["started_at"] is not None
def test_running_to_completed(self, clean_db):
"""Should transition to completed when all details complete."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5"]
)
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
job = manager.get_job(job_id)
assert job["status"] == "completed"
assert job["completed_at"] is not None
assert job["total_duration_seconds"] is not None
def test_partial_completion(self, clean_db):
"""Should mark as partial when some models fail."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5", "claude-3.7-sonnet"]
)
# First model succeeds
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
# Second model fails
manager.update_job_detail_status(job_id, "2025-01-16", "claude-3.7-sonnet", "running")
manager.update_job_detail_status(
job_id, "2025-01-16", "claude-3.7-sonnet", "failed",
error="API timeout"
)
job = manager.get_job(job_id)
assert job["status"] == "partial"
progress = manager.get_job_progress(job_id)
assert progress["completed"] == 1
assert progress["failed"] == 1
@pytest.mark.unit
class TestJobRetrieval:
"""Test job query operations."""
def test_get_nonexistent_job(self, clean_db):
"""Should return None for nonexistent job."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job = manager.get_job("nonexistent-id")
assert job is None
def test_get_current_job(self, clean_db):
"""Should return most recent job."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job1_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
manager.update_job_status(job1_id, "completed")
job2_id = manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
current = manager.get_current_job()
assert current["job_id"] == job2_id
def test_get_current_job_empty(self, clean_db):
"""Should return None when no jobs exist."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
current = manager.get_current_job()
assert current is None
def test_find_job_by_date_range(self, clean_db):
"""Should find existing job with same date range."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
"configs/test.json",
["2025-01-16", "2025-01-17"],
["gpt-5"]
)
found = manager.find_job_by_date_range(["2025-01-16", "2025-01-17"])
assert found["job_id"] == job_id
def test_find_job_by_date_range_not_found(self, clean_db):
"""Should return None when no matching job exists."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5"]
)
found = manager.find_job_by_date_range(["2025-01-20", "2025-01-21"])
assert found is None
@pytest.mark.unit
class TestJobProgress:
"""Test job progress tracking."""
def test_progress_all_pending(self, clean_db):
"""Should show 0 completed when all pending."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
"configs/test.json",
["2025-01-16", "2025-01-17"],
["gpt-5"]
)
progress = manager.get_job_progress(job_id)
assert progress["total_model_days"] == 2
assert progress["completed"] == 0
assert progress["failed"] == 0
assert progress["current"] is None
def test_progress_with_running(self, clean_db):
"""Should identify currently running model-day."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5"]
)
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
progress = manager.get_job_progress(job_id)
assert progress["current"] == {"date": "2025-01-16", "model": "gpt-5"}
def test_progress_details(self, clean_db):
"""Should return detailed progress for all model-days."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
"configs/test.json",
["2025-01-16"],
["gpt-5", "claude-3.7-sonnet"]
)
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
progress = manager.get_job_progress(job_id)
assert len(progress["details"]) == 2
# Find the gpt-5 detail (order may vary)
gpt5_detail = next(d for d in progress["details"] if d["model"] == "gpt-5")
assert gpt5_detail["status"] == "completed"
@pytest.mark.unit
class TestConcurrencyControl:
"""Test concurrency control mechanisms."""
def test_can_start_new_job_when_empty(self, clean_db):
"""Should allow job when none exist."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
assert manager.can_start_new_job() is True
def test_can_start_new_job_blocks_pending(self, clean_db):
"""Should block when job is pending."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
assert manager.can_start_new_job() is False
def test_can_start_new_job_blocks_running(self, clean_db):
"""Should block when job is running."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
manager.update_job_status(job_id, "running")
assert manager.can_start_new_job() is False
def test_can_start_new_job_allows_after_completion(self, clean_db):
"""Should allow new job after previous completes."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
manager.update_job_status(job_id, "completed")
assert manager.can_start_new_job() is True
def test_get_running_jobs(self, clean_db):
"""Should return all running/pending jobs."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job1_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
# Complete first job
manager.update_job_status(job1_id, "completed")
# Create second job
job2_id = manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
running = manager.get_running_jobs()
assert len(running) == 1
assert running[0]["job_id"] == job2_id
@pytest.mark.unit
class TestJobCleanup:
"""Test maintenance operations."""
def test_cleanup_old_jobs(self, clean_db):
"""Should delete jobs older than threshold."""
from api.job_manager import JobManager
from api.database import get_db_connection
manager = JobManager(db_path=clean_db)
# Create old job (manually set created_at)
conn = get_db_connection(clean_db)
cursor = conn.cursor()
old_date = (datetime.utcnow() - timedelta(days=35)).isoformat() + "Z"
cursor.execute("""
INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
VALUES (?, ?, ?, ?, ?, ?)
""", ("old-job", "configs/test.json", "completed", '["2025-01-01"]', '["gpt-5"]', old_date))
conn.commit()
conn.close()
# Create recent job
recent_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
# Cleanup jobs older than 30 days
result = manager.cleanup_old_jobs(days=30)
assert result["jobs_deleted"] == 1
assert manager.get_job("old-job") is None
assert manager.get_job(recent_id) is not None
@pytest.mark.unit
class TestJobUpdateOperations:
"""Test job update methods."""
def test_update_job_status_with_error(self, clean_db):
"""Should record error message when job fails."""
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
manager.update_job_status(job_id, "failed", error="MCP service unavailable")
job = manager.get_job(job_id)
assert job["status"] == "failed"
assert job["error"] == "MCP service unavailable"
def test_update_job_detail_records_duration(self, clean_db):
"""Should calculate duration for completed model-days."""
from api.job_manager import JobManager
import time
manager = JobManager(db_path=clean_db)
job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
# Start
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
# Small delay
time.sleep(0.1)
# Complete
manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
progress = manager.get_job_progress(job_id)
detail = progress["details"][0]
assert detail["duration_seconds"] is not None
assert detail["duration_seconds"] > 0
# Coverage target: 95%+ for api/job_manager.py

View File

@@ -0,0 +1,481 @@
"""
Unit tests for api/model_day_executor.py - Single model-day execution.
Coverage target: 90%+
Tests verify:
- Executor initialization
- Trading session execution
- Result persistence to SQLite
- Error handling and recovery
- Position tracking
- AI reasoning logs
"""
import pytest
import json
from unittest.mock import Mock, patch, MagicMock
from pathlib import Path
def create_mock_agent(positions=None, last_trade=None, current_prices=None,
reasoning_steps=None, tool_usage=None, session_result=None):
"""Helper to create properly mocked agent."""
mock_agent = Mock()
# Default values
mock_agent.get_positions.return_value = positions or {"CASH": 10000.0}
mock_agent.get_last_trade.return_value = last_trade
mock_agent.get_current_prices.return_value = current_prices or {}
mock_agent.get_reasoning_steps.return_value = reasoning_steps or []
mock_agent.get_tool_usage.return_value = tool_usage or {}
mock_agent.run_trading_session.return_value = session_result or {"success": True}
return mock_agent
@pytest.mark.unit
class TestModelDayExecutorInitialization:
"""Test ModelDayExecutor initialization."""
def test_init_with_required_params(self, clean_db):
"""Should initialize with required parameters."""
from api.model_day_executor import ModelDayExecutor
executor = ModelDayExecutor(
job_id="test-job-123",
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
assert executor.job_id == "test-job-123"
assert executor.date == "2025-01-16"
assert executor.model_sig == "gpt-5"
assert executor.config_path == "configs/test.json"
def test_init_creates_runtime_config(self, clean_db):
"""Should create isolated runtime config file."""
from api.model_day_executor import ModelDayExecutor
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime_test.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id="test-job-123",
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
# Verify runtime config created
mock_instance.create_runtime_config.assert_called_once_with(
job_id="test-job-123",
model_sig="gpt-5",
date="2025-01-16"
)
@pytest.mark.unit
class TestModelDayExecutorExecution:
"""Test trading session execution."""
def test_execute_success(self, clean_db, sample_job_data):
"""Should execute trading session and write results to DB."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
# Create job and job_detail
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
# Mock agent execution
mock_agent = create_mock_agent(
positions={"AAPL": 10, "CASH": 7500.0},
current_prices={"AAPL": 250.0},
session_result={"success": True, "total_steps": 15, "stop_signal_received": True}
)
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime_test.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
# Mock the _initialize_agent method
with patch.object(executor, '_initialize_agent', return_value=mock_agent):
result = executor.execute()
assert result["success"] is True
assert result["job_id"] == job_id
assert result["date"] == "2025-01-16"
assert result["model"] == "gpt-5"
# Verify job_detail status updated
progress = manager.get_job_progress(job_id)
assert progress["completed"] == 1
def test_execute_failure_updates_status(self, clean_db):
"""Should update status to failed on execution error."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
# Create job
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
# Mock agent to raise error
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime_test.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
# Mock _initialize_agent to raise error
with patch.object(executor, '_initialize_agent', side_effect=Exception("Agent initialization failed")):
result = executor.execute()
assert result["success"] is False
assert "error" in result
# Verify job_detail marked as failed
progress = manager.get_job_progress(job_id)
assert progress["failed"] == 1
@pytest.mark.unit
class TestModelDayExecutorDataPersistence:
"""Test result persistence to SQLite."""
def test_writes_position_to_database(self, clean_db):
"""Should write position record to SQLite."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
from api.database import get_db_connection
# Create job
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
# Mock successful execution
mock_agent = create_mock_agent(
positions={"AAPL": 10, "CASH": 7500.0},
last_trade={"action": "buy", "symbol": "AAPL", "amount": 10, "price": 250.0},
current_prices={"AAPL": 250.0},
session_result={"success": True, "total_steps": 10}
)
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime_test.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
with patch.object(executor, '_initialize_agent', return_value=mock_agent):
executor.execute()
# Verify position written to database
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("""
SELECT job_id, date, model, action_id, action_type
FROM positions
WHERE job_id = ? AND date = ? AND model = ?
""", (job_id, "2025-01-16", "gpt-5"))
row = cursor.fetchone()
assert row is not None
assert row[0] == job_id
assert row[1] == "2025-01-16"
assert row[2] == "gpt-5"
conn.close()
def test_writes_holdings_to_database(self, clean_db):
"""Should write holdings records to SQLite."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
from api.database import get_db_connection
# Create job
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
# Mock successful execution
mock_agent = create_mock_agent(
positions={"AAPL": 10, "MSFT": 5, "CASH": 7500.0},
current_prices={"AAPL": 250.0, "MSFT": 300.0},
session_result={"success": True}
)
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime_test.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
with patch.object(executor, '_initialize_agent', return_value=mock_agent):
executor.execute()
# Verify holdings written
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("""
SELECT h.symbol, h.quantity
FROM holdings h
JOIN positions p ON h.position_id = p.id
WHERE p.job_id = ? AND p.date = ? AND p.model = ?
ORDER BY h.symbol
""", (job_id, "2025-01-16", "gpt-5"))
holdings = cursor.fetchall()
assert len(holdings) == 3
assert holdings[0][0] == "AAPL"
assert holdings[0][1] == 10.0
conn.close()
def test_writes_reasoning_logs(self, clean_db):
"""Should write AI reasoning logs to SQLite."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
from api.database import get_db_connection
# Create job
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
# Mock execution with reasoning
mock_agent = create_mock_agent(
positions={"CASH": 10000.0},
reasoning_steps=[
{"step": 1, "reasoning": "Analyzing market data"},
{"step": 2, "reasoning": "Evaluating risk"}
],
session_result={
"success": True,
"total_steps": 5,
"stop_signal_received": True,
"reasoning_summary": "Market analysis indicates upward trend"
}
)
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime_test.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
with patch.object(executor, '_initialize_agent', return_value=mock_agent):
executor.execute()
# Verify reasoning logs
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("""
SELECT step_number, content
FROM reasoning_logs
WHERE job_id = ? AND date = ? AND model = ?
ORDER BY step_number
""", (job_id, "2025-01-16", "gpt-5"))
logs = cursor.fetchall()
assert len(logs) == 2
assert logs[0][0] == 1
conn.close()
@pytest.mark.unit
class TestModelDayExecutorCleanup:
"""Test cleanup operations."""
def test_cleanup_runtime_config_on_success(self, clean_db):
"""Should cleanup runtime config after successful execution."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
mock_agent = create_mock_agent(
positions={"CASH": 10000.0},
session_result={"success": True}
)
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
with patch.object(executor, '_initialize_agent', return_value=mock_agent):
executor.execute()
# Verify cleanup called
mock_instance.cleanup_runtime_config.assert_called_once_with("/tmp/runtime.json")
def test_cleanup_runtime_config_on_failure(self, clean_db):
"""Should cleanup runtime config even after failure."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
# Mock _initialize_agent to raise error
with patch.object(executor, '_initialize_agent', side_effect=Exception("Agent failed")):
executor.execute()
# Verify cleanup called even on failure
mock_instance.cleanup_runtime_config.assert_called_once_with("/tmp/runtime.json")
@pytest.mark.unit
class TestModelDayExecutorPositionCalculations:
"""Test position and P&L calculations."""
def test_calculates_portfolio_value(self, clean_db):
"""Should calculate total portfolio value."""
from api.model_day_executor import ModelDayExecutor
from api.job_manager import JobManager
from api.database import get_db_connection
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
mock_agent = create_mock_agent(
positions={"AAPL": 10, "CASH": 7500.0}, # 10 shares @ $250 = $2500
current_prices={"AAPL": 250.0},
session_result={"success": True}
)
with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
mock_instance = Mock()
mock_instance.create_runtime_config.return_value = "/tmp/runtime_test.json"
mock_runtime.return_value = mock_instance
executor = ModelDayExecutor(
job_id=job_id,
date="2025-01-16",
model_sig="gpt-5",
config_path="configs/test.json",
db_path=clean_db
)
with patch.object(executor, '_initialize_agent', return_value=mock_agent):
executor.execute()
# Verify portfolio value calculated correctly
conn = get_db_connection(clean_db)
cursor = conn.cursor()
cursor.execute("""
SELECT portfolio_value
FROM positions
WHERE job_id = ? AND date = ? AND model = ?
""", (job_id, "2025-01-16", "gpt-5"))
row = cursor.fetchone()
assert row is not None
# Portfolio value should be 2500 (stocks) + 7500 (cash) = 10000
assert row[0] == 10000.0
conn.close()
# Coverage target: 90%+ for api/model_day_executor.py

381
tests/unit/test_models.py Normal file
View File

@@ -0,0 +1,381 @@
"""
Unit tests for api/models.py - Pydantic data models.
Coverage target: 90%+
Tests verify:
- Request model validation
- Response model serialization
- Field constraints and types
- Optional vs required fields
"""
import pytest
from pydantic import ValidationError
from datetime import datetime
@pytest.mark.unit
class TestTriggerSimulationRequest:
"""Test TriggerSimulationRequest model."""
def test_valid_request_with_defaults(self):
"""Should accept request with default config_path."""
from api.models import TriggerSimulationRequest
request = TriggerSimulationRequest()
assert request.config_path == "configs/default_config.json"
def test_valid_request_with_custom_path(self):
"""Should accept request with custom config_path."""
from api.models import TriggerSimulationRequest
request = TriggerSimulationRequest(config_path="configs/custom.json")
assert request.config_path == "configs/custom.json"
@pytest.mark.unit
class TestJobProgress:
"""Test JobProgress model."""
def test_valid_progress_minimal(self):
"""Should create progress with minimal fields."""
from api.models import JobProgress
progress = JobProgress(
total_model_days=4,
completed=2,
failed=0
)
assert progress.total_model_days == 4
assert progress.completed == 2
assert progress.failed == 0
assert progress.current is None
assert progress.details is None
def test_valid_progress_with_current(self):
"""Should include current model-day being executed."""
from api.models import JobProgress
progress = JobProgress(
total_model_days=4,
completed=1,
failed=0,
current={"date": "2025-01-16", "model": "gpt-5"}
)
assert progress.current == {"date": "2025-01-16", "model": "gpt-5"}
def test_valid_progress_with_details(self):
"""Should include detailed progress for all model-days."""
from api.models import JobProgress
details = [
{"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 45.2},
{"date": "2025-01-16", "model": "claude", "status": "running", "duration_seconds": None}
]
progress = JobProgress(
total_model_days=2,
completed=1,
failed=0,
details=details
)
assert len(progress.details) == 2
assert progress.details[0]["status"] == "completed"
@pytest.mark.unit
class TestTriggerSimulationResponse:
"""Test TriggerSimulationResponse model."""
def test_valid_response_accepted(self):
"""Should create accepted response."""
from api.models import TriggerSimulationResponse
response = TriggerSimulationResponse(
job_id="test-job-123",
status="accepted",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5"],
created_at="2025-01-20T14:30:00Z",
message="Job queued successfully"
)
assert response.job_id == "test-job-123"
assert response.status == "accepted"
assert len(response.date_range) == 2
assert response.progress is None
def test_valid_response_with_progress(self):
"""Should include progress for running jobs."""
from api.models import TriggerSimulationResponse, JobProgress
progress = JobProgress(
total_model_days=4,
completed=2,
failed=0
)
response = TriggerSimulationResponse(
job_id="test-job-123",
status="running",
date_range=["2025-01-16"],
models=["gpt-5"],
created_at="2025-01-20T14:30:00Z",
message="Simulation in progress",
progress=progress
)
assert response.progress is not None
assert response.progress.completed == 2
@pytest.mark.unit
class TestJobStatusResponse:
"""Test JobStatusResponse model."""
def test_valid_status_running(self):
"""Should create running status response."""
from api.models import JobStatusResponse, JobProgress
progress = JobProgress(
total_model_days=4,
completed=2,
failed=0,
current={"date": "2025-01-16", "model": "gpt-5"}
)
response = JobStatusResponse(
job_id="test-job-123",
status="running",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5", "claude"],
progress=progress,
created_at="2025-01-20T14:30:00Z"
)
assert response.status == "running"
assert response.completed_at is None
assert response.total_duration_seconds is None
def test_valid_status_completed(self):
"""Should create completed status response."""
from api.models import JobStatusResponse, JobProgress
progress = JobProgress(
total_model_days=4,
completed=4,
failed=0
)
response = JobStatusResponse(
job_id="test-job-123",
status="completed",
date_range=["2025-01-16"],
models=["gpt-5"],
progress=progress,
created_at="2025-01-20T14:30:00Z",
completed_at="2025-01-20T14:35:00Z",
total_duration_seconds=300.5
)
assert response.status == "completed"
assert response.completed_at == "2025-01-20T14:35:00Z"
assert response.total_duration_seconds == 300.5
@pytest.mark.unit
class TestDailyPnL:
"""Test DailyPnL model."""
def test_valid_pnl(self):
"""Should create P&L with all fields."""
from api.models import DailyPnL
pnl = DailyPnL(
profit=150.50,
return_pct=1.51,
portfolio_value=10150.50
)
assert pnl.profit == 150.50
assert pnl.return_pct == 1.51
assert pnl.portfolio_value == 10150.50
@pytest.mark.unit
class TestTrade:
"""Test Trade model."""
def test_valid_trade_buy(self):
"""Should create buy trade."""
from api.models import Trade
trade = Trade(
id=1,
action="buy",
symbol="AAPL",
amount=10,
price=255.88,
total=2558.80
)
assert trade.action == "buy"
assert trade.symbol == "AAPL"
assert trade.amount == 10
def test_valid_trade_sell(self):
"""Should create sell trade."""
from api.models import Trade
trade = Trade(
id=2,
action="sell",
symbol="MSFT",
amount=5
)
assert trade.action == "sell"
assert trade.price is None # Optional
assert trade.total is None # Optional
@pytest.mark.unit
class TestAIReasoning:
"""Test AIReasoning model."""
def test_valid_reasoning(self):
"""Should create reasoning summary."""
from api.models import AIReasoning
reasoning = AIReasoning(
total_steps=15,
stop_signal_received=True,
reasoning_summary="Market analysis shows...",
tool_usage={"search": 3, "get_price": 5, "trade": 1}
)
assert reasoning.total_steps == 15
assert reasoning.stop_signal_received is True
assert "search" in reasoning.tool_usage
@pytest.mark.unit
class TestModelResult:
"""Test ModelResult model."""
def test_valid_result_minimal(self):
"""Should create minimal result."""
from api.models import ModelResult, DailyPnL
pnl = DailyPnL(profit=150.0, return_pct=1.5, portfolio_value=10150.0)
result = ModelResult(
model="gpt-5",
positions={"AAPL": 10, "CASH": 7500.0},
daily_pnl=pnl
)
assert result.model == "gpt-5"
assert result.positions["AAPL"] == 10
assert result.trades is None
assert result.ai_reasoning is None
def test_valid_result_full(self):
"""Should create full result with all details."""
from api.models import ModelResult, DailyPnL, Trade, AIReasoning
pnl = DailyPnL(profit=150.0, return_pct=1.5, portfolio_value=10150.0)
trades = [Trade(id=1, action="buy", symbol="AAPL", amount=10)]
reasoning = AIReasoning(
total_steps=15,
stop_signal_received=True,
reasoning_summary="...",
tool_usage={"search": 3}
)
result = ModelResult(
model="gpt-5",
positions={"AAPL": 10, "CASH": 7500.0},
daily_pnl=pnl,
trades=trades,
ai_reasoning=reasoning,
log_file_path="data/agent_data/gpt-5/log/2025-01-16/log.jsonl"
)
assert result.trades is not None
assert len(result.trades) == 1
assert result.ai_reasoning is not None
@pytest.mark.unit
class TestResultsResponse:
"""Test ResultsResponse model."""
def test_valid_results_response(self):
"""Should create results response."""
from api.models import ResultsResponse, ModelResult, DailyPnL
pnl = DailyPnL(profit=150.0, return_pct=1.5, portfolio_value=10150.0)
model_result = ModelResult(
model="gpt-5",
positions={"AAPL": 10, "CASH": 7500.0},
daily_pnl=pnl
)
response = ResultsResponse(
date="2025-01-16",
results=[model_result]
)
assert response.date == "2025-01-16"
assert len(response.results) == 1
assert response.results[0].model == "gpt-5"
@pytest.mark.unit
class TestResultsQueryParams:
"""Test ResultsQueryParams model."""
def test_valid_params_minimal(self):
"""Should create params with minimal fields."""
from api.models import ResultsQueryParams
params = ResultsQueryParams(date="2025-01-16")
assert params.date == "2025-01-16"
assert params.model is None
assert params.detail == "minimal"
def test_valid_params_with_filters(self):
"""Should create params with all filters."""
from api.models import ResultsQueryParams
params = ResultsQueryParams(
date="2025-01-16",
model="gpt-5",
detail="full"
)
assert params.model == "gpt-5"
assert params.detail == "full"
def test_invalid_date_format(self):
"""Should reject invalid date format."""
from api.models import ResultsQueryParams
with pytest.raises(ValidationError):
ResultsQueryParams(date="2025/01/16") # Wrong format
def test_invalid_detail_value(self):
"""Should reject invalid detail value."""
from api.models import ResultsQueryParams
with pytest.raises(ValidationError):
ResultsQueryParams(date="2025-01-16", detail="invalid")
# Coverage target: 90%+ for api/models.py

View File

@@ -0,0 +1,210 @@
"""
Unit tests for api/runtime_manager.py - Runtime config isolation.
Coverage target: 85%+
Tests verify:
- Isolated runtime config file creation
- Config path uniqueness per model-day
- Cleanup operations
- File lifecycle management
"""
import pytest
import os
import json
from pathlib import Path
import tempfile
@pytest.mark.unit
class TestRuntimeConfigCreation:
"""Test runtime config file creation."""
def test_create_runtime_config(self):
"""Should create unique runtime config file."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
manager = RuntimeConfigManager(data_dir=temp_dir)
config_path = manager.create_runtime_config(
job_id="test-job-123",
model_sig="gpt-5",
date="2025-01-16"
)
# Verify file exists
assert os.path.exists(config_path)
# Verify file is in correct location
assert temp_dir in config_path
# Verify filename contains identifiers
assert "gpt-5" in config_path
assert "2025-01-16" in config_path
def test_create_runtime_config_contents(self):
"""Should initialize config with correct values."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
manager = RuntimeConfigManager(data_dir=temp_dir)
config_path = manager.create_runtime_config(
job_id="test-job-123",
model_sig="gpt-5",
date="2025-01-16"
)
# Read and verify contents
with open(config_path, 'r') as f:
config = json.load(f)
assert config["TODAY_DATE"] == "2025-01-16"
assert config["SIGNATURE"] == "gpt-5"
assert config["IF_TRADE"] is False
assert config["JOB_ID"] == "test-job-123"
def test_create_runtime_config_unique_paths(self):
"""Should create unique paths for different model-days."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
manager = RuntimeConfigManager(data_dir=temp_dir)
path1 = manager.create_runtime_config("job1", "gpt-5", "2025-01-16")
path2 = manager.create_runtime_config("job1", "claude", "2025-01-16")
path3 = manager.create_runtime_config("job1", "gpt-5", "2025-01-17")
# All paths should be different
assert path1 != path2
assert path1 != path3
assert path2 != path3
# All files should exist
assert os.path.exists(path1)
assert os.path.exists(path2)
assert os.path.exists(path3)
def test_create_runtime_config_creates_directory(self):
"""Should create data directory in __init__ if it doesn't exist."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
data_dir = os.path.join(temp_dir, "data")
# Directory shouldn't exist yet
assert not os.path.exists(data_dir)
# Manager creates directory in __init__
manager = RuntimeConfigManager(data_dir=data_dir)
# Directory should be created by __init__
assert os.path.exists(data_dir)
config_path = manager.create_runtime_config("job1", "gpt-5", "2025-01-16")
# Config file should exist
assert os.path.exists(config_path)
@pytest.mark.unit
class TestRuntimeConfigCleanup:
"""Test runtime config cleanup operations."""
def test_cleanup_runtime_config(self):
"""Should delete runtime config file."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
manager = RuntimeConfigManager(data_dir=temp_dir)
config_path = manager.create_runtime_config("job1", "gpt-5", "2025-01-16")
assert os.path.exists(config_path)
# Cleanup
manager.cleanup_runtime_config(config_path)
# File should be deleted
assert not os.path.exists(config_path)
def test_cleanup_nonexistent_file(self):
"""Should handle cleanup of nonexistent file gracefully."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
manager = RuntimeConfigManager(data_dir=temp_dir)
# Should not raise error
manager.cleanup_runtime_config("/nonexistent/path.json")
def test_cleanup_all_runtime_configs(self):
"""Should cleanup all runtime config files."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
manager = RuntimeConfigManager(data_dir=temp_dir)
# Create multiple configs
path1 = manager.create_runtime_config("job1", "gpt-5", "2025-01-16")
path2 = manager.create_runtime_config("job1", "claude", "2025-01-16")
path3 = manager.create_runtime_config("job2", "gpt-5", "2025-01-17")
# Also create a non-runtime file (should not be deleted)
other_file = os.path.join(temp_dir, "other.json")
with open(other_file, 'w') as f:
json.dump({"test": "data"}, f)
# Cleanup all
count = manager.cleanup_all_runtime_configs()
# Runtime configs should be deleted
assert not os.path.exists(path1)
assert not os.path.exists(path2)
assert not os.path.exists(path3)
# Other file should still exist
assert os.path.exists(other_file)
# Should return count of deleted files
assert count == 3
def test_cleanup_all_empty_directory(self):
"""Should handle cleanup when no runtime configs exist."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
manager = RuntimeConfigManager(data_dir=temp_dir)
count = manager.cleanup_all_runtime_configs()
# Should return 0
assert count == 0
@pytest.mark.unit
class TestRuntimeConfigManager:
"""Test RuntimeConfigManager initialization."""
def test_init_with_default_path(self):
"""Should initialize with default data directory."""
from api.runtime_manager import RuntimeConfigManager
manager = RuntimeConfigManager()
assert manager.data_dir == Path("data")
def test_init_with_custom_path(self):
"""Should initialize with custom data directory."""
from api.runtime_manager import RuntimeConfigManager
with tempfile.TemporaryDirectory() as temp_dir:
custom_path = os.path.join(temp_dir, "custom", "path")
manager = RuntimeConfigManager(data_dir=custom_path)
assert manager.data_dir == Path(custom_path)
assert os.path.exists(custom_path) # Should create the directory
# Coverage target: 85%+ for api/runtime_manager.py

View File

@@ -0,0 +1,277 @@
"""
Unit tests for api/simulation_worker.py - Job orchestration.
Coverage target: 90%+
Tests verify:
- Worker initialization
- Job execution orchestration
- Date-sequential, model-parallel execution
- Error handling and partial completion
- Job status updates
"""
import pytest
from unittest.mock import Mock, patch, call
from datetime import datetime
@pytest.mark.unit
class TestSimulationWorkerInitialization:
"""Test SimulationWorker initialization."""
def test_init_with_job_id(self, clean_db):
"""Should initialize with job ID."""
from api.simulation_worker import SimulationWorker
worker = SimulationWorker(job_id="test-job-123", db_path=clean_db)
assert worker.job_id == "test-job-123"
assert worker.db_path == clean_db
@pytest.mark.unit
class TestSimulationWorkerExecution:
"""Test job execution orchestration."""
def test_run_executes_all_model_days(self, clean_db):
"""Should execute all model-day combinations."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
# Create job with 2 dates and 2 models = 4 model-days
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5", "claude-3.7-sonnet"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
# Mock ModelDayExecutor
with patch("api.simulation_worker.ModelDayExecutor") as mock_executor_class:
mock_executor = Mock()
mock_executor.execute.return_value = {"success": True}
mock_executor_class.return_value = mock_executor
worker.run()
# Should have created 4 executors (2 dates × 2 models)
assert mock_executor_class.call_count == 4
def test_run_date_sequential_execution(self, clean_db):
"""Should execute dates sequentially, models in parallel."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5", "claude-3.7-sonnet"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
execution_order = []
def track_execution(job_id, date, model_sig, config_path, db_path):
executor = Mock()
execution_order.append((date, model_sig))
executor.execute.return_value = {"success": True}
return executor
with patch("api.simulation_worker.ModelDayExecutor", side_effect=track_execution):
worker.run()
# All 2025-01-16 executions should come before 2025-01-17
date_16_executions = [e for e in execution_order if e[0] == "2025-01-16"]
date_17_executions = [e for e in execution_order if e[0] == "2025-01-17"]
assert len(date_16_executions) == 2
assert len(date_17_executions) == 2
# Find last index of date 16 and first index of date 17
last_16_idx = max(i for i, e in enumerate(execution_order) if e[0] == "2025-01-16")
first_17_idx = min(i for i, e in enumerate(execution_order) if e[0] == "2025-01-17")
assert last_16_idx < first_17_idx
def test_run_updates_job_status_to_completed(self, clean_db):
"""Should update job status to completed on success."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
with patch("api.simulation_worker.ModelDayExecutor") as mock_executor_class:
mock_executor = Mock()
mock_executor.execute.return_value = {"success": True}
mock_executor_class.return_value = mock_executor
worker.run()
# Check job status
job = manager.get_job(job_id)
assert job["status"] == "completed"
def test_run_handles_partial_failure(self, clean_db):
"""Should mark job as partial when some models fail."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5", "claude-3.7-sonnet"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
call_count = 0
def mixed_results(*args, **kwargs):
nonlocal call_count
executor = Mock()
# First model succeeds, second fails
executor.execute.return_value = {"success": call_count == 0}
call_count += 1
return executor
with patch("api.simulation_worker.ModelDayExecutor", side_effect=mixed_results):
worker.run()
# Check job status
job = manager.get_job(job_id)
assert job["status"] == "partial"
@pytest.mark.unit
class TestSimulationWorkerErrorHandling:
"""Test error handling."""
def test_run_continues_on_single_model_failure(self, clean_db):
"""Should continue executing other models if one fails."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5", "claude-3.7-sonnet", "gemini"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
execution_count = 0
def counting_executor(*args, **kwargs):
nonlocal execution_count
execution_count += 1
executor = Mock()
# Second model fails
if execution_count == 2:
executor.execute.return_value = {"success": False, "error": "Model failed"}
else:
executor.execute.return_value = {"success": True}
return executor
with patch("api.simulation_worker.ModelDayExecutor", side_effect=counting_executor):
worker.run()
# All 3 models should have been executed
assert execution_count == 3
def test_run_updates_job_to_failed_on_exception(self, clean_db):
"""Should update job to failed on unexpected exception."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
with patch("api.simulation_worker.ModelDayExecutor", side_effect=Exception("Unexpected error")):
worker.run()
# Check job status
job = manager.get_job(job_id)
assert job["status"] == "failed"
assert "Unexpected error" in job["error"]
@pytest.mark.unit
class TestSimulationWorkerConcurrency:
"""Test concurrent execution handling."""
def test_run_with_threading(self, clean_db):
"""Should use threading for parallel model execution."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16"],
models=["gpt-5", "claude-3.7-sonnet"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
with patch("api.simulation_worker.ModelDayExecutor") as mock_executor_class:
mock_executor = Mock()
mock_executor.execute.return_value = {"success": True}
mock_executor_class.return_value = mock_executor
# Mock ThreadPoolExecutor to verify it's being used
with patch("api.simulation_worker.ThreadPoolExecutor") as mock_pool:
mock_pool_instance = Mock()
mock_pool.return_value.__enter__.return_value = mock_pool_instance
mock_pool_instance.submit.return_value = Mock(result=lambda: {"success": True})
worker.run()
# Verify ThreadPoolExecutor was used
mock_pool.assert_called_once()
@pytest.mark.unit
class TestSimulationWorkerJobRetrieval:
"""Test job information retrieval."""
def test_get_job_info(self, clean_db):
"""Should retrieve job information."""
from api.simulation_worker import SimulationWorker
from api.job_manager import JobManager
manager = JobManager(db_path=clean_db)
job_id = manager.create_job(
config_path="configs/test.json",
date_range=["2025-01-16", "2025-01-17"],
models=["gpt-5"]
)
worker = SimulationWorker(job_id=job_id, db_path=clean_db)
job_info = worker.get_job_info()
assert job_info["job_id"] == job_id
assert job_info["date_range"] == ["2025-01-16", "2025-01-17"]
assert job_info["models"] == ["gpt-5"]
# Coverage target: 90%+ for api/simulation_worker.py