feat: transform to REST API service with SQLite persistence (v0.3.0)

Major architecture transformation from batch-only to API service with database persistence for Windmill integration. ## REST API Implementation - POST /simulate/trigger - Start simulation jobs - GET /simulate/status/{job_id} - Monitor job progress - GET /results - Query results with filters (job_id, date, model) - GET /health - Service health checks ## Database Layer - SQLite persistence with 6 tables (jobs, job_details, positions, holdings, reasoning_logs, tool_usage) - Foreign key constraints with cascade deletes - Replaces JSONL file storage ## Backend Components - JobManager: Job lifecycle management with concurrency control - RuntimeConfigManager: Thread-safe isolated runtime configs - ModelDayExecutor: Single model-day execution engine - SimulationWorker: Date-sequential, model-parallel orchestration ## Testing - 102 unit and integration tests (85% coverage) - Database: 98% coverage - Job manager: 98% coverage - API endpoints: 81% coverage - Pydantic models: 100% coverage - TDD approach throughout ## Docker Deployment - Dual-mode: API server (persistent) + batch (one-time) - Health checks with 30s interval - Volume persistence for database and logs - Separate entrypoints for each mode ## Validation Tools - scripts/validate_docker_build.sh - Build validation - scripts/test_api_endpoints.sh - Complete API testing - scripts/test_batch_mode.sh - Batch mode validation - DOCKER_API.md - Deployment guide - TESTING_GUIDE.md - Testing procedures ## Configuration - API_PORT environment variable (default: 8080) - Backwards compatible with existing configs - FastAPI, uvicorn, pydantic>=2.0 dependencies Co-Authored-By: AI Assistant <noreply@example.com>
2026-04-01 17:17:24 -04:00 · 2025-10-31 11:47:10 -04:00
parent 5da02b4ba0
commit fb9583b374
45 changed files with 13775 additions and 18 deletions
--- a/docs/api-specification.md
+++ b/docs/api-specification.md
@@ -0,0 +1,837 @@
+# AI-Trader API Service - Technical Specification
+
+## 1. API Endpoints Specification
+
+### 1.1 POST /simulate/trigger
+
+**Purpose:** Trigger a catch-up simulation from the last completed date to the most recent trading day.
+
+**Request:**
+```http
+POST /simulate/trigger HTTP/1.1
+Content-Type: application/json
+
+```
+
+**Response (202 Accepted):**
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "accepted",
+  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
+  "models": ["claude-3.7-sonnet", "gpt-5"],
+  "created_at": "2025-01-20T14:30:00Z",
+  "message": "Simulation job queued successfully"
+}
+```
+
+**Response (200 OK - Job Already Running):**
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "running",
+  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
+  "models": ["claude-3.7-sonnet", "gpt-5"],
+  "progress": {
+    "total_model_days": 6,
+    "completed": 3,
+    "failed": 0,
+    "current": {
+      "date": "2025-01-17",
+      "model": "gpt-5"
+    }
+  },
+  "created_at": "2025-01-20T14:25:00Z",
+  "message": "Simulation already in progress"
+}
+```
+
+**Response (200 OK - Already Up To Date):**
+```json
+{
+  "status": "current",
+  "message": "Simulation already up-to-date",
+  "last_simulation_date": "2025-01-20",
+  "next_trading_day": "2025-01-21"
+}
+```
+
+**Response (409 Conflict):**
+```json
+{
+  "error": "conflict",
+  "message": "Different simulation already running",
+  "current_job_id": "previous-job-uuid",
+  "current_date_range": ["2025-01-10", "2025-01-15"]
+}
+```
+
+**Business Logic:**
+1. Load configuration from `config_path` (or default)
+2. Determine last completed date from each model's `position.jsonl`
+3. Calculate date range: `max(last_dates) + 1 day` → `most_recent_trading_day`
+4. Filter for weekdays only (Monday-Friday)
+5. If date_range is empty, return "already up-to-date"
+6. Check for existing jobs with same date range → return existing job
+7. Check for running jobs with different date range → return 409
+8. Create new job in SQLite with status=`pending`
+9. Queue background task to execute simulation
+10. Return 202 with job details
+
+---
+
+### 1.2 GET /simulate/status/{job_id}
+
+**Purpose:** Poll the status and progress of a simulation job.
+
+**Request:**
+```http
+GET /simulate/status/550e8400-e29b-41d4-a716-446655440000 HTTP/1.1
+```
+
+**Response (200 OK - Running):**
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "running",
+  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
+  "models": ["claude-3.7-sonnet", "gpt-5"],
+  "progress": {
+    "total_model_days": 6,
+    "completed": 3,
+    "failed": 0,
+    "current": {
+      "date": "2025-01-17",
+      "model": "gpt-5"
+    },
+    "details": [
+      {"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
+      {"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
+      {"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 42.1},
+      {"date": "2025-01-17", "model": "gpt-5", "status": "running", "duration_seconds": null}
+    ]
+  },
+  "created_at": "2025-01-20T14:25:00Z",
+  "updated_at": "2025-01-20T14:27:15Z"
+}
+```
+
+**Response (200 OK - Completed):**
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "completed",
+  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
+  "models": ["claude-3.7-sonnet", "gpt-5"],
+  "progress": {
+    "total_model_days": 6,
+    "completed": 6,
+    "failed": 0,
+    "details": [
+      {"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
+      {"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
+      {"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 42.1},
+      {"date": "2025-01-17", "model": "gpt-5", "status": "completed", "duration_seconds": 40.3},
+      {"date": "2025-01-20", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 43.8},
+      {"date": "2025-01-20", "model": "gpt-5", "status": "completed", "duration_seconds": 39.1}
+    ]
+  },
+  "created_at": "2025-01-20T14:25:00Z",
+  "completed_at": "2025-01-20T14:29:45Z",
+  "total_duration_seconds": 285.0
+}
+```
+
+**Response (200 OK - Partial Failure):**
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "partial",
+  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
+  "models": ["claude-3.7-sonnet", "gpt-5"],
+  "progress": {
+    "total_model_days": 6,
+    "completed": 4,
+    "failed": 2,
+    "details": [
+      {"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
+      {"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
+      {"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "failed", "error": "MCP service timeout after 3 retries", "duration_seconds": null},
+      {"date": "2025-01-17", "model": "gpt-5", "status": "completed", "duration_seconds": 40.3},
+      {"date": "2025-01-20", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 43.8},
+      {"date": "2025-01-20", "model": "gpt-5", "status": "failed", "error": "AI model API timeout", "duration_seconds": null}
+    ]
+  },
+  "created_at": "2025-01-20T14:25:00Z",
+  "completed_at": "2025-01-20T14:29:45Z"
+}
+```
+
+**Response (404 Not Found):**
+```json
+{
+  "error": "not_found",
+  "message": "Job not found",
+  "job_id": "invalid-job-id"
+}
+```
+
+**Business Logic:**
+1. Query SQLite jobs table for job_id
+2. If not found, return 404
+3. Return job metadata + progress from job_details table
+4. Status transitions: `pending` → `running` → `completed`/`partial`/`failed`
+
+---
+
+### 1.3 GET /simulate/current
+
+**Purpose:** Get the most recent simulation job (for Windmill to discover job_id).
+
+**Request:**
+```http
+GET /simulate/current HTTP/1.1
+```
+
+**Response (200 OK):**
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "running",
+  "date_range": ["2025-01-16", "2025-01-17"],
+  "models": ["claude-3.7-sonnet", "gpt-5"],
+  "progress": {
+    "total_model_days": 4,
+    "completed": 2,
+    "failed": 0
+  },
+  "created_at": "2025-01-20T14:25:00Z"
+}
+```
+
+**Response (404 Not Found):**
+```json
+{
+  "error": "not_found",
+  "message": "No simulation jobs found"
+}
+```
+
+**Business Logic:**
+1. Query SQLite: `SELECT * FROM jobs ORDER BY created_at DESC LIMIT 1`
+2. Return job details with progress summary
+
+---
+
+### 1.4 GET /results
+
+**Purpose:** Retrieve simulation results for a specific date and model.
+
+**Request:**
+```http
+GET /results?date=2025-01-15&model=gpt-5&detail=minimal HTTP/1.1
+```
+
+**Query Parameters:**
+- `date` (required): Trading date in YYYY-MM-DD format
+- `model` (optional): Model signature (if omitted, returns all models)
+- `detail` (optional): Response detail level
+  - `minimal` (default): Positions + daily P&L
+  - `full`: + trade history + AI reasoning logs + tool usage stats
+
+**Response (200 OK - minimal):**
+```json
+{
+  "date": "2025-01-15",
+  "results": [
+    {
+      "model": "gpt-5",
+      "positions": {
+        "AAPL": 10,
+        "MSFT": 5,
+        "NVDA": 0,
+        "CASH": 8500.00
+      },
+      "daily_pnl": {
+        "profit": 150.50,
+        "return_pct": 1.5,
+        "portfolio_value": 10150.50
+      }
+    }
+  ]
+}
+```
+
+**Response (200 OK - full):**
+```json
+{
+  "date": "2025-01-15",
+  "results": [
+    {
+      "model": "gpt-5",
+      "positions": {
+        "AAPL": 10,
+        "MSFT": 5,
+        "CASH": 8500.00
+      },
+      "daily_pnl": {
+        "profit": 150.50,
+        "return_pct": 1.5,
+        "portfolio_value": 10150.50
+      },
+      "trades": [
+        {
+          "id": 1,
+          "action": "buy",
+          "symbol": "AAPL",
+          "amount": 10,
+          "price": 255.88,
+          "total": 2558.80
+        }
+      ],
+      "ai_reasoning": {
+        "total_steps": 15,
+        "stop_signal_received": true,
+        "reasoning_summary": "Market analysis indicated strong buy signal for AAPL...",
+        "tool_usage": {
+          "search": 3,
+          "get_price": 5,
+          "math": 2,
+          "trade": 1
+        }
+      },
+      "log_file_path": "data/agent_data/gpt-5/log/2025-01-15/log.jsonl"
+    }
+  ]
+}
+```
+
+**Response (400 Bad Request):**
+```json
+{
+  "error": "invalid_date",
+  "message": "Date must be in YYYY-MM-DD format"
+}
+```
+
+**Response (404 Not Found):**
+```json
+{
+  "error": "no_data",
+  "message": "No simulation data found for date 2025-01-15 and model gpt-5"
+}
+```
+
+**Business Logic:**
+1. Validate date format
+2. Read `position.jsonl` for specified model(s) and date
+3. For `detail=minimal`: Return positions + calculate daily P&L
+4. For `detail=full`:
+   - Parse `log.jsonl` to extract reasoning summary
+   - Count tool usage from log messages
+   - Extract trades from position file
+5. Return aggregated results
+
+---
+
+### 1.5 GET /health
+
+**Purpose:** Health check endpoint for Docker and monitoring.
+
+**Request:**
+```http
+GET /health HTTP/1.1
+```
+
+**Response (200 OK):**
+```json
+{
+  "status": "healthy",
+  "timestamp": "2025-01-20T14:30:00Z",
+  "services": {
+    "mcp_math": {"status": "up", "url": "http://localhost:8000/mcp"},
+    "mcp_search": {"status": "up", "url": "http://localhost:8001/mcp"},
+    "mcp_trade": {"status": "up", "url": "http://localhost:8002/mcp"},
+    "mcp_getprice": {"status": "up", "url": "http://localhost:8003/mcp"}
+  },
+  "storage": {
+    "data_directory": "/app/data",
+    "writable": true,
+    "free_space_mb": 15234
+  },
+  "database": {
+    "status": "connected",
+    "path": "/app/data/jobs.db"
+  }
+}
+```
+
+**Response (503 Service Unavailable):**
+```json
+{
+  "status": "unhealthy",
+  "timestamp": "2025-01-20T14:30:00Z",
+  "services": {
+    "mcp_math": {"status": "down", "url": "http://localhost:8000/mcp", "error": "Connection refused"},
+    "mcp_search": {"status": "up", "url": "http://localhost:8001/mcp"},
+    "mcp_trade": {"status": "up", "url": "http://localhost:8002/mcp"},
+    "mcp_getprice": {"status": "up", "url": "http://localhost:8003/mcp"}
+  },
+  "storage": {
+    "data_directory": "/app/data",
+    "writable": true
+  },
+  "database": {
+    "status": "connected"
+  }
+}
+```
+
+---
+
+## 2. Data Models
+
+### 2.1 SQLite Schema
+
+**Table: jobs**
+```sql
+CREATE TABLE jobs (
+    job_id TEXT PRIMARY KEY,
+    config_path TEXT NOT NULL,
+    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
+    date_range TEXT NOT NULL,  -- JSON array of dates
+    models TEXT NOT NULL,      -- JSON array of model signatures
+    created_at TEXT NOT NULL,
+    started_at TEXT,
+    completed_at TEXT,
+    total_duration_seconds REAL,
+    error TEXT
+);
+
+CREATE INDEX idx_jobs_status ON jobs(status);
+CREATE INDEX idx_jobs_created_at ON jobs(created_at DESC);
+```
+
+**Table: job_details**
+```sql
+CREATE TABLE job_details (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    job_id TEXT NOT NULL,
+    date TEXT NOT NULL,
+    model TEXT NOT NULL,
+    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
+    started_at TEXT,
+    completed_at TEXT,
+    duration_seconds REAL,
+    error TEXT,
+    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
+);
+
+CREATE INDEX idx_job_details_job_id ON job_details(job_id);
+CREATE INDEX idx_job_details_status ON job_details(status);
+```
+
+### 2.2 Pydantic Models
+
+**Request Models:**
+```python
+from pydantic import BaseModel, Field
+from typing import Optional, Literal
+
+class TriggerSimulationRequest(BaseModel):
+    config_path: Optional[str] = Field(default="configs/default_config.json", description="Path to configuration file")
+
+class ResultsQueryParams(BaseModel):
+    date: str = Field(..., pattern=r"^\d{4}-\d{2}-\d{2}$", description="Date in YYYY-MM-DD format")
+    model: Optional[str] = Field(None, description="Model signature filter")
+    detail: Literal["minimal", "full"] = Field(default="minimal", description="Response detail level")
+```
+
+**Response Models:**
+```python
+class JobProgress(BaseModel):
+    total_model_days: int
+    completed: int
+    failed: int
+    current: Optional[dict] = None  # {"date": str, "model": str}
+    details: Optional[list] = None  # List of JobDetailResponse
+
+class TriggerSimulationResponse(BaseModel):
+    job_id: str
+    status: str
+    date_range: list[str]
+    models: list[str]
+    created_at: str
+    message: str
+    progress: Optional[JobProgress] = None
+
+class JobStatusResponse(BaseModel):
+    job_id: str
+    status: str
+    date_range: list[str]
+    models: list[str]
+    progress: JobProgress
+    created_at: str
+    updated_at: Optional[str] = None
+    completed_at: Optional[str] = None
+    total_duration_seconds: Optional[float] = None
+
+class DailyPnL(BaseModel):
+    profit: float
+    return_pct: float
+    portfolio_value: float
+
+class Trade(BaseModel):
+    id: int
+    action: str
+    symbol: str
+    amount: int
+    price: Optional[float] = None
+    total: Optional[float] = None
+
+class AIReasoning(BaseModel):
+    total_steps: int
+    stop_signal_received: bool
+    reasoning_summary: str
+    tool_usage: dict[str, int]
+
+class ModelResult(BaseModel):
+    model: str
+    positions: dict[str, float]
+    daily_pnl: DailyPnL
+    trades: Optional[list[Trade]] = None
+    ai_reasoning: Optional[AIReasoning] = None
+    log_file_path: Optional[str] = None
+
+class ResultsResponse(BaseModel):
+    date: str
+    results: list[ModelResult]
+```
+
+---
+
+## 3. Configuration Management
+
+### 3.1 Environment Variables
+
+Required environment variables remain the same as batch mode:
+```bash
+# OpenAI API Configuration
+OPENAI_API_BASE=https://api.openai.com/v1
+OPENAI_API_KEY=sk-...
+
+# Alpha Vantage API
+ALPHAADVANTAGE_API_KEY=...
+
+# Jina Search API
+JINA_API_KEY=...
+
+# Runtime Config Path (now shared by API and worker)
+RUNTIME_ENV_PATH=/app/data/runtime_env.json
+
+# MCP Service Ports
+MATH_HTTP_PORT=8000
+SEARCH_HTTP_PORT=8001
+TRADE_HTTP_PORT=8002
+GETPRICE_HTTP_PORT=8003
+
+# API Server Configuration
+API_HOST=0.0.0.0
+API_PORT=8080
+
+# Job Configuration
+MAX_CONCURRENT_JOBS=1  # Only one simulation job at a time
+```
+
+### 3.2 Runtime State Management
+
+**Challenge:** Multiple model-days running concurrently need isolated `runtime_env.json` state.
+
+**Solution:** Per-job runtime config files
+- `runtime_env_base.json` - Template
+- `runtime_env_{job_id}_{model}_{date}.json` - Job-specific runtime config
+- Worker passes custom `RUNTIME_ENV_PATH` to each simulation execution
+
+**Modified `write_config_value()` and `get_config_value()`:**
+- Accept optional `runtime_path` parameter
+- Worker manages lifecycle: create → use → cleanup
+
+---
+
+## 4. Error Handling
+
+### 4.1 Error Response Format
+
+All errors follow this structure:
+```json
+{
+  "error": "error_code",
+  "message": "Human-readable error description",
+  "details": {
+    // Optional additional context
+  }
+}
+```
+
+### 4.2 HTTP Status Codes
+
+- `200 OK` - Successful request
+- `202 Accepted` - Job queued successfully
+- `400 Bad Request` - Invalid input parameters
+- `404 Not Found` - Resource not found (job, results)
+- `409 Conflict` - Concurrent job conflict
+- `500 Internal Server Error` - Unexpected server error
+- `503 Service Unavailable` - Health check failed
+
+### 4.3 Retry Strategy for Workers
+
+Models run independently - failure of one model doesn't block others:
+```python
+async def run_model_day(job_id: str, date: str, model_config: dict):
+    try:
+        # Execute simulation for this model-day
+        await agent.run_trading_session(date)
+        update_job_detail_status(job_id, date, model, "completed")
+    except Exception as e:
+        # Log error, update status to failed, continue with next model-day
+        update_job_detail_status(job_id, date, model, "failed", error=str(e))
+        # Do NOT raise - let other models continue
+```
+
+---
+
+## 5. Concurrency & Locking
+
+### 5.1 Job Execution Policy
+
+**Rule:** Maximum 1 running job at a time (configurable via `MAX_CONCURRENT_JOBS`)
+
+**Enforcement:**
+```python
+def can_start_new_job() -> bool:
+    running_jobs = db.query(
+        "SELECT COUNT(*) FROM jobs WHERE status IN ('pending', 'running')"
+    ).fetchone()[0]
+    return running_jobs < MAX_CONCURRENT_JOBS
+```
+
+### 5.2 Position File Concurrency
+
+**Challenge:** Multiple model-days writing to same model's `position.jsonl`
+
+**Solution:** Sequential execution per model
+```python
+# For each date in date_range:
+#   For each model in parallel:  ← Models run in parallel
+#     Execute model-day sequentially  ← Dates for same model run sequentially
+```
+
+**Execution Pattern:**
+```
+Date 2025-01-16:
+  - Model A (running)
+  - Model B (running)
+  - Model C (running)
+
+Date 2025-01-17:  ← Starts only after all models finish 2025-01-16
+  - Model A (running)
+  - Model B (running)
+  - Model C (running)
+```
+
+**Rationale:**
+- Models write to different position files → No conflict
+- Same model's dates run sequentially → No race condition on position.jsonl
+- Date-level parallelism across models → Faster overall execution
+
+---
+
+## 6. Performance Considerations
+
+### 6.1 Execution Time Estimates
+
+Based on current implementation:
+- Single model-day: ~30-60 seconds (depends on AI model latency + tool calls)
+- 3 models × 5 days = 15 model-days ≈ 7.5-15 minutes (parallel execution)
+
+### 6.2 Timeout Configuration
+
+**API Request Timeout:**
+- `/simulate/trigger`: 10 seconds (just queue job)
+- `/simulate/status`: 5 seconds (read from DB)
+- `/results`: 30 seconds (file I/O + parsing)
+
+**Worker Timeout:**
+- Per model-day: 5 minutes (inherited from `max_retries` × `base_delay`)
+- Entire job: No timeout (job runs until all model-days complete or fail)
+
+### 6.3 Optimization Opportunities (Future)
+
+1. **Results caching:** Store computed daily_pnl in SQLite to avoid recomputation
+2. **Parallel date execution:** If position file locking is implemented, run dates in parallel
+3. **Streaming responses:** For `/simulate/status`, use SSE to push updates instead of polling
+
+---
+
+## 7. Logging & Observability
+
+### 7.1 Structured Logging
+
+All API logs use JSON format:
+```json
+{
+  "timestamp": "2025-01-20T14:30:00Z",
+  "level": "INFO",
+  "logger": "api.worker",
+  "message": "Starting simulation for model-day",
+  "job_id": "550e8400-...",
+  "date": "2025-01-16",
+  "model": "gpt-5"
+}
+```
+
+### 7.2 Log Levels
+
+- `DEBUG` - Detailed execution flow (tool calls, price fetches)
+- `INFO` - Job lifecycle events (created, started, completed)
+- `WARNING` - Recoverable errors (retry attempts)
+- `ERROR` - Model-day failures (logged but job continues)
+- `CRITICAL` - System failures (MCP services down, DB corruption)
+
+### 7.3 Audit Trail
+
+All job state transitions logged to `api_audit.log`:
+```json
+{
+  "timestamp": "2025-01-20T14:30:00Z",
+  "event": "job_created",
+  "job_id": "550e8400-...",
+  "user": "windmill-service",  // Future: from auth header
+  "details": {"date_range": [...], "models": [...]}
+}
+```
+
+---
+
+## 8. Security Considerations
+
+### 8.1 Authentication (Future)
+
+For MVP, API relies on network isolation (Docker network). Future enhancements:
+- API key authentication via header: `X-API-Key: <token>`
+- JWT tokens for Windmill integration
+- Rate limiting per API key
+
+### 8.2 Input Validation
+
+- All date parameters validated with regex: `^\d{4}-\d{2}-\d{2}$`
+- Config paths restricted to `configs/` directory (prevent path traversal)
+- Model signatures sanitized (alphanumeric + hyphens only)
+
+### 8.3 File Access Controls
+
+- Results API only reads from `data/agent_data/` directory
+- Config API only reads from `configs/` directory
+- No arbitrary file read via API parameters
+
+---
+
+## 9. Deployment Configuration
+
+### 9.1 Docker Compose
+
+```yaml
+version: '3.8'
+
+services:
+  ai-trader-api:
+    build:
+      context: .
+      dockerfile: Dockerfile
+    ports:
+      - "8080:8080"
+    volumes:
+      - ./data:/app/data
+      - ./configs:/app/configs
+    env_file:
+      - .env
+    environment:
+      - MODE=api
+      - API_PORT=8080
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 40s
+    restart: unless-stopped
+```
+
+### 9.2 Dockerfile Modifications
+
+```dockerfile
+# ... existing layers ...
+
+# Install API dependencies
+COPY requirements-api.txt /app/
+RUN pip install --no-cache-dir -r requirements-api.txt
+
+# Copy API application code
+COPY api/ /app/api/
+
+# Copy entrypoint script
+COPY docker-entrypoint.sh /app/
+RUN chmod +x /app/docker-entrypoint.sh
+
+EXPOSE 8080
+
+CMD ["/app/docker-entrypoint.sh"]
+```
+
+### 9.3 Entrypoint Script
+
+```bash
+#!/bin/bash
+set -e
+
+echo "Starting MCP services..."
+cd /app/agent_tools
+python start_mcp_services.py &
+MCP_PID=$!
+
+echo "Waiting for MCP services to be ready..."
+sleep 10
+
+echo "Starting API server..."
+cd /app
+uvicorn api.main:app --host ${API_HOST:-0.0.0.0} --port ${API_PORT:-8080} --workers 1
+
+# Cleanup on exit
+trap "kill $MCP_PID 2>/dev/null || true" EXIT
+```
+
+---
+
+## 10. API Versioning (Future)
+
+For v2 and beyond:
+- URL prefix: `/api/v1/simulate/trigger`, `/api/v2/simulate/trigger`
+- Header-based: `Accept: application/vnd.ai-trader.v1+json`
+
+MVP uses unversioned endpoints (implied v1).
+
+---
+
+## Next Steps
+
+After reviewing this specification, we'll proceed to:
+1. **Component 2:** Job Manager & SQLite Schema Implementation
+2. **Component 3:** Background Worker Architecture
+3. **Component 4:** BaseAgent Refactoring for Single-Day Execution
+4. **Component 5:** Docker & Deployment Configuration
+5. **Component 6:** Windmill Integration Flows
+
+Please review this API specification and provide feedback or approval to continue.
+5. **Component 6:** Windmill Integration Flows
+
+Please review this API specification and provide feedback or approval to continue.