docs: restructure documentation for improved clarity and navigation

Reorganize documentation into user-focused, developer-focused, and deployment-focused sections. **New structure:** - Root: README.md (streamlined), QUICK_START.md, API_REFERENCE.md - docs/user-guide/: configuration, API usage, integrations, troubleshooting - docs/developer/: contributing, development setup, testing, architecture - docs/deployment/: Docker deployment, production checklist, monitoring - docs/reference/: environment variables, MCP tools, data formats **Changes:** - Streamline README.md from 831 to 469 lines - Create QUICK_START.md for 5-minute onboarding - Create API_REFERENCE.md as single source of truth for API - Remove 9 outdated specification docs (v0.2.0 API design) - Remove DOCKER_API.md (content consolidated into new structure) - Remove docs/plans/ directory with old design documents - Update CLAUDE.md with documentation structure guide - Remove orchestration-specific references **Benefits:** - Clear entry points for different audiences - No content duplication - Better discoverability through logical hierarchy - All content reflects current v0.3.0 API
2026-06-14 21:31:18 -04:00 · 2025-11-01 10:40:57 -04:00
parent c1ebdd4780
commit b3debc125f
36 changed files with 3364 additions and 9643 deletions
@@ -0,0 +1,739 @@
+# AI-Trader API Reference
+
+Complete reference for the AI-Trader REST API service.
+
+**Base URL:** `http://localhost:8080` (default)
+
+**API Version:** 1.0.0
+
+---
+
+## Endpoints
+
+### POST /simulate/trigger
+
+Trigger a new simulation job for a specified date range and models.
+
+**Request Body:**
+
+```json
+{
+  "start_date": "2025-01-16",
+  "end_date": "2025-01-17",
+  "models": ["gpt-4", "claude-3.7-sonnet"]
+}
+```
+
+**Parameters:**
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `start_date` | string | Yes | Start date in YYYY-MM-DD format |
+| `end_date` | string | No | End date in YYYY-MM-DD format. If omitted, simulates single day (uses `start_date`) |
+| `models` | array[string] | No | Model signatures to run. If omitted, uses all enabled models from server config |
+
+**Response (200 OK):**
+
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "pending",
+  "total_model_days": 4,
+  "message": "Simulation job created with 2 trading dates"
+}
+```
+
+**Response Fields:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `job_id` | string | Unique UUID for this simulation job |
+| `status` | string | Job status: `pending`, `running`, `completed`, `partial`, or `failed` |
+| `total_model_days` | integer | Total number of model-day combinations to execute |
+| `message` | string | Human-readable status message |
+
+**Error Responses:**
+
+**400 Bad Request** - Invalid parameters or validation failure
+```json
+{
+  "detail": "Invalid date format: 2025-1-16. Expected YYYY-MM-DD"
+}
+```
+
+**400 Bad Request** - Another job is already running
+```json
+{
+  "detail": "Another simulation job is already running or pending. Please wait for it to complete."
+}
+```
+
+**500 Internal Server Error** - Server configuration issue
+```json
+{
+  "detail": "Server configuration file not found: configs/default_config.json"
+}
+```
+
+**503 Service Unavailable** - Price data download failed
+```json
+{
+  "detail": "Failed to download any price data. Check ALPHAADVANTAGE_API_KEY."
+}
+```
+
+**Validation Rules:**
+
+- **Date format:** Must be YYYY-MM-DD
+- **Date validity:** Must be valid calendar dates
+- **Date order:** `start_date` must be <= `end_date`
+- **Future dates:** Cannot simulate future dates (must be <= today)
+- **Date range limit:** Maximum 30 days (configurable via `MAX_SIMULATION_DAYS`)
+- **Model signatures:** Must match models defined in server configuration
+- **Concurrency:** Only one simulation job can run at a time
+
+**Behavior:**
+
+1. Validates date range and parameters
+2. Determines which models to run (from request or server config)
+3. Checks for missing price data in date range
+4. Downloads missing data if `AUTO_DOWNLOAD_PRICE_DATA=true` (default)
+5. Identifies trading dates with complete price data (all symbols available)
+6. Creates job in database with status `pending`
+7. Starts background worker thread
+8. Returns immediately with job ID
+
+**Examples:**
+
+Single day, single model:
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{
+    "start_date": "2025-01-16",
+    "models": ["gpt-4"]
+  }'
+```
+
+Date range, all enabled models:
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{
+    "start_date": "2025-01-16",
+    "end_date": "2025-01-20"
+  }'
+```
+
+---
+
+### GET /simulate/status/{job_id}
+
+Get status and progress of a simulation job.
+
+**URL Parameters:**
+
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `job_id` | string | Job UUID from trigger response |
+
+**Response (200 OK):**
+
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "running",
+  "progress": {
+    "total_model_days": 4,
+    "completed": 2,
+    "failed": 0,
+    "pending": 2
+  },
+  "date_range": ["2025-01-16", "2025-01-17"],
+  "models": ["gpt-4", "claude-3.7-sonnet"],
+  "created_at": "2025-01-16T10:00:00Z",
+  "started_at": "2025-01-16T10:00:05Z",
+  "completed_at": null,
+  "total_duration_seconds": null,
+  "error": null,
+  "details": [
+    {
+      "model_signature": "gpt-4",
+      "trading_date": "2025-01-16",
+      "status": "completed",
+      "start_time": "2025-01-16T10:00:05Z",
+      "end_time": "2025-01-16T10:05:23Z",
+      "duration_seconds": 318.5,
+      "error": null
+    },
+    {
+      "model_signature": "claude-3.7-sonnet",
+      "trading_date": "2025-01-16",
+      "status": "completed",
+      "start_time": "2025-01-16T10:05:24Z",
+      "end_time": "2025-01-16T10:10:12Z",
+      "duration_seconds": 288.0,
+      "error": null
+    },
+    {
+      "model_signature": "gpt-4",
+      "trading_date": "2025-01-17",
+      "status": "running",
+      "start_time": "2025-01-16T10:10:13Z",
+      "end_time": null,
+      "duration_seconds": null,
+      "error": null
+    },
+    {
+      "model_signature": "claude-3.7-sonnet",
+      "trading_date": "2025-01-17",
+      "status": "pending",
+      "start_time": null,
+      "end_time": null,
+      "duration_seconds": null,
+      "error": null
+    }
+  ]
+}
+```
+
+**Response Fields:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `job_id` | string | Job UUID |
+| `status` | string | Overall job status |
+| `progress` | object | Progress summary |
+| `progress.total_model_days` | integer | Total model-day combinations |
+| `progress.completed` | integer | Successfully completed model-days |
+| `progress.failed` | integer | Failed model-days |
+| `progress.pending` | integer | Not yet started model-days |
+| `date_range` | array[string] | Trading dates in this job |
+| `models` | array[string] | Model signatures in this job |
+| `created_at` | string | ISO 8601 timestamp when job was created |
+| `started_at` | string | ISO 8601 timestamp when execution began |
+| `completed_at` | string | ISO 8601 timestamp when job finished |
+| `total_duration_seconds` | float | Total execution time in seconds |
+| `error` | string | Error message if job failed |
+| `details` | array[object] | Per model-day execution details |
+
+**Job Status Values:**
+
+| Status | Description |
+|--------|-------------|
+| `pending` | Job created, waiting to start |
+| `running` | Job currently executing |
+| `completed` | All model-days completed successfully |
+| `partial` | Some model-days completed, some failed |
+| `failed` | All model-days failed |
+
+**Model-Day Status Values:**
+
+| Status | Description |
+|--------|-------------|
+| `pending` | Not started yet |
+| `running` | Currently executing |
+| `completed` | Finished successfully |
+| `failed` | Execution failed (see `error` field) |
+
+**Error Response:**
+
+**404 Not Found** - Job doesn't exist
+```json
+{
+  "detail": "Job 550e8400-e29b-41d4-a716-446655440000 not found"
+}
+```
+
+**Example:**
+
+```bash
+curl http://localhost:8080/simulate/status/550e8400-e29b-41d4-a716-446655440000
+```
+
+**Polling Recommendation:**
+
+Poll every 10-30 seconds until `status` is `completed`, `partial`, or `failed`.
+
+---
+
+### GET /results
+
+Query simulation results with optional filters.
+
+**Query Parameters:**
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `job_id` | string | No | Filter by job UUID |
+| `date` | string | No | Filter by trading date (YYYY-MM-DD) |
+| `model` | string | No | Filter by model signature |
+
+**Response (200 OK):**
+
+```json
+{
+  "results": [
+    {
+      "id": 1,
+      "job_id": "550e8400-e29b-41d4-a716-446655440000",
+      "date": "2025-01-16",
+      "model": "gpt-4",
+      "action_id": 1,
+      "action_type": "buy",
+      "symbol": "AAPL",
+      "amount": 10,
+      "price": 250.50,
+      "cash": 7495.00,
+      "portfolio_value": 10000.00,
+      "daily_profit": 0.00,
+      "daily_return_pct": 0.00,
+      "created_at": "2025-01-16T10:05:23Z",
+      "holdings": [
+        {"symbol": "AAPL", "quantity": 10},
+        {"symbol": "CASH", "quantity": 7495.00}
+      ]
+    },
+    {
+      "id": 2,
+      "job_id": "550e8400-e29b-41d4-a716-446655440000",
+      "date": "2025-01-16",
+      "model": "gpt-4",
+      "action_id": 2,
+      "action_type": "buy",
+      "symbol": "MSFT",
+      "amount": 5,
+      "price": 380.20,
+      "cash": 5594.00,
+      "portfolio_value": 10105.00,
+      "daily_profit": 105.00,
+      "daily_return_pct": 1.05,
+      "created_at": "2025-01-16T10:05:23Z",
+      "holdings": [
+        {"symbol": "AAPL", "quantity": 10},
+        {"symbol": "MSFT", "quantity": 5},
+        {"symbol": "CASH", "quantity": 5594.00}
+      ]
+    }
+  ],
+  "count": 2
+}
+```
+
+**Response Fields:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `results` | array[object] | Array of position records |
+| `count` | integer | Number of results returned |
+
+**Position Record Fields:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `id` | integer | Unique position record ID |
+| `job_id` | string | Job UUID this belongs to |
+| `date` | string | Trading date (YYYY-MM-DD) |
+| `model` | string | Model signature |
+| `action_id` | integer | Action sequence number (1, 2, 3...) for this model-day |
+| `action_type` | string | Action taken: `buy`, `sell`, or `hold` |
+| `symbol` | string | Stock symbol traded (or null for `hold`) |
+| `amount` | integer | Quantity traded (or null for `hold`) |
+| `price` | float | Price per share (or null for `hold`) |
+| `cash` | float | Cash balance after this action |
+| `portfolio_value` | float | Total portfolio value (cash + holdings) |
+| `daily_profit` | float | Profit/loss for this trading day |
+| `daily_return_pct` | float | Return percentage for this day |
+| `created_at` | string | ISO 8601 timestamp when recorded |
+| `holdings` | array[object] | Current holdings after this action |
+
+**Holdings Object:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `symbol` | string | Stock symbol or "CASH" |
+| `quantity` | float | Shares owned (or cash amount) |
+
+**Examples:**
+
+All results for a specific job:
+```bash
+curl "http://localhost:8080/results?job_id=550e8400-e29b-41d4-a716-446655440000"
+```
+
+Results for a specific date:
+```bash
+curl "http://localhost:8080/results?date=2025-01-16"
+```
+
+Results for a specific model:
+```bash
+curl "http://localhost:8080/results?model=gpt-4"
+```
+
+Combine filters:
+```bash
+curl "http://localhost:8080/results?job_id=550e8400-e29b-41d4-a716-446655440000&date=2025-01-16&model=gpt-4"
+```
+
+---
+
+### GET /health
+
+Health check endpoint for monitoring and orchestration services.
+
+**Response (200 OK):**
+
+```json
+{
+  "status": "healthy",
+  "database": "connected",
+  "timestamp": "2025-01-16T10:00:00Z"
+}
+```
+
+**Response Fields:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `status` | string | Overall service health: `healthy` or `unhealthy` |
+| `database` | string | Database connection status: `connected` or `disconnected` |
+| `timestamp` | string | ISO 8601 timestamp of health check |
+
+**Example:**
+
+```bash
+curl http://localhost:8080/health
+```
+
+**Usage:**
+
+- Docker health checks: `HEALTHCHECK CMD curl -f http://localhost:8080/health`
+- Monitoring systems: Poll every 30-60 seconds
+- Orchestration services: Verify availability before triggering simulations
+
+---
+
+## Common Workflows
+
+### Trigger and Monitor a Simulation
+
+1. **Trigger simulation:**
+```bash
+RESPONSE=$(curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{"start_date": "2025-01-16", "end_date": "2025-01-17", "models": ["gpt-4"]}')
+
+JOB_ID=$(echo $RESPONSE | jq -r '.job_id')
+echo "Job ID: $JOB_ID"
+```
+
+2. **Poll for completion:**
+```bash
+while true; do
+  STATUS=$(curl -s http://localhost:8080/simulate/status/$JOB_ID | jq -r '.status')
+  echo "Status: $STATUS"
+
+  if [[ "$STATUS" == "completed" ]] || [[ "$STATUS" == "partial" ]] || [[ "$STATUS" == "failed" ]]; then
+    break
+  fi
+
+  sleep 10
+done
+```
+
+3. **Retrieve results:**
+```bash
+curl "http://localhost:8080/results?job_id=$JOB_ID" | jq '.'
+```
+
+### Scheduled Daily Simulations
+
+Use a scheduler (cron, Airflow, etc.) to trigger simulations:
+
+```bash
+#!/bin/bash
+# daily_simulation.sh
+
+# Calculate yesterday's date
+DATE=$(date -d "yesterday" +%Y-%m-%d)
+
+# Trigger simulation
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d "{\"start_date\": \"$DATE\", \"models\": [\"gpt-4\"]}"
+```
+
+Add to crontab:
+```
+0 6 * * * /path/to/daily_simulation.sh
+```
+
+---
+
+## Error Handling
+
+All endpoints return consistent error responses with HTTP status codes and detail messages.
+
+### Common Error Codes
+
+| Code | Meaning | Common Causes |
+|------|---------|---------------|
+| 400 | Bad Request | Invalid date format, invalid parameters, concurrent job running |
+| 404 | Not Found | Job ID doesn't exist |
+| 500 | Internal Server Error | Server misconfiguration, missing config file |
+| 503 | Service Unavailable | Price data download failed, database unavailable |
+
+### Error Response Format
+
+```json
+{
+  "detail": "Human-readable error message"
+}
+```
+
+### Retry Recommendations
+
+- **400 errors:** Fix request parameters, don't retry
+- **404 errors:** Verify job ID, don't retry
+- **500 errors:** Check server logs, investigate before retrying
+- **503 errors:** Retry with exponential backoff (wait 1s, 2s, 4s, etc.)
+
+---
+
+## Rate Limits and Constraints
+
+### Concurrency
+
+- **Maximum concurrent jobs:** 1 (configurable via `MAX_CONCURRENT_JOBS`)
+- **Attempting to start a second job returns:** 400 Bad Request
+
+### Date Range Limits
+
+- **Maximum date range:** 30 days (configurable via `MAX_SIMULATION_DAYS`)
+- **Attempting longer range returns:** 400 Bad Request
+
+### Price Data
+
+- **Alpha Vantage API rate limit:** 5 requests/minute (free tier), 75 requests/minute (premium)
+- **Automatic download:** Enabled by default (`AUTO_DOWNLOAD_PRICE_DATA=true`)
+- **Behavior when rate limited:** Partial data downloaded, simulation continues with available dates
+
+---
+
+## Data Persistence
+
+All simulation data is stored in SQLite database at `data/jobs.db`.
+
+### Database Tables
+
+- **jobs** - Job metadata and status
+- **job_details** - Per model-day execution details
+- **positions** - Trading position records
+- **holdings** - Portfolio holdings breakdown
+- **reasoning_logs** - AI decision reasoning (if enabled)
+- **tool_usage** - MCP tool usage statistics
+- **price_data** - Historical price data cache
+- **price_coverage** - Data availability tracking
+
+### Data Retention
+
+- Job data persists indefinitely by default
+- Results can be queried at any time after job completion
+- Manual cleanup: Delete rows from `jobs` table (cascades to related tables)
+
+---
+
+## Configuration
+
+API behavior is controlled via environment variables and server configuration file.
+
+### Environment Variables
+
+See [docs/reference/environment-variables.md](docs/reference/environment-variables.md) for complete reference.
+
+**Key variables:**
+
+- `API_PORT` - API server port (default: 8080)
+- `MAX_CONCURRENT_JOBS` - Maximum concurrent simulations (default: 1)
+- `MAX_SIMULATION_DAYS` - Maximum date range (default: 30)
+- `AUTO_DOWNLOAD_PRICE_DATA` - Auto-download missing data (default: true)
+- `ALPHAADVANTAGE_API_KEY` - Alpha Vantage API key (required)
+
+### Server Configuration File
+
+Server loads model definitions from configuration file (default: `configs/default_config.json`).
+
+**Example config:**
+```json
+{
+  "models": [
+    {
+      "name": "GPT-4",
+      "basemodel": "openai/gpt-4",
+      "signature": "gpt-4",
+      "enabled": true
+    },
+    {
+      "name": "Claude 3.7 Sonnet",
+      "basemodel": "anthropic/claude-3.7-sonnet",
+      "signature": "claude-3.7-sonnet",
+      "enabled": true
+    }
+  ],
+  "agent_config": {
+    "max_steps": 30,
+    "initial_cash": 10000.0
+  }
+}
+```
+
+**Model fields:**
+
+- `signature` - Unique identifier used in API requests
+- `enabled` - Whether model runs when no models specified in request
+- `basemodel` - Model identifier for AI provider
+- `openai_base_url` - Optional custom API endpoint
+- `openai_api_key` - Optional model-specific API key
+
+---
+
+## OpenAPI / Swagger Documentation
+
+Interactive API documentation available at:
+
+- Swagger UI: `http://localhost:8080/docs`
+- ReDoc: `http://localhost:8080/redoc`
+- OpenAPI JSON: `http://localhost:8080/openapi.json`
+
+---
+
+## Client Libraries
+
+### Python
+
+```python
+import requests
+import time
+
+class AITraderClient:
+    def __init__(self, base_url="http://localhost:8080"):
+        self.base_url = base_url
+
+    def trigger_simulation(self, start_date, end_date=None, models=None):
+        """Trigger a simulation job."""
+        payload = {"start_date": start_date}
+        if end_date:
+            payload["end_date"] = end_date
+        if models:
+            payload["models"] = models
+
+        response = requests.post(
+            f"{self.base_url}/simulate/trigger",
+            json=payload
+        )
+        response.raise_for_status()
+        return response.json()
+
+    def get_status(self, job_id):
+        """Get job status."""
+        response = requests.get(f"{self.base_url}/simulate/status/{job_id}")
+        response.raise_for_status()
+        return response.json()
+
+    def wait_for_completion(self, job_id, poll_interval=10):
+        """Poll until job completes."""
+        while True:
+            status = self.get_status(job_id)
+            if status["status"] in ["completed", "partial", "failed"]:
+                return status
+            time.sleep(poll_interval)
+
+    def get_results(self, job_id=None, date=None, model=None):
+        """Query results with optional filters."""
+        params = {}
+        if job_id:
+            params["job_id"] = job_id
+        if date:
+            params["date"] = date
+        if model:
+            params["model"] = model
+
+        response = requests.get(f"{self.base_url}/results", params=params)
+        response.raise_for_status()
+        return response.json()
+
+# Usage
+client = AITraderClient()
+job = client.trigger_simulation("2025-01-16", models=["gpt-4"])
+result = client.wait_for_completion(job["job_id"])
+results = client.get_results(job_id=job["job_id"])
+```
+
+### TypeScript/JavaScript
+
+```typescript
+class AITraderClient {
+  constructor(private baseUrl: string = "http://localhost:8080") {}
+
+  async triggerSimulation(
+    startDate: string,
+    endDate?: string,
+    models?: string[]
+  ) {
+    const body: any = { start_date: startDate };
+    if (endDate) body.end_date = endDate;
+    if (models) body.models = models;
+
+    const response = await fetch(`${this.baseUrl}/simulate/trigger`, {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      body: JSON.stringify(body)
+    });
+
+    if (!response.ok) throw new Error(`HTTP ${response.status}`);
+    return response.json();
+  }
+
+  async getStatus(jobId: string) {
+    const response = await fetch(
+      `${this.baseUrl}/simulate/status/${jobId}`
+    );
+    if (!response.ok) throw new Error(`HTTP ${response.status}`);
+    return response.json();
+  }
+
+  async waitForCompletion(jobId: string, pollInterval: number = 10000) {
+    while (true) {
+      const status = await this.getStatus(jobId);
+      if (["completed", "partial", "failed"].includes(status.status)) {
+        return status;
+      }
+      await new Promise(resolve => setTimeout(resolve, pollInterval));
+    }
+  }
+
+  async getResults(filters: {
+    jobId?: string;
+    date?: string;
+    model?: string;
+  } = {}) {
+    const params = new URLSearchParams();
+    if (filters.jobId) params.set("job_id", filters.jobId);
+    if (filters.date) params.set("date", filters.date);
+    if (filters.model) params.set("model", filters.model);
+
+    const response = await fetch(
+      `${this.baseUrl}/results?${params.toString()}`
+    );
+    if (!response.ok) throw new Error(`HTTP ${response.status}`);
+    return response.json();
+  }
+}
+
+// Usage
+const client = new AITraderClient();
+const job = await client.triggerSimulation("2025-01-16", null, ["gpt-4"]);
+const result = await client.waitForCompletion(job.job_id);
+const results = await client.getResults({ jobId: job.job_id });
+```
@@ -303,6 +303,48 @@ When modifying agent behavior or adding tools:
 4. Verify position updates in `position/position.jsonl`
 5. Use `main.sh` only for full end-to-end testing

+See [docs/developer/testing.md](docs/developer/testing.md) for complete testing guide.
+
+## Documentation Structure
+
+The project uses a well-organized documentation structure:
+
+### Root Level (User-facing)
+- **README.md** - Project overview, quick start, API overview
+- **QUICK_START.md** - 5-minute getting started guide
+- **API_REFERENCE.md** - Complete API endpoint documentation
+- **CHANGELOG.md** - Release notes and version history
+- **TESTING_GUIDE.md** - Testing and validation procedures
+
+### docs/user-guide/
+- `configuration.md` - Environment setup and model configuration
+- `using-the-api.md` - Common workflows and best practices
+- `integration-examples.md` - Python, TypeScript, automation examples
+- `troubleshooting.md` - Common issues and solutions
+
+### docs/developer/
+- `CONTRIBUTING.md` - Contribution guidelines
+- `development-setup.md` - Local development without Docker
+- `testing.md` - Running tests and validation
+- `architecture.md` - System design and components
+- `database-schema.md` - SQLite table reference
+- `adding-models.md` - How to add custom AI models
+
+### docs/deployment/
+- `docker-deployment.md` - Production Docker setup
+- `production-checklist.md` - Pre-deployment verification
+- `monitoring.md` - Health checks, logging, metrics
+- `scaling.md` - Multiple instances and load balancing
+
+### docs/reference/
+- `environment-variables.md` - Configuration reference
+- `mcp-tools.md` - Trading tool documentation
+- `data-formats.md` - File formats and schemas
+
+### docs/ (Maintainer docs)
+- `DOCKER.md` - Docker deployment details
+- `RELEASING.md` - Release process for maintainers
+
 ## Common Issues

 **MCP Services Not Running:**
@@ -1,6 +0,0 @@
-We provide QR codes for joining the HKUDS discussion groups on WeChat and Feishu.
-
-You can join by scanning the QR codes below:
-
-<img src="https://github.com/HKUDS/.github/blob/main/profile/QR.png" alt="WeChat QR Code" width="400"/>
-
@@ -1,347 +0,0 @@
-# Docker API Server Deployment
-
-This guide explains how to run AI-Trader as a persistent REST API server using Docker for Windmill.dev integration.
-
-## Quick Start
-
-### 1. Environment Setup
-
-```bash
-# Copy environment template
-cp .env.example .env
-
-# Edit .env and add your API keys:
-# - OPENAI_API_KEY
-# - ALPHAADVANTAGE_API_KEY
-# - JINA_API_KEY
-```
-
-### 2. Start API Server
-
-```bash
-# Start in API mode (default)
-docker-compose up -d ai-trader-api
-
-# View logs
-docker-compose logs -f ai-trader-api
-
-# Check health
-curl http://localhost:8080/health
-```
-
-### 3. Test API Endpoints
-
-```bash
-# Health check
-curl http://localhost:8080/health
-
-# Trigger simulation
-curl -X POST http://localhost:8080/simulate/trigger \
-  -H "Content-Type: application/json" \
-  -d '{
-    "config_path": "/app/configs/default_config.json",
-    "date_range": ["2025-01-16", "2025-01-17"],
-    "models": ["gpt-4"]
-  }'
-
-# Check job status (replace JOB_ID)
-curl http://localhost:8080/simulate/status/JOB_ID
-
-# Query results
-curl http://localhost:8080/results?date=2025-01-16
-```
-
-## Architecture
-
-### Two Deployment Modes
-
-**API Server Mode** (Windmill integration):
- REST API on port 8080
- Background job execution
- Persistent SQLite database
- Continuous uptime with health checks
- Start with: `docker-compose up -d ai-trader-api`
-
-**Batch Mode** (one-time simulation):
- Command-line execution
- Runs to completion then exits
- Config file driven
- Start with: `docker-compose --profile batch up ai-trader-batch`
-
-### Port Configuration
-
-| Service | Internal Port | Default Host Port | Environment Variable |
-|---------|--------------|-------------------|---------------------|
-| API Server | 8080 | 8080 | `API_PORT` |
-| Math MCP | 8000 | 8000 | `MATH_HTTP_PORT` |
-| Search MCP | 8001 | 8001 | `SEARCH_HTTP_PORT` |
-| Trade MCP | 8002 | 8002 | `TRADE_HTTP_PORT` |
-| Price MCP | 8003 | 8003 | `GETPRICE_HTTP_PORT` |
-| Web Dashboard | 8888 | 8888 | `WEB_HTTP_PORT` |
-
-## API Endpoints
-
-### POST /simulate/trigger
-Trigger a new simulation job.
-
-**Request:**
-```json
-{
-  "config_path": "/app/configs/default_config.json",
-  "date_range": ["2025-01-16", "2025-01-17"],
-  "models": ["gpt-4", "claude-3.7-sonnet"]
-}
-```
-
-**Response:**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "pending",
-  "total_model_days": 4,
-  "message": "Simulation job created and started"
-}
-```
-
-### GET /simulate/status/{job_id}
-Get job progress and status.
-
-**Response:**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "running",
-  "progress": {
-    "total_model_days": 4,
-    "completed": 2,
-    "failed": 0,
-    "pending": 2
-  },
-  "date_range": ["2025-01-16", "2025-01-17"],
-  "models": ["gpt-4", "claude-3.7-sonnet"],
-  "created_at": "2025-01-16T10:00:00Z",
-  "details": [
-    {
-      "date": "2025-01-16",
-      "model": "gpt-4",
-      "status": "completed",
-      "started_at": "2025-01-16T10:00:05Z",
-      "completed_at": "2025-01-16T10:05:23Z",
-      "duration_seconds": 318.5
-    }
-  ]
-}
-```
-
-### GET /results
-Query simulation results with optional filters.
-
-**Parameters:**
- `job_id` (optional): Filter by job UUID
- `date` (optional): Filter by trading date (YYYY-MM-DD)
- `model` (optional): Filter by model signature
-
-**Response:**
-```json
-{
-  "results": [
-    {
-      "id": 1,
-      "job_id": "550e8400-e29b-41d4-a716-446655440000",
-      "date": "2025-01-16",
-      "model": "gpt-4",
-      "action_id": 1,
-      "action_type": "buy",
-      "symbol": "AAPL",
-      "amount": 10,
-      "price": 250.50,
-      "cash": 7495.00,
-      "portfolio_value": 10000.00,
-      "daily_profit": 0.00,
-      "daily_return_pct": 0.00,
-      "holdings": [
-        {"symbol": "AAPL", "quantity": 10},
-        {"symbol": "CASH", "quantity": 7495.00}
-      ]
-    }
-  ],
-  "count": 1
-}
-```
-
-### GET /health
-Service health check.
-
-**Response:**
-```json
-{
-  "status": "healthy",
-  "database": "connected",
-  "timestamp": "2025-01-16T10:00:00Z"
-}
-```
-
-## Volume Mounts
-
-Data persists across container restarts via volume mounts:
-
-```yaml
-volumes:
-  - ./data:/app/data          # SQLite database, price data
-  - ./logs:/app/logs          # Application logs
-  - ./configs:/app/configs    # Configuration files
-```
-
-**Key files:**
- `/app/data/jobs.db` - SQLite database with job history and results
- `/app/data/merged.jsonl` - Cached price data (fetched on first run)
- `/app/logs/` - Application and MCP service logs
-
-## Configuration
-
-### Custom Config File
-
-Place config files in `./configs/` directory:
-
-```json
-{
-  "agent_type": "BaseAgent",
-  "date_range": {
-    "init_date": "2025-01-01",
-    "end_date": "2025-01-31"
-  },
-  "models": [
-    {
-      "name": "GPT-4",
-      "basemodel": "gpt-4",
-      "signature": "gpt-4",
-      "enabled": true
-    }
-  ],
-  "agent_config": {
-    "max_steps": 30,
-    "initial_cash": 10000.0
-  }
-}
-```
-
-Reference in API calls: `/app/configs/your_config.json`
-
-## Troubleshooting
-
-### Check Container Status
-```bash
-docker-compose ps
-docker-compose logs ai-trader-api
-```
-
-### Health Check Failing
-```bash
-# Check if services started
-docker exec ai-trader-api ps aux
-
-# Test internal health
-docker exec ai-trader-api curl http://localhost:8080/health
-
-# Check MCP services
-docker exec ai-trader-api curl http://localhost:8000/health
-```
-
-### Database Issues
-```bash
-# View database
-docker exec ai-trader-api sqlite3 data/jobs.db ".tables"
-
-# Reset database (WARNING: deletes all data)
-rm ./data/jobs.db
-docker-compose restart ai-trader-api
-```
-
-### Port Conflicts
-If ports are already in use, edit `.env`:
-```bash
-API_PORT=9080  # Change to available port
-```
-
-## Windmill Integration
-
-Example Windmill workflow step:
-
-```python
-import httpx
-
-def trigger_simulation(
-    api_url: str,
-    config_path: str,
-    start_date: str,
-    end_date: str,
-    models: list[str]
-):
-    """Trigger AI trading simulation via API."""
-
-    response = httpx.post(
-        f"{api_url}/simulate/trigger",
-        json={
-            "config_path": config_path,
-            "date_range": [start_date, end_date],
-            "models": models
-        },
-        timeout=30.0
-    )
-
-    response.raise_for_status()
-    return response.json()
-
-def check_status(api_url: str, job_id: str):
-    """Check simulation job status."""
-
-    response = httpx.get(
-        f"{api_url}/simulate/status/{job_id}",
-        timeout=10.0
-    )
-
-    response.raise_for_status()
-    return response.json()
-```
-
-## Production Deployment
-
-### Use Docker Hub Image
-```yaml
-# docker-compose.yml
-services:
-  ai-trader-api:
-    image: ghcr.io/xe138/ai-trader:latest
-    # ... rest of config
-```
-
-### Build Locally
-```yaml
-# docker-compose.yml
-services:
-  ai-trader-api:
-    build: .
-    # ... rest of config
-```
-
-### Environment Security
- Never commit `.env` to version control
- Use secrets management in production (Docker secrets, Kubernetes secrets, etc.)
- Rotate API keys regularly
-
-## Monitoring
-
-### Prometheus Metrics (Future)
-Metrics endpoint planned: `GET /metrics`
-
-### Log Aggregation
- Container logs: `docker-compose logs -f`
- Application logs: `./logs/api.log`
- MCP service logs: `./logs/mcp_*.log`
-
-## Scaling Considerations
-
- Single-job concurrency enforced by database lock
- For parallel simulations, deploy multiple instances with separate databases
- Consider load balancer for high-availability setup
- Database size grows with number of simulations (plan for cleanup/archival)
@@ -0,0 +1,373 @@
+# Quick Start Guide
+
+Get AI-Trader running in under 5 minutes using Docker.
+
+---
+
+## Prerequisites
+
+- **Docker** and **Docker Compose** installed
+  - [Install Docker Desktop](https://www.docker.com/products/docker-desktop/) (includes both)
+- **API Keys:**
+  - OpenAI API key ([get one here](https://platform.openai.com/api-keys))
+  - Alpha Vantage API key ([free tier](https://www.alphavantage.co/support/#api-key))
+  - Jina AI API key ([free tier](https://jina.ai/))
+- **System Requirements:**
+  - 2GB free disk space
+  - Internet connection
+
+---
+
+## Step 1: Clone Repository
+
+```bash
+git clone https://github.com/Xe138/AI-Trader.git
+cd AI-Trader
+```
+
+---
+
+## Step 2: Configure Environment
+
+Create `.env` file with your API keys:
+
+```bash
+cp .env.example .env
+```
+
+Edit `.env` and add your keys:
+
+```bash
+# Required API Keys
+OPENAI_API_KEY=sk-your-openai-key-here
+ALPHAADVANTAGE_API_KEY=your-alpha-vantage-key-here
+JINA_API_KEY=your-jina-key-here
+
+# Optional: Custom OpenAI endpoint
+# OPENAI_API_BASE=https://api.openai.com/v1
+
+# Optional: API server port (default: 8080)
+# API_PORT=8080
+```
+
+**Save the file.**
+
+---
+
+## Step 3: Start the API Server
+
+```bash
+docker-compose up -d
+```
+
+This will:
+- Build the Docker image (~5-10 minutes first time)
+- Start the AI-Trader API service
+- Start internal MCP services (math, search, trade, price)
+- Initialize the SQLite database
+
+**Wait for startup:**
+
+```bash
+# View logs
+docker logs -f ai-trader
+
+# Wait for this message:
+# "Application startup complete"
+# Press Ctrl+C to stop viewing logs
+```
+
+---
+
+## Step 4: Verify Service is Running
+
+```bash
+curl http://localhost:8080/health
+```
+
+**Expected response:**
+
+```json
+{
+  "status": "healthy",
+  "database": "connected",
+  "timestamp": "2025-01-16T10:00:00Z"
+}
+```
+
+If you see `"status": "healthy"`, you're ready!
+
+---
+
+## Step 5: Run Your First Simulation
+
+Trigger a simulation for a single day with GPT-4:
+
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{
+    "start_date": "2025-01-16",
+    "models": ["gpt-4"]
+  }'
+```
+
+**Response:**
+
+```json
+{
+  "job_id": "550e8400-e29b-41d4-a716-446655440000",
+  "status": "pending",
+  "total_model_days": 1,
+  "message": "Simulation job created with 1 trading dates"
+}
+```
+
+**Save the `job_id`** - you'll need it to check status.
+
+---
+
+## Step 6: Monitor Progress
+
+```bash
+# Replace with your job_id from Step 5
+JOB_ID="550e8400-e29b-41d4-a716-446655440000"
+
+curl http://localhost:8080/simulate/status/$JOB_ID
+```
+
+**While running:**
+
+```json
+{
+  "job_id": "550e8400-...",
+  "status": "running",
+  "progress": {
+    "total_model_days": 1,
+    "completed": 0,
+    "failed": 0,
+    "pending": 1
+  },
+  ...
+}
+```
+
+**When complete:**
+
+```json
+{
+  "job_id": "550e8400-...",
+  "status": "completed",
+  "progress": {
+    "total_model_days": 1,
+    "completed": 1,
+    "failed": 0,
+    "pending": 0
+  },
+  ...
+}
+```
+
+**Typical execution time:** 2-5 minutes for a single model-day.
+
+---
+
+## Step 7: View Results
+
+```bash
+curl "http://localhost:8080/results?job_id=$JOB_ID" | jq '.'
+```
+
+**Example output:**
+
+```json
+{
+  "results": [
+    {
+      "id": 1,
+      "job_id": "550e8400-...",
+      "date": "2025-01-16",
+      "model": "gpt-4",
+      "action_type": "buy",
+      "symbol": "AAPL",
+      "amount": 10,
+      "price": 250.50,
+      "cash": 7495.00,
+      "portfolio_value": 10000.00,
+      "daily_profit": 0.00,
+      "holdings": [
+        {"symbol": "AAPL", "quantity": 10},
+        {"symbol": "CASH", "quantity": 7495.00}
+      ]
+    }
+  ],
+  "count": 1
+}
+```
+
+You can see:
+- What the AI decided to buy/sell
+- Portfolio value and cash balance
+- All current holdings
+
+---
+
+## Success! What's Next?
+
+### Run Multiple Days
+
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{
+    "start_date": "2025-01-16",
+    "end_date": "2025-01-20"
+  }'
+```
+
+This simulates 5 trading days (weekdays only).
+
+### Run Multiple Models
+
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{
+    "start_date": "2025-01-16",
+    "models": ["gpt-4", "claude-3.7-sonnet"]
+  }'
+```
+
+**Note:** Models must be defined and enabled in `configs/default_config.json`.
+
+### Query Specific Results
+
+```bash
+# All results for a specific date
+curl "http://localhost:8080/results?date=2025-01-16"
+
+# All results for a specific model
+curl "http://localhost:8080/results?model=gpt-4"
+
+# Combine filters
+curl "http://localhost:8080/results?date=2025-01-16&model=gpt-4"
+```
+
+---
+
+## Troubleshooting
+
+### Service won't start
+
+```bash
+# Check logs
+docker logs ai-trader
+
+# Common issues:
+# - Missing API keys in .env
+# - Port 8080 already in use
+# - Docker not running
+```
+
+**Fix port conflicts:**
+
+Edit `.env` and change `API_PORT`:
+
+```bash
+API_PORT=8889
+```
+
+Then restart:
+
+```bash
+docker-compose down
+docker-compose up -d
+```
+
+### Health check returns error
+
+```bash
+# Check if container is running
+docker ps | grep ai-trader
+
+# Restart service
+docker-compose restart
+
+# Check for errors in logs
+docker logs ai-trader | grep -i error
+```
+
+### Job stays "pending"
+
+The simulation might still be downloading price data on first run.
+
+```bash
+# Watch logs in real-time
+docker logs -f ai-trader
+
+# Look for messages like:
+# "Downloading missing price data..."
+# "Starting simulation for model-day..."
+```
+
+First run can take 10-15 minutes while downloading historical price data.
+
+### "No trading dates with complete price data"
+
+This means price data is missing for the requested date range.
+
+**Solution 1:** Try a different date range (recent dates work best)
+
+**Solution 2:** Manually download price data:
+
+```bash
+docker exec -it ai-trader bash
+cd data
+python get_daily_price.py
+python merge_jsonl.py
+exit
+```
+
+---
+
+## Common Commands
+
+```bash
+# View logs
+docker logs -f ai-trader
+
+# Stop service
+docker-compose down
+
+# Start service
+docker-compose up -d
+
+# Restart service
+docker-compose restart
+
+# Check health
+curl http://localhost:8080/health
+
+# Access container shell
+docker exec -it ai-trader bash
+
+# View database
+docker exec -it ai-trader sqlite3 /app/data/jobs.db
+```
+
+---
+
+## Next Steps
+
+- **Full API Reference:** [API_REFERENCE.md](API_REFERENCE.md)
+- **Configuration Guide:** [docs/user-guide/configuration.md](docs/user-guide/configuration.md)
+- **Integration Examples:** [docs/user-guide/integration-examples.md](docs/user-guide/integration-examples.md)
+- **Troubleshooting:** [docs/user-guide/troubleshooting.md](docs/user-guide/troubleshooting.md)
+
+---
+
+## Need Help?
+
+- Check [docs/user-guide/troubleshooting.md](docs/user-guide/troubleshooting.md)
+- Review logs: `docker logs ai-trader`
+- Open an issue: [GitHub Issues](https://github.com/Xe138/AI-Trader/issues)
@@ -66,6 +66,18 @@ This document outlines planned features and improvements for the AI-Trader proje
    - Chart library (Plotly.js, Chart.js, or Recharts)
    - Served alongside API (single container deployment)

+#### Development Infrastructure
+- **Migration to uv Package Manager** - Modern Python package management
+  - Replace pip with uv for dependency management
+  - Create pyproject.toml with project metadata and dependencies
+  - Update Dockerfile to use uv for faster, more reliable builds
+  - Update development documentation and workflows
+  - Benefits:
+    - 10-100x faster dependency resolution and installation
+    - Better dependency locking and reproducibility
+    - Unified tool for virtual environments and package management
+    - Drop-in pip replacement with improved UX
+
 ## Contributing

 We welcome contributions to any of these planned features! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
@@ -1,631 +0,0 @@
-# AI-Trader API Service - Enhanced Specifications Summary
-
-## Changes from Original Specifications
-
-Based on user feedback, the specifications have been enhanced with:
-
-1. **SQLite-backed results storage** (instead of reading position.jsonl on-demand)
-2. **Comprehensive Python testing suite** with pytest
-3. **Defined testing thresholds** for coverage, performance, and quality gates
-
---
-
-## Document Index
-
-### Core Specifications (Original)
-1. **[api-specification.md](./api-specification.md)** - REST API endpoints and data models
-2. **[job-manager-specification.md](./job-manager-specification.md)** - Job tracking and database layer
-3. **[worker-specification.md](./worker-specification.md)** - Background worker architecture
-4. **[implementation-specifications.md](./implementation-specifications.md)** - Agent, Docker, Windmill integration
-
-### Enhanced Specifications (New)
-5. **[database-enhanced-specification.md](./database-enhanced-specification.md)** - SQLite results storage
-6. **[testing-specification.md](./testing-specification.md)** - Comprehensive testing suite
-
-### Summary Documents
-7. **[README-SPECS.md](./README-SPECS.md)** - Original specifications overview
-8. **[ENHANCED-SPECIFICATIONS-SUMMARY.md](./ENHANCED-SPECIFICATIONS-SUMMARY.md)** - This document
-
---
-
-## Key Enhancement #1: SQLite Results Storage
-
-### What Changed
-
-**Before:**
- `/results` endpoint reads `position.jsonl` files on-demand
- File I/O on every API request
- No support for advanced queries (date ranges, aggregations)
-
-**After:**
- Simulation results written to SQLite during execution
- Fast database queries (10-100x faster than file I/O)
- Advanced analytics: timeseries, leaderboards, aggregations
-
-### New Database Tables
-
-```sql
-- Results storage
-CREATE TABLE positions (
-    id INTEGER PRIMARY KEY,
-    job_id TEXT,
-    date TEXT,
-    model TEXT,
-    action_id INTEGER,
-    action_type TEXT,
-    symbol TEXT,
-    amount INTEGER,
-    price REAL,
-    cash REAL,
-    portfolio_value REAL,
-    daily_profit REAL,
-    daily_return_pct REAL,
-    cumulative_profit REAL,
-    cumulative_return_pct REAL,
-    created_at TEXT,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id)
-);
-
-CREATE TABLE holdings (
-    id INTEGER PRIMARY KEY,
-    position_id INTEGER,
-    symbol TEXT,
-    quantity INTEGER,
-    FOREIGN KEY (position_id) REFERENCES positions(id)
-);
-
-CREATE TABLE reasoning_logs (
-    id INTEGER PRIMARY KEY,
-    job_id TEXT,
-    date TEXT,
-    model TEXT,
-    step_number INTEGER,
-    timestamp TEXT,
-    role TEXT,
-    content TEXT,
-    tool_name TEXT,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id)
-);
-
-CREATE TABLE tool_usage (
-    id INTEGER PRIMARY KEY,
-    job_id TEXT,
-    date TEXT,
-    model TEXT,
-    tool_name TEXT,
-    call_count INTEGER,
-    total_duration_seconds REAL,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id)
-);
-```
-
-### New API Endpoints
-
-```python
-# Enhanced results endpoint (now reads from SQLite)
-GET /results?date=2025-01-16&model=gpt-5&detail=minimal|full
-
-# New analytics endpoints
-GET /portfolio/timeseries?model=gpt-5&start_date=2025-01-01&end_date=2025-01-31
-GET /leaderboard?date=2025-01-16  # Rankings by portfolio value
-```
-
-### Migration Strategy
-
-**Phase 1:** Dual-write mode
- Agent writes to `position.jsonl` (existing code)
- Executor writes to SQLite after agent completes
- Ensures backward compatibility
-
-**Phase 2:** Verification
- Compare SQLite data vs JSONL data
- Fix any discrepancies
-
-**Phase 3:** Switch over
- `/results` endpoint reads from SQLite
- JSONL writes become optional (can deprecate later)
-
-### Performance Improvement
-
-| Operation | Before (JSONL) | After (SQLite) | Speedup |
-|-----------|----------------|----------------|---------|
-| Get results for 1 date | 200-500ms | 20-50ms | **10x faster** |
-| Get timeseries (30 days) | 6-15 seconds | 100-300ms | **50x faster** |
-| Get leaderboard | 5-10 seconds | 50-100ms | **100x faster** |
-
---
-
-## Key Enhancement #2: Comprehensive Testing Suite
-
-### Testing Thresholds
-
-| Metric | Minimum | Target | Enforcement |
-|--------|---------|--------|-------------|
-| **Code Coverage** | 85% | 90% | CI fails if below |
-| **Critical Path Coverage** | 90% | 95% | Manual review |
-| **Unit Test Speed** | <10s | <5s | Benchmark tracking |
-| **Integration Test Speed** | <60s | <30s | Benchmark tracking |
-| **API Response Times** | <500ms | <200ms | Load testing |
-
-### Test Suite Structure
-
-```
-tests/
-├── unit/                          # 80 tests, <10 seconds
-│   ├── test_job_manager.py        # 95% coverage target
-│   ├── test_database.py
-│   ├── test_runtime_manager.py
-│   ├── test_results_service.py    # 95% coverage target
-│   └── test_models.py
-│
-├── integration/                   # 30 tests, <60 seconds
-│   ├── test_api_endpoints.py      # Full FastAPI testing
-│   ├── test_worker.py
-│   ├── test_executor.py
-│   └── test_end_to_end.py
-│
-├── performance/                   # 20 tests
-│   ├── test_database_benchmarks.py
-│   ├── test_api_load.py           # Locust load testing
-│   └── test_simulation_timing.py
-│
-├── security/                      # 10 tests
-│   ├── test_api_security.py       # SQL injection, XSS, path traversal
-│   └── test_auth.py               # Future: API key validation
-│
-└── e2e/                           # 10 tests, Docker required
-    └── test_docker_workflow.py    # Full Docker compose scenario
-```
-
-### Quality Gates
-
-**All PRs must pass:**
-1. ✅ All tests passing (unit + integration)
-2. ✅ Code coverage ≥ 85%
-3. ✅ No critical security vulnerabilities (Bandit scan)
-4. ✅ Linting passes (Ruff or Flake8)
-5. ✅ Type checking passes (mypy strict mode)
-6. ✅ No performance regressions (±10% tolerance)
-
-**Release checklist:**
-1. ✅ All quality gates pass
-2. ✅ End-to-end tests pass in Docker
-3. ✅ Load testing passes (100 concurrent requests)
-4. ✅ Security scan passes (OWASP ZAP)
-5. ✅ Manual smoke tests complete
-
-### CI/CD Integration
-
-```yaml
-# .github/workflows/test.yml
-name: Test Suite
-
-on: [push, pull_request]
-
-jobs:
-  test:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v3
-      - name: Run unit tests
-        run: pytest tests/unit/ --cov=api --cov-fail-under=85
-      - name: Run integration tests
-        run: pytest tests/integration/
-      - name: Security scan
-        run: bandit -r api/ -ll
-      - name: Upload coverage
-        uses: codecov/codecov-action@v3
-```
-
-### Test Coverage Breakdown
-
-| Component | Minimum | Target | Tests |
-|-----------|---------|--------|-------|
-| `api/job_manager.py` | 90% | 95% | 25 tests |
-| `api/worker.py` | 85% | 90% | 15 tests |
-| `api/executor.py` | 85% | 90% | 12 tests |
-| `api/results_service.py` | 90% | 95% | 18 tests |
-| `api/database.py` | 95% | 100% | 10 tests |
-| `api/runtime_manager.py` | 85% | 90% | 8 tests |
-| `api/main.py` | 80% | 85% | 20 tests |
-| **Total** | **85%** | **90%** | **~150 tests** |
-
---
-
-## Updated Implementation Plan
-
-### Phase 1: API Foundation (Days 1-2)
- [x] Create `api/` directory structure
- [ ] Implement `api/models.py` with Pydantic models
- [ ] Implement `api/database.py` with **enhanced schema** (6 tables)
- [ ] Implement `api/job_manager.py` with job CRUD operations
- [ ] **NEW:** Write unit tests for job_manager (target: 95% coverage)
- [ ] Test database operations manually
-
-**Testing Deliverables:**
- 25 unit tests for job_manager
- 10 unit tests for database utilities
- 85%+ coverage for Phase 1 code
-
---
-
-### Phase 2: Worker & Executor (Days 3-4)
- [ ] Implement `api/runtime_manager.py`
- [ ] Implement `api/executor.py` for single model-day execution
- [ ] **NEW:** Add SQLite write logic to executor (`_store_results_to_db()`)
- [ ] Implement `api/worker.py` for job orchestration
- [ ] **NEW:** Write unit tests for worker and executor (target: 85% coverage)
- [ ] Test runtime config isolation
-
-**Testing Deliverables:**
- 15 unit tests for worker
- 12 unit tests for executor
- 8 unit tests for runtime_manager
- 85%+ coverage for Phase 2 code
-
---
-
-### Phase 3: Results Service & FastAPI Endpoints (Days 5-6)
- [ ] **NEW:** Implement `api/results_service.py` (SQLite-backed)
-  - [ ] `get_results(date, model, detail)`
-  - [ ] `get_portfolio_timeseries(model, start_date, end_date)`
-  - [ ] `get_leaderboard(date)`
- [ ] Implement `api/main.py` with all endpoints
-  - [ ] `/simulate/trigger` with background tasks
-  - [ ] `/simulate/status/{job_id}`
-  - [ ] `/simulate/current`
-  - [ ] `/results` (now reads from SQLite)
-  - [ ] **NEW:** `/portfolio/timeseries`
-  - [ ] **NEW:** `/leaderboard`
-  - [ ] `/health` with MCP checks
- [ ] **NEW:** Write unit tests for results_service (target: 95% coverage)
- [ ] **NEW:** Write integration tests for API endpoints (target: 80% coverage)
- [ ] Test all endpoints with Postman/curl
-
-**Testing Deliverables:**
- 18 unit tests for results_service
- 20 integration tests for API endpoints
- Performance benchmarks for database queries
- 85%+ coverage for Phase 3 code
-
---
-
-### Phase 4: Docker Integration (Day 7)
- [ ] Update `Dockerfile`
- [ ] Create `docker-entrypoint-api.sh`
- [ ] Create `requirements-api.txt`
- [ ] Update `docker-compose.yml`
- [ ] Test Docker build
- [ ] Test container startup and health checks
- [ ] **NEW:** Run E2E tests in Docker environment
- [ ] Test end-to-end simulation via API in Docker
-
-**Testing Deliverables:**
- 10 E2E tests with Docker
- Docker health check validation
- Performance testing in containerized environment
-
---
-
-### Phase 5: Windmill Integration (Days 8-9)
- [ ] Create Windmill scripts (trigger, poll, store)
- [ ] **UPDATED:** Modify `store_simulation_results.py` to use new `/results` endpoint
- [ ] Test scripts locally against Docker API
- [ ] Deploy scripts to Windmill instance
- [ ] Create Windmill workflow
- [ ] Test workflow end-to-end
- [ ] Create Windmill dashboard (using new `/portfolio/timeseries` and `/leaderboard` endpoints)
- [ ] Document Windmill setup process
-
-**Testing Deliverables:**
- Integration tests for Windmill scripts
- End-to-end workflow validation
- Dashboard functionality verification
-
---
-
-### Phase 6: Testing, Security & Documentation (Day 10)
- [ ] **NEW:** Run full test suite and verify all thresholds met
-  - [ ] Code coverage ≥ 85%
-  - [ ] All ~150 tests passing
-  - [ ] Performance benchmarks within limits
- [ ] **NEW:** Security testing
-  - [ ] Bandit scan (Python security issues)
-  - [ ] SQL injection tests
-  - [ ] Input validation tests
-  - [ ] OWASP ZAP scan (optional)
- [ ] **NEW:** Load testing with Locust
-  - [ ] 100 concurrent users
-  - [ ] API endpoints within performance thresholds
- [ ] Integration tests for complete workflow
- [ ] Update README.md with API usage
- [ ] Create API documentation (Swagger/OpenAPI - auto-generated by FastAPI)
- [ ] Create deployment guide
- [ ] Create troubleshooting guide
- [ ] **NEW:** Generate test coverage report
-
-**Testing Deliverables:**
- Full test suite execution report
- Security scan results
- Load testing results
- Coverage report (HTML + XML)
- CI/CD pipeline configuration
-
---
-
-## New Files Created
-
-### Database & Results
- `api/results_service.py` - SQLite-backed results retrieval
- `api/import_historical_data.py` - Migration script for existing position.jsonl files
-
-### Testing Suite
- `tests/conftest.py` - Shared pytest fixtures
- `tests/unit/test_job_manager.py` - 25 tests
- `tests/unit/test_database.py` - 10 tests
- `tests/unit/test_runtime_manager.py` - 8 tests
- `tests/unit/test_results_service.py` - 18 tests
- `tests/unit/test_models.py` - 5 tests
- `tests/integration/test_api_endpoints.py` - 20 tests
- `tests/integration/test_worker.py` - 15 tests
- `tests/integration/test_executor.py` - 12 tests
- `tests/integration/test_end_to_end.py` - 5 tests
- `tests/performance/test_database_benchmarks.py` - 10 tests
- `tests/performance/test_api_load.py` - Locust load testing
- `tests/security/test_api_security.py` - 10 tests
- `tests/e2e/test_docker_workflow.py` - 10 tests
- `pytest.ini` - Test configuration
- `requirements-dev.txt` - Testing dependencies
-
-### CI/CD
- `.github/workflows/test.yml` - GitHub Actions workflow
-
---
-
-## Updated File Structure
-
-```
-AI-Trader/
-├── api/
-│   ├── __init__.py
-│   ├── main.py                      # FastAPI application
-│   ├── models.py                    # Pydantic request/response models
-│   ├── job_manager.py               # Job lifecycle management
-│   ├── database.py                  # SQLite utilities (enhanced schema)
-│   ├── worker.py                    # Background simulation worker
-│   ├── executor.py                  # Single model-day execution (+ SQLite writes)
-│   ├── runtime_manager.py           # Runtime config isolation
-│   ├── results_service.py           # NEW: SQLite-backed results retrieval
-│   └── import_historical_data.py    # NEW: JSONL → SQLite migration
-│
-├── tests/                           # NEW: Comprehensive test suite
-│   ├── conftest.py
-│   ├── unit/                        # 80 tests, <10s
-│   ├── integration/                 # 30 tests, <60s
-│   ├── performance/                 # 20 tests
-│   ├── security/                    # 10 tests
-│   └── e2e/                         # 10 tests
-│
-├── docs/
-│   ├── api-specification.md
-│   ├── job-manager-specification.md
-│   ├── worker-specification.md
-│   ├── implementation-specifications.md
-│   ├── database-enhanced-specification.md    # NEW
-│   ├── testing-specification.md              # NEW
-│   ├── README-SPECS.md
-│   └── ENHANCED-SPECIFICATIONS-SUMMARY.md    # NEW (this file)
-│
-├── data/
-│   ├── jobs.db                      # SQLite database (6 tables)
-│   ├── runtime_env*.json            # Runtime configs (temporary)
-│   ├── agent_data/                  # Existing position/log data
-│   └── merged.jsonl                 # Existing price data
-│
-├── pytest.ini                       # NEW: Test configuration
-├── requirements-dev.txt             # NEW: Testing dependencies
-├── .github/workflows/test.yml       # NEW: CI/CD pipeline
-└── ... (existing files)
-```
-
---
-
-## Benefits Summary
-
-### Performance
- **10-100x faster** results queries (SQLite vs file I/O)
- **Advanced analytics** - timeseries, leaderboards, aggregations in milliseconds
- **Optimized indexes** for common queries
-
-### Quality
- **85% minimum coverage** enforced by CI/CD
- **150 comprehensive tests** across unit, integration, performance, security
- **Quality gates** prevent regressions
- **Type safety** with mypy strict mode
-
-### Maintainability
- **SQLite single source of truth** - easier backup, restore, migration
- **Automated testing** catches bugs early
- **CI/CD integration** provides fast feedback on every commit
- **Security scanning** prevents vulnerabilities
-
-### Analytics Capabilities
-
-**New queries enabled by SQLite:**
-
-```python
-# Portfolio timeseries for charting
-GET /portfolio/timeseries?model=gpt-5&start_date=2025-01-01&end_date=2025-01-31
-
-# Model leaderboard
-GET /leaderboard?date=2025-01-31
-
-# Advanced filtering (future)
-SELECT * FROM positions
-WHERE daily_return_pct > 2.0
-ORDER BY portfolio_value DESC;
-
-# Aggregations (future)
-SELECT model, AVG(daily_return_pct) as avg_return
-FROM positions
-GROUP BY model
-ORDER BY avg_return DESC;
-```
-
---
-
-## Migration from Original Spec
-
-If you've already started implementation based on original specs:
-
-### Step 1: Database Schema Migration
-```sql
-- Run enhanced schema creation
-- See database-enhanced-specification.md Section 2.1
-```
-
-### Step 2: Add Results Service
-```bash
-# Create new file
-touch api/results_service.py
-# Implement as per database-enhanced-specification.md Section 4.1
-```
-
-### Step 3: Update Executor
-```python
-# In api/executor.py, add after agent.run_trading_session():
-self._store_results_to_db(job_id, date, model_sig)
-```
-
-### Step 4: Update API Endpoints
-```python
-# In api/main.py, update /results endpoint to use ResultsService
-from api.results_service import ResultsService
-results_service = ResultsService()
-
-@app.get("/results")
-async def get_results(...):
-    return results_service.get_results(date, model, detail)
-```
-
-### Step 5: Add Test Suite
-```bash
-mkdir -p tests/{unit,integration,performance,security,e2e}
-# Create test files as per testing-specification.md Section 4-8
-```
-
-### Step 6: Configure CI/CD
-```bash
-mkdir -p .github/workflows
-# Create test.yml as per testing-specification.md Section 10.1
-```
-
---
-
-## Testing Execution Guide
-
-### Run Unit Tests
-```bash
-pytest tests/unit/ -v --cov=api --cov-report=term-missing
-```
-
-### Run Integration Tests
-```bash
-pytest tests/integration/ -v
-```
-
-### Run All Tests (Except E2E)
-```bash
-pytest tests/ -v --ignore=tests/e2e/ --cov=api --cov-report=html
-```
-
-### Run E2E Tests (Requires Docker)
-```bash
-pytest tests/e2e/ -v -s
-```
-
-### Run Performance Benchmarks
-```bash
-pytest tests/performance/ --benchmark-only
-```
-
-### Run Security Tests
-```bash
-pytest tests/security/ -v
-bandit -r api/ -ll
-```
-
-### Generate Coverage Report
-```bash
-pytest tests/unit/ tests/integration/ --cov=api --cov-report=html
-open htmlcov/index.html  # View in browser
-```
-
-### Run Load Tests
-```bash
-locust -f tests/performance/test_api_load.py --host=http://localhost:8080
-# Open http://localhost:8089 for Locust UI
-```
-
---
-
-## Questions & Next Steps
-
-### Review Checklist
-
-Please review:
-1. ✅ **Enhanced database schema** with 6 tables for comprehensive results storage
-2. ✅ **Migration strategy** for backward compatibility (dual-write mode)
-3. ✅ **Testing thresholds** (85% coverage minimum, performance benchmarks)
-4. ✅ **Test suite structure** (150 tests across 5 categories)
-5. ✅ **CI/CD integration** with quality gates
-6. ✅ **Updated implementation plan** (10 days, 6 phases)
-
-### Questions to Consider
-
-1. **Database migration timing:** Start with dual-write mode immediately, or add in Phase 2?
-2. **Testing priorities:** Should we implement tests alongside features (TDD) or after each phase?
-3. **CI/CD platform:** GitHub Actions (as specified) or different platform?
-4. **Performance baselines:** Should we run benchmarks before implementation to track improvement?
-5. **Security priorities:** Which security tests are MVP vs nice-to-have?
-
-### Ready to Implement?
-
-**Option A:** Approve specifications and begin Phase 1 implementation
- Create API directory structure
- Implement enhanced database schema
- Write unit tests for database layer
- Target: 2 days, 90%+ coverage for database code
-
-**Option B:** Request modifications to specifications
- Clarify any unclear requirements
- Adjust testing thresholds
- Modify implementation timeline
-
-**Option C:** Implement in parallel workstreams
- Workstream 1: Core API (Phases 1-3)
- Workstream 2: Testing suite (parallel with Phase 1-3)
- Workstream 3: Docker + Windmill (Phases 4-5)
- Benefits: Faster delivery, more parallelization
- Requires: Clear interfaces between components
-
---
-
-## Summary
-
-**Enhanced specifications** add:
-1. 🗄️ **SQLite results storage** - 10-100x faster queries, advanced analytics
-2. 🧪 **Comprehensive testing** - 150 tests, 85% coverage, quality gates
-3. 🔒 **Security testing** - SQL injection, XSS, input validation
-4. ⚡ **Performance benchmarks** - Catch regressions early
-5. 🚀 **CI/CD pipeline** - Automated quality checks on every commit
-
-**Total effort:** Still ~10 days, but with significantly higher code quality and confidence in deployments.
-
-**Risk mitigation:** Extensive testing catches bugs before production, preventing costly hotfixes.
-
-**Long-term value:** Maintainable, well-tested codebase enables rapid feature development.
-
---
-
-Ready to proceed? Please provide feedback or approval to begin implementation!
@@ -1,436 +0,0 @@
-# AI-Trader API Service - Technical Specifications Summary
-
-## Overview
-
-This directory contains comprehensive technical specifications for transforming the AI-Trader batch simulation system into an API service compatible with Windmill automation.
-
-## Specification Documents
-
-### 1. [API Specification](./api-specification.md)
-**Purpose:** Defines all API endpoints, request/response formats, and data models
-
-**Key Contents:**
- **5 REST Endpoints:**
-  - `POST /simulate/trigger` - Queue catch-up simulation job
-  - `GET /simulate/status/{job_id}` - Poll job progress
-  - `GET /simulate/current` - Get latest job
-  - `GET /results` - Retrieve simulation results (minimal/full detail)
-  - `GET /health` - Service health check
- **Pydantic Models** for type-safe request/response handling
- **Error Handling** strategies and HTTP status codes
- **SQLite Schema** for jobs and job_details tables
- **Configuration Management** via environment variables
-
-**Status Codes:** 200 OK, 202 Accepted, 400 Bad Request, 404 Not Found, 409 Conflict, 503 Service Unavailable
-
---
-
-### 2. [Job Manager Specification](./job-manager-specification.md)
-**Purpose:** Details the job tracking and database layer
-
-**Key Contents:**
- **SQLite Database Schema:**
-  - `jobs` table - High-level job metadata
-  - `job_details` table - Per model-day execution tracking
- **JobManager Class Interface:**
-  - `create_job()` - Create new simulation job
-  - `get_job()` - Retrieve job by ID
-  - `update_job_status()` - State transitions (pending → running → completed/partial/failed)
-  - `get_job_progress()` - Detailed progress metrics
-  - `can_start_new_job()` - Concurrency control
- **State Machine:** Job status transitions and business logic
- **Concurrency Control:** Single-job execution enforcement
- **Testing Strategy:** Unit tests with temporary databases
-
-**Key Feature:** Independent model execution - one model's failure doesn't block others (results in "partial" status)
-
---
-
-### 3. [Background Worker Specification](./worker-specification.md)
-**Purpose:** Defines async job execution architecture
-
-**Key Contents:**
- **Execution Pattern:** Date-sequential, Model-parallel
-  - All models for Date 1 run in parallel
-  - Date 2 starts only after all models finish Date 1
-  - Ensures position.jsonl integrity (no concurrent writes)
- **SimulationWorker Class:**
-  - Orchestrates job execution
-  - Manages date sequencing
-  - Handles job-level errors
- **ModelDayExecutor Class:**
-  - Executes single model-day simulation
-  - Updates job_detail status
-  - Isolates runtime configuration
- **RuntimeConfigManager:**
-  - Creates temporary runtime_env_{job_id}_{model}_{date}.json files
-  - Prevents state collisions between concurrent models
-  - Cleans up after execution
- **Error Handling:** Graceful failure (models continue despite peer failures)
- **Logging:** Structured JSON logging with job/model/date context
-
-**Performance:** 3 models × 5 days = ~7-15 minutes (vs. ~22-45 minutes sequential)
-
---
-
-### 4. [Implementation Specification](./implementation-specifications.md)
-**Purpose:** Complete implementation guide covering Agent, Docker, and Windmill
-
-**Key Contents:**
-
-#### Part 1: BaseAgent Refactoring
- **Analysis:** Existing `run_trading_session()` already compatible with API mode
- **Required Changes:** ✅ NONE! Existing code works as-is
- **Worker Integration:** Calls `agent.run_trading_session(date)` directly
-
-#### Part 2: Docker Configuration
- **Modified Dockerfile:** Adds FastAPI dependencies, new entrypoint
- **docker-entrypoint-api.sh:** Starts MCP services → launches uvicorn
- **Health Checks:** Verifies MCP services and database connectivity
- **Volume Mounts:** `./data`, `./configs` for persistence
-
-#### Part 3: Windmill Integration
- **Flow 1: trigger_simulation.ts** - Daily cron triggers API
- **Flow 2: poll_simulation_status.ts** - Polls every 5 min until complete
- **Flow 3: store_simulation_results.py** - Stores results in Windmill DB
- **Dashboard:** Charts and tables showing portfolio performance
- **Workflow Orchestration:** Complete YAML workflow definition
-
-#### Part 4: File Structure
- New `api/` directory with 7 modules
- New `windmill/` directory with scripts and dashboard
- New `docs/` directory (this folder)
- `data/jobs.db` for job tracking
-
-#### Part 5: Implementation Checklist
-10-day implementation plan broken into 6 phases
-
---
-
-## Architecture Highlights
-
-### Request Flow
-
-```
-1. Windmill → POST /simulate/trigger
-2. API creates job in SQLite (status: pending)
-3. API queues BackgroundTask
-4. API returns 202 Accepted with job_id
-   ↓
-5. Worker starts (status: running)
-6. For each date sequentially:
-     For each model in parallel:
-       - Create isolated runtime config
-       - Execute agent.run_trading_session(date)
-       - Update job_detail status
-7. Worker finishes (status: completed/partial/failed)
-   ↓
-8. Windmill polls GET /simulate/status/{job_id}
-9. When complete: Windmill calls GET /results?date=X
-10. Windmill stores results in internal DB
-11. Windmill dashboard displays performance
-```
-
-### Data Flow
-
-```
-Input: configs/default_config.json
-       ↓
-API: Calculates date_range (last position → today)
-       ↓
-Worker: Executes simulations
-       ↓
-Output: data/agent_data/{model}/position/position.jsonl
-        data/agent_data/{model}/log/{date}/log.jsonl
-        data/jobs.db (job tracking)
-       ↓
-API: Reads position.jsonl + calculates P&L
-       ↓
-Windmill: Stores in internal DB → Dashboard visualization
-```
-
---
-
-## Key Design Decisions
-
-### 1. Pattern B: Lazy On-Demand Processing
- **Chosen:** Windmill controls simulation timing via API calls
- **Benefit:** Centralized scheduling in Windmill
- **Tradeoff:** First Windmill call of the day triggers long-running job
-
-### 2. SQLite vs. PostgreSQL
- **Chosen:** SQLite for MVP
- **Rationale:** Low concurrency (1 job at a time), simple deployment
- **Future:** PostgreSQL for production with multiple concurrent jobs
-
-### 3. Date-Sequential, Model-Parallel Execution
- **Chosen:** Dates run sequentially, models run in parallel per date
- **Rationale:** Prevents position.jsonl race conditions, faster than fully sequential
- **Performance:** ~50% faster than sequential (3 models in parallel)
-
-### 4. Independent Model Failures
- **Chosen:** One model's failure doesn't block others
- **Benefit:** Partial results better than no results
- **Implementation:** Job status becomes "partial" if any model fails
-
-### 5. Minimal BaseAgent Changes
- **Chosen:** No modifications to agent code
- **Rationale:** Existing `run_trading_session()` is perfect API interface
- **Benefit:** Maintains backward compatibility with batch mode
-
---
-
-## Implementation Prerequisites
-
-### Required Environment Variables
-```bash
-OPENAI_API_BASE=...
-OPENAI_API_KEY=...
-ALPHAADVANTAGE_API_KEY=...
-JINA_API_KEY=...
-RUNTIME_ENV_PATH=/app/data/runtime_env.json
-MATH_HTTP_PORT=8000
-SEARCH_HTTP_PORT=8001
-TRADE_HTTP_PORT=8002
-GETPRICE_HTTP_PORT=8003
-API_HOST=0.0.0.0
-API_PORT=8080
-```
-
-### Required Python Packages (new)
-```
-fastapi==0.109.0
-uvicorn[standard]==0.27.0
-pydantic==2.5.3
-```
-
-### Docker Requirements
- Docker Engine 20.10+
- Docker Compose 2.0+
- 2GB RAM minimum for container
- 10GB disk space for data
-
-### Windmill Requirements
- Windmill instance (self-hosted or cloud)
- Network access from Windmill to AI-Trader API
- Windmill CLI for deployment (optional)
-
---
-
-## Testing Strategy
-
-### Unit Tests
- `tests/test_job_manager.py` - Database operations
- `tests/test_worker.py` - Job execution logic
- `tests/test_executor.py` - Model-day execution
-
-### Integration Tests
- `tests/test_api_endpoints.py` - FastAPI endpoint behavior
- `tests/test_end_to_end.py` - Full workflow (trigger → execute → retrieve)
-
-### Manual Testing
- Docker container startup
- Health check endpoint
- Windmill workflow execution
- Dashboard visualization
-
---
-
-## Performance Expectations
-
-### Single Model-Day Execution
- **Duration:** 30-60 seconds (varies by AI model latency)
- **Bottlenecks:** AI API calls, MCP tool latency
-
-### Multi-Model Job
- **Example:** 3 models × 5 days = 15 model-days
- **Parallel Execution:** ~7-15 minutes
- **Sequential Execution:** ~22-45 minutes
- **Speedup:** ~3x (number of models)
-
-### API Response Times
- `/simulate/trigger`: < 1 second (just queues job)
- `/simulate/status`: < 100ms (SQLite query)
- `/results?detail=minimal`: < 500ms (file read + JSON parsing)
- `/results?detail=full`: < 2 seconds (parse log files)
-
---
-
-## Security Considerations
-
-### MVP Security
- **Network Isolation:** Docker network (no public exposure)
- **No Authentication:** Assumes Windmill → API is trusted network
-
-### Future Enhancements
- API key authentication (`X-API-Key` header)
- Rate limiting per client
- HTTPS/TLS encryption
- Input sanitization for path traversal prevention
-
---
-
-## Deployment Steps
-
-### 1. Build Docker Image
-```bash
-docker-compose build
-```
-
-### 2. Start API Service
-```bash
-docker-compose up -d
-```
-
-### 3. Verify Health
-```bash
-curl http://localhost:8080/health
-```
-
-### 4. Test Trigger
-```bash
-curl -X POST http://localhost:8080/simulate/trigger \
-  -H "Content-Type: application/json" \
-  -d '{"config_path": "configs/default_config.json"}'
-```
-
-### 5. Deploy Windmill Scripts
-```bash
-wmill script push windmill/trigger_simulation.ts
-wmill script push windmill/poll_simulation_status.ts
-wmill script push windmill/store_simulation_results.py
-```
-
-### 6. Create Windmill Workflow
- Import `windmill/daily_simulation_workflow.yaml`
- Configure resource `ai_trader_api` with API URL
- Set cron schedule (daily 6 AM)
-
-### 7. Create Windmill Dashboard
- Import `windmill/dashboard.json`
- Verify data visualization
-
---
-
-## Troubleshooting Guide
-
-### Issue: Health check fails
-**Symptoms:** `curl http://localhost:8080/health` returns 503
-
-**Possible Causes:**
-1. MCP services not running
-2. Database file permission error
-3. API server not started
-
-**Solutions:**
-```bash
-# Check MCP services
-docker-compose exec ai-trader curl http://localhost:8000/health
-
-# Check API logs
-docker-compose logs -f ai-trader
-
-# Restart container
-docker-compose restart
-```
-
-### Issue: Job stuck in "running" status
-**Symptoms:** Job never completes, status remains "running"
-
-**Possible Causes:**
-1. Agent execution crashed
-2. Model API timeout
-3. Worker process died
-
-**Solutions:**
-```bash
-# Check job details for error messages
-curl http://localhost:8080/simulate/status/{job_id}
-
-# Check container logs
-docker-compose logs -f ai-trader
-
-# If API restarted, stale jobs are marked as failed on startup
-docker-compose restart
-```
-
-### Issue: Windmill can't reach API
-**Symptoms:** Connection refused from Windmill scripts
-
-**Solutions:**
- Verify Windmill and AI-Trader on same Docker network
- Check firewall rules
- Use container name (ai-trader) instead of localhost in Windmill resource
- Verify API_PORT environment variable
-
---
-
-## Migration from Batch Mode
-
-### For Users Currently Running Batch Mode
-
-**Option 1: Dual Mode (Recommended)**
- Keep existing `main.py` for manual testing
- Add new API mode for production automation
- Use different config files for each mode
-
-**Option 2: API-Only**
- Replace batch execution entirely
- All simulations via API calls
- More consistent with production workflow
-
-### Migration Checklist
- [ ] Backup existing `data/` directory
- [ ] Update `.env` with API configuration
- [ ] Test API mode in separate environment first
- [ ] Gradually migrate Windmill workflows
- [ ] Monitor logs for errors
- [ ] Validate results match batch mode output
-
---
-
-## Next Steps
-
-1. **Review Specifications**
-   - Read all 4 specification documents
-   - Ask clarifying questions
-   - Approve design before implementation
-
-2. **Implementation Phase 1** (Days 1-2)
-   - Set up `api/` directory structure
-   - Implement database and job_manager
-   - Write unit tests
-
-3. **Implementation Phase 2** (Days 3-4)
-   - Implement worker and executor
-   - Test with mock agents
-
-4. **Implementation Phase 3** (Days 5-6)
-   - Implement FastAPI endpoints
-   - Test with Postman/curl
-
-5. **Implementation Phase 4** (Day 7)
-   - Docker integration
-   - End-to-end testing
-
-6. **Implementation Phase 5** (Days 8-9)
-   - Windmill integration
-   - Dashboard creation
-
-7. **Implementation Phase 6** (Day 10)
-   - Final testing
-   - Documentation
-
---
-
-## Questions or Feedback?
-
-Please review all specifications and provide feedback on:
-1. API endpoint design
-2. Database schema
-3. Execution pattern (date-sequential, model-parallel)
-4. Error handling approach
-5. Windmill integration workflow
-6. Any concerns or suggested improvements
-
-**Ready to proceed with implementation?** Confirm approval of specifications to begin Phase 1.
@@ -1,837 +0,0 @@
-# AI-Trader API Service - Technical Specification
-
-## 1. API Endpoints Specification
-
-### 1.1 POST /simulate/trigger
-
-**Purpose:** Trigger a catch-up simulation from the last completed date to the most recent trading day.
-
-**Request:**
-```http
-POST /simulate/trigger HTTP/1.1
-Content-Type: application/json
-
-```
-
-**Response (202 Accepted):**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "accepted",
-  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
-  "models": ["claude-3.7-sonnet", "gpt-5"],
-  "created_at": "2025-01-20T14:30:00Z",
-  "message": "Simulation job queued successfully"
-}
-```
-
-**Response (200 OK - Job Already Running):**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "running",
-  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
-  "models": ["claude-3.7-sonnet", "gpt-5"],
-  "progress": {
-    "total_model_days": 6,
-    "completed": 3,
-    "failed": 0,
-    "current": {
-      "date": "2025-01-17",
-      "model": "gpt-5"
-    }
-  },
-  "created_at": "2025-01-20T14:25:00Z",
-  "message": "Simulation already in progress"
-}
-```
-
-**Response (200 OK - Already Up To Date):**
-```json
-{
-  "status": "current",
-  "message": "Simulation already up-to-date",
-  "last_simulation_date": "2025-01-20",
-  "next_trading_day": "2025-01-21"
-}
-```
-
-**Response (409 Conflict):**
-```json
-{
-  "error": "conflict",
-  "message": "Different simulation already running",
-  "current_job_id": "previous-job-uuid",
-  "current_date_range": ["2025-01-10", "2025-01-15"]
-}
-```
-
-**Business Logic:**
-1. Load configuration from `config_path` (or default)
-2. Determine last completed date from each model's `position.jsonl`
-3. Calculate date range: `max(last_dates) + 1 day` → `most_recent_trading_day`
-4. Filter for weekdays only (Monday-Friday)
-5. If date_range is empty, return "already up-to-date"
-6. Check for existing jobs with same date range → return existing job
-7. Check for running jobs with different date range → return 409
-8. Create new job in SQLite with status=`pending`
-9. Queue background task to execute simulation
-10. Return 202 with job details
-
---
-
-### 1.2 GET /simulate/status/{job_id}
-
-**Purpose:** Poll the status and progress of a simulation job.
-
-**Request:**
-```http
-GET /simulate/status/550e8400-e29b-41d4-a716-446655440000 HTTP/1.1
-```
-
-**Response (200 OK - Running):**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "running",
-  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
-  "models": ["claude-3.7-sonnet", "gpt-5"],
-  "progress": {
-    "total_model_days": 6,
-    "completed": 3,
-    "failed": 0,
-    "current": {
-      "date": "2025-01-17",
-      "model": "gpt-5"
-    },
-    "details": [
-      {"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
-      {"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
-      {"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 42.1},
-      {"date": "2025-01-17", "model": "gpt-5", "status": "running", "duration_seconds": null}
-    ]
-  },
-  "created_at": "2025-01-20T14:25:00Z",
-  "updated_at": "2025-01-20T14:27:15Z"
-}
-```
-
-**Response (200 OK - Completed):**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "completed",
-  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
-  "models": ["claude-3.7-sonnet", "gpt-5"],
-  "progress": {
-    "total_model_days": 6,
-    "completed": 6,
-    "failed": 0,
-    "details": [
-      {"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
-      {"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
-      {"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 42.1},
-      {"date": "2025-01-17", "model": "gpt-5", "status": "completed", "duration_seconds": 40.3},
-      {"date": "2025-01-20", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 43.8},
-      {"date": "2025-01-20", "model": "gpt-5", "status": "completed", "duration_seconds": 39.1}
-    ]
-  },
-  "created_at": "2025-01-20T14:25:00Z",
-  "completed_at": "2025-01-20T14:29:45Z",
-  "total_duration_seconds": 285.0
-}
-```
-
-**Response (200 OK - Partial Failure):**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "partial",
-  "date_range": ["2025-01-16", "2025-01-17", "2025-01-20"],
-  "models": ["claude-3.7-sonnet", "gpt-5"],
-  "progress": {
-    "total_model_days": 6,
-    "completed": 4,
-    "failed": 2,
-    "details": [
-      {"date": "2025-01-16", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 45.2},
-      {"date": "2025-01-16", "model": "gpt-5", "status": "completed", "duration_seconds": 38.7},
-      {"date": "2025-01-17", "model": "claude-3.7-sonnet", "status": "failed", "error": "MCP service timeout after 3 retries", "duration_seconds": null},
-      {"date": "2025-01-17", "model": "gpt-5", "status": "completed", "duration_seconds": 40.3},
-      {"date": "2025-01-20", "model": "claude-3.7-sonnet", "status": "completed", "duration_seconds": 43.8},
-      {"date": "2025-01-20", "model": "gpt-5", "status": "failed", "error": "AI model API timeout", "duration_seconds": null}
-    ]
-  },
-  "created_at": "2025-01-20T14:25:00Z",
-  "completed_at": "2025-01-20T14:29:45Z"
-}
-```
-
-**Response (404 Not Found):**
-```json
-{
-  "error": "not_found",
-  "message": "Job not found",
-  "job_id": "invalid-job-id"
-}
-```
-
-**Business Logic:**
-1. Query SQLite jobs table for job_id
-2. If not found, return 404
-3. Return job metadata + progress from job_details table
-4. Status transitions: `pending` → `running` → `completed`/`partial`/`failed`
-
---
-
-### 1.3 GET /simulate/current
-
-**Purpose:** Get the most recent simulation job (for Windmill to discover job_id).
-
-**Request:**
-```http
-GET /simulate/current HTTP/1.1
-```
-
-**Response (200 OK):**
-```json
-{
-  "job_id": "550e8400-e29b-41d4-a716-446655440000",
-  "status": "running",
-  "date_range": ["2025-01-16", "2025-01-17"],
-  "models": ["claude-3.7-sonnet", "gpt-5"],
-  "progress": {
-    "total_model_days": 4,
-    "completed": 2,
-    "failed": 0
-  },
-  "created_at": "2025-01-20T14:25:00Z"
-}
-```
-
-**Response (404 Not Found):**
-```json
-{
-  "error": "not_found",
-  "message": "No simulation jobs found"
-}
-```
-
-**Business Logic:**
-1. Query SQLite: `SELECT * FROM jobs ORDER BY created_at DESC LIMIT 1`
-2. Return job details with progress summary
-
---
-
-### 1.4 GET /results
-
-**Purpose:** Retrieve simulation results for a specific date and model.
-
-**Request:**
-```http
-GET /results?date=2025-01-15&model=gpt-5&detail=minimal HTTP/1.1
-```
-
-**Query Parameters:**
- `date` (required): Trading date in YYYY-MM-DD format
- `model` (optional): Model signature (if omitted, returns all models)
- `detail` (optional): Response detail level
-  - `minimal` (default): Positions + daily P&L
-  - `full`: + trade history + AI reasoning logs + tool usage stats
-
-**Response (200 OK - minimal):**
-```json
-{
-  "date": "2025-01-15",
-  "results": [
-    {
-      "model": "gpt-5",
-      "positions": {
-        "AAPL": 10,
-        "MSFT": 5,
-        "NVDA": 0,
-        "CASH": 8500.00
-      },
-      "daily_pnl": {
-        "profit": 150.50,
-        "return_pct": 1.5,
-        "portfolio_value": 10150.50
-      }
-    }
-  ]
-}
-```
-
-**Response (200 OK - full):**
-```json
-{
-  "date": "2025-01-15",
-  "results": [
-    {
-      "model": "gpt-5",
-      "positions": {
-        "AAPL": 10,
-        "MSFT": 5,
-        "CASH": 8500.00
-      },
-      "daily_pnl": {
-        "profit": 150.50,
-        "return_pct": 1.5,
-        "portfolio_value": 10150.50
-      },
-      "trades": [
-        {
-          "id": 1,
-          "action": "buy",
-          "symbol": "AAPL",
-          "amount": 10,
-          "price": 255.88,
-          "total": 2558.80
-        }
-      ],
-      "ai_reasoning": {
-        "total_steps": 15,
-        "stop_signal_received": true,
-        "reasoning_summary": "Market analysis indicated strong buy signal for AAPL...",
-        "tool_usage": {
-          "search": 3,
-          "get_price": 5,
-          "math": 2,
-          "trade": 1
-        }
-      },
-      "log_file_path": "data/agent_data/gpt-5/log/2025-01-15/log.jsonl"
-    }
-  ]
-}
-```
-
-**Response (400 Bad Request):**
-```json
-{
-  "error": "invalid_date",
-  "message": "Date must be in YYYY-MM-DD format"
-}
-```
-
-**Response (404 Not Found):**
-```json
-{
-  "error": "no_data",
-  "message": "No simulation data found for date 2025-01-15 and model gpt-5"
-}
-```
-
-**Business Logic:**
-1. Validate date format
-2. Read `position.jsonl` for specified model(s) and date
-3. For `detail=minimal`: Return positions + calculate daily P&L
-4. For `detail=full`:
-   - Parse `log.jsonl` to extract reasoning summary
-   - Count tool usage from log messages
-   - Extract trades from position file
-5. Return aggregated results
-
---
-
-### 1.5 GET /health
-
-**Purpose:** Health check endpoint for Docker and monitoring.
-
-**Request:**
-```http
-GET /health HTTP/1.1
-```
-
-**Response (200 OK):**
-```json
-{
-  "status": "healthy",
-  "timestamp": "2025-01-20T14:30:00Z",
-  "services": {
-    "mcp_math": {"status": "up", "url": "http://localhost:8000/mcp"},
-    "mcp_search": {"status": "up", "url": "http://localhost:8001/mcp"},
-    "mcp_trade": {"status": "up", "url": "http://localhost:8002/mcp"},
-    "mcp_getprice": {"status": "up", "url": "http://localhost:8003/mcp"}
-  },
-  "storage": {
-    "data_directory": "/app/data",
-    "writable": true,
-    "free_space_mb": 15234
-  },
-  "database": {
-    "status": "connected",
-    "path": "/app/data/jobs.db"
-  }
-}
-```
-
-**Response (503 Service Unavailable):**
-```json
-{
-  "status": "unhealthy",
-  "timestamp": "2025-01-20T14:30:00Z",
-  "services": {
-    "mcp_math": {"status": "down", "url": "http://localhost:8000/mcp", "error": "Connection refused"},
-    "mcp_search": {"status": "up", "url": "http://localhost:8001/mcp"},
-    "mcp_trade": {"status": "up", "url": "http://localhost:8002/mcp"},
-    "mcp_getprice": {"status": "up", "url": "http://localhost:8003/mcp"}
-  },
-  "storage": {
-    "data_directory": "/app/data",
-    "writable": true
-  },
-  "database": {
-    "status": "connected"
-  }
-}
-```
-
---
-
-## 2. Data Models
-
-### 2.1 SQLite Schema
-
-**Table: jobs**
-```sql
-CREATE TABLE jobs (
-    job_id TEXT PRIMARY KEY,
-    config_path TEXT NOT NULL,
-    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
-    date_range TEXT NOT NULL,  -- JSON array of dates
-    models TEXT NOT NULL,      -- JSON array of model signatures
-    created_at TEXT NOT NULL,
-    started_at TEXT,
-    completed_at TEXT,
-    total_duration_seconds REAL,
-    error TEXT
-);
-
-CREATE INDEX idx_jobs_status ON jobs(status);
-CREATE INDEX idx_jobs_created_at ON jobs(created_at DESC);
-```
-
-**Table: job_details**
-```sql
-CREATE TABLE job_details (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    job_id TEXT NOT NULL,
-    date TEXT NOT NULL,
-    model TEXT NOT NULL,
-    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
-    started_at TEXT,
-    completed_at TEXT,
-    duration_seconds REAL,
-    error TEXT,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
-);
-
-CREATE INDEX idx_job_details_job_id ON job_details(job_id);
-CREATE INDEX idx_job_details_status ON job_details(status);
-```
-
-### 2.2 Pydantic Models
-
-**Request Models:**
-```python
-from pydantic import BaseModel, Field
-from typing import Optional, Literal
-
-class TriggerSimulationRequest(BaseModel):
-    config_path: Optional[str] = Field(default="configs/default_config.json", description="Path to configuration file")
-
-class ResultsQueryParams(BaseModel):
-    date: str = Field(..., pattern=r"^\d{4}-\d{2}-\d{2}$", description="Date in YYYY-MM-DD format")
-    model: Optional[str] = Field(None, description="Model signature filter")
-    detail: Literal["minimal", "full"] = Field(default="minimal", description="Response detail level")
-```
-
-**Response Models:**
-```python
-class JobProgress(BaseModel):
-    total_model_days: int
-    completed: int
-    failed: int
-    current: Optional[dict] = None  # {"date": str, "model": str}
-    details: Optional[list] = None  # List of JobDetailResponse
-
-class TriggerSimulationResponse(BaseModel):
-    job_id: str
-    status: str
-    date_range: list[str]
-    models: list[str]
-    created_at: str
-    message: str
-    progress: Optional[JobProgress] = None
-
-class JobStatusResponse(BaseModel):
-    job_id: str
-    status: str
-    date_range: list[str]
-    models: list[str]
-    progress: JobProgress
-    created_at: str
-    updated_at: Optional[str] = None
-    completed_at: Optional[str] = None
-    total_duration_seconds: Optional[float] = None
-
-class DailyPnL(BaseModel):
-    profit: float
-    return_pct: float
-    portfolio_value: float
-
-class Trade(BaseModel):
-    id: int
-    action: str
-    symbol: str
-    amount: int
-    price: Optional[float] = None
-    total: Optional[float] = None
-
-class AIReasoning(BaseModel):
-    total_steps: int
-    stop_signal_received: bool
-    reasoning_summary: str
-    tool_usage: dict[str, int]
-
-class ModelResult(BaseModel):
-    model: str
-    positions: dict[str, float]
-    daily_pnl: DailyPnL
-    trades: Optional[list[Trade]] = None
-    ai_reasoning: Optional[AIReasoning] = None
-    log_file_path: Optional[str] = None
-
-class ResultsResponse(BaseModel):
-    date: str
-    results: list[ModelResult]
-```
-
---
-
-## 3. Configuration Management
-
-### 3.1 Environment Variables
-
-Required environment variables remain the same as batch mode:
-```bash
-# OpenAI API Configuration
-OPENAI_API_BASE=https://api.openai.com/v1
-OPENAI_API_KEY=sk-...
-
-# Alpha Vantage API
-ALPHAADVANTAGE_API_KEY=...
-
-# Jina Search API
-JINA_API_KEY=...
-
-# Runtime Config Path (now shared by API and worker)
-RUNTIME_ENV_PATH=/app/data/runtime_env.json
-
-# MCP Service Ports
-MATH_HTTP_PORT=8000
-SEARCH_HTTP_PORT=8001
-TRADE_HTTP_PORT=8002
-GETPRICE_HTTP_PORT=8003
-
-# API Server Configuration
-API_HOST=0.0.0.0
-API_PORT=8080
-
-# Job Configuration
-MAX_CONCURRENT_JOBS=1  # Only one simulation job at a time
-```
-
-### 3.2 Runtime State Management
-
-**Challenge:** Multiple model-days running concurrently need isolated `runtime_env.json` state.
-
-**Solution:** Per-job runtime config files
- `runtime_env_base.json` - Template
- `runtime_env_{job_id}_{model}_{date}.json` - Job-specific runtime config
- Worker passes custom `RUNTIME_ENV_PATH` to each simulation execution
-
-**Modified `write_config_value()` and `get_config_value()`:**
- Accept optional `runtime_path` parameter
- Worker manages lifecycle: create → use → cleanup
-
---
-
-## 4. Error Handling
-
-### 4.1 Error Response Format
-
-All errors follow this structure:
-```json
-{
-  "error": "error_code",
-  "message": "Human-readable error description",
-  "details": {
-    // Optional additional context
-  }
-}
-```
-
-### 4.2 HTTP Status Codes
-
- `200 OK` - Successful request
- `202 Accepted` - Job queued successfully
- `400 Bad Request` - Invalid input parameters
- `404 Not Found` - Resource not found (job, results)
- `409 Conflict` - Concurrent job conflict
- `500 Internal Server Error` - Unexpected server error
- `503 Service Unavailable` - Health check failed
-
-### 4.3 Retry Strategy for Workers
-
-Models run independently - failure of one model doesn't block others:
-```python
-async def run_model_day(job_id: str, date: str, model_config: dict):
-    try:
-        # Execute simulation for this model-day
-        await agent.run_trading_session(date)
-        update_job_detail_status(job_id, date, model, "completed")
-    except Exception as e:
-        # Log error, update status to failed, continue with next model-day
-        update_job_detail_status(job_id, date, model, "failed", error=str(e))
-        # Do NOT raise - let other models continue
-```
-
---
-
-## 5. Concurrency & Locking
-
-### 5.1 Job Execution Policy
-
-**Rule:** Maximum 1 running job at a time (configurable via `MAX_CONCURRENT_JOBS`)
-
-**Enforcement:**
-```python
-def can_start_new_job() -> bool:
-    running_jobs = db.query(
-        "SELECT COUNT(*) FROM jobs WHERE status IN ('pending', 'running')"
-    ).fetchone()[0]
-    return running_jobs < MAX_CONCURRENT_JOBS
-```
-
-### 5.2 Position File Concurrency
-
-**Challenge:** Multiple model-days writing to same model's `position.jsonl`
-
-**Solution:** Sequential execution per model
-```python
-# For each date in date_range:
-#   For each model in parallel:  ← Models run in parallel
-#     Execute model-day sequentially  ← Dates for same model run sequentially
-```
-
-**Execution Pattern:**
-```
-Date 2025-01-16:
-  - Model A (running)
-  - Model B (running)
-  - Model C (running)
-
-Date 2025-01-17:  ← Starts only after all models finish 2025-01-16
-  - Model A (running)
-  - Model B (running)
-  - Model C (running)
-```
-
-**Rationale:**
- Models write to different position files → No conflict
- Same model's dates run sequentially → No race condition on position.jsonl
- Date-level parallelism across models → Faster overall execution
-
---
-
-## 6. Performance Considerations
-
-### 6.1 Execution Time Estimates
-
-Based on current implementation:
- Single model-day: ~30-60 seconds (depends on AI model latency + tool calls)
- 3 models × 5 days = 15 model-days ≈ 7.5-15 minutes (parallel execution)
-
-### 6.2 Timeout Configuration
-
-**API Request Timeout:**
- `/simulate/trigger`: 10 seconds (just queue job)
- `/simulate/status`: 5 seconds (read from DB)
- `/results`: 30 seconds (file I/O + parsing)
-
-**Worker Timeout:**
- Per model-day: 5 minutes (inherited from `max_retries` × `base_delay`)
- Entire job: No timeout (job runs until all model-days complete or fail)
-
-### 6.3 Optimization Opportunities (Future)
-
-1. **Results caching:** Store computed daily_pnl in SQLite to avoid recomputation
-2. **Parallel date execution:** If position file locking is implemented, run dates in parallel
-3. **Streaming responses:** For `/simulate/status`, use SSE to push updates instead of polling
-
---
-
-## 7. Logging & Observability
-
-### 7.1 Structured Logging
-
-All API logs use JSON format:
-```json
-{
-  "timestamp": "2025-01-20T14:30:00Z",
-  "level": "INFO",
-  "logger": "api.worker",
-  "message": "Starting simulation for model-day",
-  "job_id": "550e8400-...",
-  "date": "2025-01-16",
-  "model": "gpt-5"
-}
-```
-
-### 7.2 Log Levels
-
- `DEBUG` - Detailed execution flow (tool calls, price fetches)
- `INFO` - Job lifecycle events (created, started, completed)
- `WARNING` - Recoverable errors (retry attempts)
- `ERROR` - Model-day failures (logged but job continues)
- `CRITICAL` - System failures (MCP services down, DB corruption)
-
-### 7.3 Audit Trail
-
-All job state transitions logged to `api_audit.log`:
-```json
-{
-  "timestamp": "2025-01-20T14:30:00Z",
-  "event": "job_created",
-  "job_id": "550e8400-...",
-  "user": "windmill-service",  // Future: from auth header
-  "details": {"date_range": [...], "models": [...]}
-}
-```
-
---
-
-## 8. Security Considerations
-
-### 8.1 Authentication (Future)
-
-For MVP, API relies on network isolation (Docker network). Future enhancements:
- API key authentication via header: `X-API-Key: <token>`
- JWT tokens for Windmill integration
- Rate limiting per API key
-
-### 8.2 Input Validation
-
- All date parameters validated with regex: `^\d{4}-\d{2}-\d{2}$`
- Config paths restricted to `configs/` directory (prevent path traversal)
- Model signatures sanitized (alphanumeric + hyphens only)
-
-### 8.3 File Access Controls
-
- Results API only reads from `data/agent_data/` directory
- Config API only reads from `configs/` directory
- No arbitrary file read via API parameters
-
---
-
-## 9. Deployment Configuration
-
-### 9.1 Docker Compose
-
-```yaml
-version: '3.8'
-
-services:
-  ai-trader-api:
-    build:
-      context: .
-      dockerfile: Dockerfile
-    ports:
-      - "8080:8080"
-    volumes:
-      - ./data:/app/data
-      - ./configs:/app/configs
-    env_file:
-      - .env
-    environment:
-      - MODE=api
-      - API_PORT=8080
-    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
-      interval: 30s
-      timeout: 10s
-      retries: 3
-      start_period: 40s
-    restart: unless-stopped
-```
-
-### 9.2 Dockerfile Modifications
-
-```dockerfile
-# ... existing layers ...
-
-# Install API dependencies
-COPY requirements-api.txt /app/
-RUN pip install --no-cache-dir -r requirements-api.txt
-
-# Copy API application code
-COPY api/ /app/api/
-
-# Copy entrypoint script
-COPY docker-entrypoint.sh /app/
-RUN chmod +x /app/docker-entrypoint.sh
-
-EXPOSE 8080
-
-CMD ["/app/docker-entrypoint.sh"]
-```
-
-### 9.3 Entrypoint Script
-
-```bash
-#!/bin/bash
-set -e
-
-echo "Starting MCP services..."
-cd /app/agent_tools
-python start_mcp_services.py &
-MCP_PID=$!
-
-echo "Waiting for MCP services to be ready..."
-sleep 10
-
-echo "Starting API server..."
-cd /app
-uvicorn api.main:app --host ${API_HOST:-0.0.0.0} --port ${API_PORT:-8080} --workers 1
-
-# Cleanup on exit
-trap "kill $MCP_PID 2>/dev/null || true" EXIT
-```
-
---
-
-## 10. API Versioning (Future)
-
-For v2 and beyond:
- URL prefix: `/api/v1/simulate/trigger`, `/api/v2/simulate/trigger`
- Header-based: `Accept: application/vnd.ai-trader.v1+json`
-
-MVP uses unversioned endpoints (implied v1).
-
---
-
-## Next Steps
-
-After reviewing this specification, we'll proceed to:
-1. **Component 2:** Job Manager & SQLite Schema Implementation
-2. **Component 3:** Background Worker Architecture
-3. **Component 4:** BaseAgent Refactoring for Single-Day Execution
-4. **Component 5:** Docker & Deployment Configuration
-5. **Component 6:** Windmill Integration Flows
-
-Please review this API specification and provide feedback or approval to continue.
-5. **Component 6:** Windmill Integration Flows
-
-Please review this API specification and provide feedback or approval to continue.
@@ -1,911 +0,0 @@
-# Enhanced Database Specification - Results Storage in SQLite
-
-## 1. Overview
-
-**Change from Original Spec:** Instead of reading `position.jsonl` on-demand, simulation results are written to SQLite during execution for faster retrieval and queryability.
-
-**Benefits:**
- **Faster `/results` endpoint** - No file I/O on every request
- **Advanced querying** - Filter by date range, model, performance metrics
- **Aggregations** - Portfolio timeseries, leaderboards, statistics
- **Data integrity** - Single source of truth with ACID guarantees
- **Backup/restore** - Single database file instead of scattered JSONL files
-
-**Tradeoff:** Additional database writes during simulation (minimal performance impact)
-
---
-
-## 2. Enhanced Database Schema
-
-### 2.1 Complete Table Structure
-
-```sql
-- Job tracking tables (from original spec)
-CREATE TABLE IF NOT EXISTS jobs (
-    job_id TEXT PRIMARY KEY,
-    config_path TEXT NOT NULL,
-    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
-    date_range TEXT NOT NULL,
-    models TEXT NOT NULL,
-    created_at TEXT NOT NULL,
-    started_at TEXT,
-    completed_at TEXT,
-    total_duration_seconds REAL,
-    error TEXT
-);
-
-CREATE TABLE IF NOT EXISTS job_details (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    job_id TEXT NOT NULL,
-    date TEXT NOT NULL,
-    model TEXT NOT NULL,
-    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
-    started_at TEXT,
-    completed_at TEXT,
-    duration_seconds REAL,
-    error TEXT,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
-);
-
-- NEW: Simulation results storage
-CREATE TABLE IF NOT EXISTS positions (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    job_id TEXT NOT NULL,
-    date TEXT NOT NULL,
-    model TEXT NOT NULL,
-    action_id INTEGER NOT NULL,  -- Sequence number within that day
-    action_type TEXT CHECK(action_type IN ('buy', 'sell', 'no_trade')),
-    symbol TEXT,
-    amount INTEGER,
-    price REAL,
-    cash REAL NOT NULL,
-    portfolio_value REAL NOT NULL,
-    daily_profit REAL,
-    daily_return_pct REAL,
-    cumulative_profit REAL,
-    cumulative_return_pct REAL,
-    created_at TEXT NOT NULL,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
-);
-
-CREATE TABLE IF NOT EXISTS holdings (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    position_id INTEGER NOT NULL,
-    symbol TEXT NOT NULL,
-    quantity INTEGER NOT NULL,
-    FOREIGN KEY (position_id) REFERENCES positions(id) ON DELETE CASCADE
-);
-
-- NEW: AI reasoning logs (optional - for detail=full)
-CREATE TABLE IF NOT EXISTS reasoning_logs (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    job_id TEXT NOT NULL,
-    date TEXT NOT NULL,
-    model TEXT NOT NULL,
-    step_number INTEGER NOT NULL,
-    timestamp TEXT NOT NULL,
-    role TEXT CHECK(role IN ('user', 'assistant', 'tool')),
-    content TEXT,
-    tool_name TEXT,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
-);
-
-- NEW: Tool usage statistics
-CREATE TABLE IF NOT EXISTS tool_usage (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    job_id TEXT NOT NULL,
-    date TEXT NOT NULL,
-    model TEXT NOT NULL,
-    tool_name TEXT NOT NULL,
-    call_count INTEGER NOT NULL DEFAULT 1,
-    total_duration_seconds REAL,
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
-);
-
-- Indexes for performance
-CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status);
-CREATE INDEX IF NOT EXISTS idx_jobs_created_at ON jobs(created_at DESC);
-CREATE INDEX IF NOT EXISTS idx_job_details_job_id ON job_details(job_id);
-CREATE INDEX IF NOT EXISTS idx_job_details_status ON job_details(status);
-CREATE UNIQUE INDEX IF NOT EXISTS idx_job_details_unique ON job_details(job_id, date, model);
-
-CREATE INDEX IF NOT EXISTS idx_positions_job_id ON positions(job_id);
-CREATE INDEX IF NOT EXISTS idx_positions_date ON positions(date);
-CREATE INDEX IF NOT EXISTS idx_positions_model ON positions(model);
-CREATE INDEX IF NOT EXISTS idx_positions_date_model ON positions(date, model);
-CREATE UNIQUE INDEX IF NOT EXISTS idx_positions_unique ON positions(job_id, date, model, action_id);
-
-CREATE INDEX IF NOT EXISTS idx_holdings_position_id ON holdings(position_id);
-CREATE INDEX IF NOT EXISTS idx_holdings_symbol ON holdings(symbol);
-
-CREATE INDEX IF NOT EXISTS idx_reasoning_logs_job_date_model ON reasoning_logs(job_id, date, model);
-CREATE INDEX IF NOT EXISTS idx_tool_usage_job_date_model ON tool_usage(job_id, date, model);
-```
-
---
-
-### 2.2 Table Relationships
-
-```
-jobs (1) ──┬──> (N) job_details
-           │
-           ├──> (N) positions ──> (N) holdings
-           │
-           ├──> (N) reasoning_logs
-           │
-           └──> (N) tool_usage
-```
-
---
-
-### 2.3 Data Examples
-
-#### positions table
-```
-id | job_id     | date       | model | action_id | action_type | symbol | amount | price  | cash    | portfolio_value | daily_profit | daily_return_pct | cumulative_profit | cumulative_return_pct | created_at
---|------------|------------|-------|-----------|-------------|--------|--------|--------|---------|-----------------|--------------|------------------|-------------------|----------------------|------------
-1  | abc-123... | 2025-01-16 | gpt-5 | 0         | no_trade    | NULL   | NULL   | NULL   | 10000.0 | 10000.0         | 0.0          | 0.0              | 0.0               | 0.0                  | 2025-01-16T09:30:00Z
-2  | abc-123... | 2025-01-16 | gpt-5 | 1         | buy         | AAPL   | 10     | 255.88 | 7441.2  | 10000.0         | 0.0          | 0.0              | 0.0               | 0.0                  | 2025-01-16T09:35:12Z
-3  | abc-123... | 2025-01-17 | gpt-5 | 0         | no_trade    | NULL   | NULL   | NULL   | 7441.2  | 10150.5         | 150.5        | 1.51             | 150.5             | 1.51                 | 2025-01-17T09:30:00Z
-4  | abc-123... | 2025-01-17 | gpt-5 | 1         | sell        | AAPL   | 5      | 262.24 | 8752.4  | 10150.5         | 150.5        | 1.51             | 150.5             | 1.51                 | 2025-01-17T09:42:38Z
-```
-
-#### holdings table
-```
-id | position_id | symbol | quantity
---|-------------|--------|----------
-1  | 2           | AAPL   | 10
-2  | 3           | AAPL   | 10
-3  | 4           | AAPL   | 5
-```
-
-#### tool_usage table
-```
-id | job_id     | date       | model | tool_name  | call_count | total_duration_seconds
---|------------|------------|-------|------------|------------|-----------------------
-1  | abc-123... | 2025-01-16 | gpt-5 | get_price  | 5          | 2.3
-2  | abc-123... | 2025-01-16 | gpt-5 | search     | 3          | 12.7
-3  | abc-123... | 2025-01-16 | gpt-5 | trade      | 1          | 0.8
-4  | abc-123... | 2025-01-16 | gpt-5 | math       | 2          | 0.1
-```
-
---
-
-## 3. Data Migration from position.jsonl
-
-### 3.1 Migration Strategy
-
-**During execution:** Write to BOTH SQLite AND position.jsonl for backward compatibility
-
-**Migration path:**
-1. **Phase 1:** Dual-write mode (write to both SQLite and JSONL)
-2. **Phase 2:** Verify SQLite data matches JSONL
-3. **Phase 3:** Switch `/results` endpoint to read from SQLite
-4. **Phase 4:** (Optional) Deprecate JSONL writes
-
-**Import existing data:** One-time migration script to populate SQLite from existing position.jsonl files
-
---
-
-### 3.2 Import Script
-
-```python
-# api/import_historical_data.py
-
-import json
-import sqlite3
-from pathlib import Path
-from datetime import datetime
-from api.database import get_db_connection
-
-def import_position_jsonl(
-    model_signature: str,
-    position_file: Path,
-    job_id: str = "historical-import"
-) -> int:
-    """
-    Import existing position.jsonl data into SQLite.
-
-    Args:
-        model_signature: Model signature (e.g., "gpt-5")
-        position_file: Path to position.jsonl
-        job_id: Job ID to associate with (use "historical-import" for existing data)
-
-    Returns:
-        Number of records imported
-    """
-    conn = get_db_connection()
-    cursor = conn.cursor()
-
-    imported_count = 0
-    initial_cash = 10000.0
-
-    with open(position_file, 'r') as f:
-        for line in f:
-            if not line.strip():
-                continue
-
-            record = json.loads(line)
-            date = record['date']
-            action_id = record['id']
-            action = record.get('this_action', {})
-            positions = record.get('positions', {})
-
-            # Extract action details
-            action_type = action.get('action', 'no_trade')
-            symbol = action.get('symbol', None)
-            amount = action.get('amount', None)
-            price = None  # Not stored in original position.jsonl
-
-            # Extract holdings
-            cash = positions.get('CASH', 0.0)
-            holdings = {k: v for k, v in positions.items() if k != 'CASH' and v > 0}
-
-            # Calculate portfolio value (approximate - need price data)
-            portfolio_value = cash  # Base value
-
-            # Calculate profits (need previous record)
-            daily_profit = 0.0
-            daily_return_pct = 0.0
-            cumulative_profit = cash - initial_cash  # Simplified
-            cumulative_return_pct = (cumulative_profit / initial_cash) * 100
-
-            # Insert position record
-            cursor.execute("""
-                INSERT INTO positions (
-                    job_id, date, model, action_id, action_type, symbol, amount, price,
-                    cash, portfolio_value, daily_profit, daily_return_pct,
-                    cumulative_profit, cumulative_return_pct, created_at
-                ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-            """, (
-                job_id, date, model_signature, action_id, action_type, symbol, amount, price,
-                cash, portfolio_value, daily_profit, daily_return_pct,
-                cumulative_profit, cumulative_return_pct, datetime.utcnow().isoformat() + "Z"
-            ))
-
-            position_id = cursor.lastrowid
-
-            # Insert holdings
-            for sym, qty in holdings.items():
-                cursor.execute("""
-                    INSERT INTO holdings (position_id, symbol, quantity)
-                    VALUES (?, ?, ?)
-                """, (position_id, sym, qty))
-
-            imported_count += 1
-
-    conn.commit()
-    conn.close()
-
-    return imported_count
-
-
-def import_all_historical_data(base_path: Path = Path("data/agent_data")) -> dict:
-    """
-    Import all existing position.jsonl files from data/agent_data/.
-
-    Returns:
-        Summary dict with import counts per model
-    """
-    summary = {}
-
-    for model_dir in base_path.iterdir():
-        if not model_dir.is_dir():
-            continue
-
-        model_signature = model_dir.name
-        position_file = model_dir / "position" / "position.jsonl"
-
-        if not position_file.exists():
-            continue
-
-        print(f"Importing {model_signature}...")
-        count = import_position_jsonl(model_signature, position_file)
-        summary[model_signature] = count
-        print(f"  Imported {count} records")
-
-    return summary
-
-
-if __name__ == "__main__":
-    print("Starting historical data import...")
-    summary = import_all_historical_data()
-    print(f"\nImport complete: {summary}")
-    print(f"Total records: {sum(summary.values())}")
-```
-
---
-
-## 4. Updated Results Service
-
-### 4.1 ResultsService Class
-
-```python
-# api/results_service.py
-
-from typing import List, Dict, Optional
-from datetime import datetime
-from api.database import get_db_connection
-
-class ResultsService:
-    """
-    Service for retrieving simulation results from SQLite.
-
-    Replaces on-demand reading of position.jsonl files.
-    """
-
-    def __init__(self, db_path: str = "data/jobs.db"):
-        self.db_path = db_path
-
-    def get_results(
-        self,
-        date: str,
-        model: Optional[str] = None,
-        detail: str = "minimal"
-    ) -> Dict:
-        """
-        Get simulation results for specified date and model(s).
-
-        Args:
-            date: Trading date (YYYY-MM-DD)
-            model: Optional model signature filter
-            detail: "minimal" or "full"
-
-        Returns:
-            {
-                "date": str,
-                "results": [
-                    {
-                        "model": str,
-                        "positions": {...},
-                        "daily_pnl": {...},
-                        "trades": [...],  // if detail=full
-                        "ai_reasoning": {...}  // if detail=full
-                    }
-                ]
-            }
-        """
-        conn = get_db_connection(self.db_path)
-
-        # Get all models for this date (or specific model)
-        if model:
-            models = [model]
-        else:
-            cursor = conn.cursor()
-            cursor.execute("""
-                SELECT DISTINCT model FROM positions WHERE date = ?
-            """, (date,))
-            models = [row[0] for row in cursor.fetchall()]
-
-        results = []
-
-        for mdl in models:
-            result = self._get_model_result(conn, date, mdl, detail)
-            if result:
-                results.append(result)
-
-        conn.close()
-
-        return {
-            "date": date,
-            "results": results
-        }
-
-    def _get_model_result(
-        self,
-        conn,
-        date: str,
-        model: str,
-        detail: str
-    ) -> Optional[Dict]:
-        """Get result for single model on single date"""
-        cursor = conn.cursor()
-
-        # Get latest position for this date (highest action_id)
-        cursor.execute("""
-            SELECT
-                cash, portfolio_value, daily_profit, daily_return_pct,
-                cumulative_profit, cumulative_return_pct
-            FROM positions
-            WHERE date = ? AND model = ?
-            ORDER BY action_id DESC
-            LIMIT 1
-        """, (date, model))
-
-        row = cursor.fetchone()
-        if not row:
-            return None
-
-        cash, portfolio_value, daily_profit, daily_return_pct, cumulative_profit, cumulative_return_pct = row
-
-        # Get holdings for latest position
-        cursor.execute("""
-            SELECT h.symbol, h.quantity
-            FROM holdings h
-            JOIN positions p ON h.position_id = p.id
-            WHERE p.date = ? AND p.model = ?
-            ORDER BY p.action_id DESC
-            LIMIT 100  -- One position worth of holdings
-        """, (date, model))
-
-        holdings = {row[0]: row[1] for row in cursor.fetchall()}
-        holdings['CASH'] = cash
-
-        result = {
-            "model": model,
-            "positions": holdings,
-            "daily_pnl": {
-                "profit": daily_profit,
-                "return_pct": daily_return_pct,
-                "portfolio_value": portfolio_value
-            },
-            "cumulative_pnl": {
-                "profit": cumulative_profit,
-                "return_pct": cumulative_return_pct
-            }
-        }
-
-        # Add full details if requested
-        if detail == "full":
-            result["trades"] = self._get_trades(cursor, date, model)
-            result["ai_reasoning"] = self._get_reasoning(cursor, date, model)
-            result["tool_usage"] = self._get_tool_usage(cursor, date, model)
-
-        return result
-
-    def _get_trades(self, cursor, date: str, model: str) -> List[Dict]:
-        """Get all trades executed on this date"""
-        cursor.execute("""
-            SELECT action_id, action_type, symbol, amount, price
-            FROM positions
-            WHERE date = ? AND model = ? AND action_type IN ('buy', 'sell')
-            ORDER BY action_id
-        """, (date, model))
-
-        trades = []
-        for row in cursor.fetchall():
-            trades.append({
-                "id": row[0],
-                "action": row[1],
-                "symbol": row[2],
-                "amount": row[3],
-                "price": row[4],
-                "total": row[3] * row[4] if row[3] and row[4] else None
-            })
-
-        return trades
-
-    def _get_reasoning(self, cursor, date: str, model: str) -> Dict:
-        """Get AI reasoning summary"""
-        cursor.execute("""
-            SELECT COUNT(*) as total_steps,
-                   COUNT(CASE WHEN role = 'assistant' THEN 1 END) as assistant_messages,
-                   COUNT(CASE WHEN role = 'tool' THEN 1 END) as tool_messages
-            FROM reasoning_logs
-            WHERE date = ? AND model = ?
-        """, (date, model))
-
-        row = cursor.fetchone()
-        total_steps = row[0] if row else 0
-
-        # Get reasoning summary (last assistant message with FINISH_SIGNAL)
-        cursor.execute("""
-            SELECT content FROM reasoning_logs
-            WHERE date = ? AND model = ? AND role = 'assistant'
-              AND content LIKE '%<FINISH_SIGNAL>%'
-            ORDER BY step_number DESC
-            LIMIT 1
-        """, (date, model))
-
-        row = cursor.fetchone()
-        reasoning_summary = row[0] if row else "No reasoning summary available"
-
-        return {
-            "total_steps": total_steps,
-            "stop_signal_received": "<FINISH_SIGNAL>" in reasoning_summary,
-            "reasoning_summary": reasoning_summary[:500]  # Truncate for brevity
-        }
-
-    def _get_tool_usage(self, cursor, date: str, model: str) -> Dict[str, int]:
-        """Get tool usage counts"""
-        cursor.execute("""
-            SELECT tool_name, call_count
-            FROM tool_usage
-            WHERE date = ? AND model = ?
-        """, (date, model))
-
-        return {row[0]: row[1] for row in cursor.fetchall()}
-
-    def get_portfolio_timeseries(
-        self,
-        model: str,
-        start_date: Optional[str] = None,
-        end_date: Optional[str] = None
-    ) -> List[Dict]:
-        """
-        Get portfolio value over time for a model.
-
-        Returns:
-            [
-                {"date": "2025-01-16", "portfolio_value": 10000.0, "daily_return_pct": 0.0},
-                {"date": "2025-01-17", "portfolio_value": 10150.5, "daily_return_pct": 1.51},
-                ...
-            ]
-        """
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        query = """
-            SELECT date, portfolio_value, daily_return_pct, cumulative_return_pct
-            FROM (
-                SELECT date, portfolio_value, daily_return_pct, cumulative_return_pct,
-                       ROW_NUMBER() OVER (PARTITION BY date ORDER BY action_id DESC) as rn
-                FROM positions
-                WHERE model = ?
-            )
-            WHERE rn = 1
-        """
-
-        params = [model]
-
-        if start_date:
-            query += " AND date >= ?"
-            params.append(start_date)
-        if end_date:
-            query += " AND date <= ?"
-            params.append(end_date)
-
-        query += " ORDER BY date ASC"
-
-        cursor.execute(query, params)
-
-        timeseries = []
-        for row in cursor.fetchall():
-            timeseries.append({
-                "date": row[0],
-                "portfolio_value": row[1],
-                "daily_return_pct": row[2],
-                "cumulative_return_pct": row[3]
-            })
-
-        conn.close()
-        return timeseries
-
-    def get_leaderboard(self, date: Optional[str] = None) -> List[Dict]:
-        """
-        Get model performance leaderboard.
-
-        Args:
-            date: Optional date filter (latest results if not specified)
-
-        Returns:
-            [
-                {"model": "gpt-5", "portfolio_value": 10500, "cumulative_return_pct": 5.0, "rank": 1},
-                {"model": "claude-3.7-sonnet", "portfolio_value": 10300, "cumulative_return_pct": 3.0, "rank": 2},
-                ...
-            ]
-        """
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        if date:
-            # Specific date leaderboard
-            cursor.execute("""
-                SELECT model, portfolio_value, cumulative_return_pct
-                FROM (
-                    SELECT model, portfolio_value, cumulative_return_pct,
-                           ROW_NUMBER() OVER (PARTITION BY model ORDER BY action_id DESC) as rn
-                    FROM positions
-                    WHERE date = ?
-                )
-                WHERE rn = 1
-                ORDER BY portfolio_value DESC
-            """, (date,))
-        else:
-            # Latest results for each model
-            cursor.execute("""
-                SELECT model, portfolio_value, cumulative_return_pct
-                FROM (
-                    SELECT model, portfolio_value, cumulative_return_pct,
-                           ROW_NUMBER() OVER (PARTITION BY model ORDER BY date DESC, action_id DESC) as rn
-                    FROM positions
-                )
-                WHERE rn = 1
-                ORDER BY portfolio_value DESC
-            """)
-
-        leaderboard = []
-        rank = 1
-        for row in cursor.fetchall():
-            leaderboard.append({
-                "rank": rank,
-                "model": row[0],
-                "portfolio_value": row[1],
-                "cumulative_return_pct": row[2]
-            })
-            rank += 1
-
-        conn.close()
-        return leaderboard
-```
-
---
-
-## 5. Updated Executor - Write to SQLite
-
-```python
-# api/executor.py (additions to existing code)
-
-class ModelDayExecutor:
-    # ... existing code ...
-
-    async def run_model_day(
-        self,
-        job_id: str,
-        date: str,
-        model_config: Dict[str, Any],
-        agent_class: type,
-        config: Dict[str, Any]
-    ) -> None:
-        """Execute simulation for one model on one date"""
-
-        # ... existing execution code ...
-
-        try:
-            # Execute trading session
-            await agent.run_trading_session(date)
-
-            # NEW: Extract and store results in SQLite
-            self._store_results_to_db(job_id, date, model_sig)
-
-            # Mark as completed
-            self.job_manager.update_job_detail_status(
-                job_id, date, model_sig, "completed"
-            )
-
-        except Exception as e:
-            # ... error handling ...
-
-    def _store_results_to_db(self, job_id: str, date: str, model: str) -> None:
-        """
-        Extract data from position.jsonl and log.jsonl, store in SQLite.
-
-        This runs after agent.run_trading_session() completes.
-        """
-        from api.database import get_db_connection
-        from pathlib import Path
-        import json
-
-        conn = get_db_connection()
-        cursor = conn.cursor()
-
-        # Read position.jsonl for this model
-        position_file = Path(f"data/agent_data/{model}/position/position.jsonl")
-
-        if not position_file.exists():
-            logger.warning(f"Position file not found: {position_file}")
-            return
-
-        # Find records for this date
-        with open(position_file, 'r') as f:
-            for line in f:
-                if not line.strip():
-                    continue
-
-                record = json.loads(line)
-                if record['date'] != date:
-                    continue  # Skip other dates
-
-                # Extract fields
-                action_id = record['id']
-                action = record.get('this_action', {})
-                positions = record.get('positions', {})
-
-                action_type = action.get('action', 'no_trade')
-                symbol = action.get('symbol')
-                amount = action.get('amount')
-                price = None  # TODO: Get from price data if needed
-
-                cash = positions.get('CASH', 0.0)
-                holdings = {k: v for k, v in positions.items() if k != 'CASH' and v > 0}
-
-                # Calculate portfolio value (simplified - improve with actual prices)
-                portfolio_value = cash  # + sum(holdings value)
-
-                # Calculate daily P&L (compare to previous day's closing value)
-                # TODO: Implement proper P&L calculation
-
-                # Insert position
-                cursor.execute("""
-                    INSERT INTO positions (
-                        job_id, date, model, action_id, action_type, symbol, amount, price,
-                        cash, portfolio_value, daily_profit, daily_return_pct,
-                        cumulative_profit, cumulative_return_pct, created_at
-                    ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-                """, (
-                    job_id, date, model, action_id, action_type, symbol, amount, price,
-                    cash, portfolio_value, 0.0, 0.0,  # TODO: Calculate P&L
-                    0.0, 0.0,  # TODO: Calculate cumulative P&L
-                    datetime.utcnow().isoformat() + "Z"
-                ))
-
-                position_id = cursor.lastrowid
-
-                # Insert holdings
-                for sym, qty in holdings.items():
-                    cursor.execute("""
-                        INSERT INTO holdings (position_id, symbol, quantity)
-                        VALUES (?, ?, ?)
-                    """, (position_id, sym, qty))
-
-        # Parse log.jsonl for reasoning (if detail=full is needed later)
-        # TODO: Implement log parsing and storage in reasoning_logs table
-
-        conn.commit()
-        conn.close()
-
-        logger.info(f"Stored results for {model} on {date} in SQLite")
-```
-
---
-
-## 6. Migration Path
-
-### 6.1 Backward Compatibility
-
-**Keep position.jsonl writes** to ensure existing tools/scripts continue working:
-
-```python
-# In agent/base_agent/base_agent.py - no changes needed
-# position.jsonl writing continues as normal
-
-# In api/executor.py - AFTER position.jsonl is written
-await agent.run_trading_session(date)  # Writes to position.jsonl
-self._store_results_to_db(job_id, date, model_sig)  # Copies to SQLite
-```
-
-### 6.2 Gradual Migration
-
-**Week 1:** Deploy with dual-write (JSONL + SQLite)
-**Week 2:** Verify data consistency, fix any discrepancies
-**Week 3:** Switch `/results` endpoint to read from SQLite
-**Week 4:** (Optional) Remove JSONL writes
-
---
-
-## 7. Updated API Endpoints
-
-### 7.1 Enhanced `/results` Endpoint
-
-```python
-# api/main.py
-
-from api.results_service import ResultsService
-
-results_service = ResultsService()
-
-@app.get("/results")
-async def get_results(
-    date: str,
-    model: Optional[str] = None,
-    detail: str = "minimal"
-):
-    """Get simulation results from SQLite (fast!)"""
-    # Validate date format
-    try:
-        datetime.strptime(date, "%Y-%m-%d")
-    except ValueError:
-        raise HTTPException(status_code=400, detail="Invalid date format (use YYYY-MM-DD)")
-
-    results = results_service.get_results(date, model, detail)
-
-    if not results["results"]:
-        raise HTTPException(status_code=404, detail=f"No data found for date {date}")
-
-    return results
-```
-
-### 7.2 New Endpoints for Advanced Queries
-
-```python
-@app.get("/portfolio/timeseries")
-async def get_portfolio_timeseries(
-    model: str,
-    start_date: Optional[str] = None,
-    end_date: Optional[str] = None
-):
-    """Get portfolio value over time for a model"""
-    timeseries = results_service.get_portfolio_timeseries(model, start_date, end_date)
-
-    if not timeseries:
-        raise HTTPException(status_code=404, detail=f"No data found for model {model}")
-
-    return {
-        "model": model,
-        "timeseries": timeseries
-    }
-
-
-@app.get("/leaderboard")
-async def get_leaderboard(date: Optional[str] = None):
-    """Get model performance leaderboard"""
-    leaderboard = results_service.get_leaderboard(date)
-
-    return {
-        "date": date or "latest",
-        "leaderboard": leaderboard
-    }
-```
-
---
-
-## 8. Database Maintenance
-
-### 8.1 Cleanup Old Data
-
-```python
-# api/job_manager.py (add method)
-
-def cleanup_old_data(self, days: int = 90) -> dict:
-    """
-    Delete jobs and associated data older than specified days.
-
-    Returns:
-        Summary of deleted records
-    """
-    conn = get_db_connection(self.db_path)
-    cursor = conn.cursor()
-
-    cutoff_date = (datetime.utcnow() - timedelta(days=days)).isoformat() + "Z"
-
-    # Count records before deletion
-    cursor.execute("SELECT COUNT(*) FROM jobs WHERE created_at < ?", (cutoff_date,))
-    jobs_to_delete = cursor.fetchone()[0]
-
-    cursor.execute("""
-        SELECT COUNT(*) FROM positions
-        WHERE job_id IN (SELECT job_id FROM jobs WHERE created_at < ?)
-    """, (cutoff_date,))
-    positions_to_delete = cursor.fetchone()[0]
-
-    # Delete (CASCADE will handle related tables)
-    cursor.execute("DELETE FROM jobs WHERE created_at < ?", (cutoff_date,))
-
-    conn.commit()
-    conn.close()
-
-    return {
-        "cutoff_date": cutoff_date,
-        "jobs_deleted": jobs_to_delete,
-        "positions_deleted": positions_to_delete
-    }
-```
-
-### 8.2 Vacuum Database
-
-```python
-def vacuum_database(self) -> None:
-    """Reclaim disk space after deletes"""
-    conn = get_db_connection(self.db_path)
-    conn.execute("VACUUM")
-    conn.close()
-```
-
---
-
-## Summary
-
-**Enhanced database schema** with 6 tables:
- `jobs`, `job_details` (job tracking)
- `positions`, `holdings` (simulation results)
- `reasoning_logs`, `tool_usage` (AI details)
-
-**Benefits:**
- ⚡ **10-100x faster** `/results` queries (no file I/O)
- 📊 **Advanced analytics** - timeseries, leaderboards, aggregations
- 🔒 **Data integrity** - ACID compliance, foreign keys
- 🗄️ **Single source of truth** - all data in one place
-
-**Migration strategy:** Dual-write (JSONL + SQLite) for backward compatibility
-
-**Next:** Comprehensive testing suite specification
@@ -0,0 +1,95 @@
+# Docker Deployment
+
+Production Docker deployment guide.
+
+---
+
+## Quick Deployment
+
+```bash
+git clone https://github.com/Xe138/AI-Trader.git
+cd AI-Trader
+cp .env.example .env
+# Edit .env with API keys
+docker-compose up -d
+```
+
+---
+
+## Production Configuration
+
+### Use Pre-built Image
+
+```yaml
+# docker-compose.yml
+services:
+  ai-trader:
+    image: ghcr.io/xe138/ai-trader:latest
+    # ... rest of config
+```
+
+### Build Locally
+
+```yaml
+# docker-compose.yml
+services:
+  ai-trader:
+    build: .
+    # ... rest of config
+```
+
+---
+
+## Volume Persistence
+
+Ensure data persists across restarts:
+
+```yaml
+volumes:
+  - ./data:/app/data          # Required: database and cache
+  - ./logs:/app/logs          # Recommended: application logs
+  - ./configs:/app/configs    # Required: model configurations
+```
+
+---
+
+## Environment Security
+
+- Never commit `.env` to version control
+- Use secrets management (Docker secrets, Kubernetes secrets)
+- Rotate API keys regularly
+- Restrict network access to API port
+
+---
+
+## Health Checks
+
+Docker automatically restarts unhealthy containers:
+
+```yaml
+healthcheck:
+  test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
+  interval: 30s
+  timeout: 10s
+  retries: 3
+  start_period: 40s
+```
+
+---
+
+## Monitoring
+
+```bash
+# Container status
+docker ps
+
+# Resource usage
+docker stats ai-trader
+
+# Logs
+docker logs -f ai-trader
+```
+
+---
+
+See [DOCKER_API.md](../../DOCKER_API.md) for detailed Docker documentation.
@@ -0,0 +1,49 @@
+# Monitoring
+
+Health checks, logging, and metrics.
+
+---
+
+## Health Checks
+
+```bash
+# Manual check
+curl http://localhost:8080/health
+
+# Automated monitoring (cron)
+*/5 * * * * curl -f http://localhost:8080/health || echo "API down" | mail -s "Alert" admin@example.com
+```
+
+---
+
+## Logging
+
+```bash
+# View logs
+docker logs -f ai-trader
+
+# Filter errors
+docker logs ai-trader 2>&1 | grep -i error
+
+# Export logs
+docker logs ai-trader > ai-trader.log 2>&1
+```
+
+---
+
+## Database Monitoring
+
+```bash
+# Database size
+docker exec ai-trader du -h /app/data/jobs.db
+
+# Job statistics
+docker exec ai-trader sqlite3 /app/data/jobs.db \
+  "SELECT status, COUNT(*) FROM jobs GROUP BY status;"
+```
+
+---
+
+## Metrics (Future)
+
+Prometheus metrics planned for v0.4.0.
@@ -0,0 +1,50 @@
+# Production Deployment Checklist
+
+Pre-deployment verification.
+
+---
+
+## Pre-Deployment
+
+- [ ] API keys configured in `.env`
+- [ ] Environment variables reviewed
+- [ ] Model configuration validated
+- [ ] Port availability confirmed
+- [ ] Volume mounts configured
+- [ ] Health checks enabled
+- [ ] Restart policy set
+
+---
+
+## Testing
+
+- [ ] `bash scripts/validate_docker_build.sh` passes
+- [ ] `bash scripts/test_api_endpoints.sh` passes
+- [ ] Health endpoint responds correctly
+- [ ] Sample simulation completes successfully
+
+---
+
+## Monitoring
+
+- [ ] Log aggregation configured
+- [ ] Health check monitoring enabled
+- [ ] Alerting configured for failures
+- [ ] Database backup strategy defined
+
+---
+
+## Security
+
+- [ ] API keys stored securely (not in code)
+- [ ] `.env` excluded from version control
+- [ ] Network access restricted
+- [ ] SSL/TLS configured (if exposing publicly)
+
+---
+
+## Documentation
+
+- [ ] Runbook created for operations team
+- [ ] Escalation procedures documented
+- [ ] Recovery procedures tested
@@ -0,0 +1,46 @@
+# Scaling
+
+Running multiple instances and load balancing.
+
+---
+
+## Current Limitations
+
+- Maximum 1 concurrent job per instance
+- No built-in load balancing
+- Single SQLite database per instance
+
+---
+
+## Multi-Instance Deployment
+
+For parallel simulations, deploy multiple instances:
+
+```yaml
+# docker-compose.yml
+services:
+  ai-trader-1:
+    image: ghcr.io/xe138/ai-trader:latest
+    ports:
+      - "8081:8080"
+    volumes:
+      - ./data1:/app/data
+
+  ai-trader-2:
+    image: ghcr.io/xe138/ai-trader:latest
+    ports:
+      - "8082:8080"
+    volumes:
+      - ./data2:/app/data
+```
+
+**Note:** Each instance needs separate database and data volumes.
+
+---
+
+## Load Balancing (Future)
+
+Planned for v0.4.0:
+- Shared PostgreSQL database
+- Job queue with multiple workers
+- Horizontal scaling support
@@ -0,0 +1,48 @@
+# Contributing to AI-Trader
+
+Guidelines for contributing to the project.
+
+---
+
+## Development Setup
+
+See [development-setup.md](development-setup.md)
+
+---
+
+## Pull Request Process
+
+1. Fork the repository
+2. Create feature branch: `git checkout -b feature/my-feature`
+3. Make changes
+4. Run tests: `pytest tests/`
+5. Update documentation
+6. Commit: `git commit -m "Add feature: description"`
+7. Push: `git push origin feature/my-feature`
+8. Create Pull Request
+
+---
+
+## Code Style
+
+- Follow PEP 8 for Python
+- Use type hints
+- Add docstrings to public functions
+- Keep functions focused and small
+
+---
+
+## Testing Requirements
+
+- Unit tests for new functionality
+- Integration tests for API changes
+- Maintain test coverage >80%
+
+---
+
+## Documentation
+
+- Update README.md for new features
+- Add entries to CHANGELOG.md
+- Update API_REFERENCE.md for endpoint changes
+- Include examples in relevant guides
@@ -0,0 +1,69 @@
+# Adding Custom AI Models
+
+How to add and configure custom AI models.
+
+---
+
+## Basic Setup
+
+Edit `configs/default_config.json`:
+
+```json
+{
+  "models": [
+    {
+      "name": "Your Model Name",
+      "basemodel": "provider/model-id",
+      "signature": "unique-identifier",
+      "enabled": true
+    }
+  ]
+}
+```
+
+---
+
+## Examples
+
+### OpenAI Models
+
+```json
+{
+  "name": "GPT-4",
+  "basemodel": "openai/gpt-4",
+  "signature": "gpt-4",
+  "enabled": true
+}
+```
+
+### Anthropic Claude
+
+```json
+{
+  "name": "Claude 3.7 Sonnet",
+  "basemodel": "anthropic/claude-3.7-sonnet",
+  "signature": "claude-3.7-sonnet",
+  "enabled": true,
+  "openai_base_url": "https://api.anthropic.com/v1",
+  "openai_api_key": "your-anthropic-key"
+}
+```
+
+### Via OpenRouter
+
+```json
+{
+  "name": "DeepSeek",
+  "basemodel": "deepseek/deepseek-chat",
+  "signature": "deepseek",
+  "enabled": true,
+  "openai_base_url": "https://openrouter.ai/api/v1",
+  "openai_api_key": "your-openrouter-key"
+}
+```
+
+---
+
+## Field Reference
+
+See [docs/user-guide/configuration.md](../user-guide/configuration.md#model-configuration-fields) for complete field descriptions.
@@ -0,0 +1,68 @@
+# Architecture
+
+System design and component overview.
+
+---
+
+## Component Diagram
+
+See README.md for architecture diagram.
+
+---
+
+## Key Components
+
+### FastAPI Server (`api/main.py`)
+- REST API endpoints
+- Request validation
+- Response formatting
+
+### Job Manager (`api/job_manager.py`)
+- Job lifecycle management
+- SQLite operations
+- Concurrency control
+
+### Simulation Worker (`api/simulation_worker.py`)
+- Background job execution
+- Date-sequential, model-parallel orchestration
+- Error handling
+
+### Model-Day Executor (`api/model_day_executor.py`)
+- Single model-day execution
+- Runtime config isolation
+- Agent invocation
+
+### Base Agent (`agent/base_agent/base_agent.py`)
+- Trading session execution
+- MCP tool integration
+- Position management
+
+### MCP Services (`agent_tools/`)
+- Math, Search, Trade, Price tools
+- Internal HTTP servers
+- Localhost-only access
+
+---
+
+## Data Flow
+
+1. API receives trigger request
+2. Job Manager validates and creates job
+3. Worker starts background execution
+4. For each date (sequential):
+   - For each model (parallel):
+     - Executor creates isolated runtime config
+     - Agent executes trading session
+     - Results stored in database
+5. Job status updated
+6. Results available via API
+
+---
+
+## Anti-Look-Ahead Controls
+
+- `TODAY_DATE` in runtime config limits data access
+- Price queries filter by date
+- Search results filtered by publication date
+
+See [CLAUDE.md](../../CLAUDE.md) for implementation details.
@@ -0,0 +1,94 @@
+# Database Schema
+
+SQLite database schema reference.
+
+---
+
+## Tables
+
+### jobs
+Job metadata and overall status.
+
+```sql
+CREATE TABLE jobs (
+    job_id TEXT PRIMARY KEY,
+    config_path TEXT NOT NULL,
+    status TEXT CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
+    date_range TEXT,  -- JSON array
+    models TEXT,      -- JSON array
+    created_at TEXT,
+    started_at TEXT,
+    completed_at TEXT,
+    total_duration_seconds REAL,
+    error TEXT
+);
+```
+
+### job_details
+Per model-day execution details.
+
+```sql
+CREATE TABLE job_details (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    job_id TEXT,
+    model_signature TEXT,
+    trading_date TEXT,
+    status TEXT CHECK(status IN ('pending', 'running', 'completed', 'failed')),
+    start_time TEXT,
+    end_time TEXT,
+    duration_seconds REAL,
+    error TEXT,
+    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
+);
+```
+
+### positions
+Trading position records with P&L.
+
+```sql
+CREATE TABLE positions (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    job_id TEXT,
+    date TEXT,
+    model TEXT,
+    action_id INTEGER,
+    action_type TEXT,
+    symbol TEXT,
+    amount INTEGER,
+    price REAL,
+    cash REAL,
+    portfolio_value REAL,
+    daily_profit REAL,
+    daily_return_pct REAL,
+    created_at TEXT
+);
+```
+
+### holdings
+Portfolio holdings breakdown per position.
+
+```sql
+CREATE TABLE holdings (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    position_id INTEGER,
+    symbol TEXT,
+    quantity REAL,
+    FOREIGN KEY (position_id) REFERENCES positions(id) ON DELETE CASCADE
+);
+```
+
+### price_data
+Cached historical price data.
+
+### price_coverage
+Data availability tracking per symbol.
+
+### reasoning_logs
+AI decision reasoning (when enabled).
+
+### tool_usage
+MCP tool usage statistics.
+
+---
+
+See `api/database.py` for complete schema definitions.
@@ -0,0 +1,71 @@
+# Development Setup
+
+Local development without Docker.
+
+---
+
+## Prerequisites
+
+- Python 3.10+
+- pip
+- virtualenv
+
+---
+
+## Setup Steps
+
+### 1. Clone Repository
+
+```bash
+git clone https://github.com/Xe138/AI-Trader.git
+cd AI-Trader
+```
+
+### 2. Create Virtual Environment
+
+```bash
+python3 -m venv venv
+source venv/bin/activate  # Linux/Mac
+# venv\Scripts\activate  # Windows
+```
+
+### 3. Install Dependencies
+
+```bash
+pip install -r requirements.txt
+```
+
+### 4. Configure Environment
+
+```bash
+cp .env.example .env
+# Edit .env with your API keys
+```
+
+### 5. Start MCP Services
+
+```bash
+cd agent_tools
+python start_mcp_services.py &
+cd ..
+```
+
+### 6. Start API Server
+
+```bash
+python -m uvicorn api.main:app --reload --port 8080
+```
+
+---
+
+## Running Tests
+
+```bash
+pytest tests/ -v
+```
+
+---
+
+## Project Structure
+
+See [CLAUDE.md](../../CLAUDE.md) for complete project structure.
@@ -0,0 +1,64 @@
+# Testing Guide
+
+Guide for testing AI-Trader during development.
+
+---
+
+## Automated Testing
+
+### Docker Build Validation
+
+```bash
+chmod +x scripts/*.sh
+bash scripts/validate_docker_build.sh
+```
+
+Validates:
+- Docker installation
+- Environment configuration
+- Image build
+- Container startup
+- Health endpoint
+
+### API Endpoint Testing
+
+```bash
+bash scripts/test_api_endpoints.sh
+```
+
+Tests all API endpoints with real simulations.
+
+---
+
+## Unit Tests
+
+```bash
+# Install dependencies
+pip install -r requirements.txt
+
+# Run tests
+pytest tests/ -v
+
+# With coverage
+pytest tests/ -v --cov=api --cov-report=term-missing
+
+# Specific test file
+pytest tests/unit/test_job_manager.py -v
+```
+
+---
+
+##  Integration Tests
+
+```bash
+# Run integration tests only
+pytest tests/integration/ -v
+
+# Test with real API server
+docker-compose up -d
+pytest tests/integration/test_api_endpoints.py -v
+```
+
+---
+
+For detailed testing procedures, see root [TESTING_GUIDE.md](../../TESTING_GUIDE.md).
@@ -1,873 +0,0 @@
-# Implementation Specifications: Agent, Docker, and Windmill Integration
-
-## Part 1: BaseAgent Refactoring
-
-### 1.1 Current State Analysis
-
-**Current `base_agent.py` structure:**
- `run_date_range(init_date, end_date)` - Loops through all dates
- `run_trading_session(today_date)` - Executes single day
- `get_trading_dates()` - Calculates dates from position.jsonl
-
-**What works well:**
- `run_trading_session()` is already isolated for single-day execution ✅
- Agent initialization is separate from execution ✅
- Position tracking via position.jsonl ✅
-
-**What needs modification:**
- `runtime_env.json` management (move to RuntimeConfigManager)
- `get_trading_dates()` logic (move to API layer for date range calculation)
-
-### 1.2 Required Changes
-
-#### Change 1: No modifications needed to core execution logic
-
-**Rationale:** `BaseAgent.run_trading_session(today_date)` already supports single-day execution. The worker will call this method directly.
-
-```python
-# Current code (already suitable for API mode):
-async def run_trading_session(self, today_date: str) -> None:
-    """Run single day trading session"""
-    # This method is perfect as-is for worker to call
-```
-
-**Action:** ✅ No changes needed
-
---
-
-#### Change 2: Make runtime config path injectable
-
-**Current issue:**
-```python
-# In base_agent.py, uses global config
-from tools.general_tools import get_config_value, write_config_value
-```
-
-**Problem:** `get_config_value()` reads from `os.environ["RUNTIME_ENV_PATH"]`, which the worker will override per execution.
-
-**Solution:** Already works! The worker sets `RUNTIME_ENV_PATH` before calling agent methods:
-
-```python
-# In executor.py
-os.environ["RUNTIME_ENV_PATH"] = runtime_config_path
-await agent.run_trading_session(date)
-```
-
-**Action:** ✅ No changes needed (env var override is sufficient)
-
---
-
-#### Change 3: Optional - Separate agent initialization from date-range logic
-
-**Current code in `main.py`:**
-```python
-# Creates agent
-agent = AgentClass(...)
-await agent.initialize()
-
-# Runs all dates
-await agent.run_date_range(INIT_DATE, END_DATE)
-```
-
-**For API mode:**
-```python
-# Worker creates agent
-agent = AgentClass(...)
-await agent.initialize()
-
-# Worker calls run_trading_session directly for each date
-for date in date_range:
-    await agent.run_trading_session(date)
-```
-
-**Action:** ✅ Worker will not use `run_date_range()` method. No changes needed to agent.
-
---
-
-### 1.3 Summary: BaseAgent Changes
-
-**Result:** **NO CODE CHANGES REQUIRED** to `base_agent.py`!
-
-The existing architecture is already compatible with the API worker pattern:
- `run_trading_session()` is the perfect interface
- Runtime config is managed via environment variables
- Position tracking works as-is
-
-**Only change needed:** Worker must call `agent.register_agent()` if position file doesn't exist (already handled by `get_trading_dates()` logic).
-
---
-
-## Part 2: Docker Configuration
-
-### 2.1 Current Docker Setup
-
-**Existing files:**
- `Dockerfile` - Multi-stage build for batch mode
- `docker-compose.yml` - Service definition
- `docker-entrypoint.sh` - Launches data fetch + main.py
-
-### 2.2 Modified Dockerfile
-
-```dockerfile
-# Existing stages remain the same...
-FROM python:3.10-slim
-
-WORKDIR /app
-
-# Install system dependencies
-RUN apt-get update && apt-get install -y \
-    curl \
-    && rm -rf /var/lib/apt/lists/*
-
-# Copy requirements
-COPY requirements.txt requirements-api.txt ./
-RUN pip install --no-cache-dir -r requirements.txt
-RUN pip install --no-cache-dir -r requirements-api.txt
-
-# Copy application code
-COPY . /app
-
-# Create data directories
-RUN mkdir -p /app/data /app/configs
-
-# Copy and set permissions for entrypoint
-COPY docker-entrypoint-api.sh /app/
-RUN chmod +x /app/docker-entrypoint-api.sh
-
-# Expose API port
-EXPOSE 8080
-
-# Health check
-HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
-    CMD curl -f http://localhost:8080/health || exit 1
-
-# Run API service
-CMD ["/app/docker-entrypoint-api.sh"]
-```
-
-### 2.3 New requirements-api.txt
-
-```
-fastapi==0.109.0
-uvicorn[standard]==0.27.0
-pydantic==2.5.3
-pydantic-settings==2.1.0
-python-multipart==0.0.6
-```
-
-### 2.4 New docker-entrypoint-api.sh
-
-```bash
-#!/bin/bash
-set -e
-
-echo "=================================="
-echo "AI-Trader API Service Starting"
-echo "=================================="
-
-# Cleanup stale runtime configs from previous runs
-echo "Cleaning up stale runtime configs..."
-python3 -c "from api.runtime_manager import RuntimeConfigManager; RuntimeConfigManager().cleanup_all_runtime_configs()"
-
-# Start MCP services in background
-echo "Starting MCP services..."
-cd /app/agent_tools
-python3 start_mcp_services.py &
-MCP_PID=$!
-
-# Wait for MCP services to be ready
-echo "Waiting for MCP services to initialize..."
-sleep 10
-
-# Verify MCP services are running
-echo "Verifying MCP services..."
-for port in ${MATH_HTTP_PORT:-8000} ${SEARCH_HTTP_PORT:-8001} ${TRADE_HTTP_PORT:-8002} ${GETPRICE_HTTP_PORT:-8003}; do
-    if ! curl -f -s http://localhost:$port/health > /dev/null 2>&1; then
-        echo "WARNING: MCP service on port $port not responding"
-    else
-        echo "✓ MCP service on port $port is healthy"
-    fi
-done
-
-# Start API server
-echo "Starting FastAPI server..."
-cd /app
-
-# Use environment variables for host and port
-API_HOST=${API_HOST:-0.0.0.0}
-API_PORT=${API_PORT:-8080}
-
-echo "API will be available at http://${API_HOST}:${API_PORT}"
-echo "=================================="
-
-# Start uvicorn with single worker (for simplicity in MVP)
-exec uvicorn api.main:app \
-    --host ${API_HOST} \
-    --port ${API_PORT} \
-    --workers 1 \
-    --log-level info
-
-# Cleanup function (called on exit)
-trap "echo 'Shutting down...'; kill $MCP_PID 2>/dev/null || true" EXIT SIGTERM SIGINT
-```
-
-### 2.5 Updated docker-compose.yml
-
-```yaml
-version: '3.8'
-
-services:
-  ai-trader:
-    build:
-      context: .
-      dockerfile: Dockerfile
-    container_name: ai-trader-api
-    ports:
-      - "8080:8080"
-    volumes:
-      - ./data:/app/data
-      - ./configs:/app/configs
-      - ./logs:/app/logs
-    env_file:
-      - .env
-    environment:
-      - API_HOST=0.0.0.0
-      - API_PORT=8080
-      - RUNTIME_ENV_PATH=/app/data/runtime_env.json
-    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
-      interval: 30s
-      timeout: 10s
-      retries: 3
-      start_period: 40s
-    restart: unless-stopped
-    networks:
-      - ai-trader-network
-
-networks:
-  ai-trader-network:
-    driver: bridge
-```
-
-### 2.6 Environment Variables Reference
-
-```bash
-# .env file example for API mode
-
-# OpenAI Configuration
-OPENAI_API_BASE=https://api.openai.com/v1
-OPENAI_API_KEY=sk-...
-
-# API Keys
-ALPHAADVANTAGE_API_KEY=your_alpha_vantage_key
-JINA_API_KEY=your_jina_key
-
-# MCP Service Ports
-MATH_HTTP_PORT=8000
-SEARCH_HTTP_PORT=8001
-TRADE_HTTP_PORT=8002
-GETPRICE_HTTP_PORT=8003
-
-# API Configuration
-API_HOST=0.0.0.0
-API_PORT=8080
-
-# Runtime Config
-RUNTIME_ENV_PATH=/app/data/runtime_env.json
-
-# Job Configuration
-MAX_CONCURRENT_JOBS=1
-```
-
-### 2.7 Docker Commands Reference
-
-```bash
-# Build image
-docker-compose build
-
-# Start service
-docker-compose up
-
-# Start in background
-docker-compose up -d
-
-# View logs
-docker-compose logs -f
-
-# Check health
-docker-compose ps
-
-# Stop service
-docker-compose down
-
-# Restart service
-docker-compose restart
-
-# Execute command in running container
-docker-compose exec ai-trader python3 -c "from api.job_manager import JobManager; jm = JobManager(); print(jm.get_current_job())"
-
-# Access container shell
-docker-compose exec ai-trader bash
-```
-
---
-
-## Part 3: Windmill Integration
-
-### 3.1 Windmill Overview
-
-Windmill (windmill.dev) is a workflow automation platform that can:
- Schedule cron jobs
- Execute TypeScript/Python scripts
- Store state between runs
- Build UI dashboards
-
-**Integration approach:**
-1. Windmill cron job triggers simulation daily
-2. Windmill polls for job completion
-3. Windmill retrieves results and stores in internal database
-4. Windmill dashboard displays performance metrics
-
-### 3.2 Flow 1: Daily Simulation Trigger
-
-**File:** `windmill/trigger_simulation.ts`
-
-```typescript
-import { Resource } from "https://deno.land/x/windmill@v1.0.0/mod.ts";
-
-export async function main(
-  ai_trader_api: Resource<"ai_trader_api">
-) {
-  const apiUrl = ai_trader_api.base_url; // e.g., "http://ai-trader:8080"
-
-  // Trigger simulation
-  const response = await fetch(`${apiUrl}/simulate/trigger`, {
-    method: "POST",
-    headers: {
-      "Content-Type": "application/json",
-    },
-    body: JSON.stringify({
-      config_path: "configs/default_config.json"
-    }),
-  });
-
-  if (!response.ok) {
-    throw new Error(`API error: ${response.status} ${response.statusText}`);
-  }
-
-  const data = await response.json();
-
-  // Handle different response types
-  if (data.status === "current") {
-    console.log("Simulation already up-to-date");
-    return {
-      action: "skipped",
-      message: data.message,
-      last_date: data.last_simulation_date
-    };
-  }
-
-  // Store job_id in Windmill state for poller to pick up
-  await Deno.writeTextFile(
-    `/tmp/current_job_id.txt`,
-    data.job_id
-  );
-
-  console.log(`Simulation triggered: ${data.job_id}`);
-  console.log(`Date range: ${data.date_range.join(", ")}`);
-  console.log(`Models: ${data.models.join(", ")}`);
-
-  return {
-    action: "triggered",
-    job_id: data.job_id,
-    date_range: data.date_range,
-    models: data.models,
-    status: data.status
-  };
-}
-```
-
-**Windmill Resource Configuration:**
-```json
-{
-  "resource_type": "ai_trader_api",
-  "base_url": "http://ai-trader:8080"
-}
-```
-
-**Schedule:** Every day at 6:00 AM
-
---
-
-### 3.3 Flow 2: Job Status Poller
-
-**File:** `windmill/poll_simulation_status.ts`
-
-```typescript
-import { Resource } from "https://deno.land/x/windmill@v1.0.0/mod.ts";
-
-export async function main(
-  ai_trader_api: Resource<"ai_trader_api">,
-  job_id?: string
-) {
-  const apiUrl = ai_trader_api.base_url;
-
-  // Get job_id from parameter or from current job file
-  let jobId = job_id;
-  if (!jobId) {
-    try {
-      jobId = await Deno.readTextFile("/tmp/current_job_id.txt");
-    } catch {
-      // No current job
-      return {
-        status: "no_job",
-        message: "No active simulation job"
-      };
-    }
-  }
-
-  // Poll status
-  const response = await fetch(`${apiUrl}/simulate/status/${jobId}`);
-
-  if (!response.ok) {
-    if (response.status === 404) {
-      return {
-        status: "not_found",
-        message: "Job not found",
-        job_id: jobId
-      };
-    }
-    throw new Error(`API error: ${response.status}`);
-  }
-
-  const data = await response.json();
-
-  console.log(`Job ${jobId}: ${data.status}`);
-  console.log(`Progress: ${data.progress.completed}/${data.progress.total_model_days} model-days`);
-
-  // If job is complete, retrieve results
-  if (data.status === "completed" || data.status === "partial") {
-    console.log("Job finished, retrieving results...");
-
-    const results = [];
-    for (const date of data.date_range) {
-      const resultsResponse = await fetch(
-        `${apiUrl}/results?date=${date}&detail=minimal`
-      );
-
-      if (resultsResponse.ok) {
-        const dateResults = await resultsResponse.json();
-        results.push(dateResults);
-      }
-    }
-
-    // Clean up job_id file
-    try {
-      await Deno.remove("/tmp/current_job_id.txt");
-    } catch {
-      // Ignore
-    }
-
-    return {
-      status: data.status,
-      job_id: jobId,
-      completed_at: data.completed_at,
-      duration_seconds: data.total_duration_seconds,
-      results: results
-    };
-  }
-
-  // Job still running
-  return {
-    status: data.status,
-    job_id: jobId,
-    progress: data.progress,
-    started_at: data.created_at
-  };
-}
-```
-
-**Schedule:** Every 5 minutes (will skip if no active job)
-
---
-
-### 3.4 Flow 3: Results Retrieval and Storage
-
-**File:** `windmill/store_simulation_results.py`
-
-```python
-import wmill
-from datetime import datetime
-
-def main(
-    job_results: dict,
-    database: str = "simulation_results"
-):
-    """
-    Store simulation results in Windmill's internal database.
-
-    Args:
-        job_results: Output from poll_simulation_status flow
-        database: Database name for storage
-    """
-    if job_results.get("status") not in ("completed", "partial"):
-        return {"message": "Job not complete, skipping storage"}
-
-    # Extract results
-    job_id = job_results["job_id"]
-    results = job_results.get("results", [])
-
-    stored_count = 0
-
-    for date_result in results:
-        date = date_result["date"]
-
-        for model_result in date_result["results"]:
-            model = model_result["model"]
-            positions = model_result["positions"]
-            pnl = model_result["daily_pnl"]
-
-            # Store in Windmill database
-            record = {
-                "job_id": job_id,
-                "date": date,
-                "model": model,
-                "cash": positions.get("CASH", 0),
-                "portfolio_value": pnl["portfolio_value"],
-                "daily_profit": pnl["profit"],
-                "daily_return_pct": pnl["return_pct"],
-                "stored_at": datetime.utcnow().isoformat()
-            }
-
-            # Use Windmill's internal storage
-            wmill.set_variable(
-                path=f"{database}/{model}/{date}",
-                value=record
-            )
-
-            stored_count += 1
-
-    return {
-        "stored_count": stored_count,
-        "job_id": job_id,
-        "message": f"Stored {stored_count} model-day results"
-    }
-```
-
---
-
-### 3.5 Windmill Dashboard Example
-
-**File:** `windmill/dashboard.json` (Windmill App Builder)
-
-```json
-{
-  "grid": [
-    {
-      "type": "table",
-      "id": "performance_table",
-      "configuration": {
-        "title": "Model Performance Summary",
-        "data_source": {
-          "type": "script",
-          "path": "f/simulation_results/get_latest_performance"
-        },
-        "columns": [
-          {"field": "model", "header": "Model"},
-          {"field": "latest_date", "header": "Latest Date"},
-          {"field": "portfolio_value", "header": "Portfolio Value"},
-          {"field": "total_return_pct", "header": "Total Return %"},
-          {"field": "daily_return_pct", "header": "Daily Return %"}
-        ]
-      }
-    },
-    {
-      "type": "chart",
-      "id": "portfolio_chart",
-      "configuration": {
-        "title": "Portfolio Value Over Time",
-        "chart_type": "line",
-        "data_source": {
-          "type": "script",
-          "path": "f/simulation_results/get_timeseries"
-        },
-        "x_axis": "date",
-        "y_axis": "portfolio_value",
-        "series": "model"
-      }
-    }
-  ]
-}
-```
-
-**Supporting Script:** `windmill/get_latest_performance.py`
-
-```python
-import wmill
-
-def main(database: str = "simulation_results"):
-    """Get latest performance for each model"""
-
-    # Query Windmill variables
-    all_vars = wmill.list_variables(path_prefix=f"{database}/")
-
-    # Group by model
-    models = {}
-    for var in all_vars:
-        parts = var["path"].split("/")
-        if len(parts) >= 3:
-            model = parts[1]
-            date = parts[2]
-
-            value = wmill.get_variable(var["path"])
-
-            if model not in models:
-                models[model] = []
-            models[model].append(value)
-
-    # Compute summary for each model
-    summary = []
-    for model, records in models.items():
-        # Sort by date
-        records.sort(key=lambda x: x["date"], reverse=True)
-        latest = records[0]
-
-        # Calculate total return
-        initial_value = 10000  # Initial cash
-        total_return_pct = ((latest["portfolio_value"] - initial_value) / initial_value) * 100
-
-        summary.append({
-            "model": model,
-            "latest_date": latest["date"],
-            "portfolio_value": latest["portfolio_value"],
-            "total_return_pct": round(total_return_pct, 2),
-            "daily_return_pct": latest["daily_return_pct"]
-        })
-
-    return summary
-```
-
---
-
-### 3.6 Windmill Workflow Orchestration
-
-**Main Workflow:** `windmill/daily_simulation_workflow.yaml`
-
-```yaml
-name: Daily AI Trader Simulation
-description: Trigger simulation, poll status, and store results
-
-triggers:
-  - type: cron
-    schedule: "0 6 * * *"  # Every day at 6 AM
-
-steps:
-  - id: trigger
-    name: Trigger Simulation
-    script: f/ai_trader/trigger_simulation
-    outputs:
-      - job_id
-      - action
-
-  - id: wait
-    name: Wait for Job Start
-    type: sleep
-    duration: 10s
-
-  - id: poll_loop
-    name: Poll Until Complete
-    type: loop
-    max_iterations: 60  # Poll for up to 5 hours (60 × 5min)
-    interval: 5m
-    script: f/ai_trader/poll_simulation_status
-    inputs:
-      job_id: ${{ steps.trigger.outputs.job_id }}
-    break_condition: |
-      ${{ steps.poll_loop.outputs.status in ['completed', 'partial', 'failed'] }}
-
-  - id: store_results
-    name: Store Results in Database
-    script: f/ai_trader/store_simulation_results
-    inputs:
-      job_results: ${{ steps.poll_loop.outputs }}
-    condition: |
-      ${{ steps.poll_loop.outputs.status in ['completed', 'partial'] }}
-
-  - id: notify
-    name: Send Notification
-    type: email
-    to: admin@example.com
-    subject: "AI Trader Simulation Complete"
-    body: |
-      Simulation completed for ${{ steps.poll_loop.outputs.job_id }}
-      Status: ${{ steps.poll_loop.outputs.status }}
-      Duration: ${{ steps.poll_loop.outputs.duration_seconds }}s
-```
-
---
-
-### 3.7 Testing Windmill Integration Locally
-
-**1. Start AI-Trader API:**
-```bash
-docker-compose up -d
-```
-
-**2. Test trigger endpoint:**
-```bash
-curl -X POST http://localhost:8080/simulate/trigger \
-  -H "Content-Type: application/json" \
-  -d '{"config_path": "configs/default_config.json"}'
-```
-
-**3. Test status polling:**
-```bash
-JOB_ID="<job_id_from_step_2>"
-curl http://localhost:8080/simulate/status/$JOB_ID
-```
-
-**4. Test results retrieval:**
-```bash
-curl "http://localhost:8080/results?date=2025-01-16&model=gpt-5&detail=minimal"
-```
-
-**5. Deploy to Windmill:**
-```bash
-# Install Windmill CLI
-npm install -g windmill-cli
-
-# Login to your Windmill instance
-wmill login https://your-windmill-instance.com
-
-# Deploy scripts
-wmill script push windmill/trigger_simulation.ts
-wmill script push windmill/poll_simulation_status.ts
-wmill script push windmill/store_simulation_results.py
-
-# Deploy workflow
-wmill flow push windmill/daily_simulation_workflow.yaml
-```
-
---
-
-## Part 4: Complete File Structure
-
-After implementation, the project structure will be:
-
-```
-AI-Trader/
-├── api/
-│   ├── __init__.py
-│   ├── main.py                 # FastAPI application
-│   ├── models.py               # Pydantic request/response models
-│   ├── job_manager.py          # Job lifecycle management
-│   ├── database.py             # SQLite utilities
-│   ├── worker.py               # Background simulation worker
-│   ├── executor.py             # Single model-day execution
-│   └── runtime_manager.py      # Runtime config isolation
-│
-├── docs/
-│   ├── api-specification.md
-│   ├── job-manager-specification.md
-│   ├── worker-specification.md
-│   └── implementation-specifications.md
-│
-├── windmill/
-│   ├── trigger_simulation.ts
-│   ├── poll_simulation_status.ts
-│   ├── store_simulation_results.py
-│   ├── get_latest_performance.py
-│   ├── daily_simulation_workflow.yaml
-│   └── dashboard.json
-│
-├── agent/
-│   └── base_agent/
-│       └── base_agent.py       # NO CHANGES NEEDED
-│
-├── agent_tools/
-│   └── ... (existing MCP tools)
-│
-├── data/
-│   ├── jobs.db                 # SQLite database (created automatically)
-│   ├── runtime_env*.json       # Runtime configs (temporary)
-│   ├── agent_data/             # Existing position/log data
-│   └── merged.jsonl            # Existing price data
-│
-├── Dockerfile                  # Updated for API mode
-├── docker-compose.yml          # Updated service definition
-├── docker-entrypoint-api.sh    # New API entrypoint
-├── requirements-api.txt        # FastAPI dependencies
-├── .env                        # Environment configuration
-└── main.py                     # Existing (used by worker)
-```
-
---
-
-## Part 5: Implementation Checklist
-
-### Phase 1: API Foundation (Days 1-2)
- [ ] Create `api/` directory structure
- [ ] Implement `api/models.py` with Pydantic models
- [ ] Implement `api/database.py` with SQLite utilities
- [ ] Implement `api/job_manager.py` with job CRUD operations
- [ ] Write unit tests for job_manager
- [ ] Test database operations manually
-
-### Phase 2: Worker & Executor (Days 3-4)
- [ ] Implement `api/runtime_manager.py`
- [ ] Implement `api/executor.py` for single model-day execution
- [ ] Implement `api/worker.py` for job orchestration
- [ ] Test worker with mock agent
- [ ] Test runtime config isolation
-
-### Phase 3: FastAPI Endpoints (Days 5-6)
- [ ] Implement `api/main.py` with all endpoints
- [ ] Implement `/simulate/trigger` with background tasks
- [ ] Implement `/simulate/status/{job_id}`
- [ ] Implement `/simulate/current`
- [ ] Implement `/results` with detail levels
- [ ] Implement `/health` with MCP checks
- [ ] Test all endpoints with Postman/curl
-
-### Phase 4: Docker Integration (Day 7)
- [ ] Update `Dockerfile`
- [ ] Create `docker-entrypoint-api.sh`
- [ ] Create `requirements-api.txt`
- [ ] Update `docker-compose.yml`
- [ ] Test Docker build
- [ ] Test container startup and health checks
- [ ] Test end-to-end simulation via API in Docker
-
-### Phase 5: Windmill Integration (Days 8-9)
- [ ] Create Windmill scripts (trigger, poll, store)
- [ ] Test scripts locally against Docker API
- [ ] Deploy scripts to Windmill instance
- [ ] Create Windmill workflow
- [ ] Test workflow end-to-end
- [ ] Create Windmill dashboard
- [ ] Document Windmill setup process
-
-### Phase 6: Testing & Documentation (Day 10)
- [ ] Integration tests for complete workflow
- [ ] Load testing (multiple concurrent requests)
- [ ] Error scenario testing (MCP down, API timeout)
- [ ] Update README.md with API usage
- [ ] Create API documentation (Swagger/OpenAPI)
- [ ] Create deployment guide
- [ ] Create troubleshooting guide
-
---
-
-## Summary
-
-This comprehensive specification covers:
-
-1. **BaseAgent Refactoring:** Minimal changes needed (existing code compatible)
-2. **Docker Configuration:** API service mode with health checks and proper entrypoint
-3. **Windmill Integration:** Complete workflow automation with TypeScript/Python scripts
-4. **File Structure:** Clear organization of new API components
-5. **Implementation Checklist:** Step-by-step plan for 10-day implementation
-
-**Total estimated implementation time:** 10 working days for MVP
-
-**Next Step:** Review all specifications (api-specification.md, job-manager-specification.md, worker-specification.md, and this document) and approve before beginning implementation.
@@ -1,963 +0,0 @@
-# Job Manager & Database Specification
-
-## 1. Overview
-
-The Job Manager is responsible for:
-1. **Job lifecycle management** - Creating, tracking, updating job status
-2. **Database operations** - SQLite CRUD operations for jobs and job_details
-3. **Concurrency control** - Ensuring only one simulation runs at a time
-4. **State persistence** - Maintaining job state across API restarts
-
---
-
-## 2. Database Schema
-
-### 2.1 SQLite Database Location
-
-```
-data/jobs.db
-```
-
-**Rationale:** Co-located with simulation data for easy volume mounting
-
-### 2.2 Table: jobs
-
-**Purpose:** Track high-level job metadata and status
-
-```sql
-CREATE TABLE IF NOT EXISTS jobs (
-    job_id TEXT PRIMARY KEY,
-    config_path TEXT NOT NULL,
-    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
-    date_range TEXT NOT NULL,  -- JSON array: ["2025-01-16", "2025-01-17"]
-    models TEXT NOT NULL,      -- JSON array: ["claude-3.7-sonnet", "gpt-5"]
-    created_at TEXT NOT NULL,  -- ISO 8601: "2025-01-20T14:30:00Z"
-    started_at TEXT,           -- When first model-day started
-    completed_at TEXT,         -- When last model-day finished
-    total_duration_seconds REAL,
-    error TEXT                 -- Top-level error message if job failed
-);
-
-- Indexes for performance
-CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status);
-CREATE INDEX IF NOT EXISTS idx_jobs_created_at ON jobs(created_at DESC);
-```
-
-**Field Details:**
- `job_id`: UUID v4 (e.g., `550e8400-e29b-41d4-a716-446655440000`)
- `status`: Current job state
-  - `pending`: Job created, not started yet
-  - `running`: At least one model-day is executing
-  - `completed`: All model-days succeeded
-  - `partial`: Some model-days succeeded, some failed
-  - `failed`: All model-days failed (rare edge case)
- `date_range`: JSON string for easy querying
- `models`: JSON string of enabled model signatures
-
-### 2.3 Table: job_details
-
-**Purpose:** Track individual model-day execution status
-
-```sql
-CREATE TABLE IF NOT EXISTS job_details (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    job_id TEXT NOT NULL,
-    date TEXT NOT NULL,        -- "2025-01-16"
-    model TEXT NOT NULL,       -- "gpt-5"
-    status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
-    started_at TEXT,
-    completed_at TEXT,
-    duration_seconds REAL,
-    error TEXT,                -- Error message if this model-day failed
-    FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
-);
-
-- Indexes
-CREATE INDEX IF NOT EXISTS idx_job_details_job_id ON job_details(job_id);
-CREATE INDEX IF NOT EXISTS idx_job_details_status ON job_details(status);
-CREATE UNIQUE INDEX IF NOT EXISTS idx_job_details_unique ON job_details(job_id, date, model);
-```
-
-**Field Details:**
- Each row represents one model-day (e.g., `gpt-5` on `2025-01-16`)
- `UNIQUE INDEX` prevents duplicate execution entries
- `ON DELETE CASCADE` ensures orphaned records are cleaned up
-
-### 2.4 Example Data
-
-**jobs table:**
-```
-job_id                                | config_path              | status    | date_range                        | models                          | created_at           | started_at           | completed_at         | total_duration_seconds
--------------------------------------|--------------------------|-----------|-----------------------------------|---------------------------------|----------------------|----------------------|----------------------|----------------------
-550e8400-e29b-41d4-a716-446655440000 | configs/default_config.json | completed | ["2025-01-16","2025-01-17"]     | ["gpt-5","claude-3.7-sonnet"]  | 2025-01-20T14:25:00Z | 2025-01-20T14:25:10Z | 2025-01-20T14:29:45Z | 275.3
-```
-
-**job_details table:**
-```
-id | job_id                               | date       | model              | status    | started_at           | completed_at         | duration_seconds | error
---|--------------------------------------|------------|--------------------|-----------|----------------------|----------------------|------------------|------
-1  | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-16 | gpt-5              | completed | 2025-01-20T14:25:10Z | 2025-01-20T14:25:48Z | 38.2             | NULL
-2  | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-16 | claude-3.7-sonnet  | completed | 2025-01-20T14:25:10Z | 2025-01-20T14:25:55Z | 45.1             | NULL
-3  | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-17 | gpt-5              | completed | 2025-01-20T14:25:56Z | 2025-01-20T14:26:36Z | 40.0             | NULL
-4  | 550e8400-e29b-41d4-a716-446655440000 | 2025-01-17 | claude-3.7-sonnet  | completed | 2025-01-20T14:25:56Z | 2025-01-20T14:26:42Z | 46.5             | NULL
-```
-
---
-
-## 3. Job Manager Class
-
-### 3.1 File Structure
-
-```
-api/
-├── job_manager.py      # Core JobManager class
-├── database.py         # SQLite connection and utilities
-└── models.py           # Pydantic models
-```
-
-### 3.2 JobManager Interface
-
-```python
-# api/job_manager.py
-
-from datetime import datetime
-from typing import Optional, List, Dict, Tuple
-import uuid
-import json
-from api.database import get_db_connection
-
-class JobManager:
-    """Manages simulation job lifecycle and database operations"""
-
-    def __init__(self, db_path: str = "data/jobs.db"):
-        self.db_path = db_path
-        self._initialize_database()
-
-    def _initialize_database(self) -> None:
-        """Create tables if they don't exist"""
-        conn = get_db_connection(self.db_path)
-        # Execute CREATE TABLE statements from section 2.2 and 2.3
-        conn.close()
-
-    # ========== Job Creation ==========
-
-    def create_job(
-        self,
-        config_path: str,
-        date_range: List[str],
-        models: List[str]
-    ) -> str:
-        """
-        Create a new simulation job.
-
-        Args:
-            config_path: Path to config file
-            date_range: List of trading dates to simulate
-            models: List of model signatures to run
-
-        Returns:
-            job_id: UUID of created job
-
-        Raises:
-            ValueError: If another job is already running
-        """
-        # 1. Check if any jobs are currently running
-        if not self.can_start_new_job():
-            raise ValueError("Another simulation job is already running")
-
-        # 2. Generate job ID
-        job_id = str(uuid.uuid4())
-
-        # 3. Create job record
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        cursor.execute("""
-            INSERT INTO jobs (
-                job_id, config_path, status, date_range, models, created_at
-            ) VALUES (?, ?, ?, ?, ?, ?)
-        """, (
-            job_id,
-            config_path,
-            "pending",
-            json.dumps(date_range),
-            json.dumps(models),
-            datetime.utcnow().isoformat() + "Z"
-        ))
-
-        # 4. Create job_details records for each model-day
-        for date in date_range:
-            for model in models:
-                cursor.execute("""
-                    INSERT INTO job_details (
-                        job_id, date, model, status
-                    ) VALUES (?, ?, ?, ?)
-                """, (job_id, date, model, "pending"))
-
-        conn.commit()
-        conn.close()
-
-        return job_id
-
-    # ========== Job Retrieval ==========
-
-    def get_job(self, job_id: str) -> Optional[Dict]:
-        """
-        Get job metadata by ID.
-
-        Returns:
-            Job dict with keys: job_id, config_path, status, date_range (list),
-            models (list), created_at, started_at, completed_at, total_duration_seconds
-
-            Returns None if job not found.
-        """
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        cursor.execute("SELECT * FROM jobs WHERE job_id = ?", (job_id,))
-        row = cursor.fetchone()
-        conn.close()
-
-        if row is None:
-            return None
-
-        return {
-            "job_id": row[0],
-            "config_path": row[1],
-            "status": row[2],
-            "date_range": json.loads(row[3]),
-            "models": json.loads(row[4]),
-            "created_at": row[5],
-            "started_at": row[6],
-            "completed_at": row[7],
-            "total_duration_seconds": row[8],
-            "error": row[9]
-        }
-
-    def get_current_job(self) -> Optional[Dict]:
-        """Get most recent job (for /simulate/current endpoint)"""
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        cursor.execute("""
-            SELECT * FROM jobs
-            ORDER BY created_at DESC
-            LIMIT 1
-        """)
-        row = cursor.fetchone()
-        conn.close()
-
-        if row is None:
-            return None
-
-        return self._row_to_job_dict(row)
-
-    def get_running_jobs(self) -> List[Dict]:
-        """Get all running or pending jobs"""
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        cursor.execute("""
-            SELECT * FROM jobs
-            WHERE status IN ('pending', 'running')
-            ORDER BY created_at DESC
-        """)
-        rows = cursor.fetchall()
-        conn.close()
-
-        return [self._row_to_job_dict(row) for row in rows]
-
-    # ========== Job Status Updates ==========
-
-    def update_job_status(
-        self,
-        job_id: str,
-        status: str,
-        error: Optional[str] = None
-    ) -> None:
-        """Update job status (pending → running → completed/partial/failed)"""
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        updates = {"status": status}
-
-        if status == "running" and self.get_job(job_id)["status"] == "pending":
-            updates["started_at"] = datetime.utcnow().isoformat() + "Z"
-
-        if status in ("completed", "partial", "failed"):
-            updates["completed_at"] = datetime.utcnow().isoformat() + "Z"
-            # Calculate total duration
-            job = self.get_job(job_id)
-            if job["started_at"]:
-                started = datetime.fromisoformat(job["started_at"].replace("Z", ""))
-                completed = datetime.utcnow()
-                updates["total_duration_seconds"] = (completed - started).total_seconds()
-
-        if error:
-            updates["error"] = error
-
-        # Build dynamic UPDATE query
-        set_clause = ", ".join([f"{k} = ?" for k in updates.keys()])
-        values = list(updates.values()) + [job_id]
-
-        cursor.execute(f"""
-            UPDATE jobs
-            SET {set_clause}
-            WHERE job_id = ?
-        """, values)
-
-        conn.commit()
-        conn.close()
-
-    def update_job_detail_status(
-        self,
-        job_id: str,
-        date: str,
-        model: str,
-        status: str,
-        error: Optional[str] = None
-    ) -> None:
-        """Update individual model-day status"""
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        updates = {"status": status}
-
-        # Get current detail status to determine if this is a status transition
-        cursor.execute("""
-            SELECT status, started_at FROM job_details
-            WHERE job_id = ? AND date = ? AND model = ?
-        """, (job_id, date, model))
-        row = cursor.fetchone()
-
-        if row:
-            current_status = row[0]
-
-            if status == "running" and current_status == "pending":
-                updates["started_at"] = datetime.utcnow().isoformat() + "Z"
-
-            if status in ("completed", "failed"):
-                updates["completed_at"] = datetime.utcnow().isoformat() + "Z"
-                # Calculate duration if started_at exists
-                if row[1]:  # started_at
-                    started = datetime.fromisoformat(row[1].replace("Z", ""))
-                    completed = datetime.utcnow()
-                    updates["duration_seconds"] = (completed - started).total_seconds()
-
-        if error:
-            updates["error"] = error
-
-        # Build UPDATE query
-        set_clause = ", ".join([f"{k} = ?" for k in updates.keys()])
-        values = list(updates.values()) + [job_id, date, model]
-
-        cursor.execute(f"""
-            UPDATE job_details
-            SET {set_clause}
-            WHERE job_id = ? AND date = ? AND model = ?
-        """, values)
-
-        conn.commit()
-        conn.close()
-
-        # After updating detail, check if overall job status needs update
-        self._update_job_status_from_details(job_id)
-
-    def _update_job_status_from_details(self, job_id: str) -> None:
-        """
-        Recalculate job status based on job_details statuses.
-
-        Logic:
-        - If any detail is 'running' → job is 'running'
-        - If all details are 'completed' → job is 'completed'
-        - If some details are 'completed' and some 'failed' → job is 'partial'
-        - If all details are 'failed' → job is 'failed'
-        - If all details are 'pending' → job is 'pending'
-        """
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        cursor.execute("""
-            SELECT status, COUNT(*)
-            FROM job_details
-            WHERE job_id = ?
-            GROUP BY status
-        """, (job_id,))
-
-        status_counts = {row[0]: row[1] for row in cursor.fetchall()}
-        conn.close()
-
-        # Determine overall job status
-        if status_counts.get("running", 0) > 0:
-            new_status = "running"
-        elif status_counts.get("pending", 0) > 0:
-            # Some details still pending, job is either pending or running
-            current_job = self.get_job(job_id)
-            new_status = current_job["status"]  # Keep current status
-        elif status_counts.get("failed", 0) > 0 and status_counts.get("completed", 0) > 0:
-            new_status = "partial"
-        elif status_counts.get("failed", 0) > 0:
-            new_status = "failed"
-        else:
-            new_status = "completed"
-
-        self.update_job_status(job_id, new_status)
-
-    # ========== Job Progress ==========
-
-    def get_job_progress(self, job_id: str) -> Dict:
-        """
-        Get detailed progress for a job.
-
-        Returns:
-            {
-                "total_model_days": int,
-                "completed": int,
-                "failed": int,
-                "current": {"date": str, "model": str} | None,
-                "details": [
-                    {"date": str, "model": str, "status": str, "duration_seconds": float | None, "error": str | None},
-                    ...
-                ]
-            }
-        """
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        # Get all details for this job
-        cursor.execute("""
-            SELECT date, model, status, started_at, completed_at, duration_seconds, error
-            FROM job_details
-            WHERE job_id = ?
-            ORDER BY date ASC, model ASC
-        """, (job_id,))
-
-        rows = cursor.fetchall()
-        conn.close()
-
-        if not rows:
-            return {
-                "total_model_days": 0,
-                "completed": 0,
-                "failed": 0,
-                "current": None,
-                "details": []
-            }
-
-        total = len(rows)
-        completed = sum(1 for row in rows if row[2] == "completed")
-        failed = sum(1 for row in rows if row[2] == "failed")
-
-        # Find currently running model-day
-        current = None
-        for row in rows:
-            if row[2] == "running":
-                current = {"date": row[0], "model": row[1]}
-                break
-
-        # Build details list
-        details = []
-        for row in rows:
-            details.append({
-                "date": row[0],
-                "model": row[1],
-                "status": row[2],
-                "started_at": row[3],
-                "completed_at": row[4],
-                "duration_seconds": row[5],
-                "error": row[6]
-            })
-
-        return {
-            "total_model_days": total,
-            "completed": completed,
-            "failed": failed,
-            "current": current,
-            "details": details
-        }
-
-    # ========== Concurrency Control ==========
-
-    def can_start_new_job(self) -> bool:
-        """Check if a new job can be started (max 1 concurrent job)"""
-        running_jobs = self.get_running_jobs()
-        return len(running_jobs) == 0
-
-    def find_job_by_date_range(self, date_range: List[str]) -> Optional[Dict]:
-        """Find job with exact matching date range (for idempotency check)"""
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        # Query recent jobs (last 24 hours)
-        cursor.execute("""
-            SELECT * FROM jobs
-            WHERE created_at > datetime('now', '-1 day')
-            ORDER BY created_at DESC
-        """)
-
-        rows = cursor.fetchall()
-        conn.close()
-
-        # Check each job's date_range
-        target_range = set(date_range)
-        for row in rows:
-            job_range = set(json.loads(row[3]))  # date_range column
-            if job_range == target_range:
-                return self._row_to_job_dict(row)
-
-        return None
-
-    # ========== Utility Methods ==========
-
-    def _row_to_job_dict(self, row: tuple) -> Dict:
-        """Convert DB row to job dictionary"""
-        return {
-            "job_id": row[0],
-            "config_path": row[1],
-            "status": row[2],
-            "date_range": json.loads(row[3]),
-            "models": json.loads(row[4]),
-            "created_at": row[5],
-            "started_at": row[6],
-            "completed_at": row[7],
-            "total_duration_seconds": row[8],
-            "error": row[9]
-        }
-
-    def cleanup_old_jobs(self, days: int = 30) -> int:
-        """
-        Delete jobs older than specified days (cleanup maintenance).
-
-        Returns:
-            Number of jobs deleted
-        """
-        conn = get_db_connection(self.db_path)
-        cursor = conn.cursor()
-
-        cursor.execute("""
-            DELETE FROM jobs
-            WHERE created_at < datetime('now', '-' || ? || ' days')
-        """, (days,))
-
-        deleted_count = cursor.rowcount
-        conn.commit()
-        conn.close()
-
-        return deleted_count
-```
-
---
-
-## 4. Database Utility Module
-
-```python
-# api/database.py
-
-import sqlite3
-from typing import Optional
-import os
-
-def get_db_connection(db_path: str = "data/jobs.db") -> sqlite3.Connection:
-    """
-    Get SQLite database connection.
-
-    Ensures:
-    - Database directory exists
-    - Foreign keys are enabled
-    - Row factory returns dict-like objects
-    """
-    # Ensure data directory exists
-    os.makedirs(os.path.dirname(db_path), exist_ok=True)
-
-    conn = sqlite3.connect(db_path, check_same_thread=False)
-    conn.execute("PRAGMA foreign_keys = ON")  # Enable FK constraints
-    conn.row_factory = sqlite3.Row  # Return rows as dict-like objects
-
-    return conn
-
-def initialize_database(db_path: str = "data/jobs.db") -> None:
-    """Create database tables if they don't exist"""
-    conn = get_db_connection(db_path)
-    cursor = conn.cursor()
-
-    # Create jobs table
-    cursor.execute("""
-        CREATE TABLE IF NOT EXISTS jobs (
-            job_id TEXT PRIMARY KEY,
-            config_path TEXT NOT NULL,
-            status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'partial', 'failed')),
-            date_range TEXT NOT NULL,
-            models TEXT NOT NULL,
-            created_at TEXT NOT NULL,
-            started_at TEXT,
-            completed_at TEXT,
-            total_duration_seconds REAL,
-            error TEXT
-        )
-    """)
-
-    # Create indexes
-    cursor.execute("""
-        CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status)
-    """)
-    cursor.execute("""
-        CREATE INDEX IF NOT EXISTS idx_jobs_created_at ON jobs(created_at DESC)
-    """)
-
-    # Create job_details table
-    cursor.execute("""
-        CREATE TABLE IF NOT EXISTS job_details (
-            id INTEGER PRIMARY KEY AUTOINCREMENT,
-            job_id TEXT NOT NULL,
-            date TEXT NOT NULL,
-            model TEXT NOT NULL,
-            status TEXT NOT NULL CHECK(status IN ('pending', 'running', 'completed', 'failed')),
-            started_at TEXT,
-            completed_at TEXT,
-            duration_seconds REAL,
-            error TEXT,
-            FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE
-        )
-    """)
-
-    # Create indexes
-    cursor.execute("""
-        CREATE INDEX IF NOT EXISTS idx_job_details_job_id ON job_details(job_id)
-    """)
-    cursor.execute("""
-        CREATE INDEX IF NOT EXISTS idx_job_details_status ON job_details(status)
-    """)
-    cursor.execute("""
-        CREATE UNIQUE INDEX IF NOT EXISTS idx_job_details_unique
-        ON job_details(job_id, date, model)
-    """)
-
-    conn.commit()
-    conn.close()
-```
-
---
-
-## 5. State Transitions
-
-### 5.1 Job Status State Machine
-
-```
-pending ──────────────> running ──────────> completed
-                          │                     │
-                          │                     │
-                          └────────────> partial
-                          │                     │
-                          └────────────> failed
-```
-
-**Transition Logic:**
- `pending → running`: When first model-day starts executing
- `running → completed`: When all model-days complete successfully
- `running → partial`: When some model-days succeed, some fail
- `running → failed`: When all model-days fail (rare)
-
-### 5.2 Job Detail Status State Machine
-
-```
-pending ──────> running ──────> completed
-                   │
-                   └───────────> failed
-```
-
-**Transition Logic:**
- `pending → running`: When worker starts executing that model-day
- `running → completed`: When `agent.run_trading_session()` succeeds
- `running → failed`: When `agent.run_trading_session()` raises exception after retries
-
---
-
-## 6. Concurrency Scenarios
-
-### 6.1 Scenario: Duplicate Trigger Requests
-
-**Timeline:**
-1. Request A: POST /simulate/trigger → Job created with date_range=[2025-01-16, 2025-01-17]
-2. Request B (5 seconds later): POST /simulate/trigger → Same date range
-
-**Expected Behavior:**
- Request A: Returns `{"job_id": "abc123", "status": "accepted"}`
- Request B: `find_job_by_date_range()` finds Job abc123
- Request B: Returns `{"job_id": "abc123", "status": "running", ...}` (same job)
-
-**Code:**
-```python
-# In /simulate/trigger endpoint
-existing_job = job_manager.find_job_by_date_range(date_range)
-if existing_job:
-    # Return existing job instead of creating duplicate
-    return existing_job
-```
-
-### 6.2 Scenario: Concurrent Jobs with Different Dates
-
-**Timeline:**
-1. Job A running: date_range=[2025-01-01 to 2025-01-10] (started 5 min ago)
-2. Request: POST /simulate/trigger with date_range=[2025-01-11 to 2025-01-15]
-
-**Expected Behavior:**
- `can_start_new_job()` returns False (Job A is still running)
- Request returns 409 Conflict with details of Job A
-
-### 6.3 Scenario: Job Cleanup on API Restart
-
-**Problem:** API crashes while job is running. On restart, job stuck in "running" state.
-
-**Solution:** On API startup, detect stale jobs and mark as failed:
-```python
-# In api/main.py startup event
-@app.on_event("startup")
-async def startup_event():
-    job_manager = JobManager()
-
-    # Find jobs stuck in 'running' or 'pending' state
-    stale_jobs = job_manager.get_running_jobs()
-
-    for job in stale_jobs:
-        # Mark as failed with explanation
-        job_manager.update_job_status(
-            job["job_id"],
-            "failed",
-            error="API restarted while job was running"
-        )
-```
-
---
-
-## 7. Testing Strategy
-
-### 7.1 Unit Tests
-
-```python
-# tests/test_job_manager.py
-
-import pytest
-from api.job_manager import JobManager
-import tempfile
-import os
-
-@pytest.fixture
-def job_manager():
-    # Use temporary database for tests
-    temp_db = tempfile.NamedTemporaryFile(delete=False, suffix=".db")
-    temp_db.close()
-
-    jm = JobManager(db_path=temp_db.name)
-    yield jm
-
-    # Cleanup
-    os.unlink(temp_db.name)
-
-def test_create_job(job_manager):
-    job_id = job_manager.create_job(
-        config_path="configs/test.json",
-        date_range=["2025-01-16", "2025-01-17"],
-        models=["gpt-5", "claude-3.7-sonnet"]
-    )
-
-    assert job_id is not None
-    job = job_manager.get_job(job_id)
-    assert job["status"] == "pending"
-    assert job["date_range"] == ["2025-01-16", "2025-01-17"]
-
-    # Check job_details created
-    progress = job_manager.get_job_progress(job_id)
-    assert progress["total_model_days"] == 4  # 2 dates × 2 models
-
-def test_concurrent_job_blocked(job_manager):
-    # Create first job
-    job1_id = job_manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
-
-    # Try to create second job while first is pending
-    with pytest.raises(ValueError, match="Another simulation job is already running"):
-        job_manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
-
-    # Mark first job as completed
-    job_manager.update_job_status(job1_id, "completed")
-
-    # Now second job should be allowed
-    job2_id = job_manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
-    assert job2_id is not None
-
-def test_job_status_transitions(job_manager):
-    job_id = job_manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
-
-    # Update job detail to running
-    job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
-
-    # Job should now be 'running'
-    job = job_manager.get_job(job_id)
-    assert job["status"] == "running"
-    assert job["started_at"] is not None
-
-    # Complete the detail
-    job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
-
-    # Job should now be 'completed'
-    job = job_manager.get_job(job_id)
-    assert job["status"] == "completed"
-    assert job["completed_at"] is not None
-
-def test_partial_job_status(job_manager):
-    job_id = job_manager.create_job(
-        "configs/test.json",
-        ["2025-01-16"],
-        ["gpt-5", "claude-3.7-sonnet"]
-    )
-
-    # One model succeeds
-    job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
-    job_manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
-
-    # One model fails
-    job_manager.update_job_detail_status(job_id, "2025-01-16", "claude-3.7-sonnet", "running")
-    job_manager.update_job_detail_status(
-        job_id, "2025-01-16", "claude-3.7-sonnet", "failed",
-        error="API timeout"
-    )
-
-    # Job should be 'partial'
-    job = job_manager.get_job(job_id)
-    assert job["status"] == "partial"
-
-    progress = job_manager.get_job_progress(job_id)
-    assert progress["completed"] == 1
-    assert progress["failed"] == 1
-```
-
---
-
-## 8. Performance Considerations
-
-### 8.1 Database Indexing
-
- `idx_jobs_status`: Fast filtering for running jobs
- `idx_jobs_created_at DESC`: Fast retrieval of most recent job
- `idx_job_details_unique`: Prevent duplicate model-day entries
-
-### 8.2 Connection Pooling
-
-For MVP, using `sqlite3.connect()` per operation is acceptable (low concurrency).
-
-For higher concurrency (future), consider:
- SQLAlchemy ORM with connection pooling
- PostgreSQL for production deployments
-
-### 8.3 Query Optimization
-
-**Avoid N+1 queries:**
-```python
-# BAD: Separate query for each job's progress
-for job in jobs:
-    progress = job_manager.get_job_progress(job["job_id"])
-
-# GOOD: Join jobs and job_details in single query
-SELECT
-    jobs.*,
-    COUNT(job_details.id) as total,
-    SUM(CASE WHEN job_details.status = 'completed' THEN 1 ELSE 0 END) as completed
-FROM jobs
-LEFT JOIN job_details ON jobs.job_id = job_details.job_id
-GROUP BY jobs.job_id
-```
-
---
-
-## 9. Error Handling
-
-### 9.1 Database Errors
-
-**Scenario:** SQLite database is locked or corrupted
-
-**Handling:**
-```python
-try:
-    job_id = job_manager.create_job(...)
-except sqlite3.OperationalError as e:
-    # Database locked - retry with exponential backoff
-    logger.error(f"Database error: {e}")
-    raise HTTPException(status_code=503, detail="Database temporarily unavailable")
-except sqlite3.IntegrityError as e:
-    # Constraint violation (e.g., duplicate job_id)
-    logger.error(f"Integrity error: {e}")
-    raise HTTPException(status_code=400, detail="Invalid job data")
-```
-
-### 9.2 Foreign Key Violations
-
-**Scenario:** Attempt to create job_detail for non-existent job
-
-**Prevention:**
- Always create job record before job_details records
- Use transactions to ensure atomicity
-
-```python
-def create_job(self, ...):
-    conn = get_db_connection(self.db_path)
-    try:
-        cursor = conn.cursor()
-
-        # Insert job
-        cursor.execute("INSERT INTO jobs ...")
-
-        # Insert job_details
-        for date in date_range:
-            for model in models:
-                cursor.execute("INSERT INTO job_details ...")
-
-        conn.commit()  # Atomic commit
-    except Exception as e:
-        conn.rollback()  # Rollback on any error
-        raise
-    finally:
-        conn.close()
-```
-
---
-
-## 10. Migration Strategy
-
-### 10.1 Schema Versioning
-
-For future schema changes, use migration scripts:
-
-```
-data/
-└── migrations/
-    ├── 001_initial_schema.sql
-    ├── 002_add_priority_column.sql
-    └── ...
-```
-
-Track applied migrations in database:
-```sql
-CREATE TABLE IF NOT EXISTS schema_migrations (
-    version INTEGER PRIMARY KEY,
-    applied_at TEXT NOT NULL
-);
-```
-
-### 10.2 Backward Compatibility
-
-When adding columns:
- Use `ALTER TABLE ADD COLUMN ... DEFAULT ...` for backward compatibility
- Never remove columns (deprecate instead)
- Version API responses to handle schema changes
-
---
-
-## Summary
-
-The Job Manager provides:
-1. **Robust job tracking** with SQLite persistence
-2. **Concurrency control** ensuring single-job execution
-3. **Granular progress monitoring** at model-day level
-4. **Flexible status handling** (completed/partial/failed)
-5. **Idempotency** for duplicate trigger requests
-
-Next specification: **Background Worker Architecture**
@@ -1,197 +0,0 @@
-# Data Cache Reuse Design
-
-**Date:** 2025-10-30
-**Status:** Approved
-
-## Problem Statement
-
-Docker containers currently fetch all 103 NASDAQ 100 tickers from Alpha Vantage on every startup, even when price data is volume-mounted and already cached in `./data`. This causes:
- Slow startup times (103 API calls)
- Unnecessary API quota consumption
- Rate limit risks during frequent development iterations
-
-## Solution Overview
-
-Implement staleness-based data refresh with configurable age threshold. Container checks all `daily_prices_*.json` files and only refetches if any file is missing or older than `MAX_DATA_AGE_DAYS`.
-
-## Design Decisions
-
-### Architecture Choice
-**Selected:** Check all `daily_prices_*.json` files individually
-**Rationale:** Ensures data integrity by detecting partial/missing files, not just stale merged data
-
-### Implementation Location
-**Selected:** Bash wrapper logic in `entrypoint.sh`
-**Rationale:** Keeps data fetching scripts unchanged, adds orchestration at container startup layer
-
-### Staleness Threshold
-**Selected:** Configurable via `MAX_DATA_AGE_DAYS` environment variable (default: 7 days)
-**Rationale:** Balances freshness with API usage; flexible for different use cases (development vs production)
-
-## Technical Design
-
-### Components
-
-#### 1. Staleness Check Function
-Location: `entrypoint.sh` (after environment validation, before data fetch)
-
-```bash
-should_refresh_data() {
-    MAX_AGE=${MAX_DATA_AGE_DAYS:-7}
-
-    # Check if at least one price file exists
-    if ! ls /app/data/daily_prices_*.json >/dev/null 2>&1; then
-        echo "📭 No price data found"
-        return 0  # Need refresh
-    fi
-
-    # Find any files older than MAX_AGE days
-    STALE_COUNT=$(find /app/data -name "daily_prices_*.json" -mtime +$MAX_AGE | wc -l)
-    TOTAL_COUNT=$(ls /app/data/daily_prices_*.json 2>/dev/null | wc -l)
-
-    if [ $STALE_COUNT -gt 0 ]; then
-        echo "📅 Found $STALE_COUNT stale files (>$MAX_AGE days old)"
-        return 0  # Need refresh
-    fi
-
-    echo "✅ All $TOTAL_COUNT price files are fresh (<$MAX_AGE days old)"
-    return 1  # Skip refresh
-}
-```
-
-**Logic:**
- Uses `find -mtime +N` to detect files modified more than N days ago
- Returns shell exit codes: 0 (refresh needed), 1 (skip refresh)
- Logs informative messages for debugging
-
-#### 2. Conditional Data Fetch
-Location: `entrypoint.sh` lines 40-46 (replace existing unconditional fetch)
-
-```bash
-# Step 1: Data preparation (conditional)
-echo "📊 Checking price data freshness..."
-
-if should_refresh_data; then
-    echo "🔄 Fetching and merging price data..."
-    cd /app/data
-    python /app/scripts/get_daily_price.py
-    python /app/scripts/merge_jsonl.py
-    cd /app
-else
-    echo "⏭️  Skipping data fetch (using cached data)"
-fi
-```
-
-#### 3. Environment Configuration
-**docker-compose.yml:**
-```yaml
-environment:
-  - MAX_DATA_AGE_DAYS=${MAX_DATA_AGE_DAYS:-7}
-```
-
-**.env.example:**
-```bash
-# Data Refresh Configuration
-MAX_DATA_AGE_DAYS=7  # Refresh price data older than N days (0=always refresh)
-```
-
-### Data Flow
-
-1. **Container Startup** → entrypoint.sh begins execution
-2. **Environment Validation** → Check required API keys (existing logic)
-3. **Staleness Check** → `should_refresh_data()` scans `/app/data/daily_prices_*.json`
-   - No files found → Return 0 (refresh)
-   - Any file older than `MAX_DATA_AGE_DAYS` → Return 0 (refresh)
-   - All files fresh → Return 1 (skip)
-4. **Conditional Fetch** → Run get_daily_price.py only if refresh needed
-5. **Merge Data** → Always run merge_jsonl.py (handles missing merged.jsonl)
-6. **MCP Services** → Start services (existing logic)
-7. **Trading Agent** → Begin trading (existing logic)
-
-### Edge Cases
-
-| Scenario | Behavior |
-|----------|----------|
-| **First run (no data)** | Detects no files → triggers full fetch |
-| **Restart within 7 days** | All files fresh → skips fetch (fast startup) |
-| **Restart after 7 days** | Files stale → refreshes all data |
-| **Partial data (some files missing)** | Missing files treated as infinitely old → triggers refresh |
-| **Corrupt merged.jsonl but fresh price files** | Skips fetch, re-runs merge to rebuild merged.jsonl |
-| **MAX_DATA_AGE_DAYS=0** | Always refresh (useful for testing/production) |
-| **MAX_DATA_AGE_DAYS unset** | Defaults to 7 days |
-| **Alpha Vantage rate limit** | get_daily_price.py handles with warning (existing behavior) |
-
-## Configuration Options
-
-| Variable | Default | Purpose |
-|----------|---------|---------|
-| `MAX_DATA_AGE_DAYS` | 7 | Days before price data considered stale |
-
-**Special Values:**
- `0` → Always refresh (force fresh data)
- `999` → Never refresh (use cached data indefinitely)
-
-## User Experience
-
-### Scenario 1: Fresh Container
-```
-🚀 Starting AI-Trader...
-🔍 Validating environment variables...
-✅ Environment variables validated
-📊 Checking price data freshness...
-📭 No price data found
-🔄 Fetching and merging price data...
-✓ Fetched NVDA
-✓ Fetched MSFT
-...
-```
-
-### Scenario 2: Restart Within 7 Days
-```
-🚀 Starting AI-Trader...
-🔍 Validating environment variables...
-✅ Environment variables validated
-📊 Checking price data freshness...
-✅ All 103 price files are fresh (<7 days old)
-⏭️  Skipping data fetch (using cached data)
-🔧 Starting MCP services...
-```
-
-### Scenario 3: Restart After 7 Days
-```
-🚀 Starting AI-Trader...
-🔍 Validating environment variables...
-✅ Environment variables validated
-📊 Checking price data freshness...
-📅 Found 103 stale files (>7 days old)
-🔄 Fetching and merging price data...
-✓ Fetched NVDA
-✓ Fetched MSFT
-...
-```
-
-## Testing Plan
-
-1. **Test fresh container:** Delete `./data/daily_prices_*.json`, start container → should fetch all
-2. **Test cached data:** Restart immediately → should skip fetch
-3. **Test staleness:** `touch -d "8 days ago" ./data/daily_prices_AAPL.json`, restart → should refresh
-4. **Test partial data:** Delete 10 random price files → should refresh all
-5. **Test MAX_DATA_AGE_DAYS=0:** Restart with env var set → should always fetch
-6. **Test MAX_DATA_AGE_DAYS=30:** Restart with 8-day-old data → should skip
-
-## Documentation Updates
-
-Files requiring updates:
- `entrypoint.sh` → Add function and conditional logic
- `docker-compose.yml` → Add MAX_DATA_AGE_DAYS environment variable
- `.env.example` → Document MAX_DATA_AGE_DAYS with default value
- `CLAUDE.md` → Update "Docker Deployment" section with new env var
- `docs/DOCKER.md` (if exists) → Explain data caching behavior
-
-## Benefits
-
- **Development:** Instant container restarts during iteration
- **API Quota:** ~103 fewer API calls per restart
- **Reliability:** No rate limit risks during frequent testing
- **Flexibility:** Configurable threshold for different use cases
- **Consistency:** Checks all files to ensure complete data
@@ -1,491 +0,0 @@
-# Docker Deployment and CI/CD Design
-
-**Date:** 2025-10-30
-**Status:** Approved
-**Target:** Development/local testing environment
-
-## Overview
-
-Package AI-Trader as a Docker container with docker-compose orchestration and automated image builds via GitHub Actions on release tags. Focus on simplicity and ease of use for researchers and developers.
-
-## Requirements
-
- **Primary Use Case:** Development and local testing
- **Deployment Target:** Single monolithic container (all MCP services + trading agent)
- **Secrets Management:** Environment variables (no mounted .env file)
- **Data Strategy:** Fetch price data on container startup
- **Container Registry:** GitHub Container Registry (ghcr.io)
- **Trigger:** Build images automatically on release tag push (`v*` pattern)
-
-## Architecture
-
-### Components
-
-1. **Dockerfile** - Builds Python 3.10 image with all dependencies
-2. **docker-compose.yml** - Orchestrates container with volume mounts and environment config
-3. **entrypoint.sh** - Sequential startup script (data fetch → MCP services → trading agent)
-4. **GitHub Actions Workflow** - Automated image build and push on release tags
-5. **.dockerignore** - Excludes unnecessary files from image
-6. **Documentation** - Docker usage guide and examples
-
-### Execution Flow
-
-```
-Container Start
-    ↓
-entrypoint.sh
-    ↓
-1. Fetch/merge price data (get_daily_price.py → merge_jsonl.py)
-    ↓
-2. Start MCP services in background (start_mcp_services.py)
-    ↓
-3. Wait 3 seconds for service stabilization
-    ↓
-4. Run trading agent (main.py with config)
-    ↓
-Container Exit → Cleanup MCP services
-```
-
-## Detailed Design
-
-### 1. Dockerfile
-
-**Multi-stage build:**
-
-```dockerfile
-# Base stage
-FROM python:3.10-slim as base
-
-WORKDIR /app
-
-# Install dependencies
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-
-# Application stage
-FROM base
-
-WORKDIR /app
-
-# Copy application code
-COPY . .
-
-# Create necessary directories
-RUN mkdir -p data logs data/agent_data
-
-# Make entrypoint executable
-RUN chmod +x entrypoint.sh
-
-# Expose MCP service ports
-EXPOSE 8000 8001 8002 8003
-
-# Set Python to run unbuffered
-ENV PYTHONUNBUFFERED=1
-
-# Use entrypoint script
-ENTRYPOINT ["./entrypoint.sh"]
-CMD ["configs/default_config.json"]
-```
-
-**Key Features:**
- `python:3.10-slim` base for smaller image size
- Multi-stage for dependency caching
- Non-root user NOT included (dev/testing focus, can add later)
- Unbuffered Python output for real-time logs
- Default config path with override support
-
-### 2. docker-compose.yml
-
-```yaml
-version: '3.8'
-
-services:
-  ai-trader:
-    build: .
-    container_name: ai-trader-app
-    volumes:
-      - ./data:/app/data
-      - ./logs:/app/logs
-    environment:
-      - OPENAI_API_BASE=${OPENAI_API_BASE}
-      - OPENAI_API_KEY=${OPENAI_API_KEY}
-      - ALPHAADVANTAGE_API_KEY=${ALPHAADVANTAGE_API_KEY}
-      - JINA_API_KEY=${JINA_API_KEY}
-      - RUNTIME_ENV_PATH=/app/data/runtime_env.json
-      - MATH_HTTP_PORT=${MATH_HTTP_PORT:-8000}
-      - SEARCH_HTTP_PORT=${SEARCH_HTTP_PORT:-8001}
-      - TRADE_HTTP_PORT=${TRADE_HTTP_PORT:-8002}
-      - GETPRICE_HTTP_PORT=${GETPRICE_HTTP_PORT:-8003}
-      - AGENT_MAX_STEP=${AGENT_MAX_STEP:-30}
-    ports:
-      - "8000:8000"
-      - "8001:8001"
-      - "8002:8002"
-      - "8003:8003"
-      - "8888:8888"  # Optional: web dashboard
-    restart: unless-stopped
-```
-
-**Key Features:**
- Volume mounts for data/logs persistence
- Environment variables interpolated from `.env` file (Docker Compose reads automatically)
- No `.env` file mounted into container (cleaner separation)
- Default port values with override support
- Restart policy for recovery
-
-### 3. entrypoint.sh
-
-```bash
-#!/bin/bash
-set -e  # Exit on any error
-
-echo "🚀 Starting AI-Trader..."
-
-# Step 1: Data preparation
-echo "📊 Fetching and merging price data..."
-cd /app/data
-python get_daily_price.py
-python merge_jsonl.py
-cd /app
-
-# Step 2: Start MCP services in background
-echo "🔧 Starting MCP services..."
-cd /app/agent_tools
-python start_mcp_services.py &
-MCP_PID=$!
-cd /app
-
-# Step 3: Wait for services to initialize
-echo "⏳ Waiting for MCP services to start..."
-sleep 3
-
-# Step 4: Run trading agent with config file
-echo "🤖 Starting trading agent..."
-CONFIG_FILE="${1:-configs/default_config.json}"
-python main.py "$CONFIG_FILE"
-
-# Cleanup on exit
-trap "echo '🛑 Stopping MCP services...'; kill $MCP_PID 2>/dev/null" EXIT
-```
-
-**Key Features:**
- Sequential execution with clear logging
- MCP services run in background with PID capture
- Trap ensures cleanup on container exit
- Config file path as argument (defaults to `configs/default_config.json`)
- Fail-fast with `set -e`
-
-### 4. GitHub Actions Workflow
-
-**File:** `.github/workflows/docker-release.yml`
-
-```yaml
-name: Build and Push Docker Image
-
-on:
-  push:
-    tags:
-      - 'v*'  # Triggers on v1.0.0, v2.1.3, etc.
-  workflow_dispatch:  # Manual trigger option
-
-jobs:
-  build-and-push:
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-      packages: write
-
-    steps:
-      - name: Checkout code
-        uses: actions/checkout@v4
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-
-      - name: Login to GitHub Container Registry
-        uses: docker/login-action@v3
-        with:
-          registry: ghcr.io
-          username: ${{ github.actor }}
-          password: ${{ secrets.GITHUB_TOKEN }}
-
-      - name: Extract version from tag
-        id: meta
-        run: |
-          VERSION=${GITHUB_REF#refs/tags/v}
-          echo "version=$VERSION" >> $GITHUB_OUTPUT
-
-      - name: Build and push Docker image
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          push: true
-          tags: |
-            ghcr.io/${{ github.repository_owner }}/ai-trader:${{ steps.meta.outputs.version }}
-            ghcr.io/${{ github.repository_owner }}/ai-trader:latest
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-```
-
-**Key Features:**
- Triggers on `v*` tags (e.g., `git tag v1.0.0 && git push origin v1.0.0`)
- Manual dispatch option for testing
- Uses `GITHUB_TOKEN` (automatically provided, no secrets needed)
- Builds with caching for faster builds
- Tags both version and `latest`
- Multi-platform support possible by adding `platforms: linux/amd64,linux/arm64`
-
-### 5. .dockerignore
-
-```
-# Version control
-.git/
-.gitignore
-
-# Python
-__pycache__/
-*.py[cod]
-*$py.class
-*.so
-.Python
-venv/
-env/
-ENV/
-
-# IDE
-.vscode/
-.idea/
-*.swp
-*.swo
-
-# Environment and secrets
-.env
-.env.*
-!.env.example
-
-# Data files (fetched at runtime)
-data/*.json
-data/agent_data/
-data/merged.jsonl
-
-# Logs
-logs/
-*.log
-
-# Runtime state
-runtime_env.json
-
-# Documentation (not needed in image)
-*.md
-docs/
-!README.md
-
-# CI/CD
-.github/
-```
-
-**Purpose:**
- Reduces image size
- Keeps secrets out of image
- Excludes generated files
- Keeps only necessary source code and scripts
-
-## Documentation Updates
-
-### New File: docs/DOCKER.md
-
-Create comprehensive Docker usage guide including:
-
-1. **Quick Start**
-   ```bash
-   cp .env.example .env
-   # Edit .env with your API keys
-   docker-compose up
-   ```
-
-2. **Configuration**
-   - Required environment variables
-   - Optional configuration overrides
-   - Custom config file usage
-
-3. **Usage Examples**
-   ```bash
-   # Run with default config
-   docker-compose up
-
-   # Run with custom config
-   docker-compose run ai-trader configs/my_config.json
-
-   # View logs
-   docker-compose logs -f
-
-   # Stop and clean up
-   docker-compose down
-   ```
-
-4. **Data Persistence**
-   - How volume mounts work
-   - Where data is stored
-   - How to backup/restore
-
-5. **Troubleshooting**
-   - MCP services not starting → Check logs, verify ports available
-   - Missing API keys → Check .env file
-   - Data fetch failures → API rate limits or invalid keys
-   - Permission issues → Volume mount permissions
-
-6. **Using Pre-built Images**
-   ```bash
-   docker pull ghcr.io/hkuds/ai-trader:latest
-   docker run --env-file .env -v $(pwd)/data:/app/data ghcr.io/hkuds/ai-trader:latest
-   ```
-
-### Update .env.example
-
-Add/clarify Docker-specific variables:
-
-```bash
-# AI Model API Configuration
-OPENAI_API_BASE=https://your-openai-proxy.com/v1
-OPENAI_API_KEY=your_openai_key
-
-# Data Source Configuration
-ALPHAADVANTAGE_API_KEY=your_alpha_vantage_key
-JINA_API_KEY=your_jina_api_key
-
-# System Configuration (Docker defaults)
-RUNTIME_ENV_PATH=/app/data/runtime_env.json
-
-# MCP Service Ports
-MATH_HTTP_PORT=8000
-SEARCH_HTTP_PORT=8001
-TRADE_HTTP_PORT=8002
-GETPRICE_HTTP_PORT=8003
-
-# Agent Configuration
-AGENT_MAX_STEP=30
-```
-
-### Update Main README.md
-
-Add Docker section after "Quick Start":
-
-```markdown
-## Docker Deployment
-
-### Using Docker Compose (Recommended)
-
-```bash
-# Setup environment
-cp .env.example .env
-# Edit .env with your API keys
-
-# Run with docker-compose
-docker-compose up
-```
-
-### Using Pre-built Images
-
-```bash
-# Pull latest image
-docker pull ghcr.io/hkuds/ai-trader:latest
-
-# Run container
-docker run --env-file .env \
-  -v $(pwd)/data:/app/data \
-  -v $(pwd)/logs:/app/logs \
-  ghcr.io/hkuds/ai-trader:latest
-```
-
-See [docs/DOCKER.md](docs/DOCKER.md) for detailed Docker usage guide.
-```
-
-## Release Process
-
-### For Maintainers
-
-1. **Prepare release:**
-   ```bash
-   # Ensure main branch is ready
-   git checkout main
-   git pull origin main
-   ```
-
-2. **Create and push tag:**
-   ```bash
-   git tag v1.0.0
-   git push origin v1.0.0
-   ```
-
-3. **GitHub Actions automatically:**
-   - Builds Docker image
-   - Tags with version and `latest`
-   - Pushes to `ghcr.io/hkuds/ai-trader`
-
-4. **Verify build:**
-   - Check Actions tab for build status
-   - Test pull: `docker pull ghcr.io/hkuds/ai-trader:v1.0.0`
-
-5. **Optional: Create GitHub Release**
-   - Add release notes
-   - Include Docker pull command
-
-### For Users
-
-```bash
-# Pull specific version
-docker pull ghcr.io/hkuds/ai-trader:v1.0.0
-
-# Or always get latest
-docker pull ghcr.io/hkuds/ai-trader:latest
-```
-
-## Implementation Checklist
-
- [ ] Create Dockerfile with multi-stage build
- [ ] Create docker-compose.yml with volume mounts and environment config
- [ ] Create entrypoint.sh with sequential startup logic
- [ ] Create .dockerignore to exclude unnecessary files
- [ ] Create .github/workflows/docker-release.yml for CI/CD
- [ ] Create docs/DOCKER.md with comprehensive usage guide
- [ ] Update .env.example with Docker-specific variables
- [ ] Update main README.md with Docker deployment section
- [ ] Test local build: `docker-compose build`
- [ ] Test local run: `docker-compose up`
- [ ] Test with custom config
- [ ] Verify data persistence across container restarts
- [ ] Test GitHub Actions workflow (create test tag)
- [ ] Verify image pushed to ghcr.io
- [ ] Test pulling and running pre-built image
- [ ] Update CLAUDE.md with Docker commands
-
-## Future Enhancements
-
-Possible improvements for production use:
-
-1. **Multi-container Architecture**
-   - Separate containers for each MCP service
-   - Better isolation and independent scaling
-   - More complex orchestration
-
-2. **Security Hardening**
-   - Non-root user in container
-   - Docker secrets for production
-   - Read-only filesystem where possible
-
-3. **Monitoring**
-   - Health checks for MCP services
-   - Prometheus metrics export
-   - Logging aggregation
-
-4. **Optimization**
-   - Multi-platform builds (ARM64 support)
-   - Smaller base image (alpine)
-   - Layer caching optimization
-
-5. **Development Tools**
-   - docker-compose.dev.yml with hot reload
-   - Debug container with additional tools
-   - Integration test container
-
-These are deferred to keep initial implementation simple and focused on development/testing use cases.
@@ -1,102 +0,0 @@
-Docker Build Test Results
-==========================
-Date: 2025-10-30
-Branch: docker-deployment
-Working Directory: /home/bballou/AI-Trader/.worktrees/docker-deployment
-
-Test 1: Docker Image Build
---------------------------
-Command: docker-compose build
-Status: SUCCESS
-Result: Successfully built image 7b36b8f4c0e9
-
-Build Output Summary:
- Base image: python:3.10-slim
- Build stages: Multi-stage build (base + application)
- Dependencies installed successfully from requirements.txt
- Application code copied
- Directories created: data, logs, data/agent_data
- Entrypoint script made executable
- Ports exposed: 8000, 8001, 8002, 8003, 8888
- Environment: PYTHONUNBUFFERED=1 set
- Image size: 266MB
- Build time: ~2 minutes (including dependency installation)
-
-Key packages installed:
- langchain==1.0.2
- langchain-openai==1.0.1
- langchain-mcp-adapters>=0.1.0
- fastmcp==2.12.5
- langgraph<1.1.0,>=1.0.0
- pydantic<3.0.0,>=2.7.4
- openai<3.0.0,>=1.109.1
- All dependencies resolved without conflicts
-
-Test 2: Image Verification
---------------------------
-Command: docker images | grep ai-trader
-Status: SUCCESS
-Result: docker-deployment_ai-trader     latest          7b36b8f4c0e9   9 seconds ago   266MB
-
-Image Details:
- Repository: docker-deployment_ai-trader
- Tag: latest
- Image ID: 7b36b8f4c0e9
- Created: Just now
- Size: 266MB (reasonable for Python 3.10 + ML dependencies)
-
-Test 3: Configuration Parsing (Dry-Run)
----------------------------------------
-Command: docker-compose --env-file .env.test config
-Status: SUCCESS
-Result: Configuration parsed correctly without errors
-
-Test .env.test contents:
-OPENAI_API_KEY=test
-ALPHAADVANTAGE_API_KEY=test
-JINA_API_KEY=test
-RUNTIME_ENV_PATH=/app/data/runtime_env.json
-
-Parsed Configuration:
- Service name: ai-trader
- Container name: ai-trader-app
- Build context: /home/bballou/AI-Trader/.worktrees/docker-deployment
- Environment variables correctly injected:
-  * AGENT_MAX_STEP: '30' (default)
-  * ALPHAADVANTAGE_API_KEY: test
-  * GETPRICE_HTTP_PORT: '8003' (default)
-  * JINA_API_KEY: test
-  * MATH_HTTP_PORT: '8000' (default)
-  * OPENAI_API_BASE: '' (not set, defaulted to blank)
-  * OPENAI_API_KEY: test
-  * RUNTIME_ENV_PATH: /app/data/runtime_env.json
-  * SEARCH_HTTP_PORT: '8001' (default)
-  * TRADE_HTTP_PORT: '8002' (default)
- Ports correctly mapped: 8000, 8001, 8002, 8003, 8888
- Volumes correctly configured:
-  * ./data:/app/data:rw
-  * ./logs:/app/logs:rw
- Restart policy: unless-stopped
- Docker Compose version: 3.8
-
-Summary
-------
-All Docker build tests PASSED successfully:
-✓ Docker image builds without errors
-✓ Image created with reasonable size (266MB)
-✓ Multi-stage build optimizes layer caching
-✓ All Python dependencies install correctly
-✓ Configuration parsing works with test environment
-✓ Environment variables properly injected
-✓ Volume mounts configured correctly
-✓ Port mappings set up correctly
-✓ Restart policy configured
-
-No issues encountered during local Docker build testing.
-The Docker deployment is ready for use.
-
-Next Steps:
-1. Test actual container startup with valid API keys
-2. Verify MCP services start correctly in container
-3. Test trading agent execution
-4. Consider creating test tag for GitHub Actions CI/CD verification
@@ -0,0 +1,30 @@
+# Data Formats
+
+File formats and schemas used by AI-Trader.
+
+---
+
+## Position File (`position.jsonl`)
+
+```jsonl
+{"date": "2025-01-16", "id": 1, "this_action": {"action": "buy", "symbol": "AAPL", "amount": 10}, "positions": {"AAPL": 10, "CASH": 9500.0}}
+{"date": "2025-01-17", "id": 2, "this_action": {"action": "sell", "symbol": "AAPL", "amount": 5}, "positions": {"AAPL": 5, "CASH": 10750.0}}
+```
+
+---
+
+## Price Data (`merged.jsonl`)
+
+```jsonl
+{"Meta Data": {"2. Symbol": "AAPL", "3. Last Refreshed": "2025-01-16"}, "Time Series (Daily)": {"2025-01-16": {"1. buy price": "250.50", "2. high": "252.00", "3. low": "249.00", "4. sell price": "251.50", "5. volume": "50000000"}}}
+```
+
+---
+
+## Log Files (`log.jsonl`)
+
+Contains complete AI reasoning and tool usage for each trading session.
+
+---
+
+See database schema in [docs/developer/database-schema.md](../developer/database-schema.md) for SQLite formats.
@@ -0,0 +1,32 @@
+# Environment Variables Reference
+
+Complete list of configuration variables.
+
+---
+
+See [docs/user-guide/configuration.md](../user-guide/configuration.md#environment-variables) for detailed descriptions.
+
+---
+
+## Required
+
+- `OPENAI_API_KEY`
+- `ALPHAADVANTAGE_API_KEY`
+- `JINA_API_KEY`
+
+---
+
+## Optional
+
+- `API_PORT` (default: 8080)
+- `API_HOST` (default: 0.0.0.0)
+- `OPENAI_API_BASE`
+- `MAX_CONCURRENT_JOBS` (default: 1)
+- `MAX_SIMULATION_DAYS` (default: 30)
+- `AUTO_DOWNLOAD_PRICE_DATA` (default: true)
+- `AGENT_MAX_STEP` (default: 30)
+- `VOLUME_PATH` (default: .)
+- `MATH_HTTP_PORT` (default: 8000)
+- `SEARCH_HTTP_PORT` (default: 8001)
+- `TRADE_HTTP_PORT` (default: 8002)
+- `GETPRICE_HTTP_PORT` (default: 8003)
@@ -0,0 +1,39 @@
+# MCP Tools Reference
+
+Model Context Protocol tools available to AI agents.
+
+---
+
+## Available Tools
+
+### Math Tool (Port 8000)
+Mathematical calculations and analysis.
+
+### Search Tool (Port 8001)
+Market intelligence via Jina AI search.
+- News articles
+- Analyst reports
+- Financial data
+
+### Trade Tool (Port 8002)
+Buy/sell execution.
+- Place orders
+- Check balances
+- View positions
+
+### Price Tool (Port 8003)
+Historical and current price data.
+- OHLCV data
+- Multiple symbols
+- Date filtering
+
+---
+
+## Usage
+
+AI agents access tools automatically through MCP protocol.
+Tools are localhost-only and not exposed to external network.
+
+---
+
+See `agent_tools/` directory for implementations.
@@ -0,0 +1,327 @@
+# Configuration Guide
+
+Complete guide to configuring AI-Trader.
+
+---
+
+## Environment Variables
+
+Set in `.env` file in project root.
+
+### Required Variables
+
+```bash
+# OpenAI API (or compatible endpoint)
+OPENAI_API_KEY=sk-your-key-here
+
+# Alpha Vantage (price data)
+ALPHAADVANTAGE_API_KEY=your-key-here
+
+# Jina AI (market intelligence search)
+JINA_API_KEY=your-key-here
+```
+
+### Optional Variables
+
+```bash
+# API Server Configuration
+API_PORT=8080                       # Host port mapping (default: 8080)
+API_HOST=0.0.0.0                   # Bind address (default: 0.0.0.0)
+
+# OpenAI Configuration
+OPENAI_API_BASE=https://api.openai.com/v1  # Custom endpoint
+
+# Simulation Limits
+MAX_CONCURRENT_JOBS=1               # Max simultaneous jobs (default: 1)
+MAX_SIMULATION_DAYS=30              # Max date range per job (default: 30)
+
+# Price Data Management
+AUTO_DOWNLOAD_PRICE_DATA=true       # Auto-fetch missing data (default: true)
+
+# Agent Configuration
+AGENT_MAX_STEP=30                   # Max reasoning steps per day (default: 30)
+
+# Volume Paths
+VOLUME_PATH=.                       # Base directory for data (default: .)
+
+# MCP Service Ports (usually don't need to change)
+MATH_HTTP_PORT=8000
+SEARCH_HTTP_PORT=8001
+TRADE_HTTP_PORT=8002
+GETPRICE_HTTP_PORT=8003
+```
+
+---
+
+## Model Configuration
+
+Edit `configs/default_config.json` to define available AI models.
+
+### Configuration Structure
+
+```json
+{
+  "agent_type": "BaseAgent",
+  "date_range": {
+    "init_date": "2025-01-01",
+    "end_date": "2025-01-31"
+  },
+  "models": [
+    {
+      "name": "GPT-4",
+      "basemodel": "openai/gpt-4",
+      "signature": "gpt-4",
+      "enabled": true
+    }
+  ],
+  "agent_config": {
+    "max_steps": 30,
+    "max_retries": 3,
+    "initial_cash": 10000.0
+  },
+  "log_config": {
+    "log_path": "./data/agent_data"
+  }
+}
+```
+
+### Model Configuration Fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `name` | Yes | Display name for the model |
+| `basemodel` | Yes | Model identifier (e.g., `openai/gpt-4`, `anthropic/claude-3.7-sonnet`) |
+| `signature` | Yes | Unique identifier used in API requests and database |
+| `enabled` | Yes | Whether this model runs when no models specified in API request |
+| `openai_base_url` | No | Custom API endpoint for this model |
+| `openai_api_key` | No | Model-specific API key (overrides `OPENAI_API_KEY` env var) |
+
+### Adding Custom Models
+
+**Example: Add Claude 3.7 Sonnet**
+
+```json
+{
+  "models": [
+    {
+      "name": "Claude 3.7 Sonnet",
+      "basemodel": "anthropic/claude-3.7-sonnet",
+      "signature": "claude-3.7-sonnet",
+      "enabled": true,
+      "openai_base_url": "https://api.anthropic.com/v1",
+      "openai_api_key": "your-anthropic-key"
+    }
+  ]
+}
+```
+
+**Example: Add DeepSeek via OpenRouter**
+
+```json
+{
+  "models": [
+    {
+      "name": "DeepSeek",
+      "basemodel": "deepseek/deepseek-chat",
+      "signature": "deepseek",
+      "enabled": true,
+      "openai_base_url": "https://openrouter.ai/api/v1",
+      "openai_api_key": "your-openrouter-key"
+    }
+  ]
+}
+```
+
+### Agent Configuration
+
+| Field | Description | Default |
+|-------|-------------|---------|
+| `max_steps` | Maximum reasoning iterations per trading day | 30 |
+| `max_retries` | Retry attempts on API failures | 3 |
+| `initial_cash` | Starting capital per model | 10000.0 |
+
+---
+
+## Port Configuration
+
+### Default Ports
+
+| Service | Internal Port | Host Port (configurable) |
+|---------|---------------|--------------------------|
+| API Server | 8080 | `API_PORT` (default: 8080) |
+| MCP Math | 8000 | Not exposed to host |
+| MCP Search | 8001 | Not exposed to host |
+| MCP Trade | 8002 | Not exposed to host |
+| MCP Price | 8003 | Not exposed to host |
+
+### Changing API Port
+
+If port 8080 is already in use:
+
+```bash
+# Add to .env
+echo "API_PORT=8889" >> .env
+
+# Restart
+docker-compose down
+docker-compose up -d
+
+# Access on new port
+curl http://localhost:8889/health
+```
+
+---
+
+## Volume Configuration
+
+Docker volumes persist data across container restarts:
+
+```yaml
+volumes:
+  - ./data:/app/data          # Database, price data, agent data
+  - ./configs:/app/configs    # Configuration files
+  - ./logs:/app/logs          # Application logs
+```
+
+### Data Directory Structure
+
+```
+data/
+├── jobs.db                      # SQLite database
+├── merged.jsonl                 # Cached price data
+├── daily_prices_*.json          # Individual stock data
+├── price_coverage.json          # Data availability tracking
+└── agent_data/                  # Agent execution data
+    └── {signature}/
+        ├── position/
+        │   └── position.jsonl   # Trading positions
+        └── log/
+            └── {date}/
+                └── log.jsonl    # Trading logs
+```
+
+---
+
+## API Key Setup
+
+### OpenAI API Key
+
+1. Visit [platform.openai.com/api-keys](https://platform.openai.com/api-keys)
+2. Create new key
+3. Add to `.env`:
+   ```bash
+   OPENAI_API_KEY=sk-...
+   ```
+
+### Alpha Vantage API Key
+
+1. Visit [alphavantage.co/support/#api-key](https://www.alphavantage.co/support/#api-key)
+2. Get free key (5 req/min) or premium (75 req/min)
+3. Add to `.env`:
+   ```bash
+   ALPHAADVANTAGE_API_KEY=...
+   ```
+
+### Jina AI API Key
+
+1. Visit [jina.ai](https://jina.ai/)
+2. Sign up for free tier
+3. Add to `.env`:
+   ```bash
+   JINA_API_KEY=...
+   ```
+
+---
+
+## Configuration Examples
+
+### Development Setup
+
+```bash
+# .env
+API_PORT=8080
+MAX_CONCURRENT_JOBS=1
+MAX_SIMULATION_DAYS=5           # Limit for faster testing
+AUTO_DOWNLOAD_PRICE_DATA=true
+AGENT_MAX_STEP=10               # Fewer steps for faster iteration
+```
+
+### Production Setup
+
+```bash
+# .env
+API_PORT=8080
+MAX_CONCURRENT_JOBS=1
+MAX_SIMULATION_DAYS=30
+AUTO_DOWNLOAD_PRICE_DATA=true
+AGENT_MAX_STEP=30
+```
+
+### Multi-Model Competition
+
+```json
+// configs/default_config.json
+{
+  "models": [
+    {
+      "name": "GPT-4",
+      "basemodel": "openai/gpt-4",
+      "signature": "gpt-4",
+      "enabled": true
+    },
+    {
+      "name": "Claude 3.7",
+      "basemodel": "anthropic/claude-3.7-sonnet",
+      "signature": "claude-3.7",
+      "enabled": true,
+      "openai_base_url": "https://api.anthropic.com/v1",
+      "openai_api_key": "anthropic-key"
+    },
+    {
+      "name": "GPT-3.5 Turbo",
+      "basemodel": "openai/gpt-3.5-turbo",
+      "signature": "gpt-3.5-turbo",
+      "enabled": false  // Not run by default
+    }
+  ]
+}
+```
+
+---
+
+## Environment Variable Priority
+
+When the same configuration exists in multiple places:
+
+1. **API request parameters** (highest priority)
+2. **Model-specific config** (`openai_base_url`, `openai_api_key` in model config)
+3. **Environment variables** (`.env` file)
+4. **Default values** (lowest priority)
+
+Example:
+```json
+// If model config has:
+{
+  "openai_api_key": "model-specific-key"
+}
+
+// This overrides OPENAI_API_KEY from .env
+```
+
+---
+
+## Validation
+
+After configuration changes:
+
+```bash
+# Restart service
+docker-compose down
+docker-compose up -d
+
+# Verify health
+curl http://localhost:8080/health
+
+# Check logs for errors
+docker logs ai-trader | grep -i error
+```
@@ -0,0 +1,197 @@
+# Integration Examples
+
+Examples for integrating AI-Trader with external systems.
+
+---
+
+## Python
+
+See complete Python client in [API_REFERENCE.md](../../API_REFERENCE.md#client-libraries).
+
+### Async Client
+
+```python
+import aiohttp
+import asyncio
+
+class AsyncAITraderClient:
+    def __init__(self, base_url="http://localhost:8080"):
+        self.base_url = base_url
+
+    async def trigger_simulation(self, start_date, end_date=None, models=None):
+        payload = {"start_date": start_date}
+        if end_date:
+            payload["end_date"] = end_date
+        if models:
+            payload["models"] = models
+
+        async with aiohttp.ClientSession() as session:
+            async with session.post(
+                f"{self.base_url}/simulate/trigger",
+                json=payload
+            ) as response:
+                response.raise_for_status()
+                return await response.json()
+
+    async def wait_for_completion(self, job_id, poll_interval=10):
+        async with aiohttp.ClientSession() as session:
+            while True:
+                async with session.get(
+                    f"{self.base_url}/simulate/status/{job_id}"
+                ) as response:
+                    status = await response.json()
+
+                    if status["status"] in ["completed", "partial", "failed"]:
+                        return status
+
+                    await asyncio.sleep(poll_interval)
+
+# Usage
+async def main():
+    client = AsyncAITraderClient()
+    job = await client.trigger_simulation("2025-01-16", models=["gpt-4"])
+    result = await client.wait_for_completion(job["job_id"])
+    print(f"Simulation completed: {result['status']}")
+
+asyncio.run(main())
+```
+
+---
+
+## TypeScript/JavaScript
+
+See complete TypeScript client in [API_REFERENCE.md](../../API_REFERENCE.md#client-libraries).
+
+---
+
+## Bash/Shell Scripts
+
+### Daily Automation
+
+```bash
+#!/bin/bash
+# daily_simulation.sh
+
+API_URL="http://localhost:8080"
+DATE=$(date -d "yesterday" +%Y-%m-%d)
+
+echo "Triggering simulation for $DATE"
+
+# Trigger
+RESPONSE=$(curl -s -X POST $API_URL/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d "{\"start_date\": \"$DATE\", \"models\": [\"gpt-4\"]}")
+
+JOB_ID=$(echo $RESPONSE | jq -r '.job_id')
+echo "Job ID: $JOB_ID"
+
+# Poll
+while true; do
+  STATUS=$(curl -s $API_URL/simulate/status/$JOB_ID | jq -r '.status')
+  echo "Status: $STATUS"
+
+  if [[ "$STATUS" == "completed" ]] || [[ "$STATUS" == "partial" ]] || [[ "$STATUS" == "failed" ]]; then
+    break
+  fi
+
+  sleep 30
+done
+
+# Get results
+curl -s "$API_URL/results?job_id=$JOB_ID" | jq '.' > results_$DATE.json
+echo "Results saved to results_$DATE.json"
+```
+
+Add to crontab:
+```bash
+0 6 * * * /path/to/daily_simulation.sh >> /var/log/ai-trader.log 2>&1
+```
+
+---
+
+## Apache Airflow
+
+```python
+from airflow import DAG
+from airflow.operators.python import PythonOperator
+from datetime import datetime, timedelta
+import requests
+import time
+
+def trigger_simulation(**context):
+    response = requests.post(
+        "http://ai-trader:8080/simulate/trigger",
+        json={"start_date": "{{ ds }}", "models": ["gpt-4"]}
+    )
+    response.raise_for_status()
+    return response.json()["job_id"]
+
+def wait_for_completion(**context):
+    job_id = context["task_instance"].xcom_pull(task_ids="trigger")
+    
+    while True:
+        response = requests.get(f"http://ai-trader:8080/simulate/status/{job_id}")
+        status = response.json()
+        
+        if status["status"] in ["completed", "partial", "failed"]:
+            return status
+        
+        time.sleep(30)
+
+def fetch_results(**context):
+    job_id = context["task_instance"].xcom_pull(task_ids="trigger")
+    response = requests.get(f"http://ai-trader:8080/results?job_id={job_id}")
+    return response.json()
+
+default_args = {
+    "owner": "airflow",
+    "depends_on_past": False,
+    "start_date": datetime(2025, 1, 1),
+    "retries": 1,
+    "retry_delay": timedelta(minutes=5),
+}
+
+dag = DAG(
+    "ai_trader_simulation",
+    default_args=default_args,
+    schedule_interval="0 6 * * *",  # Daily at 6 AM
+    catchup=False
+)
+
+trigger_task = PythonOperator(
+    task_id="trigger",
+    python_callable=trigger_simulation,
+    dag=dag
+)
+
+wait_task = PythonOperator(
+    task_id="wait",
+    python_callable=wait_for_completion,
+    dag=dag
+)
+
+fetch_task = PythonOperator(
+    task_id="fetch_results",
+    python_callable=fetch_results,
+    dag=dag
+)
+
+trigger_task >> wait_task >> fetch_task
+```
+
+---
+
+## Generic Workflow Automation
+
+Any HTTP-capable automation service can integrate with AI-Trader:
+
+1. **Trigger:** POST to `/simulate/trigger`
+2. **Poll:** GET `/simulate/status/{job_id}` every 10-30 seconds
+3. **Retrieve:** GET `/results?job_id={job_id}` when complete
+4. **Store:** Save results to your database/warehouse
+
+**Key considerations:**
+- Handle 400 errors (concurrent jobs) gracefully
+- Implement exponential backoff for retries
+- Monitor health endpoint before triggering
+- Store job_id for tracking and debugging
@@ -0,0 +1,488 @@
+# Troubleshooting Guide
+
+Common issues and solutions for AI-Trader.
+
+---
+
+## Container Issues
+
+### Container Won't Start
+
+**Symptoms:**
+- `docker ps` shows no ai-trader container
+- Container exits immediately after starting
+
+**Debug:**
+```bash
+# Check logs
+docker logs ai-trader
+
+# Check if container exists (stopped)
+docker ps -a | grep ai-trader
+```
+
+**Common Causes & Solutions:**
+
+**1. Missing API Keys**
+```bash
+# Verify .env file
+cat .env | grep -E "OPENAI_API_KEY|ALPHAADVANTAGE_API_KEY|JINA_API_KEY"
+
+# Should show all three keys with values
+```
+
+**Solution:** Add missing keys to `.env`
+
+**2. Port Already in Use**
+```bash
+# Check what's using port 8080
+sudo lsof -i :8080  # Linux/Mac
+netstat -ano | findstr :8080  # Windows
+```
+
+**Solution:** Change port in `.env`:
+```bash
+echo "API_PORT=8889" >> .env
+docker-compose down
+docker-compose up -d
+```
+
+**3. Volume Permission Issues**
+```bash
+# Fix permissions
+chmod -R 755 data logs configs
+```
+
+---
+
+### Health Check Fails
+
+**Symptoms:**
+- `curl http://localhost:8080/health` returns error or HTML page
+- Container running but API not responding
+
+**Debug:**
+```bash
+# Check if API process is running
+docker exec ai-trader ps aux | grep uvicorn
+
+# Test internal health (always port 8080 inside container)
+docker exec ai-trader curl http://localhost:8080/health
+
+# Check configured port
+grep API_PORT .env
+```
+
+**Solutions:**
+
+**If you get HTML 404 page:**
+Another service is using your configured port.
+
+```bash
+# Find conflicting service
+sudo lsof -i :8080
+
+# Change AI-Trader port
+echo "API_PORT=8889" >> .env
+docker-compose down
+docker-compose up -d
+
+# Now use new port
+curl http://localhost:8889/health
+```
+
+**If MCP services didn't start:**
+```bash
+# Check MCP processes
+docker exec ai-trader ps aux | grep python
+
+# Should see 4 MCP services on ports 8000-8003
+```
+
+**If database issues:**
+```bash
+# Check database file
+docker exec ai-trader ls -l /app/data/jobs.db
+
+# If missing, restart to recreate
+docker-compose restart
+```
+
+---
+
+## Simulation Issues
+
+### Job Stays in "Pending" Status
+
+**Symptoms:**
+- Job triggered but never progresses to "running"
+- Status remains "pending" indefinitely
+
+**Debug:**
+```bash
+# Check worker logs
+docker logs ai-trader | grep -i "worker\|simulation"
+
+# Check database
+docker exec ai-trader sqlite3 /app/data/jobs.db "SELECT * FROM job_details;"
+
+# Check MCP service accessibility
+docker exec ai-trader curl http://localhost:8000/health
+```
+
+**Solutions:**
+
+```bash
+# Restart container (jobs resume automatically)
+docker-compose restart
+
+# Check specific job status with details
+curl http://localhost:8080/simulate/status/$JOB_ID | jq '.details'
+```
+
+---
+
+### Job Takes Too Long / Timeouts
+
+**Symptoms:**
+- Jobs taking longer than expected
+- Test scripts timing out
+
+**Expected Execution Times:**
+- Single model-day: 2-5 minutes (with cached price data)
+- First run with data download: 10-15 minutes
+- 2-date, 2-model job: 10-20 minutes
+
+**Solutions:**
+
+**Increase poll timeout in monitoring:**
+```bash
+# Instead of fixed polling, use this
+while true; do
+  STATUS=$(curl -s http://localhost:8080/simulate/status/$JOB_ID | jq -r '.status')
+  echo "$(date): Status = $STATUS"
+  
+  if [[ "$STATUS" == "completed" ]] || [[ "$STATUS" == "partial" ]] || [[ "$STATUS" == "failed" ]]; then
+    break
+  fi
+  
+  sleep 30
+done
+```
+
+**Check if agent is stuck:**
+```bash
+# View real-time logs
+docker logs -f ai-trader
+
+# Look for repeated errors or infinite loops
+```
+
+---
+
+### "No trading dates with complete price data"
+
+**Error Message:**
+```
+No trading dates with complete price data in range 2025-01-16 to 2025-01-17. 
+All symbols must have data for a date to be tradeable.
+```
+
+**Cause:** Missing price data for requested dates.
+
+**Solutions:**
+
+**Option 1: Try Recent Dates**
+
+Use more recent dates where data is more likely available:
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{"start_date": "2024-12-15", "models": ["gpt-4"]}'
+```
+
+**Option 2: Manually Download Data**
+
+```bash
+docker exec -it ai-trader bash
+cd data
+python get_daily_price.py  # Downloads latest data
+python merge_jsonl.py       # Merges into database
+exit
+
+# Retry simulation
+```
+
+**Option 3: Check Auto-Download Setting**
+
+```bash
+# Ensure auto-download is enabled
+grep AUTO_DOWNLOAD_PRICE_DATA .env
+
+# Should be: AUTO_DOWNLOAD_PRICE_DATA=true
+```
+
+---
+
+### Rate Limit Errors
+
+**Symptoms:**
+- Logs show "rate limit" messages
+- Partial data downloaded
+
+**Cause:** Alpha Vantage API rate limits (5 req/min free tier, 75 req/min premium)
+
+**Solutions:**
+
+**For free tier:**
+- Simulations automatically continue with available data
+- Next simulation resumes downloads
+- Consider upgrading to premium API key
+
+**Workaround:**
+```bash
+# Pre-download data in batches
+docker exec -it ai-trader bash
+cd data
+
+# Download in stages (wait 1 min between runs)
+python get_daily_price.py
+sleep 60
+python get_daily_price.py
+sleep 60
+python get_daily_price.py
+
+python merge_jsonl.py
+exit
+```
+
+---
+
+## API Issues
+
+### 400 Bad Request: Another Job Running
+
+**Error:**
+```json
+{
+  "detail": "Another simulation job is already running or pending. Please wait for it to complete."
+}
+```
+
+**Cause:** AI-Trader allows only 1 concurrent job by default.
+
+**Solutions:**
+
+**Check current jobs:**
+```bash
+# Find running job
+curl http://localhost:8080/health  # Verify API is up
+
+# Query recent jobs (need to check database)
+docker exec ai-trader sqlite3 /app/data/jobs.db \
+  "SELECT job_id, status FROM jobs ORDER BY created_at DESC LIMIT 5;"
+```
+
+**Wait for completion:**
+```bash
+# Get the blocking job's status
+curl http://localhost:8080/simulate/status/{job_id}
+```
+
+**Force-stop stuck job (last resort):**
+```bash
+# Update job status in database
+docker exec ai-trader sqlite3 /app/data/jobs.db \
+  "UPDATE jobs SET status='failed' WHERE status IN ('pending', 'running');"
+
+# Restart service
+docker-compose restart
+```
+
+---
+
+### Invalid Date Format Errors
+
+**Error:**
+```json
+{
+  "detail": "Invalid date format: 2025-1-16. Expected YYYY-MM-DD"
+}
+```
+
+**Solution:** Use zero-padded dates:
+
+```bash
+# Wrong
+{"start_date": "2025-1-16"}
+
+# Correct
+{"start_date": "2025-01-16"}
+```
+
+---
+
+### Date Range Too Large
+
+**Error:**
+```json
+{
+  "detail": "Date range too large: 45 days. Maximum allowed: 30 days"
+}
+```
+
+**Solution:** Split into smaller batches:
+
+```bash
+# Instead of 2025-01-01 to 2025-02-15 (45 days)
+# Run as two jobs:
+
+# Job 1: Jan 1-30
+curl -X POST http://localhost:8080/simulate/trigger \
+  -d '{"start_date": "2025-01-01", "end_date": "2025-01-30"}'
+
+# Job 2: Jan 31 - Feb 15
+curl -X POST http://localhost:8080/simulate/trigger \
+  -d '{"start_date": "2025-01-31", "end_date": "2025-02-15"}'
+```
+
+---
+
+## Data Issues
+
+### Database Corruption
+
+**Symptoms:**
+- "database disk image is malformed"
+- Unexpected SQL errors
+
+**Solutions:**
+
+**Backup and rebuild:**
+```bash
+# Stop service
+docker-compose down
+
+# Backup current database
+cp data/jobs.db data/jobs.db.backup
+
+# Try recovery
+docker run --rm -v $(pwd)/data:/data alpine sqlite3 /data/jobs.db "PRAGMA integrity_check;"
+
+# If corrupted, delete and restart (loses job history)
+rm data/jobs.db
+docker-compose up -d
+```
+
+---
+
+### Missing Price Data Files
+
+**Symptoms:**
+- Errors about missing `merged.jsonl`
+- Price query failures
+
+**Solution:**
+
+```bash
+# Re-download price data
+docker exec -it ai-trader bash
+cd data
+python get_daily_price.py
+python merge_jsonl.py
+ls -lh merged.jsonl  # Should exist
+exit
+```
+
+---
+
+## Performance Issues
+
+### Slow Simulation Execution
+
+**Typical speeds:**
+- Single model-day: 2-5 minutes
+- With cold start (first time): +3-5 minutes
+
+**Causes & Solutions:**
+
+**1. AI Model API is slow**
+- Check AI provider status page
+- Try different model
+- Increase timeout in config
+
+**2. Network latency**
+- Check internet connection
+- Jina Search API might be slow
+
+**3. MCP services overloaded**
+```bash
+# Check CPU usage
+docker stats ai-trader
+```
+
+---
+
+### High Memory Usage
+
+**Normal:** 500MB - 1GB during simulation
+
+**If higher:**
+```bash
+# Check memory
+docker stats ai-trader
+
+# Restart if needed
+docker-compose restart
+```
+
+---
+
+## Diagnostic Commands
+
+```bash
+# Container status
+docker ps | grep ai-trader
+
+# Real-time logs
+docker logs -f ai-trader
+
+# Check errors only
+docker logs ai-trader 2>&1 | grep -i error
+
+# Container resource usage
+docker stats ai-trader
+
+# Access container shell
+docker exec -it ai-trader bash
+
+# Database inspection
+docker exec -it ai-trader sqlite3 /app/data/jobs.db
+sqlite> SELECT * FROM jobs ORDER BY created_at DESC LIMIT 5;
+sqlite> SELECT status, COUNT(*) FROM jobs GROUP BY status;
+sqlite> .quit
+
+# Check file permissions
+docker exec ai-trader ls -la /app/data
+
+# Test API connectivity
+curl -v http://localhost:8080/health
+
+# View all environment variables
+docker exec ai-trader env | sort
+```
+
+---
+
+## Getting More Help
+
+If your issue isn't covered here:
+
+1. **Check logs** for specific error messages
+2. **Review** [API_REFERENCE.md](../../API_REFERENCE.md) for correct usage
+3. **Search** [GitHub Issues](https://github.com/Xe138/AI-Trader/issues)
+4. **Open new issue** with:
+   - Error messages from logs
+   - Steps to reproduce
+   - Environment details (OS, Docker version)
+   - Relevant config files (redact API keys)
@@ -0,0 +1,182 @@
+# Using the API
+
+Common workflows and best practices for AI-Trader API.
+
+---
+
+## Basic Workflow
+
+### 1. Trigger Simulation
+
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -H "Content-Type: application/json" \
+  -d '{
+    "start_date": "2025-01-16",
+    "end_date": "2025-01-17",
+    "models": ["gpt-4"]
+  }'
+```
+
+Save the `job_id` from response.
+
+### 2. Poll for Completion
+
+```bash
+JOB_ID="your-job-id-here"
+
+while true; do
+  STATUS=$(curl -s http://localhost:8080/simulate/status/$JOB_ID | jq -r '.status')
+  echo "Status: $STATUS"
+  
+  if [[ "$STATUS" == "completed" ]] || [[ "$STATUS" == "partial" ]] || [[ "$STATUS" == "failed" ]]; then
+    break
+  fi
+  
+  sleep 10
+done
+```
+
+### 3. Retrieve Results
+
+```bash
+curl "http://localhost:8080/results?job_id=$JOB_ID" | jq '.'
+```
+
+---
+
+## Common Patterns
+
+### Single-Day Simulation
+
+Omit `end_date` to simulate just one day:
+
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -d '{"start_date": "2025-01-16", "models": ["gpt-4"]}'
+```
+
+### All Enabled Models
+
+Omit `models` to run all enabled models from config:
+
+```bash
+curl -X POST http://localhost:8080/simulate/trigger \
+  -d '{"start_date": "2025-01-16", "end_date": "2025-01-20"}'
+```
+
+### Filter Results
+
+```bash
+# By date
+curl "http://localhost:8080/results?date=2025-01-16"
+
+# By model
+curl "http://localhost:8080/results?model=gpt-4"
+
+# Combined
+curl "http://localhost:8080/results?job_id=$JOB_ID&date=2025-01-16&model=gpt-4"
+```
+
+---
+
+## Best Practices
+
+### 1. Check Health Before Triggering
+
+```bash
+curl http://localhost:8080/health
+
+# Only proceed if status is "healthy"
+```
+
+### 2. Use Exponential Backoff for Retries
+
+```python
+import time
+import requests
+
+def trigger_with_retry(max_retries=3):
+    for attempt in range(max_retries):
+        try:
+            response = requests.post(
+                "http://localhost:8080/simulate/trigger",
+                json={"start_date": "2025-01-16"}
+            )
+            response.raise_for_status()
+            return response.json()
+        except requests.HTTPError as e:
+            if e.response.status_code == 400:
+                # Don't retry on validation errors
+                raise
+            wait = 2 ** attempt  # 1s, 2s, 4s
+            time.sleep(wait)
+    
+    raise Exception("Max retries exceeded")
+```
+
+### 3. Handle Concurrent Job Conflicts
+
+```python
+response = requests.post(
+    "http://localhost:8080/simulate/trigger",
+    json={"start_date": "2025-01-16"}
+)
+
+if response.status_code == 400 and "already running" in response.json()["detail"]:
+    print("Another job is running. Waiting...")
+    # Wait and retry, or query existing job status
+```
+
+### 4. Monitor Progress with Details
+
+```python
+def get_detailed_progress(job_id):
+    response = requests.get(f"http://localhost:8080/simulate/status/{job_id}")
+    status = response.json()
+    
+    print(f"Overall: {status['status']}")
+    print(f"Progress: {status['progress']['completed']}/{status['progress']['total_model_days']}")
+    
+    # Show per-model-day status
+    for detail in status['details']:
+        print(f"  {detail['trading_date']} {detail['model_signature']}: {detail['status']}")
+```
+
+---
+
+## Error Handling
+
+### Validation Errors (400)
+
+```python
+try:
+    response = requests.post(
+        "http://localhost:8080/simulate/trigger",
+        json={"start_date": "2025-1-16"}  # Wrong format
+    )
+    response.raise_for_status()
+except requests.HTTPError as e:
+    if e.response.status_code == 400:
+        print(f"Validation error: {e.response.json()['detail']}")
+        # Fix input and retry
+```
+
+### Service Unavailable (503)
+
+```python
+try:
+    response = requests.post(
+        "http://localhost:8080/simulate/trigger",
+        json={"start_date": "2025-01-16"}
+    )
+    response.raise_for_status()
+except requests.HTTPError as e:
+    if e.response.status_code == 503:
+        print("Service unavailable (likely price data download failed)")
+        # Retry later or check ALPHAADVANTAGE_API_KEY
+```
+
+---
+
+See [API_REFERENCE.md](../../API_REFERENCE.md) for complete endpoint documentation.
@@ -1,900 +0,0 @@
-# Background Worker Architecture Specification
-
-## 1. Overview
-
-The Background Worker executes simulation jobs asynchronously, allowing the API to return immediately (202 Accepted) while simulations run in the background.
-
-**Key Responsibilities:**
-1. Execute simulation jobs queued by `/simulate/trigger` endpoint
-2. Manage per-model-day execution with status updates
-3. Handle errors gracefully (model failures don't block other models)
-4. Coordinate runtime configuration for concurrent model execution
-5. Update job status in database throughout execution
-
---
-
-## 2. Worker Architecture
-
-### 2.1 Execution Model
-
-**Pattern:** Date-sequential, Model-parallel execution
-
-```
-Job: Simulate 2025-01-16 to 2025-01-18 for models [gpt-5, claude-3.7-sonnet]
-
-Execution flow:
-┌─────────────────────────────────────────────────────────────┐
-│ Date: 2025-01-16                                            │
-│   ├─ gpt-5 (running)              ┐                         │
-│   └─ claude-3.7-sonnet (running)  ┘ Parallel               │
-└─────────────────────────────────────────────────────────────┘
-                    │
-                    ▼ (both complete)
-┌─────────────────────────────────────────────────────────────┐
-│ Date: 2025-01-17                                            │
-│   ├─ gpt-5 (running)              ┐                         │
-│   └─ claude-3.7-sonnet (running)  ┘ Parallel               │
-└─────────────────────────────────────────────────────────────┘
-                    │
-                    ▼
-┌─────────────────────────────────────────────────────────────┐
-│ Date: 2025-01-18                                            │
-│   ├─ gpt-5 (running)              ┐                         │
-│   └─ claude-3.7-sonnet (running)  ┘ Parallel               │
-└─────────────────────────────────────────────────────────────┘
-```
-
-**Rationale:**
- **Models run in parallel** → Faster total execution (30-60s per model-day, 3 models = ~30-60s per date instead of ~90-180s)
- **Dates run sequentially** → Ensures position.jsonl integrity (no concurrent writes to same file)
- **Independent failure handling** → One model's failure doesn't block other models
-
---
-
-### 2.2 File Structure
-
-```
-api/
-├── worker.py           # SimulationWorker class
-├── executor.py         # Single model-day execution logic
-└── runtime_manager.py  # Runtime config isolation
-```
-
---
-
-## 3. Worker Implementation
-
-### 3.1 SimulationWorker Class
-
-```python
-# api/worker.py
-
-import asyncio
-from typing import List, Dict
-from datetime import datetime
-import logging
-from api.job_manager import JobManager
-from api.executor import ModelDayExecutor
-from main import load_config, get_agent_class
-
-logger = logging.getLogger(__name__)
-
-class SimulationWorker:
-    """
-    Executes simulation jobs in the background.
-
-    Manages:
-    - Date-sequential, model-parallel execution
-    - Job status updates throughout execution
-    - Error handling and recovery
-    """
-
-    def __init__(self, job_manager: JobManager):
-        self.job_manager = job_manager
-        self.executor = ModelDayExecutor(job_manager)
-
-    async def run_job(self, job_id: str) -> None:
-        """
-        Execute a simulation job.
-
-        Args:
-            job_id: UUID of job to execute
-
-        Flow:
-            1. Load job from database
-            2. Load configuration file
-            3. Initialize agents for each model
-            4. For each date sequentially:
-                - Run all models in parallel
-                - Update status after each model-day
-            5. Mark job as completed/partial/failed
-        """
-        logger.info(f"Starting simulation job {job_id}")
-
-        try:
-            # 1. Load job metadata
-            job = self.job_manager.get_job(job_id)
-            if not job:
-                logger.error(f"Job {job_id} not found")
-                return
-
-            # 2. Update job status to 'running'
-            self.job_manager.update_job_status(job_id, "running")
-
-            # 3. Load configuration
-            config = load_config(job["config_path"])
-
-            # 4. Get enabled models from config
-            enabled_models = [
-                m for m in config["models"]
-                if m.get("signature") in job["models"] and m.get("enabled", True)
-            ]
-
-            if not enabled_models:
-                raise ValueError("No enabled models found in configuration")
-
-            # 5. Get agent class
-            agent_type = config.get("agent_type", "BaseAgent")
-            AgentClass = get_agent_class(agent_type)
-
-            # 6. Execute each date sequentially
-            for date in job["date_range"]:
-                logger.info(f"[Job {job_id}] Processing date: {date}")
-
-                # Run all models for this date in parallel
-                tasks = []
-                for model_config in enabled_models:
-                    task = self.executor.run_model_day(
-                        job_id=job_id,
-                        date=date,
-                        model_config=model_config,
-                        agent_class=AgentClass,
-                        config=config
-                    )
-                    tasks.append(task)
-
-                # Wait for all models to complete this date
-                results = await asyncio.gather(*tasks, return_exceptions=True)
-
-                # Log any exceptions (already handled by executor, just for visibility)
-                for i, result in enumerate(results):
-                    if isinstance(result, Exception):
-                        model_sig = enabled_models[i]["signature"]
-                        logger.error(f"[Job {job_id}] Model {model_sig} failed on {date}: {result}")
-
-                logger.info(f"[Job {job_id}] Date {date} completed")
-
-            # 7. Job execution finished - final status will be set by job_manager
-            # based on job_details statuses
-            logger.info(f"[Job {job_id}] All dates processed")
-
-        except Exception as e:
-            logger.error(f"[Job {job_id}] Fatal error: {e}", exc_info=True)
-            self.job_manager.update_job_status(job_id, "failed", error=str(e))
-```
-
---
-
-### 3.2 ModelDayExecutor
-
-```python
-# api/executor.py
-
-import asyncio
-import os
-import logging
-from typing import Dict, Any
-from datetime import datetime
-from pathlib import Path
-from api.job_manager import JobManager
-from api.runtime_manager import RuntimeConfigManager
-from tools.general_tools import write_config_value
-
-logger = logging.getLogger(__name__)
-
-class ModelDayExecutor:
-    """
-    Executes a single model-day simulation.
-
-    Responsibilities:
-    - Initialize agent for specific model
-    - Set up isolated runtime configuration
-    - Execute trading session
-    - Update job_detail status
-    - Handle errors without blocking other models
-    """
-
-    def __init__(self, job_manager: JobManager):
-        self.job_manager = job_manager
-        self.runtime_manager = RuntimeConfigManager()
-
-    async def run_model_day(
-        self,
-        job_id: str,
-        date: str,
-        model_config: Dict[str, Any],
-        agent_class: type,
-        config: Dict[str, Any]
-    ) -> None:
-        """
-        Execute simulation for one model on one date.
-
-        Args:
-            job_id: Job UUID
-            date: Trading date (YYYY-MM-DD)
-            model_config: Model configuration dict from config file
-            agent_class: Agent class (e.g., BaseAgent)
-            config: Full configuration dict
-
-        Updates:
-            - job_details status: pending → running → completed/failed
-            - Writes to position.jsonl and log.jsonl
-        """
-        model_sig = model_config["signature"]
-        logger.info(f"[Job {job_id}] Starting {model_sig} on {date}")
-
-        # Update status to 'running'
-        self.job_manager.update_job_detail_status(
-            job_id, date, model_sig, "running"
-        )
-
-        # Create isolated runtime config for this execution
-        runtime_config_path = self.runtime_manager.create_runtime_config(
-            job_id=job_id,
-            model_sig=model_sig,
-            date=date
-        )
-
-        try:
-            # 1. Extract model parameters
-            basemodel = model_config.get("basemodel")
-            openai_base_url = model_config.get("openai_base_url")
-            openai_api_key = model_config.get("openai_api_key")
-
-            if not basemodel:
-                raise ValueError(f"Model {model_sig} missing basemodel field")
-
-            # 2. Get agent configuration
-            agent_config = config.get("agent_config", {})
-            log_config = config.get("log_config", {})
-
-            max_steps = agent_config.get("max_steps", 10)
-            max_retries = agent_config.get("max_retries", 3)
-            base_delay = agent_config.get("base_delay", 0.5)
-            initial_cash = agent_config.get("initial_cash", 10000.0)
-            log_path = log_config.get("log_path", "./data/agent_data")
-
-            # 3. Get stock symbols from prompts
-            from prompts.agent_prompt import all_nasdaq_100_symbols
-
-            # 4. Create agent instance
-            agent = agent_class(
-                signature=model_sig,
-                basemodel=basemodel,
-                stock_symbols=all_nasdaq_100_symbols,
-                log_path=log_path,
-                openai_base_url=openai_base_url,
-                openai_api_key=openai_api_key,
-                max_steps=max_steps,
-                max_retries=max_retries,
-                base_delay=base_delay,
-                initial_cash=initial_cash,
-                init_date=date  # Note: This is used for initial registration
-            )
-
-            # 5. Initialize MCP connection and AI model
-            # (Only do this once per job, not per date - optimization for future)
-            await agent.initialize()
-
-            # 6. Set runtime configuration for this execution
-            # Override RUNTIME_ENV_PATH to use isolated config
-            original_runtime_path = os.environ.get("RUNTIME_ENV_PATH")
-            os.environ["RUNTIME_ENV_PATH"] = runtime_config_path
-
-            try:
-                # Write runtime config values
-                write_config_value("TODAY_DATE", date)
-                write_config_value("SIGNATURE", model_sig)
-                write_config_value("IF_TRADE", False)
-
-                # 7. Execute trading session
-                await agent.run_trading_session(date)
-
-                # 8. Mark as completed
-                self.job_manager.update_job_detail_status(
-                    job_id, date, model_sig, "completed"
-                )
-
-                logger.info(f"[Job {job_id}] Completed {model_sig} on {date}")
-
-            finally:
-                # Restore original runtime path
-                if original_runtime_path:
-                    os.environ["RUNTIME_ENV_PATH"] = original_runtime_path
-                else:
-                    os.environ.pop("RUNTIME_ENV_PATH", None)
-
-        except Exception as e:
-            # Log error and update status to 'failed'
-            error_msg = f"{type(e).__name__}: {str(e)}"
-            logger.error(
-                f"[Job {job_id}] Failed {model_sig} on {date}: {error_msg}",
-                exc_info=True
-            )
-
-            self.job_manager.update_job_detail_status(
-                job_id, date, model_sig, "failed", error=error_msg
-            )
-
-        finally:
-            # Cleanup runtime config file
-            self.runtime_manager.cleanup_runtime_config(runtime_config_path)
-```
-
---
-
-### 3.3 RuntimeConfigManager
-
-```python
-# api/runtime_manager.py
-
-import os
-import json
-import tempfile
-from pathlib import Path
-import logging
-
-logger = logging.getLogger(__name__)
-
-class RuntimeConfigManager:
-    """
-    Manages isolated runtime configuration files for concurrent model execution.
-
-    Problem:
-        Multiple models running concurrently need separate runtime_env.json files
-        to avoid race conditions on TODAY_DATE, SIGNATURE, IF_TRADE values.
-
-    Solution:
-        Create temporary runtime config file per model-day execution:
-        - /app/data/runtime_env_{job_id}_{model}_{date}.json
-
-    Lifecycle:
-        1. create_runtime_config() → Creates temp file
-        2. Executor sets RUNTIME_ENV_PATH env var
-        3. Agent uses isolated config via get_config_value/write_config_value
-        4. cleanup_runtime_config() → Deletes temp file
-    """
-
-    def __init__(self, data_dir: str = "data"):
-        self.data_dir = Path(data_dir)
-        self.data_dir.mkdir(parents=True, exist_ok=True)
-
-    def create_runtime_config(
-        self,
-        job_id: str,
-        model_sig: str,
-        date: str
-    ) -> str:
-        """
-        Create isolated runtime config file for this execution.
-
-        Args:
-            job_id: Job UUID
-            model_sig: Model signature
-            date: Trading date
-
-        Returns:
-            Path to created runtime config file
-        """
-        # Generate unique filename
-        filename = f"runtime_env_{job_id[:8]}_{model_sig}_{date}.json"
-        config_path = self.data_dir / filename
-
-        # Initialize with default values
-        initial_config = {
-            "TODAY_DATE": date,
-            "SIGNATURE": model_sig,
-            "IF_TRADE": False,
-            "JOB_ID": job_id
-        }
-
-        with open(config_path, "w", encoding="utf-8") as f:
-            json.dump(initial_config, f, indent=4)
-
-        logger.debug(f"Created runtime config: {config_path}")
-        return str(config_path)
-
-    def cleanup_runtime_config(self, config_path: str) -> None:
-        """
-        Delete runtime config file after execution.
-
-        Args:
-            config_path: Path to runtime config file
-        """
-        try:
-            if os.path.exists(config_path):
-                os.unlink(config_path)
-                logger.debug(f"Cleaned up runtime config: {config_path}")
-        except Exception as e:
-            logger.warning(f"Failed to cleanup runtime config {config_path}: {e}")
-
-    def cleanup_all_runtime_configs(self) -> int:
-        """
-        Cleanup all runtime config files (for maintenance/startup).
-
-        Returns:
-            Number of files deleted
-        """
-        count = 0
-        for config_file in self.data_dir.glob("runtime_env_*.json"):
-            try:
-                config_file.unlink()
-                count += 1
-            except Exception as e:
-                logger.warning(f"Failed to delete {config_file}: {e}")
-
-        if count > 0:
-            logger.info(f"Cleaned up {count} stale runtime config files")
-
-        return count
-```
-
---
-
-## 4. Integration with FastAPI
-
-### 4.1 Background Task Pattern
-
-```python
-# api/main.py
-
-from fastapi import FastAPI, BackgroundTasks, HTTPException
-from api.job_manager import JobManager
-from api.worker import SimulationWorker
-from api.models import TriggerSimulationRequest, TriggerSimulationResponse
-
-app = FastAPI(title="AI-Trader API")
-
-# Global instances
-job_manager = JobManager()
-worker = SimulationWorker(job_manager)
-
-@app.post("/simulate/trigger", response_model=TriggerSimulationResponse)
-async def trigger_simulation(
-    request: TriggerSimulationRequest,
-    background_tasks: BackgroundTasks
-):
-    """
-    Trigger a catch-up simulation job.
-
-    Returns:
-        202 Accepted with job details if new job queued
-        200 OK with existing job details if already running
-    """
-    # 1. Load configuration
-    config = load_config(request.config_path)
-
-    # 2. Determine date range (last position date → most recent trading day)
-    date_range = calculate_date_range(config)
-
-    if not date_range:
-        return {
-            "status": "current",
-            "message": "Simulation already up-to-date",
-            "last_simulation_date": get_last_simulation_date(config),
-            "next_trading_day": get_next_trading_day()
-        }
-
-    # 3. Get enabled models
-    models = [m["signature"] for m in config["models"] if m.get("enabled", True)]
-
-    # 4. Check for existing job with same date range
-    existing_job = job_manager.find_job_by_date_range(date_range)
-    if existing_job:
-        # Return existing job status
-        progress = job_manager.get_job_progress(existing_job["job_id"])
-        return {
-            "job_id": existing_job["job_id"],
-            "status": existing_job["status"],
-            "date_range": date_range,
-            "models": models,
-            "created_at": existing_job["created_at"],
-            "message": "Simulation already in progress",
-            "progress": progress
-        }
-
-    # 5. Create new job
-    try:
-        job_id = job_manager.create_job(
-            config_path=request.config_path,
-            date_range=date_range,
-            models=models
-        )
-    except ValueError as e:
-        # Another job is running (different date range)
-        raise HTTPException(status_code=409, detail=str(e))
-
-    # 6. Queue background task
-    background_tasks.add_task(worker.run_job, job_id)
-
-    # 7. Return immediately with job details
-    return {
-        "job_id": job_id,
-        "status": "accepted",
-        "date_range": date_range,
-        "models": models,
-        "created_at": datetime.utcnow().isoformat() + "Z",
-        "message": "Simulation job queued successfully"
-    }
-```
-
---
-
-## 5. Agent Initialization Optimization
-
-### 5.1 Current Issue
-
-**Problem:** Each model-day calls `agent.initialize()`, which:
-1. Creates new MCP client connections
-2. Creates new AI model instance
-
-For a 5-day simulation with 3 models = 15 `initialize()` calls → Slow
-
-### 5.2 Optimization Strategy (Future Enhancement)
-
-**Option A: Persistent Agent Instances**
-
-Create agent once per model, reuse for all dates:
-
-```python
-class SimulationWorker:
-    async def run_job(self, job_id: str) -> None:
-        # ... load config ...
-
-        # Initialize all agents once
-        agents = {}
-        for model_config in enabled_models:
-            agent = await self._create_and_initialize_agent(
-                model_config, AgentClass, config
-            )
-            agents[model_config["signature"]] = agent
-
-        # Execute dates
-        for date in job["date_range"]:
-            tasks = []
-            for model_sig, agent in agents.items():
-                task = self.executor.run_model_day_with_agent(
-                    job_id, date, agent
-                )
-                tasks.append(task)
-
-            await asyncio.gather(*tasks, return_exceptions=True)
-```
-
-**Benefit:** ~10-15s saved per job (avoid repeated MCP handshakes)
-
-**Tradeoff:** More memory usage (agents kept in memory), more complex error handling
-
-**Recommendation:** Implement in v2 after MVP validation
-
---
-
-## 6. Error Handling & Recovery
-
-### 6.1 Model-Day Failure Scenarios
-
-**Scenario 1: AI Model API Timeout**
-
-```python
-# In executor.run_model_day()
-try:
-    await agent.run_trading_session(date)
-except asyncio.TimeoutError:
-    error_msg = "AI model API timeout after 30s"
-    self.job_manager.update_job_detail_status(
-        job_id, date, model_sig, "failed", error=error_msg
-    )
-    # Do NOT raise - let other models continue
-```
-
-**Scenario 2: MCP Service Down**
-
-```python
-# In agent.initialize()
-except RuntimeError as e:
-    if "Failed to initialize MCP client" in str(e):
-        error_msg = "MCP services unavailable - check agent_tools/start_mcp_services.py"
-        self.job_manager.update_job_detail_status(
-            job_id, date, model_sig, "failed", error=error_msg
-        )
-        # This likely affects all models - but still don't raise, let job_manager determine final status
-```
-
-**Scenario 3: Out of Cash**
-
-```python
-# In trade tool
-if position["CASH"] < total_cost:
-    # Trade tool returns error message
-    # Agent receives error, continues reasoning (might sell other stocks)
-    # Not a fatal error - trading session completes normally
-```
-
-### 6.2 Job-Level Failure
-
-**When does entire job fail?**
-
-Only if:
-1. Configuration file is invalid/missing
-2. Agent class import fails
-3. Database errors during status updates
-
-In these cases, `worker.run_job()` catches exception and marks job as `failed`.
-
-All other errors (model-day failures) result in `partial` status.
-
---
-
-## 7. Logging Strategy
-
-### 7.1 Log Levels by Component
-
-**Worker (api/worker.py):**
- `INFO`: Job start/end, date transitions
- `ERROR`: Fatal job errors
-
-**Executor (api/executor.py):**
- `INFO`: Model-day start/completion
- `ERROR`: Model-day failures (with exc_info=True)
-
-**Agent (base_agent.py):**
- Existing logging (step-by-step execution)
-
-### 7.2 Structured Logging Format
-
-```python
-import logging
-import json
-
-class JSONFormatter(logging.Formatter):
-    def format(self, record):
-        log_record = {
-            "timestamp": self.formatTime(record, self.datefmt),
-            "level": record.levelname,
-            "logger": record.name,
-            "message": record.getMessage(),
-        }
-
-        # Add extra fields if present
-        if hasattr(record, "job_id"):
-            log_record["job_id"] = record.job_id
-        if hasattr(record, "model"):
-            log_record["model"] = record.model
-        if hasattr(record, "date"):
-            log_record["date"] = record.date
-
-        return json.dumps(log_record)
-
-# Configure logger
-handler = logging.StreamHandler()
-handler.setFormatter(JSONFormatter())
-logger = logging.getLogger("api")
-logger.addHandler(handler)
-logger.setLevel(logging.INFO)
-```
-
-### 7.3 Log Output Example
-
-```json
-{"timestamp": "2025-01-20T14:30:00Z", "level": "INFO", "logger": "api.worker", "message": "Starting simulation job 550e8400-...", "job_id": "550e8400-..."}
-{"timestamp": "2025-01-20T14:30:01Z", "level": "INFO", "logger": "api.executor", "message": "Starting gpt-5 on 2025-01-16", "job_id": "550e8400-...", "model": "gpt-5", "date": "2025-01-16"}
-{"timestamp": "2025-01-20T14:30:45Z", "level": "INFO", "logger": "api.executor", "message": "Completed gpt-5 on 2025-01-16", "job_id": "550e8400-...", "model": "gpt-5", "date": "2025-01-16"}
-```
-
---
-
-## 8. Testing Strategy
-
-### 8.1 Unit Tests
-
-```python
-# tests/test_worker.py
-
-import pytest
-from unittest.mock import AsyncMock, MagicMock, patch
-from api.worker import SimulationWorker
-from api.job_manager import JobManager
-
-@pytest.fixture
-def mock_job_manager():
-    jm = MagicMock(spec=JobManager)
-    jm.get_job.return_value = {
-        "job_id": "test-job-123",
-        "config_path": "configs/test.json",
-        "date_range": ["2025-01-16", "2025-01-17"],
-        "models": ["gpt-5"]
-    }
-    return jm
-
-@pytest.fixture
-def worker(mock_job_manager):
-    return SimulationWorker(mock_job_manager)
-
-@pytest.mark.asyncio
-async def test_run_job_success(worker, mock_job_manager):
-    # Mock executor
-    worker.executor.run_model_day = AsyncMock(return_value=None)
-
-    await worker.run_job("test-job-123")
-
-    # Verify job status updated to running
-    mock_job_manager.update_job_status.assert_any_call("test-job-123", "running")
-
-    # Verify executor called for each model-day
-    assert worker.executor.run_model_day.call_count == 2  # 2 dates × 1 model
-
-@pytest.mark.asyncio
-async def test_run_job_partial_failure(worker, mock_job_manager):
-    # Mock executor - first call succeeds, second fails
-    worker.executor.run_model_day = AsyncMock(
-        side_effect=[None, Exception("API timeout")]
-    )
-
-    await worker.run_job("test-job-123")
-
-    # Job should continue despite one failure
-    assert worker.executor.run_model_day.call_count == 2
-
-    # Job status determined by job_manager based on job_details
-    # (tested in test_job_manager.py)
-```
-
-### 8.2 Integration Tests
-
-```python
-# tests/test_integration.py
-
-import pytest
-from api.main import app
-from fastapi.testclient import TestClient
-
-client = TestClient(app)
-
-def test_trigger_and_poll_simulation():
-    # 1. Trigger simulation
-    response = client.post("/simulate/trigger", json={
-        "config_path": "configs/test.json"
-    })
-    assert response.status_code == 202
-    job_id = response.json()["job_id"]
-
-    # 2. Poll status (may need to wait for background task)
-    import time
-    time.sleep(2)  # Wait for execution to start
-
-    response = client.get(f"/simulate/status/{job_id}")
-    assert response.status_code == 200
-    assert response.json()["status"] in ("running", "completed")
-
-    # 3. Wait for completion (with timeout)
-    max_wait = 60  # seconds
-    start_time = time.time()
-    while time.time() - start_time < max_wait:
-        response = client.get(f"/simulate/status/{job_id}")
-        status = response.json()["status"]
-        if status in ("completed", "partial", "failed"):
-            break
-        time.sleep(5)
-
-    assert status in ("completed", "partial")
-```
-
---
-
-## 9. Performance Monitoring
-
-### 9.1 Metrics to Track
-
-**Job-level metrics:**
- Total duration (from trigger to completion)
- Model-day failure rate
- Average model-day duration
-
-**System-level metrics:**
- Concurrent job count (should be ≤ 1)
- Database query latency
- MCP service response times
-
-### 9.2 Instrumentation (Future)
-
-```python
-# api/metrics.py
-
-from prometheus_client import Counter, Histogram, Gauge
-
-# Job metrics
-job_counter = Counter('simulation_jobs_total', 'Total simulation jobs', ['status'])
-job_duration = Histogram('simulation_job_duration_seconds', 'Job execution time')
-
-# Model-day metrics
-model_day_counter = Counter('model_days_total', 'Total model-days', ['model', 'status'])
-model_day_duration = Histogram('model_day_duration_seconds', 'Model-day execution time', ['model'])
-
-# System metrics
-concurrent_jobs = Gauge('concurrent_jobs', 'Number of running jobs')
-```
-
-**Usage:**
-```python
-# In worker.run_job()
-with job_duration.time():
-    await self._execute_job_logic(job_id)
-job_counter.labels(status=final_status).inc()
-```
-
---
-
-## 10. Concurrency Safety
-
-### 10.1 Thread Safety
-
-**FastAPI Background Tasks:**
- Run in threadpool (default) or asyncio tasks
- For MVP, using asyncio tasks (async functions)
-
-**SQLite Thread Safety:**
- `check_same_thread=False` allows multi-thread access
- Each operation opens new connection → Safe for low concurrency
-
-**File I/O:**
- `position.jsonl` writes are sequential per model → Safe
- Different models write to different files → Safe
-
-### 10.2 Race Condition Scenarios
-
-**Scenario: Two trigger requests at exact same time**
-
-```
-Thread A: Check can_start_new_job() → True
-Thread B: Check can_start_new_job() → True
-Thread A: Create job → Success
-Thread B: Create job → Success (PROBLEM: 2 jobs running)
-```
-
-**Mitigation: Database-level locking**
-
-```python
-def can_start_new_job(self) -> bool:
-    conn = get_db_connection(self.db_path)
-    cursor = conn.cursor()
-
-    # Use SELECT ... FOR UPDATE to lock rows (not supported in SQLite)
-    # Instead, use UNIQUE constraint on (status, created_at) for pending/running jobs
-
-    cursor.execute("""
-        SELECT COUNT(*) FROM jobs
-        WHERE status IN ('pending', 'running')
-    """)
-
-    count = cursor.fetchone()[0]
-    conn.close()
-
-    return count == 0
-```
-
-**For MVP:** Accept risk of rare double-job scenario (extremely unlikely with Windmill polling)
-
-**For Production:** Use PostgreSQL with row-level locking or distributed lock (Redis)
-
---
-
-## Summary
-
-The Background Worker provides:
-1. **Async job execution** with FastAPI BackgroundTasks
-2. **Parallel model execution** for faster completion
-3. **Isolated runtime configs** to prevent state collisions
-4. **Graceful error handling** where model failures don't block others
-5. **Comprehensive logging** for debugging and monitoring
-
-**Next specification:** BaseAgent Refactoring for Single-Day Execution