mirror of https://github.com/Xe138/AI-Trader.git synced 2026-04-01 17:17:24 -04:00

Files

Bill a9dd346b35 fix: correct test suite failures for async price download

Fixed two test issues:
1. test_config_override.py: Updated hardcoded worktree path from config-override-system to async-price-download
2. test_dev_database.py: Added thread-local connection cleanup to prevent SQLite file locking issues

All tests now pass:
- Unit tests: 200 tests
- Integration tests: 47 tests (46 passed, 1 skipped)
- E2E tests: 3 tests
- Total: 250 tests collected

2025-11-02 07:00:19 -05:00

56 KiB

Raw Blame History

Async Price Data Download Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Move price data downloading from synchronous API endpoint to background worker thread, enabling <1s API responses and better progress visibility.

Architecture: Refactor /simulate/trigger to create jobs immediately without downloading data. SimulationWorker prepares data (downloads if needed) before executing trades. Add "downloading_data" status and warnings field for visibility.

Tech Stack: FastAPI, SQLite, Python 3.12, pytest

Task 1: Add Database Support for New Status and Warnings

Files:

Modify: api/database.py:88
Modify: api/database.py:96
Test: tests/unit/test_database_schema.py

Step 1: Write failing test for "downloading_data" status

# In tests/unit/test_database_schema.py (create if doesn't exist)
import pytest
import sqlite3
from api.database import initialize_database, get_db_connection

def test_jobs_table_allows_downloading_data_status(tmp_path):
    """Test that jobs table accepts downloading_data status."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)

    conn = get_db_connection(db_path)
    cursor = conn.cursor()

    # Should not raise constraint violation
    cursor.execute("""
        INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at)
        VALUES ('test-123', 'config.json', 'downloading_data', '[]', '[]', '2025-11-01T00:00:00Z')
    """)
    conn.commit()

    # Verify it was inserted
    cursor.execute("SELECT status FROM jobs WHERE job_id = 'test-123'")
    result = cursor.fetchone()
    assert result[0] == "downloading_data"

    conn.close()

Step 2: Run test to verify it fails

Run: ../../venv/bin/python -m pytest tests/unit/test_database_schema.py::test_jobs_table_allows_downloading_data_status -v

Expected: FAIL with constraint violation (status not in allowed values)

Step 3: Add "downloading_data" to status enum

In api/database.py:88, update CHECK constraint:

status TEXT NOT NULL CHECK(status IN ('pending', 'downloading_data', 'running', 'completed', 'partial', 'failed')),

Step 4: Run test to verify it passes

Run: ../../venv/bin/python -m pytest tests/unit/test_database_schema.py::test_jobs_table_allows_downloading_data_status -v

Expected: PASS

Step 5: Write failing test for warnings column

Add to tests/unit/test_database_schema.py:

def test_jobs_table_has_warnings_column(tmp_path):
    """Test that jobs table has warnings TEXT column."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)

    conn = get_db_connection(db_path)
    cursor = conn.cursor()

    # Insert job with warnings
    cursor.execute("""
        INSERT INTO jobs (job_id, config_path, status, date_range, models, created_at, warnings)
        VALUES ('test-456', 'config.json', 'completed', '[]', '[]', '2025-11-01T00:00:00Z', '["Warning 1", "Warning 2"]')
    """)
    conn.commit()

    # Verify warnings can be retrieved
    cursor.execute("SELECT warnings FROM jobs WHERE job_id = 'test-456'")
    result = cursor.fetchone()
    assert result[0] == '["Warning 1", "Warning 2"]'

    conn.close()

Step 6: Run test to verify it fails

Run: ../../venv/bin/python -m pytest tests/unit/test_database_schema.py::test_jobs_table_has_warnings_column -v

Expected: FAIL (no such column: warnings)

Step 7: Add warnings column to jobs table

In api/database.py:96 (after error column), add:

error TEXT,
warnings TEXT

Step 8: Run test to verify it passes

Run: ../../venv/bin/python -m pytest tests/unit/test_database_schema.py::test_jobs_table_has_warnings_column -v

Expected: PASS

Step 9: Commit database schema changes

git add api/database.py tests/unit/test_database_schema.py
git commit -m "feat(db): add downloading_data status and warnings column

Add support for:
- downloading_data job status for visibility during data prep
- warnings TEXT column for storing job-level warnings (JSON array)

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 2: Add JobManager Method for Warnings

Files:

Modify: api/job_manager.py
Test: tests/unit/test_job_manager.py

Step 1: Write failing test for add_job_warnings

Add to tests/unit/test_job_manager.py:

import json

def test_add_job_warnings(tmp_path):
    """Test adding warnings to a job."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    # Create a job
    job_id = job_manager.create_job(
        config_path="config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    # Add warnings
    warnings = ["Rate limit reached", "Skipped 2 dates"]
    job_manager.add_job_warnings(job_id, warnings)

    # Verify warnings were stored
    job = job_manager.get_job(job_id)
    stored_warnings = json.loads(job["warnings"])
    assert stored_warnings == warnings

Step 2: Run test to verify it fails

Run: ../../venv/bin/python -m pytest tests/unit/test_job_manager.py::test_add_job_warnings -v

Expected: FAIL (AttributeError: 'JobManager' has no attribute 'add_job_warnings')

Step 3: Implement add_job_warnings method

In api/job_manager.py, add method (find appropriate location after existing update methods):

def add_job_warnings(self, job_id: str, warnings: List[str]) -> None:
    """
    Store warnings for a job.

    Args:
        job_id: Job UUID
        warnings: List of warning messages
    """
    import json

    conn = get_db_connection(self.db_path)
    cursor = conn.cursor()

    warnings_json = json.dumps(warnings)

    cursor.execute("""
        UPDATE jobs
        SET warnings = ?
        WHERE job_id = ?
    """, (warnings_json, job_id))

    conn.commit()
    conn.close()

    logger.info(f"Added {len(warnings)} warnings to job {job_id}")

Step 4: Add import for List if not present

At top of api/job_manager.py, ensure:

from typing import Dict, Any, List, Optional

Step 5: Run test to verify it passes

Run: ../../venv/bin/python -m pytest tests/unit/test_job_manager.py::test_add_job_warnings -v

Expected: PASS

Step 6: Commit JobManager changes

git add api/job_manager.py tests/unit/test_job_manager.py
git commit -m "feat(api): add JobManager.add_job_warnings method

Store job warnings as JSON array in database.

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 3: Add Response Model Warnings Fields

Files:

Modify: api/main.py:336 (SimulateTriggerResponse)
Modify: api/main.py:353 (JobStatusResponse)
Test: tests/unit/test_response_models.py

Step 1: Write failing test for response model warnings

Create tests/unit/test_response_models.py:

from api.main import SimulateTriggerResponse, JobStatusResponse, JobProgress

def test_simulate_trigger_response_accepts_warnings():
    """Test SimulateTriggerResponse accepts warnings field."""
    response = SimulateTriggerResponse(
        job_id="test-123",
        status="completed",
        total_model_days=10,
        message="Job completed",
        deployment_mode="DEV",
        is_dev_mode=True,
        warnings=["Rate limited", "Skipped 2 dates"]
    )

    assert response.warnings == ["Rate limited", "Skipped 2 dates"]

def test_job_status_response_accepts_warnings():
    """Test JobStatusResponse accepts warnings field."""
    response = JobStatusResponse(
        job_id="test-123",
        status="completed",
        progress=JobProgress(total_model_days=10, completed=10, failed=0, pending=0),
        date_range=["2025-10-01"],
        models=["gpt-5"],
        created_at="2025-11-01T00:00:00Z",
        details=[],
        deployment_mode="DEV",
        is_dev_mode=True,
        warnings=["Rate limited"]
    )

    assert response.warnings == ["Rate limited"]

Step 2: Run tests to verify they fail

Run: ../../venv/bin/python -m pytest tests/unit/test_response_models.py -v

Expected: FAIL (pydantic validation error - unexpected field)

Step 3: Add warnings to SimulateTriggerResponse

In api/main.py, find SimulateTriggerResponse class (around line 68) and add:

class SimulateTriggerResponse(BaseModel):
    """Response body for POST /simulate/trigger."""
    job_id: str
    status: str
    total_model_days: int
    message: str
    deployment_mode: str
    is_dev_mode: bool
    preserve_dev_data: Optional[bool] = None
    warnings: Optional[List[str]] = None  # NEW

Step 4: Add warnings to JobStatusResponse

In api/main.py, find JobStatusResponse class (around line 87) and add:

class JobStatusResponse(BaseModel):
    """Response body for GET /simulate/status/{job_id}."""
    job_id: str
    status: str
    progress: JobProgress
    date_range: List[str]
    models: List[str]
    created_at: str
    started_at: Optional[str] = None
    completed_at: Optional[str] = None
    total_duration_seconds: Optional[float] = None
    error: Optional[str] = None
    details: List[Dict[str, Any]]
    deployment_mode: str
    is_dev_mode: bool
    preserve_dev_data: Optional[bool] = None
    warnings: Optional[List[str]] = None  # NEW

Step 5: Run tests to verify they pass

Run: ../../venv/bin/python -m pytest tests/unit/test_response_models.py -v

Expected: PASS

Step 6: Commit response model changes

git add api/main.py tests/unit/test_response_models.py
git commit -m "feat(api): add warnings field to response models

Add optional warnings field to:
- SimulateTriggerResponse
- JobStatusResponse

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 4: Implement SimulationWorker Helper Methods

Files:

Modify: api/simulation_worker.py
Test: tests/unit/test_simulation_worker.py

Step 4.1: Add _download_price_data helper

Step 1: Write failing test

Add to tests/unit/test_simulation_worker.py:

from unittest.mock import Mock, MagicMock
from api.simulation_worker import SimulationWorker

def test_download_price_data_success(tmp_path, monkeypatch):
    """Test successful price data download."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)

    worker = SimulationWorker(job_id="test-123", db_path=db_path)

    # Mock price manager
    mock_price_manager = Mock()
    mock_price_manager.download_missing_data_prioritized.return_value = {
        "downloaded": ["AAPL", "MSFT"],
        "failed": [],
        "rate_limited": False
    }

    warnings = []
    missing_coverage = {"AAPL": {"2025-10-01"}, "MSFT": {"2025-10-01"}}

    worker._download_price_data(mock_price_manager, missing_coverage, ["2025-10-01"], warnings)

    # Verify download was called
    mock_price_manager.download_missing_data_prioritized.assert_called_once()

    # No warnings for successful download
    assert len(warnings) == 0

def test_download_price_data_rate_limited(tmp_path):
    """Test price download with rate limit."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)

    worker = SimulationWorker(job_id="test-456", db_path=db_path)

    # Mock price manager
    mock_price_manager = Mock()
    mock_price_manager.download_missing_data_prioritized.return_value = {
        "downloaded": ["AAPL"],
        "failed": ["MSFT"],
        "rate_limited": True
    }

    warnings = []
    missing_coverage = {"AAPL": {"2025-10-01"}, "MSFT": {"2025-10-01"}}

    worker._download_price_data(mock_price_manager, missing_coverage, ["2025-10-01"], warnings)

    # Should add rate limit warning
    assert len(warnings) == 1
    assert "Rate limit" in warnings[0]

Step 2: Run tests to verify they fail

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_download_price_data_success -v

Expected: FAIL (AttributeError: no attribute '_download_price_data')

Step 3: Implement _download_price_data method

In api/simulation_worker.py, add method after _execute_model_day:

def _download_price_data(
    self,
    price_manager,
    missing_coverage: Dict[str, Set[str]],
    requested_dates: List[str],
    warnings: List[str]
) -> None:
    """Download missing price data with progress logging."""
    from typing import Set, Dict

    logger.info(f"Job {self.job_id}: Starting prioritized download...")

    requested_dates_set = set(requested_dates)

    download_result = price_manager.download_missing_data_prioritized(
        missing_coverage,
        requested_dates_set
    )

    downloaded = len(download_result["downloaded"])
    failed = len(download_result["failed"])
    total = downloaded + failed

    logger.info(
        f"Job {self.job_id}: Download complete - "
        f"{downloaded}/{total} symbols succeeded"
    )

    if download_result["rate_limited"]:
        msg = f"Rate limit reached - downloaded {downloaded}/{total} symbols"
        warnings.append(msg)
        logger.warning(f"Job {self.job_id}: {msg}")

    if failed > 0 and not download_result["rate_limited"]:
        msg = f"{failed} symbols failed to download"
        warnings.append(msg)
        logger.warning(f"Job {self.job_id}: {msg}")

Step 4: Add necessary imports at top of file

Ensure api/simulation_worker.py has:

from typing import Dict, Any, List, Set

Step 5: Run tests to verify they pass

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_download_price_data_success tests/unit/test_simulation_worker.py::test_download_price_data_rate_limited -v

Expected: PASS

Step 6: Commit _download_price_data

git add api/simulation_worker.py tests/unit/test_simulation_worker.py
git commit -m "feat(worker): add _download_price_data helper method

Handle price data download with rate limit detection and warning generation.

Co-Authored-By: Claude <noreply@anthropic.com>"

Step 4.2: Add _filter_completed_dates helper

Step 1: Write failing test

Add to tests/unit/test_simulation_worker.py:

def test_filter_completed_dates_all_new(tmp_path, monkeypatch):
    """Test filtering when no dates are completed."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)

    worker = SimulationWorker(job_id="test-789", db_path=db_path)

    # Mock job_manager to return empty completed dates
    mock_job_manager = Mock()
    mock_job_manager.get_completed_model_dates.return_value = {}
    worker.job_manager = mock_job_manager

    available_dates = ["2025-10-01", "2025-10-02"]
    models = ["gpt-5"]

    result = worker._filter_completed_dates(available_dates, models)

    # All dates should be returned
    assert result == available_dates

def test_filter_completed_dates_some_completed(tmp_path):
    """Test filtering when some dates are completed."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)

    worker = SimulationWorker(job_id="test-abc", db_path=db_path)

    # Mock job_manager to return one completed date
    mock_job_manager = Mock()
    mock_job_manager.get_completed_model_dates.return_value = {
        "gpt-5": ["2025-10-01"]
    }
    worker.job_manager = mock_job_manager

    available_dates = ["2025-10-01", "2025-10-02", "2025-10-03"]
    models = ["gpt-5"]

    result = worker._filter_completed_dates(available_dates, models)

    # Should exclude completed date
    assert result == ["2025-10-02", "2025-10-03"]

Step 2: Run tests to verify they fail

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_filter_completed_dates_all_new -v

Expected: FAIL (AttributeError: no attribute '_filter_completed_dates')

Step 3: Implement _filter_completed_dates method

In api/simulation_worker.py, add method after _download_price_data:

def _filter_completed_dates(
    self,
    available_dates: List[str],
    models: List[str]
) -> List[str]:
    """
    Filter out dates that are already completed for all models.

    Implements idempotent job behavior - skip model-days that already
    have completed data.

    Args:
        available_dates: List of dates with complete price data
        models: List of model signatures

    Returns:
        List of dates that need processing
    """
    if not available_dates:
        return []

    # Get completed dates from job_manager
    start_date = available_dates[0]
    end_date = available_dates[-1]

    completed_dates = self.job_manager.get_completed_model_dates(
        models,
        start_date,
        end_date
    )

    # Build list of dates that need processing
    dates_to_process = []
    for date in available_dates:
        # Check if any model needs this date
        needs_processing = False
        for model in models:
            if date not in completed_dates.get(model, []):
                needs_processing = True
                break

        if needs_processing:
            dates_to_process.append(date)

    return dates_to_process

Step 4: Run tests to verify they pass

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_filter_completed_dates_all_new tests/unit/test_simulation_worker.py::test_filter_completed_dates_some_completed -v

Expected: PASS

Step 5: Commit _filter_completed_dates

git add api/simulation_worker.py tests/unit/test_simulation_worker.py
git commit -m "feat(worker): add _filter_completed_dates helper method

Implement idempotent behavior by skipping already-completed model-days.

Co-Authored-By: Claude <noreply@anthropic.com>"

Step 4.3: Add _add_job_warnings helper

Step 1: Write failing test

Add to tests/unit/test_simulation_worker.py:

def test_add_job_warnings(tmp_path):
    """Test adding warnings to job via worker."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    # Create job
    job_id = job_manager.create_job(
        config_path="config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

    # Add warnings
    warnings = ["Warning 1", "Warning 2"]
    worker._add_job_warnings(warnings)

    # Verify warnings were stored
    import json
    job = job_manager.get_job(job_id)
    assert job["warnings"] is not None
    stored_warnings = json.loads(job["warnings"])
    assert stored_warnings == warnings

Step 2: Run test to verify it fails

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_add_job_warnings -v

Expected: FAIL (AttributeError: no attribute '_add_job_warnings')

Step 3: Implement _add_job_warnings method

In api/simulation_worker.py, add method after _filter_completed_dates:

def _add_job_warnings(self, warnings: List[str]) -> None:
    """Store warnings in job metadata."""
    self.job_manager.add_job_warnings(self.job_id, warnings)

Step 4: Run test to verify it passes

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_add_job_warnings -v

Expected: PASS

Step 5: Commit _add_job_warnings

git add api/simulation_worker.py tests/unit/test_simulation_worker.py
git commit -m "feat(worker): add _add_job_warnings helper method

Delegate to JobManager.add_job_warnings for storing warnings.

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 5: Implement SimulationWorker._prepare_data

Files:

Modify: api/simulation_worker.py
Test: tests/unit/test_simulation_worker.py

Step 1: Write failing test for _prepare_data

Add to tests/unit/test_simulation_worker.py:

def test_prepare_data_no_missing_data(tmp_path, monkeypatch):
    """Test prepare_data when all data is available."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    # Create job
    job_id = job_manager.create_job(
        config_path="config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

    # Mock PriceDataManager
    mock_price_manager = Mock()
    mock_price_manager.get_missing_coverage.return_value = {}  # No missing data
    mock_price_manager.get_available_trading_dates.return_value = ["2025-10-01"]

    # Patch PriceDataManager import
    def mock_pdm_init(db_path):
        return mock_price_manager

    monkeypatch.setattr("api.simulation_worker.PriceDataManager", mock_pdm_init)

    # Mock get_completed_model_dates
    worker.job_manager.get_completed_model_dates = Mock(return_value={})

    # Execute
    available_dates, warnings = worker._prepare_data(
        requested_dates=["2025-10-01"],
        models=["gpt-5"],
        config_path="config.json"
    )

    # Verify results
    assert available_dates == ["2025-10-01"]
    assert len(warnings) == 0

    # Verify status was updated to running
    job = job_manager.get_job(job_id)
    assert job["status"] == "running"

def test_prepare_data_with_download(tmp_path, monkeypatch):
    """Test prepare_data when data needs downloading."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    job_id = job_manager.create_job(
        config_path="config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

    # Mock PriceDataManager
    mock_price_manager = Mock()
    mock_price_manager.get_missing_coverage.return_value = {"AAPL": {"2025-10-01"}}
    mock_price_manager.download_missing_data_prioritized.return_value = {
        "downloaded": ["AAPL"],
        "failed": [],
        "rate_limited": False
    }
    mock_price_manager.get_available_trading_dates.return_value = ["2025-10-01"]

    def mock_pdm_init(db_path):
        return mock_price_manager

    monkeypatch.setattr("api.simulation_worker.PriceDataManager", mock_pdm_init)
    worker.job_manager.get_completed_model_dates = Mock(return_value={})

    # Execute
    available_dates, warnings = worker._prepare_data(
        requested_dates=["2025-10-01"],
        models=["gpt-5"],
        config_path="config.json"
    )

    # Verify download was called
    mock_price_manager.download_missing_data_prioritized.assert_called_once()

    # Verify status transitions
    job = job_manager.get_job(job_id)
    assert job["status"] == "running"

Step 2: Run tests to verify they fail

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_prepare_data_no_missing_data -v

Expected: FAIL (AttributeError: no attribute '_prepare_data')

Step 3: Implement _prepare_data method

In api/simulation_worker.py, add method after _add_job_warnings, before get_job_info:

def _prepare_data(
    self,
    requested_dates: List[str],
    models: List[str],
    config_path: str
) -> tuple:
    """
    Prepare price data for simulation.

    Steps:
    1. Update job status to "downloading_data"
    2. Check what data is missing
    3. Download missing data (with rate limit handling)
    4. Determine available trading dates
    5. Filter out already-completed model-days (idempotent)
    6. Update job status to "running"

    Args:
        requested_dates: All dates requested for simulation
        models: Model signatures to simulate
        config_path: Path to configuration file

    Returns:
        Tuple of (available_dates, warnings)
    """
    from api.price_data_manager import PriceDataManager

    warnings = []

    # Update status
    self.job_manager.update_job_status(self.job_id, "downloading_data")
    logger.info(f"Job {self.job_id}: Checking price data availability...")

    # Initialize price manager
    price_manager = PriceDataManager(db_path=self.db_path)

    # Check missing coverage
    start_date = requested_dates[0]
    end_date = requested_dates[-1]
    missing_coverage = price_manager.get_missing_coverage(start_date, end_date)

    # Download if needed
    if missing_coverage:
        logger.info(f"Job {self.job_id}: Missing data for {len(missing_coverage)} symbols")
        self._download_price_data(price_manager, missing_coverage, requested_dates, warnings)
    else:
        logger.info(f"Job {self.job_id}: All price data available")

    # Get available dates after download
    available_dates = price_manager.get_available_trading_dates(start_date, end_date)

    # Warn about skipped dates
    skipped = set(requested_dates) - set(available_dates)
    if skipped:
        warnings.append(f"Skipped {len(skipped)} dates due to incomplete price data: {sorted(list(skipped))}")
        logger.warning(f"Job {self.job_id}: {warnings[-1]}")

    # Filter already-completed model-days (idempotent behavior)
    available_dates = self._filter_completed_dates(available_dates, models)

    # Update to running
    self.job_manager.update_job_status(self.job_id, "running")
    logger.info(f"Job {self.job_id}: Starting execution - {len(available_dates)} dates, {len(models)} models")

    return available_dates, warnings

Step 4: Run tests to verify they pass

Run: ../../venv/bin/python -m pytest tests/unit/test_simulation_worker.py::test_prepare_data_no_missing_data tests/unit/test_simulation_worker.py::test_prepare_data_with_download -v

Expected: PASS

Step 5: Commit _prepare_data

git add api/simulation_worker.py tests/unit/test_simulation_worker.py
git commit -m "feat(worker): add _prepare_data method

Orchestrate data preparation phase:
- Check missing data
- Download if needed
- Filter completed dates
- Update job status

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 6: Integrate _prepare_data into SimulationWorker.run()

Files:

Modify: api/simulation_worker.py:59-133
Test: tests/integration/test_async_download.py

Step 1: Write integration test for async download flow

Create tests/integration/test_async_download.py:

import pytest
import time
from api.database import initialize_database
from api.job_manager import JobManager
from api.simulation_worker import SimulationWorker
from unittest.mock import Mock, patch

def test_worker_prepares_data_before_execution(tmp_path):
    """Test that worker calls _prepare_data before executing trades."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    # Create job
    job_id = job_manager.create_job(
        config_path="configs/default_config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

    # Mock _prepare_data to track call
    original_prepare = worker._prepare_data
    prepare_called = []

    def mock_prepare(*args, **kwargs):
        prepare_called.append(True)
        return (["2025-10-01"], [])  # Return available dates, no warnings

    worker._prepare_data = mock_prepare

    # Mock _execute_date to avoid actual execution
    worker._execute_date = Mock()

    # Run worker
    result = worker.run()

    # Verify _prepare_data was called
    assert len(prepare_called) == 1
    assert result["success"] is True

def test_worker_handles_no_available_dates(tmp_path):
    """Test worker fails gracefully when no dates are available."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    job_id = job_manager.create_job(
        config_path="configs/default_config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

    # Mock _prepare_data to return empty dates
    worker._prepare_data = Mock(return_value=([], []))

    # Run worker
    result = worker.run()

    # Should fail with descriptive error
    assert result["success"] is False
    assert "No trading dates available" in result["error"]

    # Job should be marked as failed
    job = job_manager.get_job(job_id)
    assert job["status"] == "failed"

def test_worker_stores_warnings(tmp_path):
    """Test worker stores warnings from prepare_data."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    job_id = job_manager.create_job(
        config_path="configs/default_config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

    # Mock _prepare_data to return warnings
    warnings = ["Rate limited", "Skipped 1 date"]
    worker._prepare_data = Mock(return_value=(["2025-10-01"], warnings))
    worker._execute_date = Mock()

    # Run worker
    result = worker.run()

    # Verify warnings in result
    assert result["warnings"] == warnings

    # Verify warnings stored in database
    import json
    job = job_manager.get_job(job_id)
    stored_warnings = json.loads(job["warnings"])
    assert stored_warnings == warnings

Step 2: Run tests to verify they fail

Run: ../../venv/bin/python -m pytest tests/integration/test_async_download.py::test_worker_prepares_data_before_execution -v

Expected: FAIL (run() doesn't call _prepare_data)

Step 3: Modify SimulationWorker.run() to call _prepare_data

In api/simulation_worker.py, modify the run() method (around line 59):

def run(self) -> Dict[str, Any]:
    """
    Execute the simulation job.

    Returns:
        Result dict with success status and summary

    Process:
        1. Get job details (dates, models, config)
        2. Prepare data (download if needed)
        3. For each date sequentially:
            a. Execute all models in parallel
            b. Wait for all to complete
            c. Update progress
        4. Determine final job status
        5. Store warnings if any

    Error Handling:
        - Individual model failures: Mark detail as failed, continue with others
        - Job-level errors: Mark entire job as failed
    """
    try:
        # Get job info
        job = self.job_manager.get_job(self.job_id)
        if not job:
            raise ValueError(f"Job {self.job_id} not found")

        date_range = job["date_range"]
        models = job["models"]
        config_path = job["config_path"]

        logger.info(f"Starting job {self.job_id}: {len(date_range)} dates, {len(models)} models")

        # NEW: Prepare price data (download if needed)
        available_dates, warnings = self._prepare_data(date_range, models, config_path)

        if not available_dates:
            error_msg = "No trading dates available after price data preparation"
            self.job_manager.update_job_status(self.job_id, "failed", error=error_msg)
            return {"success": False, "error": error_msg}

        # Execute available dates only
        for date in available_dates:
            logger.info(f"Processing date {date} with {len(models)} models")
            self._execute_date(date, models, config_path)

        # Job completed - determine final status
        progress = self.job_manager.get_job_progress(self.job_id)

        if progress["failed"] == 0:
            final_status = "completed"
        elif progress["completed"] > 0:
            final_status = "partial"
        else:
            final_status = "failed"

        # Add warnings if any dates were skipped
        if warnings:
            self._add_job_warnings(warnings)

        # Note: Job status is already updated by model_day_executor's detail status updates
        # We don't need to explicitly call update_job_status here as it's handled automatically
        # by the status transition logic in JobManager.update_job_detail_status

        logger.info(f"Job {self.job_id} finished with status: {final_status}")

        return {
            "success": True,
            "job_id": self.job_id,
            "status": final_status,
            "total_model_days": progress["total_model_days"],
            "completed": progress["completed"],
            "failed": progress["failed"],
            "warnings": warnings
        }

    except Exception as e:
        error_msg = f"Job execution failed: {str(e)}"
        logger.error(f"Job {self.job_id}: {error_msg}", exc_info=True)

        # Update job to failed
        self.job_manager.update_job_status(self.job_id, "failed", error=error_msg)

        return {
            "success": False,
            "job_id": self.job_id,
            "error": error_msg
        }

Step 4: Run tests to verify they pass

Run: ../../venv/bin/python -m pytest tests/integration/test_async_download.py -v

Expected: PASS

Step 5: Commit worker run() integration

git add api/simulation_worker.py tests/integration/test_async_download.py
git commit -m "feat(worker): integrate data preparation into run() method

Call _prepare_data before executing trades:
- Download missing data if needed
- Filter completed dates
- Store warnings
- Handle empty date scenarios

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 7: Simplify /simulate/trigger Endpoint

Files:

Modify: api/main.py:144-405
Test: tests/integration/test_api_endpoints.py

Step 1: Write integration test for fast endpoint response

Add to tests/integration/test_api_endpoints.py:

import time

def test_trigger_endpoint_fast_response(test_client):
    """Test that /simulate/trigger responds quickly without downloading data."""
    start_time = time.time()

    response = test_client.post("/simulate/trigger", json={
        "start_date": "2025-10-01",
        "end_date": "2025-10-01",
        "models": ["gpt-5"]
    })

    elapsed = time.time() - start_time

    # Should respond in less than 2 seconds (allowing for DB operations)
    assert elapsed < 2.0
    assert response.status_code == 200
    assert "job_id" in response.json()

def test_trigger_endpoint_no_price_download(test_client, monkeypatch):
    """Test that endpoint doesn't call price download."""
    download_called = []

    # Patch PriceDataManager to track if download is called
    original_pdm = __import__('api.price_data_manager', fromlist=['PriceDataManager']).PriceDataManager

    class MockPDM:
        def __init__(self, *args, **kwargs):
            pass

        def download_missing_data_prioritized(self, *args, **kwargs):
            download_called.append(True)
            return {"downloaded": [], "failed": [], "rate_limited": False}

    monkeypatch.setattr("api.main.PriceDataManager", MockPDM)

    response = test_client.post("/simulate/trigger", json={
        "start_date": "2025-10-01",
        "end_date": "2025-10-01",
        "models": ["gpt-5"]
    })

    # Download should NOT be called in endpoint
    assert len(download_called) == 0
    assert response.status_code == 200

Step 2: Run tests to verify they fail

Run: ../../venv/bin/python -m pytest tests/integration/test_api_endpoints.py::test_trigger_endpoint_no_price_download -v

Expected: FAIL (download is currently called in endpoint)

Step 3: Remove price download logic from /simulate/trigger

In api/main.py, modify trigger_simulation function (around line 144):

Find and remove these sections:

Lines 228-273: Price data download logic
Lines 276-287: Available dates check
Lines 290-333: Idempotent filtering

Replace with simpler logic:

@app.post("/simulate/trigger", response_model=SimulateTriggerResponse, status_code=200)
async def trigger_simulation(request: SimulateTriggerRequest):
    """
    Trigger a new simulation job.

    Validates date range and creates job. Price data is downloaded
    in background by SimulationWorker.

    Supports:
    - Single date: start_date == end_date
    - Date range: start_date < end_date
    - Resume: start_date is null (each model resumes from its last completed date)

    Raises:
        HTTPException 400: Validation errors, running job, or invalid dates
    """
    try:
        # Use config path from app state
        config_path = app.state.config_path

        # Validate config path exists
        if not Path(config_path).exists():
            raise HTTPException(
                status_code=500,
                detail=f"Server configuration file not found: {config_path}"
            )

        end_date = request.end_date

        # Determine which models to run
        import json
        with open(config_path, 'r') as f:
            config = json.load(f)

        if request.models is not None:
            # Use models from request (explicit override)
            models_to_run = request.models
        else:
            # Use enabled models from config
            models_to_run = [
                model["signature"]
                for model in config.get("models", [])
                if model.get("enabled", False)
            ]

            if not models_to_run:
                raise HTTPException(
                    status_code=400,
                    detail="No enabled models found in config. Either enable models in config or specify them in request."
                )

        job_manager = JobManager(db_path=app.state.db_path)

        # Handle resume logic (start_date is null)
        if request.start_date is None:
            # Resume mode: determine start date per model
            from datetime import timedelta
            model_start_dates = {}

            for model in models_to_run:
                last_date = job_manager.get_last_completed_date_for_model(model)

                if last_date is None:
                    # Cold start: use end_date as single-day simulation
                    model_start_dates[model] = end_date
                else:
                    # Resume from next day after last completed
                    last_dt = datetime.strptime(last_date, "%Y-%m-%d")
                    next_dt = last_dt + timedelta(days=1)
                    model_start_dates[model] = next_dt.strftime("%Y-%m-%d")

            # For validation purposes, use earliest start date
            earliest_start = min(model_start_dates.values())
            start_date = earliest_start
        else:
            # Explicit start date provided
            start_date = request.start_date
            model_start_dates = {model: start_date for model in models_to_run}

        # Validate date range
        max_days = get_max_simulation_days()
        validate_date_range(start_date, end_date, max_days=max_days)

        # Check if can start new job
        if not job_manager.can_start_new_job():
            raise HTTPException(
                status_code=400,
                detail="Another simulation job is already running or pending. Please wait for it to complete."
            )

        # Get all weekdays in range (worker will filter based on data availability)
        all_dates = expand_date_range(start_date, end_date)

        # Create job immediately with all requested dates
        # Worker will handle data download and filtering
        job_id = job_manager.create_job(
            config_path=config_path,
            date_range=all_dates,
            models=models_to_run,
            model_day_filter=None  # Worker will filter based on available data
        )

        # Start worker in background thread (only if not in test mode)
        if not getattr(app.state, "test_mode", False):
            def run_worker():
                worker = SimulationWorker(job_id=job_id, db_path=app.state.db_path)
                worker.run()

            thread = threading.Thread(target=run_worker, daemon=True)
            thread.start()

        logger.info(f"Triggered simulation job {job_id} for {len(all_dates)} dates, {len(models_to_run)} models")

        # Build response message
        message = f"Simulation job created for {len(all_dates)} dates, {len(models_to_run)} models"

        if request.start_date is None:
            message += " (resume mode)"

        # Get deployment mode info
        deployment_info = get_deployment_mode_dict()

        response = SimulateTriggerResponse(
            job_id=job_id,
            status="pending",
            total_model_days=len(all_dates) * len(models_to_run),
            message=message,
            **deployment_info
        )

        return response

    except HTTPException:
        raise
    except ValueError as e:
        logger.error(f"Validation error: {e}")
        raise HTTPException(status_code=400, detail=str(e))
    except Exception as e:
        logger.error(f"Failed to trigger simulation: {e}", exc_info=True)
        raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")

Step 4: Remove unused imports from api/main.py

Remove these imports if no longer used:

from api.price_data_manager import PriceDataManager
from api.date_utils import expand_date_range (if only used in removed code)

Keep:

from api.date_utils import validate_date_range, expand_date_range, get_max_simulation_days

Step 5: Run tests to verify they pass

Run: ../../venv/bin/python -m pytest tests/integration/test_api_endpoints.py::test_trigger_endpoint_fast_response tests/integration/test_api_endpoints.py::test_trigger_endpoint_no_price_download -v

Expected: PASS

Step 6: Commit endpoint simplification

git add api/main.py tests/integration/test_api_endpoints.py
git commit -m "refactor(api): remove price download from /simulate/trigger

Move data preparation to background worker:
- Fast endpoint response (<1s)
- No blocking downloads
- Worker handles data download and filtering
- Maintains backwards compatibility

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 8: Update /simulate/status to Return Warnings

Files:

Modify: api/main.py:407-465
Test: tests/integration/test_api_endpoints.py

Step 1: Write test for warnings in status response

Add to tests/integration/test_api_endpoints.py:

def test_status_endpoint_returns_warnings(test_client, tmp_path):
    """Test that /simulate/status returns warnings field."""
    # Create job with warnings
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

    job_id = job_manager.create_job(
        config_path="config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )

    # Add warnings
    warnings = ["Rate limited", "Skipped 1 date"]
    job_manager.add_job_warnings(job_id, warnings)

    # Mock app.state.db_path
    test_client.app.state.db_path = db_path

    # Get status
    response = test_client.get(f"/simulate/status/{job_id}")

    assert response.status_code == 200
    data = response.json()
    assert "warnings" in data
    assert data["warnings"] == warnings

Step 2: Run test to verify it fails

Run: ../../venv/bin/python -m pytest tests/integration/test_api_endpoints.py::test_status_endpoint_returns_warnings -v

Expected: FAIL (warnings not in response or None)

Step 3: Modify get_job_status to include warnings

In api/main.py, modify get_job_status function (around line 407):

@app.get("/simulate/status/{job_id}", response_model=JobStatusResponse)
async def get_job_status(job_id: str):
    """
    Get status and progress of a simulation job.

    Args:
        job_id: Job UUID

    Returns:
        Job status, progress, model-day details, and warnings

    Raises:
        HTTPException 404: If job not found
    """
    try:
        job_manager = JobManager(db_path=app.state.db_path)

        # Get job info
        job = job_manager.get_job(job_id)
        if not job:
            raise HTTPException(status_code=404, detail=f"Job {job_id} not found")

        # Get progress
        progress = job_manager.get_job_progress(job_id)

        # Get model-day details
        details = job_manager.get_job_details(job_id)

        # Calculate pending (total - completed - failed)
        pending = progress["total_model_days"] - progress["completed"] - progress["failed"]

        # Parse warnings from JSON if present
        import json
        warnings = None
        if job.get("warnings"):
            try:
                warnings = json.loads(job["warnings"])
            except (json.JSONDecodeError, TypeError):
                logger.warning(f"Failed to parse warnings for job {job_id}")

        # Get deployment mode info
        deployment_info = get_deployment_mode_dict()

        return JobStatusResponse(
            job_id=job["job_id"],
            status=job["status"],
            progress=JobProgress(
                total_model_days=progress["total_model_days"],
                completed=progress["completed"],
                failed=progress["failed"],
                pending=pending
            ),
            date_range=job["date_range"],
            models=job["models"],
            created_at=job["created_at"],
            started_at=job.get("started_at"),
            completed_at=job.get("completed_at"),
            total_duration_seconds=job.get("total_duration_seconds"),
            error=job.get("error"),
            details=details,
            warnings=warnings,  # NEW
            **deployment_info
        )

    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Failed to get job status: {e}", exc_info=True)
        raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")

Step 4: Run test to verify it passes

Run: ../../venv/bin/python -m pytest tests/integration/test_api_endpoints.py::test_status_endpoint_returns_warnings -v

Expected: PASS

Step 5: Commit status endpoint update

git add api/main.py tests/integration/test_api_endpoints.py
git commit -m "feat(api): return warnings in /simulate/status response

Parse and return job warnings from database.

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 9: End-to-End Integration Test

Files:

Create: tests/e2e/test_async_download_flow.py

Step 1: Write end-to-end test

Create tests/e2e/test_async_download_flow.py:

"""
End-to-end test for async price download flow.

Tests the complete flow:
1. POST /simulate/trigger (fast response)
2. Worker downloads data in background
3. GET /simulate/status shows downloading_data → running → completed
4. Warnings are captured and returned
"""

import pytest
import time
from unittest.mock import patch, Mock
from api.main import create_app
from api.database import initialize_database
from fastapi.testclient import TestClient

@pytest.fixture
def test_app(tmp_path):
    """Create test app with isolated database."""
    db_path = str(tmp_path / "test.db")
    initialize_database(db_path)

    app = create_app(db_path=db_path, config_path="configs/default_config.json")
    app.state.test_mode = True  # Disable background worker

    yield app

@pytest.fixture
def test_client(test_app):
    """Create test client."""
    return TestClient(test_app)

def test_complete_async_download_flow(test_client, monkeypatch):
    """Test complete flow from trigger to completion with async download."""

    # Mock PriceDataManager for predictable behavior
    class MockPriceManager:
        def __init__(self, db_path):
            self.db_path = db_path

        def get_missing_coverage(self, start, end):
            return {"AAPL": {"2025-10-01"}}  # Simulate missing data

        def download_missing_data_prioritized(self, missing, requested):
            return {
                "downloaded": ["AAPL"],
                "failed": [],
                "rate_limited": False
            }

        def get_available_trading_dates(self, start, end):
            return ["2025-10-01"]

    monkeypatch.setattr("api.simulation_worker.PriceDataManager", MockPriceManager)

    # Mock execution to avoid actual trading
    def mock_execute_date(self, date, models, config_path):
        pass

    monkeypatch.setattr("api.simulation_worker.SimulationWorker._execute_date", mock_execute_date)

    # Step 1: Trigger simulation
    start_time = time.time()
    response = test_client.post("/simulate/trigger", json={
        "start_date": "2025-10-01",
        "end_date": "2025-10-01",
        "models": ["gpt-5"]
    })
    elapsed = time.time() - start_time

    # Should respond quickly
    assert elapsed < 2.0
    assert response.status_code == 200

    data = response.json()
    job_id = data["job_id"]
    assert data["status"] == "pending"

    # Step 2: Run worker manually (since test_mode=True)
    from api.simulation_worker import SimulationWorker
    worker = SimulationWorker(job_id=job_id, db_path=test_client.app.state.db_path)
    result = worker.run()

    # Step 3: Check final status
    status_response = test_client.get(f"/simulate/status/{job_id}")
    assert status_response.status_code == 200

    status_data = status_response.json()
    assert status_data["status"] == "completed"
    assert status_data["job_id"] == job_id

def test_flow_with_rate_limit_warning(test_client, monkeypatch):
    """Test flow when rate limit is hit during download."""

    class MockPriceManagerRateLimited:
        def __init__(self, db_path):
            self.db_path = db_path

        def get_missing_coverage(self, start, end):
            return {"AAPL": {"2025-10-01"}, "MSFT": {"2025-10-01"}}

        def download_missing_data_prioritized(self, missing, requested):
            return {
                "downloaded": ["AAPL"],
                "failed": ["MSFT"],
                "rate_limited": True
            }

        def get_available_trading_dates(self, start, end):
            return []  # No complete dates due to rate limit

    monkeypatch.setattr("api.simulation_worker.PriceDataManager", MockPriceManagerRateLimited)

    # Trigger
    response = test_client.post("/simulate/trigger", json={
        "start_date": "2025-10-01",
        "end_date": "2025-10-01",
        "models": ["gpt-5"]
    })

    job_id = response.json()["job_id"]

    # Run worker
    from api.simulation_worker import SimulationWorker
    worker = SimulationWorker(job_id=job_id, db_path=test_client.app.state.db_path)
    result = worker.run()

    # Should fail due to no available dates
    assert result["success"] is False

    # Check status has error
    status_response = test_client.get(f"/simulate/status/{job_id}")
    status_data = status_response.json()
    assert status_data["status"] == "failed"
    assert "No trading dates available" in status_data["error"]

def test_flow_with_partial_data(test_client, monkeypatch):
    """Test flow when some dates are skipped due to incomplete data."""

    class MockPriceManagerPartial:
        def __init__(self, db_path):
            self.db_path = db_path

        def get_missing_coverage(self, start, end):
            return {}  # No missing data

        def get_available_trading_dates(self, start, end):
            # Only 2 out of 3 dates available
            return ["2025-10-01", "2025-10-03"]

    monkeypatch.setattr("api.simulation_worker.PriceDataManager", MockPriceManagerPartial)

    def mock_execute_date(self, date, models, config_path):
        pass

    monkeypatch.setattr("api.simulation_worker.SimulationWorker._execute_date", mock_execute_date)

    # Trigger with 3 dates
    response = test_client.post("/simulate/trigger", json={
        "start_date": "2025-10-01",
        "end_date": "2025-10-03",
        "models": ["gpt-5"]
    })

    job_id = response.json()["job_id"]

    # Run worker
    from api.simulation_worker import SimulationWorker
    worker = SimulationWorker(job_id=job_id, db_path=test_client.app.state.db_path)
    result = worker.run()

    # Should complete with warnings
    assert result["success"] is True
    assert len(result["warnings"]) > 0
    assert "Skipped" in result["warnings"][0]

    # Check status returns warnings
    status_response = test_client.get(f"/simulate/status/{job_id}")
    status_data = status_response.json()
    assert status_data["status"] == "completed"
    assert status_data["warnings"] is not None
    assert len(status_data["warnings"]) > 0

Step 2: Run e2e test

Run: ../../venv/bin/python -m pytest tests/e2e/test_async_download_flow.py -v

Expected: PASS (all scenarios work end-to-end)

Step 3: Commit e2e tests

git add tests/e2e/test_async_download_flow.py
git commit -m "test: add end-to-end tests for async download flow

Test complete flow:
- Fast API response
- Background data download
- Status transitions
- Warning capture and display

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 10: Update API Documentation

Files:

Modify: API_REFERENCE.md
Modify: docs/user-guide/using-the-api.md

Step 1: Update API_REFERENCE.md with new status

In API_REFERENCE.md, find the job status documentation and update:

### Job Status Values

- `pending`: Job created, waiting to start
- `downloading_data`: Preparing price data (downloading if needed)  ← NEW
- `running`: Executing trading simulations
- `completed`: All model-days completed successfully
- `partial`: Some model-days completed, some failed
- `failed`: Job failed to execute

### Response Fields (New)

#### warnings

Optional array of warning messages:
- Rate limit reached during data download
- Dates skipped due to incomplete price data
- Other non-fatal issues

Example:
```json
{
  "job_id": "019a426b...",
  "status": "completed",
  "warnings": [
    "Rate limit reached - downloaded 12/15 symbols",
    "Skipped 2 dates due to incomplete price data: ['2025-10-02', '2025-10-05']"
  ]
}


**Step 2: Update using-the-api.md with async behavior**

In `docs/user-guide/using-the-api.md`, add section:

```markdown
## Async Data Download

The `/simulate/trigger` endpoint responds immediately (<1 second), even when price data needs to be downloaded.

### Flow

1. **POST /simulate/trigger** - Returns `job_id` immediately
2. **Background worker** - Downloads missing data automatically
3. **Poll /simulate/status** - Track progress through status transitions

### Status Progression

pending → downloading_data → running → completed


### Monitoring Progress

Use `docker logs -f` to monitor download progress in real-time:

```bash
docker logs -f ai-trader-server

# Example output:
# Job 019a426b: Checking price data availability...
# Job 019a426b: Missing data for 15 symbols
# Job 019a426b: Starting prioritized download...
# Job 019a426b: Download complete - 12/15 symbols succeeded
# Job 019a426b: Rate limit reached - proceeding with available dates
# Job 019a426b: Starting execution - 8 dates, 1 models

Handling Warnings

Check the warnings field in status response:

import requests
import time

# Trigger simulation
response = requests.post("http://localhost:8080/simulate/trigger", json={
    "start_date": "2025-10-01",
    "end_date": "2025-10-10",
    "models": ["gpt-5"]
})

job_id = response.json()["job_id"]

# Poll until complete
while True:
    status = requests.get(f"http://localhost:8080/simulate/status/{job_id}").json()

    if status["status"] in ["completed", "partial", "failed"]:
        # Check for warnings
        if status.get("warnings"):
            print("Warnings:", status["warnings"])
        break

    time.sleep(2)


**Step 3: Commit documentation updates**

```bash
git add API_REFERENCE.md docs/user-guide/using-the-api.md
git commit -m "docs: update API docs for async download behavior

Document:
- New downloading_data status
- Warnings field in responses
- Async flow and monitoring
- Example usage patterns

Co-Authored-By: Claude <noreply@anthropic.com>"

Task 11: Run Full Test Suite

Files:

None (verification only)

Step 1: Run all unit tests

Run: ../../venv/bin/python -m pytest tests/unit/ -v

Expected: All tests PASS

Step 2: Run all integration tests

Run: ../../venv/bin/python -m pytest tests/integration/ -v

Expected: All tests PASS

Step 3: Run e2e tests

Run: ../../venv/bin/python -m pytest tests/e2e/ -v

Expected: All tests PASS

Step 4: If any tests fail, fix them

Review failures and fix. Commit fixes:

git add <fixed-files>
git commit -m "fix: address test failures

<describe fixes>

Co-Authored-By: Claude <noreply@anthropic.com>"

Step 5: Final verification - run complete suite

Run: ../../venv/bin/python -m pytest tests/ -v --tb=short

Expected: All tests PASS

Completion

All tasks complete!

The async price download feature is now fully implemented:

✅ Database schema supports downloading_data status and warnings ✅ SimulationWorker prepares data before execution ✅ API endpoint responds quickly without blocking ✅ Warnings are captured and displayed ✅ End-to-end tests validate complete flow ✅ Documentation updated

Next Steps:

Use superpowers:finishing-a-development-branch to:

Review all changes
Create pull request or merge to main
Clean up worktree

56 KiB Raw Blame History

Async Price Data Download Implementation Plan

Task 1: Add Database Support for New Status and Warnings

Task 2: Add JobManager Method for Warnings

Task 3: Add Response Model Warnings Fields

Task 4: Implement SimulationWorker Helper Methods

Task 5: Implement SimulationWorker._prepare_data

Task 6: Integrate _prepare_data into SimulationWorker.run()

Task 7: Simplify /simulate/trigger Endpoint

Task 8: Update /simulate/status to Return Warnings

Task 9: End-to-End Integration Test

Task 10: Update API Documentation

Handling Warnings

Task 11: Run Full Test Suite

Completion

56 KiB

Raw Blame History