fix: enable cross-job portfolio continuity in get_starting_holdings

Remove job_id filter from get_starting_holdings() SQL JOIN to enable holdings continuity across jobs. This completes the cross-job portfolio continuity fix started in the previous commit. Root cause: get_starting_holdings() joined on job_id, preventing it from finding previous day's holdings when queried from a different job. This caused starting_position.holdings to be empty in API results for new jobs even though starting_cash was correctly retrieved. Changes: - api/database.py: Remove job_id from JOIN condition in get_starting_holdings() - tests/unit/test_database_helpers.py: Add test for cross-job holdings retrieval Together with the previous commit fixing get_previous_trading_day(), this ensures complete portfolio continuity (both cash and holdings) across jobs.
fix: enable cross-job portfolio continuity in get_previous_trading_day
2026-04-02 01:27:24 -04:00 · 2025-11-07 16:38:33 -05:00 · 2025-11-07 16:13:28 -05:00 · 2025-11-07 15:41:28 -05:00 · 2025-11-07 14:56:48 -05:00 · 2025-11-07 14:32:20 -05:00
22 changed files with 2651 additions and 204 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,12 +7,38 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ## [Unreleased]

+## [0.4.2] - 2025-11-07
+
 ### Fixed
+- **Critical:** Fixed negative cash position bug where trades calculated from initial capital instead of accumulating
+  - Root cause: MCP tools return `CallToolResult` objects with position data in `structuredContent` field, but `ContextInjector` was checking `isinstance(result, dict)` which always failed
+  - Impact: Each trade checked cash against initial $10,000 instead of cumulative position, allowing over-spending and resulting in negative cash balances (e.g., -$8,768.68 after 11 trades totaling $18,768.68)
+  - Solution: Updated `ContextInjector` to extract position dict from `CallToolResult.structuredContent` before validation
+  - Fix ensures proper intra-day position tracking with cumulative cash checks preventing over-trading
+  - Updated unit tests to mock `CallToolResult` objects matching production MCP behavior
+  - Locations: `agent/context_injector.py:95-109`, `tests/unit/test_context_injector.py:26-53`
+- Enabled MCP service logging by redirecting stdout/stderr from `/dev/null` to main process for better debugging
+  - Previously, all MCP tool debug output was silently discarded
+  - Now visible in docker logs for diagnosing parameter injection and trade execution issues
+  - Location: `agent_tools/start_mcp_services.py:81-88`
+
+### Fixed
+- **Critical:** Fixed stale jobs blocking new jobs after Docker container restart
+  - Root cause: Jobs with status 'pending', 'downloading_data', or 'running' remained in database after container shutdown, preventing new job creation
+  - Solution: Added `cleanup_stale_jobs()` method that runs on FastAPI startup to mark interrupted jobs as 'failed' or 'partial' based on completion percentage
+  - Intelligent status determination: Uses existing progress tracking (completed/total model-days) to distinguish between failed (0% complete) and partial (>0% complete)
+  - Detailed error messages include original status and completion counts (e.g., "Job interrupted by container restart (was running, 3/10 model-days completed)")
+  - Incomplete job_details automatically marked as 'failed' with clear error messages
+  - Deployment-aware: Skips cleanup in DEV mode when database is reset, always runs in PROD mode
+  - Comprehensive test coverage: 6 new unit tests covering all cleanup scenarios
+  - Locations: `api/job_manager.py:702-779`, `api/main.py:164-168`, `tests/unit/test_job_manager.py:451-609`
 - Fixed Pydantic validation errors when using DeepSeek models via OpenRouter
- Root cause: DeepSeek returns tool_calls in non-standard format with `args` field directly, bypassing LangChain's `parse_tool_call()`
- Solution: Added `ToolCallArgsParsingWrapper` that normalizes non-standard tool_call format to OpenAI standard before LangChain processing
- Wrapper converts `{name, args, id}` → `{function: {name, arguments}, id}` format
- Includes diagnostic logging to identify format inconsistencies across providers
+  - Root cause: LangChain's `parse_tool_call()` has a bug where it sometimes returns `args` as JSON string instead of parsed dict object
+  - Solution: Added `ToolCallArgsParsingWrapper` that:
+    1. Patches `parse_tool_call()` to detect and fix string args by parsing them to dict
+    2. Normalizes non-standard tool_call formats (e.g., `{name, args, id}` → `{function: {name, arguments}, id}`)
+  - The wrapper is defensive and only acts when needed, ensuring compatibility with all AI providers
+  - Fixes validation error: `tool_calls.0.args: Input should be a valid dictionary [type=dict_type, input_value='...', input_type=str]`

 ## [0.4.1] - 2025-11-06

--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -202,6 +202,34 @@ bash main.sh
 - Search results: News filtered by publication date
 - All tools enforce temporal boundaries via `TODAY_DATE` from `runtime_env.json`

+### Duplicate Simulation Prevention
+
+**Automatic Skip Logic:**
+- `JobManager.create_job()` checks database for already-completed model-day pairs
+- Skips completed simulations automatically
+- Returns warnings list with skipped pairs
+- Raises `ValueError` if all requested simulations are already completed
+
+**Example:**
+```python
+result = job_manager.create_job(
+    config_path="config.json",
+    date_range=["2025-10-15", "2025-10-16"],
+    models=["model-a"],
+    model_day_filter=[("model-a", "2025-10-15")]  # Already completed
+)
+
+# result = {
+#     "job_id": "new-job-uuid",
+#     "warnings": ["Skipped model-a/2025-10-15 - already completed"]
+# }
+```
+
+**Cross-Job Portfolio Continuity:**
+- `get_current_position_from_db()` queries across ALL jobs for a given model
+- Enables portfolio continuity even when new jobs are created with overlapping dates
+- Starting position = most recent trading_day.ending_cash + holdings where date < current_date
+
 ## Configuration File Format

 ```json
--- a/agent/chat_model_wrapper.py
+++ b/agent/chat_model_wrapper.py
@@ -32,70 +32,35 @@ class ToolCallArgsParsingWrapper:
            # Model doesn't have this method (e.g., MockChatModel), skip patching
            return

-        # CRITICAL: Also patch parse_tool_call to see what it's returning
-        from langchain_core.output_parsers import openai_tools
-        original_parse_tool_call = openai_tools.parse_tool_call
+        # CRITICAL: Patch parse_tool_call in base.py's namespace (not in openai_tools module!)
+        from langchain_openai.chat_models import base as langchain_base
+        original_parse_tool_call = langchain_base.parse_tool_call

        def patched_parse_tool_call(raw_tool_call, *, partial=False, strict=False, return_id=True):
-            """Patched parse_tool_call to log what it returns"""
+            """Patched parse_tool_call to fix string args bug"""
            result = original_parse_tool_call(raw_tool_call, partial=partial, strict=strict, return_id=return_id)
-            if result:
-                args_type = type(result.get('args', None)).__name__
-                print(f"[DIAGNOSTIC] parse_tool_call returned: args type = {args_type}")
-                if args_type == 'str':
-                    print(f"[DIAGNOSTIC] ⚠️ BUG FOUND! parse_tool_call returned STRING args: {result['args']}")
+            if result and isinstance(result.get('args'), str):
+                # FIX: parse_tool_call sometimes returns string args instead of dict
+                # This is a known LangChain bug - parse the string to dict
+                try:
+                    result['args'] = json.loads(result['args'])
+                except (json.JSONDecodeError, TypeError):
+                    # Leave as string if we can't parse it - will fail validation
+                    # but at least we tried
+                    pass
            return result

-        # Replace globally
-        openai_tools.parse_tool_call = patched_parse_tool_call
+        # Replace in base.py's namespace (where _convert_dict_to_message uses it)
+        langchain_base.parse_tool_call = patched_parse_tool_call

        original_create_chat_result = self.wrapped_model._create_chat_result

        @wraps(original_create_chat_result)
        def patched_create_chat_result(response: Any, generation_info: Optional[Dict] = None):
-            """Patched version with diagnostic logging and args parsing"""
-            import traceback
+            """Patched version that normalizes non-standard tool_call formats"""
            response_dict = response if isinstance(response, dict) else response.model_dump()

-            # DIAGNOSTIC: Log response structure for debugging
-            print(f"\n[DIAGNOSTIC] _create_chat_result called")
-            print(f"  Response type: {type(response)}")
-            print(f"  Call stack:")
-            for line in traceback.format_stack()[-5:-1]:  # Show last 4 stack frames
-                print(f"    {line.strip()}")
-            print(f"\n[DIAGNOSTIC] Response structure:")
-            print(f"  Response keys: {list(response_dict.keys())}")
-
-            if 'choices' in response_dict and response_dict['choices']:
-                choice = response_dict['choices'][0]
-                print(f"  Choice keys: {list(choice.keys())}")
-
-                if 'message' in choice:
-                    message = choice['message']
-                    print(f"  Message keys: {list(message.keys())}")
-
-                    # Check for raw tool_calls in message (before parse_tool_call processing)
-                    if 'tool_calls' in message:
-                        tool_calls_value = message['tool_calls']
-                        print(f"  message['tool_calls'] type: {type(tool_calls_value)}")
-
-                        if tool_calls_value:
-                            print(f"  tool_calls count: {len(tool_calls_value)}")
-                            for i, tc in enumerate(tool_calls_value):  # Show ALL
-                                print(f"  tool_calls[{i}] type: {type(tc)}")
-                                print(f"  tool_calls[{i}] keys: {list(tc.keys()) if isinstance(tc, dict) else 'N/A'}")
-                                if isinstance(tc, dict):
-                                    if 'function' in tc:
-                                        print(f"    function keys: {list(tc['function'].keys())}")
-                                        if 'arguments' in tc['function']:
-                                            args = tc['function']['arguments']
-                                            print(f"    function.arguments type: {type(args).__name__}")
-                                            print(f"    function.arguments value: {str(args)[:100]}")
-                                    if 'args' in tc:
-                                        print(f"    ALSO HAS 'args' KEY: type={type(tc['args']).__name__}")
-                                        print(f"    args value: {str(tc['args'])[:100]}")
-
-            # Fix tool_calls: Normalize to OpenAI format if needed
+            # Normalize tool_calls to OpenAI standard format if needed
            if 'choices' in response_dict:
                for choice in response_dict['choices']:
                    if 'message' not in choice:
@@ -103,13 +68,11 @@ class ToolCallArgsParsingWrapper:

                    message = choice['message']

-                    # Fix tool_calls: Ensure standard OpenAI format
+                    # Fix tool_calls: Convert non-standard {name, args, id} to {function: {name, arguments}, id}
                    if 'tool_calls' in message and message['tool_calls']:
-                        print(f"[DIAGNOSTIC] Processing {len(message['tool_calls'])} tool_calls...")
-                        for idx, tool_call in enumerate(message['tool_calls']):
+                        for tool_call in message['tool_calls']:
                            # Check if this is non-standard format (has 'args' directly)
                            if 'args' in tool_call and 'function' not in tool_call:
-                                print(f"[DIAGNOSTIC] tool_calls[{idx}] has non-standard format (direct args)")
                                # Convert to standard OpenAI format
                                args = tool_call['args']
                                tool_call['function'] = {
@@ -121,36 +84,19 @@ class ToolCallArgsParsingWrapper:
                                    del tool_call['name']
                                if 'args' in tool_call:
                                    del tool_call['args']
-                                print(f"[DIAGNOSTIC] Converted tool_calls[{idx}] to standard OpenAI format")

-                    # Fix invalid_tool_calls: dict args -> string
+                    # Fix invalid_tool_calls: Ensure args is JSON string (not dict)
                    if 'invalid_tool_calls' in message and message['invalid_tool_calls']:
-                        print(f"[DIAGNOSTIC] Checking invalid_tool_calls for dict-to-string conversion...")
-                        for idx, invalid_call in enumerate(message['invalid_tool_calls']):
-                            if 'args' in invalid_call:
-                                args = invalid_call['args']
-                                # Convert dict arguments to JSON string
-                                if isinstance(args, dict):
-                                    try:
-                                        invalid_call['args'] = json.dumps(args)
-                                        print(f"[DIAGNOSTIC] Converted invalid_tool_calls[{idx}].args from dict to string")
-                                    except (TypeError, ValueError) as e:
-                                        print(f"[DIAGNOSTIC] Failed to serialize invalid_tool_calls[{idx}].args: {e}")
-                                        # Keep as-is if serialization fails
+                        for invalid_call in message['invalid_tool_calls']:
+                            if 'args' in invalid_call and isinstance(invalid_call['args'], dict):
+                                try:
+                                    invalid_call['args'] = json.dumps(invalid_call['args'])
+                                except (TypeError, ValueError):
+                                    # Keep as-is if serialization fails
+                                    pass

-            # Call original method with fixed response
-            print(f"[DIAGNOSTIC] Calling original_create_chat_result...")
-            result = original_create_chat_result(response_dict, generation_info)
-            print(f"[DIAGNOSTIC] original_create_chat_result returned successfully")
-            print(f"[DIAGNOSTIC] Result type: {type(result)}")
-            if hasattr(result, 'generations') and result.generations:
-                gen = result.generations[0]
-                if hasattr(gen, 'message') and hasattr(gen.message, 'tool_calls'):
-                    print(f"[DIAGNOSTIC] Result has {len(gen.message.tool_calls)} tool_calls")
-                    if gen.message.tool_calls:
-                        tc = gen.message.tool_calls[0]
-                        print(f"[DIAGNOSTIC] tool_calls[0]['args'] type in result: {type(tc['args'])}")
-            return result
+            # Call original method with normalized response
+            return original_create_chat_result(response_dict, generation_info)

        # Replace the method
        self.wrapped_model._create_chat_result = patched_create_chat_result
--- a/agent/context_injector.py
+++ b/agent/context_injector.py
@@ -88,9 +88,17 @@ class ContextInjector:

        # Update position state after successful trade
        if request.name in ["buy", "sell"]:
-            # Check if result is a valid position dict (not an error)
-            if isinstance(result, dict) and "error" not in result and "CASH" in result:
+            # Extract position dict from MCP result
+            # MCP tools return CallToolResult objects with structuredContent field
+            position_dict = None
+            if hasattr(result, 'structuredContent') and result.structuredContent:
+                position_dict = result.structuredContent
+            elif isinstance(result, dict):
+                position_dict = result
+
+            # Check if position dict is valid (not an error) and update state
+            if position_dict and "error" not in position_dict and "CASH" in position_dict:
                # Update our tracked position with the new state
-                self._current_position = result.copy()
+                self._current_position = position_dict.copy()

        return result
--- a/agent_tools/start_mcp_services.py
+++ b/agent_tools/start_mcp_services.py
@@ -78,10 +78,11 @@ class MCPServiceManager:
            env['PYTHONPATH'] = str(Path.cwd())

            # Start service process (output goes to Docker logs)
+            # Enable stdout/stderr for debugging (previously sent to DEVNULL)
            process = subprocess.Popen(
                [sys.executable, str(script_path)],
-                stdout=subprocess.DEVNULL,
-                stderr=subprocess.DEVNULL,
+                stdout=sys.stdout,  # Redirect to main process stdout
+                stderr=sys.stderr,  # Redirect to main process stderr
                cwd=Path.cwd(),  # Use current working directory (/app)
                env=env  # Pass environment with PYTHONPATH
            )
--- a/agent_tools/tool_trade.py
+++ b/agent_tools/tool_trade.py
@@ -34,8 +34,11 @@ def get_current_position_from_db(
    Returns ending holdings and cash from that previous day, which becomes the
    starting position for the current day.

+    NOTE: Searches across ALL jobs for the given model, enabling portfolio continuity
+    even when new jobs are created with overlapping date ranges.
+
    Args:
-        job_id: Job UUID
+        job_id: Job UUID (kept for compatibility but not used in query)
        model: Model signature
        date: Current trading date (will query for date < this)
        initial_cash: Initial cash if no prior data (first trading day)
@@ -51,13 +54,14 @@ def get_current_position_from_db(

    try:
        # Query most recent trading_day BEFORE current date (previous day's ending position)
+        # NOTE: Removed job_id filter to enable cross-job continuity
        cursor.execute("""
            SELECT id, ending_cash
            FROM trading_days
-            WHERE job_id = ? AND model = ? AND date < ?
+            WHERE model = ? AND date < ?
            ORDER BY date DESC
            LIMIT 1
-        """, (job_id, model, date))
+        """, (model, date))

        row = cursor.fetchone()

--- a/api/database.py
+++ b/api/database.py
@@ -611,6 +611,10 @@ class Database:

        Handles weekends/holidays by finding actual previous trading day.

+        NOTE: Queries across ALL jobs for the given model to enable portfolio
+        continuity even when new jobs are created with overlapping date ranges.
+        The job_id parameter is kept for API compatibility but not used in the query.
+
        Returns:
            dict with keys: id, date, ending_cash, ending_portfolio_value
            or None if no previous day exists
@@ -619,11 +623,11 @@ class Database:
            """
            SELECT id, date, ending_cash, ending_portfolio_value
            FROM trading_days
-            WHERE job_id = ? AND model = ? AND date < ?
+            WHERE model = ? AND date < ?
            ORDER BY date DESC
            LIMIT 1
            """,
-            (job_id, model, current_date)
+            (model, current_date)
        )

        row = cursor.fetchone()
@@ -657,6 +661,9 @@ class Database:
    def get_starting_holdings(self, trading_day_id: int) -> list:
        """Get starting holdings from previous day's ending holdings.

+        NOTE: Queries across ALL jobs for the given model to enable portfolio
+        continuity even when new jobs are created with overlapping date ranges.
+
        Returns:
            List of dicts with keys: symbol, quantity
            Empty list if first trading day
@@ -667,7 +674,6 @@ class Database:
            SELECT td_prev.id
            FROM trading_days td_current
            JOIN trading_days td_prev ON
-                td_prev.job_id = td_current.job_id AND
                td_prev.model = td_current.model AND
                td_prev.date < td_current.date
            WHERE td_current.id = ?
--- a/api/job_manager.py
+++ b/api/job_manager.py
@@ -55,8 +55,9 @@ class JobManager:
        config_path: str,
        date_range: List[str],
        models: List[str],
-        model_day_filter: Optional[List[tuple]] = None
-    ) -> str:
+        model_day_filter: Optional[List[tuple]] = None,
+        skip_completed: bool = True
+    ) -> Dict[str, Any]:
        """
        Create new simulation job.

@@ -66,12 +67,16 @@ class JobManager:
            models: List of model signatures to execute
            model_day_filter: Optional list of (model, date) tuples to limit job_details.
                             If None, creates job_details for all model-date combinations.
+            skip_completed: If True (default), skips already-completed simulations.
+                           If False, includes all requested simulations regardless of completion status.

        Returns:
-            job_id: UUID of created job
+            Dict with:
+              - job_id: UUID of created job
+              - warnings: List of warning messages for skipped simulations

        Raises:
-            ValueError: If another job is already running/pending
+            ValueError: If another job is already running/pending or if all simulations are already completed (when skip_completed=True)
        """
        if not self.can_start_new_job():
            raise ValueError("Another simulation job is already running or pending")
@@ -83,6 +88,49 @@ class JobManager:
        cursor = conn.cursor()

        try:
+            # Determine which model-day pairs to check
+            if model_day_filter is not None:
+                pairs_to_check = model_day_filter
+            else:
+                pairs_to_check = [(model, date) for date in date_range for model in models]
+
+            # Check for already-completed simulations (only if skip_completed=True)
+            skipped_pairs = []
+            pending_pairs = []
+
+            if skip_completed:
+                # Perform duplicate checking
+                for model, date in pairs_to_check:
+                    cursor.execute("""
+                        SELECT COUNT(*)
+                        FROM job_details
+                        WHERE model = ? AND date = ? AND status = 'completed'
+                    """, (model, date))
+
+                    count = cursor.fetchone()[0]
+
+                    if count > 0:
+                        skipped_pairs.append((model, date))
+                        logger.info(f"Skipping {model}/{date} - already completed in previous job")
+                    else:
+                        pending_pairs.append((model, date))
+
+                # If all simulations are already completed, raise error
+                if len(pending_pairs) == 0:
+                    warnings = [
+                        f"Skipped {model}/{date} - already completed"
+                        for model, date in skipped_pairs
+                    ]
+                    raise ValueError(
+                        f"All requested simulations are already completed. "
+                        f"Skipped {len(skipped_pairs)} model-day pair(s). "
+                        f"Details: {warnings}"
+                    )
+            else:
+                # skip_completed=False: include ALL pairs (no duplicate checking)
+                pending_pairs = pairs_to_check
+                logger.info(f"Including all {len(pending_pairs)} model-day pairs (skip_completed=False)")
+
            # Insert job
            cursor.execute("""
                INSERT INTO jobs (
@@ -98,34 +146,32 @@ class JobManager:
                created_at
            ))

-            # Create job_details based on filter
-            if model_day_filter is not None:
-                # Only create job_details for specified model-day pairs
-                for model, date in model_day_filter:
-                    cursor.execute("""
-                        INSERT INTO job_details (
-                            job_id, date, model, status
-                        )
-                        VALUES (?, ?, ?, ?)
-                    """, (job_id, date, model, "pending"))
+            # Create job_details only for pending pairs
+            for model, date in pending_pairs:
+                cursor.execute("""
+                    INSERT INTO job_details (
+                        job_id, date, model, status
+                    )
+                    VALUES (?, ?, ?, ?)
+                """, (job_id, date, model, "pending"))

-                logger.info(f"Created job {job_id} with {len(model_day_filter)} model-day tasks (filtered)")
-            else:
-                # Create job_details for all model-day combinations
-                for date in date_range:
-                    for model in models:
-                        cursor.execute("""
-                            INSERT INTO job_details (
-                                job_id, date, model, status
-                            )
-                            VALUES (?, ?, ?, ?)
-                        """, (job_id, date, model, "pending"))
+            logger.info(f"Created job {job_id} with {len(pending_pairs)} model-day tasks")

-                logger.info(f"Created job {job_id} with {len(date_range)} dates and {len(models)} models")
+            if skipped_pairs:
+                logger.info(f"Skipped {len(skipped_pairs)} already-completed simulations")

            conn.commit()

-            return job_id
+            # Prepare warnings
+            warnings = [
+                f"Skipped {model}/{date} - already completed"
+                for model, date in skipped_pairs
+            ]
+
+            return {
+                "job_id": job_id,
+                "warnings": warnings
+            }

        finally:
            conn.close()
@@ -699,6 +745,85 @@ class JobManager:
        finally:
            conn.close()

+    def cleanup_stale_jobs(self) -> Dict[str, int]:
+        """
+        Clean up stale jobs from container restarts.
+
+        Marks jobs with status 'pending', 'downloading_data', or 'running' as
+        'failed' or 'partial' based on completion percentage.
+
+        Called on application startup to reset interrupted jobs.
+
+        Returns:
+            Dict with jobs_cleaned count and details
+        """
+        conn = get_db_connection(self.db_path)
+        cursor = conn.cursor()
+
+        try:
+            # Find all stale jobs
+            cursor.execute("""
+                SELECT job_id, status
+                FROM jobs
+                WHERE status IN ('pending', 'downloading_data', 'running')
+            """)
+
+            stale_jobs = cursor.fetchall()
+            cleaned_count = 0
+
+            for job_id, original_status in stale_jobs:
+                # Get progress to determine if partially completed
+                cursor.execute("""
+                    SELECT
+                        COUNT(*) as total,
+                        SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed,
+                        SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) as failed
+                    FROM job_details
+                    WHERE job_id = ?
+                """, (job_id,))
+
+                total, completed, failed = cursor.fetchone()
+                completed = completed or 0
+                failed = failed or 0
+
+                # Determine final status based on completion
+                if completed > 0:
+                    new_status = "partial"
+                    error_msg = f"Job interrupted by container restart (was {original_status}, {completed}/{total} model-days completed)"
+                else:
+                    new_status = "failed"
+                    error_msg = f"Job interrupted by container restart (was {original_status}, no progress made)"
+
+                # Mark incomplete job_details as failed
+                cursor.execute("""
+                    UPDATE job_details
+                    SET status = 'failed', error = 'Container restarted before completion'
+                    WHERE job_id = ? AND status IN ('pending', 'running')
+                """, (job_id,))
+
+                # Update job status
+                updated_at = datetime.utcnow().isoformat() + "Z"
+                cursor.execute("""
+                    UPDATE jobs
+                    SET status = ?, error = ?, completed_at = ?, updated_at = ?
+                    WHERE job_id = ?
+                """, (new_status, error_msg, updated_at, updated_at, job_id))
+
+                logger.warning(f"Cleaned up stale job {job_id}: {original_status} → {new_status} ({completed}/{total} completed)")
+                cleaned_count += 1
+
+            conn.commit()
+
+            if cleaned_count > 0:
+                logger.warning(f"⚠️  Cleaned up {cleaned_count} stale job(s) from previous container session")
+            else:
+                logger.info("✅ No stale jobs found")
+
+            return {"jobs_cleaned": cleaned_count}
+
+        finally:
+            conn.close()
+
    def cleanup_old_jobs(self, days: int = 30) -> Dict[str, int]:
        """
        Delete jobs older than threshold.
--- a/api/main.py
+++ b/api/main.py
@@ -134,25 +134,39 @@ def create_app(
    @asynccontextmanager
    async def lifespan(app: FastAPI):
        """Initialize database on startup, cleanup on shutdown if needed"""
-        from tools.deployment_config import is_dev_mode, get_db_path
+        from tools.deployment_config import is_dev_mode, get_db_path, should_preserve_dev_data
        from api.database import initialize_dev_database, initialize_database

        # Startup - use closure to access db_path from create_app scope
        logger.info("🚀 FastAPI application starting...")
        logger.info("📊 Initializing database...")

+        should_cleanup_stale_jobs = False
+
        if is_dev_mode():
            # Initialize dev database (reset unless PRESERVE_DEV_DATA=true)
            logger.info("  🔧 DEV mode detected - initializing dev database")
            dev_db_path = get_db_path(db_path)
            initialize_dev_database(dev_db_path)
            log_dev_mode_startup_warning()
+
+            # Only cleanup stale jobs if preserving dev data (otherwise DB is fresh)
+            if should_preserve_dev_data():
+                should_cleanup_stale_jobs = True
        else:
            # Ensure production database schema exists
            logger.info("  🏭 PROD mode - ensuring database schema exists")
            initialize_database(db_path)
+            should_cleanup_stale_jobs = True

        logger.info("✅ Database initialized")
+
+        # Clean up stale jobs from previous container session
+        if should_cleanup_stale_jobs:
+            logger.info("🧹 Checking for stale jobs from previous session...")
+            job_manager = JobManager(get_db_path(db_path) if is_dev_mode() else db_path)
+            job_manager.cleanup_stale_jobs()
+
        logger.info("🌐 API server ready to accept requests")

        yield
@@ -266,12 +280,19 @@ def create_app(

            # Create job immediately with all requested dates
            # Worker will handle data download and filtering
-            job_id = job_manager.create_job(
+            result = job_manager.create_job(
                config_path=config_path,
                date_range=all_dates,
                models=models_to_run,
-                model_day_filter=None  # Worker will filter based on available data
+                model_day_filter=None,  # Worker will filter based on available data
+                skip_completed=(not request.replace_existing)  # Skip if replace_existing=False
            )
+            job_id = result["job_id"]
+            warnings = result.get("warnings", [])
+
+            # Log warnings if any simulations were skipped
+            if warnings:
+                logger.warning(f"Job {job_id} created with {len(warnings)} skipped simulations: {warnings}")

            # Start worker in background thread (only if not in test mode)
            if not getattr(app.state, "test_mode", False):
@@ -298,6 +319,7 @@ def create_app(
                status="pending",
                total_model_days=len(all_dates) * len(models_to_run),
                message=message,
+                warnings=warnings if warnings else None,
                **deployment_info
            )

--- a/docs/developer/architecture.md
+++ b/docs/developer/architecture.md
@@ -66,3 +66,28 @@ See README.md for architecture diagram.
 - Search results filtered by publication date

 See [CLAUDE.md](../../CLAUDE.md) for implementation details.
+
+---
+
+## Position Tracking Across Jobs
+
+**Design:** Portfolio state is tracked per-model across all jobs, not per-job.
+
+**Query Logic:**
+```python
+# Get starting position for current trading day
+SELECT id, ending_cash FROM trading_days
+WHERE model = ? AND date < ?  # No job_id filter
+ORDER BY date DESC
+LIMIT 1
+```
+
+**Benefits:**
+- Portfolio continuity when creating new jobs with overlapping dates
+- Prevents accidental portfolio resets
+- Enables flexible job scheduling (resume, rerun, backfill)
+
+**Example:**
+- Job 1: Runs 2025-10-13 to 2025-10-15 for model-a
+- Job 2: Runs 2025-10-16 to 2025-10-20 for model-a
+- Job 2 starts with Job 1's ending position from 2025-10-15
--- a/docs/plans/2025-11-07-fix-duplicate-simulation-bugs.md
+++ b/docs/plans/2025-11-07-fix-duplicate-simulation-bugs.md
--- a/tests/integration/test_api_endpoints.py
+++ b/tests/integration/test_api_endpoints.py
@@ -405,11 +405,12 @@ class TestAsyncDownload:
        db_path = api_client.db_path
        job_manager = JobManager(db_path=db_path)

-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="config.json",
            date_range=["2025-10-01"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        # Add warnings
        warnings = ["Rate limited", "Skipped 1 date"]
--- a/tests/integration/test_async_download.py
+++ b/tests/integration/test_async_download.py
@@ -12,11 +12,12 @@ def test_worker_prepares_data_before_execution(tmp_path):
    job_manager = JobManager(db_path=db_path)

    # Create job
-    job_id = job_manager.create_job(
+    job_result = job_manager.create_job(
        config_path="configs/default_config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )
+    job_id = job_result["job_id"]

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

@@ -46,11 +47,12 @@ def test_worker_handles_no_available_dates(tmp_path):
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

-    job_id = job_manager.create_job(
+    job_result = job_manager.create_job(
        config_path="configs/default_config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )
+    job_id = job_result["job_id"]

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

@@ -74,11 +76,12 @@ def test_worker_stores_warnings(tmp_path):
    initialize_database(db_path)
    job_manager = JobManager(db_path=db_path)

-    job_id = job_manager.create_job(
+    job_result = job_manager.create_job(
        config_path="configs/default_config.json",
        date_range=["2025-10-01"],
        models=["gpt-5"]
    )
+    job_id = job_result["job_id"]

    worker = SimulationWorker(job_id=job_id, db_path=db_path)

--- a/tests/integration/test_duplicate_simulation_prevention.py
+++ b/tests/integration/test_duplicate_simulation_prevention.py
@@ -0,0 +1,278 @@
+"""Integration test for duplicate simulation prevention."""
+import pytest
+import tempfile
+import os
+import json
+from pathlib import Path
+from api.job_manager import JobManager
+from api.model_day_executor import ModelDayExecutor
+from api.database import get_db_connection
+
+
+pytestmark = pytest.mark.integration
+
+
+@pytest.fixture
+def temp_env(tmp_path):
+    """Create temporary environment with db and config."""
+    # Create temp database
+    db_path = str(tmp_path / "test_jobs.db")
+
+    # Initialize database
+    conn = get_db_connection(db_path)
+    cursor = conn.cursor()
+
+    # Create schema
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS jobs (
+            job_id TEXT PRIMARY KEY,
+            config_path TEXT NOT NULL,
+            status TEXT NOT NULL,
+            date_range TEXT NOT NULL,
+            models TEXT NOT NULL,
+            created_at TEXT NOT NULL,
+            started_at TEXT,
+            updated_at TEXT,
+            completed_at TEXT,
+            total_duration_seconds REAL,
+            error TEXT,
+            warnings TEXT
+        )
+    """)
+
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS job_details (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            job_id TEXT NOT NULL,
+            date TEXT NOT NULL,
+            model TEXT NOT NULL,
+            status TEXT NOT NULL,
+            started_at TEXT,
+            completed_at TEXT,
+            duration_seconds REAL,
+            error TEXT,
+            FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE,
+            UNIQUE(job_id, date, model)
+        )
+    """)
+
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS trading_days (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            job_id TEXT NOT NULL,
+            model TEXT NOT NULL,
+            date TEXT NOT NULL,
+            starting_cash REAL NOT NULL,
+            ending_cash REAL NOT NULL,
+            profit REAL NOT NULL,
+            return_pct REAL NOT NULL,
+            portfolio_value REAL NOT NULL,
+            reasoning_summary TEXT,
+            reasoning_full TEXT,
+            completed_at TEXT,
+            session_duration_seconds REAL,
+            UNIQUE(job_id, model, date)
+        )
+    """)
+
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS holdings (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            trading_day_id INTEGER NOT NULL,
+            symbol TEXT NOT NULL,
+            quantity INTEGER NOT NULL,
+            FOREIGN KEY (trading_day_id) REFERENCES trading_days(id) ON DELETE CASCADE
+        )
+    """)
+
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS actions (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            trading_day_id INTEGER NOT NULL,
+            action_type TEXT NOT NULL,
+            symbol TEXT NOT NULL,
+            quantity INTEGER NOT NULL,
+            price REAL NOT NULL,
+            created_at TEXT NOT NULL,
+            FOREIGN KEY (trading_day_id) REFERENCES trading_days(id) ON DELETE CASCADE
+        )
+    """)
+
+    conn.commit()
+    conn.close()
+
+    # Create mock config
+    config_path = str(tmp_path / "test_config.json")
+    config = {
+        "models": [
+            {
+                "signature": "test-model",
+                "basemodel": "mock/model",
+                "enabled": True
+            }
+        ],
+        "agent_config": {
+            "max_steps": 10,
+            "initial_cash": 10000.0
+        },
+        "log_config": {
+            "log_path": str(tmp_path / "logs")
+        },
+        "date_range": {
+            "init_date": "2025-10-13"
+        }
+    }
+
+    with open(config_path, 'w') as f:
+        json.dump(config, f)
+
+    yield {
+        "db_path": db_path,
+        "config_path": config_path,
+        "data_dir": str(tmp_path)
+    }
+
+
+def test_duplicate_simulation_is_skipped(temp_env):
+    """Test that overlapping job skips already-completed simulation."""
+    manager = JobManager(db_path=temp_env["db_path"])
+
+    # Create first job
+    result_1 = manager.create_job(
+        config_path=temp_env["config_path"],
+        date_range=["2025-10-15"],
+        models=["test-model"]
+    )
+    job_id_1 = result_1["job_id"]
+
+    # Simulate completion by manually inserting trading_day record
+    conn = get_db_connection(temp_env["db_path"])
+    cursor = conn.cursor()
+
+    cursor.execute("""
+        INSERT INTO trading_days (
+            job_id, model, date, starting_cash, ending_cash,
+            profit, return_pct, portfolio_value, completed_at
+        )
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+    """, (
+        job_id_1,
+        "test-model",
+        "2025-10-15",
+        10000.0,
+        9500.0,
+        -500.0,
+        -5.0,
+        9500.0,
+        "2025-11-07T01:00:00Z"
+    ))
+
+    conn.commit()
+    conn.close()
+
+    # Mark job_detail as completed
+    manager.update_job_detail_status(
+        job_id_1,
+        "2025-10-15",
+        "test-model",
+        "completed"
+    )
+
+    # Try to create second job with same model-day
+    result_2 = manager.create_job(
+        config_path=temp_env["config_path"],
+        date_range=["2025-10-15", "2025-10-16"],
+        models=["test-model"]
+    )
+
+    # Should have warnings about skipped simulation
+    assert len(result_2["warnings"]) == 1
+    assert "2025-10-15" in result_2["warnings"][0]
+
+    # Should only create job_detail for 2025-10-16
+    details = manager.get_job_details(result_2["job_id"])
+    assert len(details) == 1
+    assert details[0]["date"] == "2025-10-16"
+
+
+def test_portfolio_continues_from_previous_job(temp_env):
+    """Test that new job continues portfolio from previous job's last day."""
+    manager = JobManager(db_path=temp_env["db_path"])
+
+    # Create and complete first job
+    result_1 = manager.create_job(
+        config_path=temp_env["config_path"],
+        date_range=["2025-10-13"],
+        models=["test-model"]
+    )
+    job_id_1 = result_1["job_id"]
+
+    # Insert completed trading_day with holdings
+    conn = get_db_connection(temp_env["db_path"])
+    cursor = conn.cursor()
+
+    cursor.execute("""
+        INSERT INTO trading_days (
+            job_id, model, date, starting_cash, ending_cash,
+            profit, return_pct, portfolio_value, completed_at
+        )
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+    """, (
+        job_id_1,
+        "test-model",
+        "2025-10-13",
+        10000.0,
+        5000.0,
+        0.0,
+        0.0,
+        15000.0,
+        "2025-11-07T01:00:00Z"
+    ))
+
+    trading_day_id = cursor.lastrowid
+
+    cursor.execute("""
+        INSERT INTO holdings (trading_day_id, symbol, quantity)
+        VALUES (?, ?, ?)
+    """, (trading_day_id, "AAPL", 10))
+
+    conn.commit()
+
+    # Mark as completed
+    manager.update_job_detail_status(job_id_1, "2025-10-13", "test-model", "completed")
+    manager.update_job_status(job_id_1, "completed")
+
+    # Create second job for next day
+    result_2 = manager.create_job(
+        config_path=temp_env["config_path"],
+        date_range=["2025-10-14"],
+        models=["test-model"]
+    )
+    job_id_2 = result_2["job_id"]
+
+    # Get starting position for 2025-10-14
+    from agent_tools.tool_trade import get_current_position_from_db
+    import agent_tools.tool_trade as trade_module
+    original_get_db_connection = trade_module.get_db_connection
+
+    def mock_get_db_connection(path):
+        return get_db_connection(temp_env["db_path"])
+
+    trade_module.get_db_connection = mock_get_db_connection
+
+    try:
+        position, _ = get_current_position_from_db(
+            job_id=job_id_2,
+            model="test-model",
+            date="2025-10-14",
+            initial_cash=10000.0
+        )
+
+        # Should continue from job 1's ending position
+        assert position["CASH"] == 5000.0
+        assert position["AAPL"] == 10
+    finally:
+        # Restore original function
+        trade_module.get_db_connection = original_get_db_connection
+
+    conn.close()
--- a/tests/unit/test_context_injector.py
+++ b/tests/unit/test_context_injector.py
@@ -2,6 +2,7 @@

 import pytest
 from agent.context_injector import ContextInjector
+from unittest.mock import Mock


@pytest.fixture
@@ -22,27 +23,34 @@ class MockRequest:
        self.args = args or {}


+def create_mcp_result(position_dict):
+    """Create a mock MCP CallToolResult object matching production behavior."""
+    result = Mock()
+    result.structuredContent = position_dict
+    return result
+
+
 async def mock_handler_success(request):
-    """Mock handler that returns a successful position update."""
+    """Mock handler that returns a successful position update as MCP CallToolResult."""
    # Simulate a successful trade returning updated position
    if request.name == "sell":
-        return {
+        return create_mcp_result({
            "CASH": 1100.0,
            "AAPL": 7,
            "MSFT": 5
-        }
+        })
    elif request.name == "buy":
-        return {
+        return create_mcp_result({
            "CASH": 50.0,
            "AAPL": 7,
            "MSFT": 12
-        }
-    return {}
+        })
+    return create_mcp_result({})


 async def mock_handler_error(request):
-    """Mock handler that returns an error."""
-    return {"error": "Insufficient cash"}
+    """Mock handler that returns an error as MCP CallToolResult."""
+    return create_mcp_result({"error": "Insufficient cash"})


@pytest.mark.asyncio
@@ -68,17 +76,17 @@ async def test_context_injector_injects_parameters(injector):
    """Test that context parameters are injected into buy/sell requests."""
    request = MockRequest("buy", {"symbol": "AAPL", "amount": 10})

-    # Mock handler that just returns the request args
+    # Mock handler that returns MCP result containing the request args
    async def handler(req):
-        return req.args
+        return create_mcp_result(req.args)

    result = await injector(request, handler)

-    # Verify context was injected
-    assert result["signature"] == "test-model"
-    assert result["today_date"] == "2025-01-15"
-    assert result["job_id"] == "test-job-123"
-    assert result["trading_day_id"] == 1
+    # Verify context was injected (result is MCP CallToolResult object)
+    assert result.structuredContent["signature"] == "test-model"
+    assert result.structuredContent["today_date"] == "2025-01-15"
+    assert result.structuredContent["job_id"] == "test-job-123"
+    assert result.structuredContent["trading_day_id"] == 1


@pytest.mark.asyncio
@@ -132,7 +140,7 @@ async def test_context_injector_does_not_update_position_on_error(injector):

    # Verify position was NOT updated
    assert injector._current_position == original_position
-    assert "error" in result
+    assert "error" in result.structuredContent


@pytest.mark.asyncio
@@ -146,7 +154,7 @@ async def test_context_injector_does_not_inject_position_for_non_trade_tools(inj

    async def verify_no_injection_handler(req):
        assert "_current_position" not in req.args
-        return {"results": []}
+        return create_mcp_result({"results": []})

    await injector(request, verify_no_injection_handler)

@@ -164,7 +172,7 @@ async def test_context_injector_full_trading_session_simulation(injector):
    async def handler1(req):
        # First trade should NOT have injected position
        assert req.args.get("_current_position") is None
-        return {"CASH": 1100.0, "AAPL": 7}
+        return create_mcp_result({"CASH": 1100.0, "AAPL": 7})

    result1 = await injector(request1, handler1)
    assert injector._current_position == {"CASH": 1100.0, "AAPL": 7}
@@ -176,7 +184,7 @@ async def test_context_injector_full_trading_session_simulation(injector):
        # Second trade SHOULD have injected position from trade 1
        assert req.args["_current_position"]["CASH"] == 1100.0
        assert req.args["_current_position"]["AAPL"] == 7
-        return {"CASH": 50.0, "AAPL": 7, "MSFT": 7}
+        return create_mcp_result({"CASH": 50.0, "AAPL": 7, "MSFT": 7})

    result2 = await injector(request2, handler2)
    assert injector._current_position == {"CASH": 50.0, "AAPL": 7, "MSFT": 7}
@@ -185,7 +193,7 @@ async def test_context_injector_full_trading_session_simulation(injector):
    request3 = MockRequest("buy", {"symbol": "GOOGL", "amount": 100})

    async def handler3(req):
-        return {"error": "Insufficient cash", "cash_available": 50.0}
+        return create_mcp_result({"error": "Insufficient cash", "cash_available": 50.0})

    result3 = await injector(request3, handler3)
    # Position should remain unchanged after failed trade
--- a/tests/unit/test_cross_job_position_continuity.py
+++ b/tests/unit/test_cross_job_position_continuity.py
@@ -0,0 +1,229 @@
+"""Test portfolio continuity across multiple jobs."""
+import pytest
+import tempfile
+import os
+from agent_tools.tool_trade import get_current_position_from_db
+from api.database import get_db_connection
+
+
+@pytest.fixture
+def temp_db():
+    """Create temporary database with schema."""
+    fd, path = tempfile.mkstemp(suffix='.db')
+    os.close(fd)
+
+    conn = get_db_connection(path)
+    cursor = conn.cursor()
+
+    # Create trading_days table
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS trading_days (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            job_id TEXT NOT NULL,
+            model TEXT NOT NULL,
+            date TEXT NOT NULL,
+            starting_cash REAL NOT NULL,
+            ending_cash REAL NOT NULL,
+            profit REAL NOT NULL,
+            return_pct REAL NOT NULL,
+            portfolio_value REAL NOT NULL,
+            reasoning_summary TEXT,
+            reasoning_full TEXT,
+            completed_at TEXT,
+            session_duration_seconds REAL,
+            UNIQUE(job_id, model, date)
+        )
+    """)
+
+    # Create holdings table
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS holdings (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            trading_day_id INTEGER NOT NULL,
+            symbol TEXT NOT NULL,
+            quantity INTEGER NOT NULL,
+            FOREIGN KEY (trading_day_id) REFERENCES trading_days(id) ON DELETE CASCADE
+        )
+    """)
+
+    conn.commit()
+    conn.close()
+
+    yield path
+
+    if os.path.exists(path):
+        os.remove(path)
+
+
+def test_position_continuity_across_jobs(temp_db):
+    """Test that position queries see history from previous jobs."""
+    # Insert trading_day from job 1
+    conn = get_db_connection(temp_db)
+    cursor = conn.cursor()
+
+    cursor.execute("""
+        INSERT INTO trading_days (
+            job_id, model, date, starting_cash, ending_cash,
+            profit, return_pct, portfolio_value, completed_at
+        )
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+    """, (
+        "job-1-uuid",
+        "deepseek-chat-v3.1",
+        "2025-10-14",
+        10000.0,
+        5121.52,  # Negative cash from buying
+        0.0,
+        0.0,
+        14993.945,
+        "2025-11-07T01:52:53Z"
+    ))
+
+    trading_day_id = cursor.lastrowid
+
+    # Insert holdings from job 1
+    holdings = [
+        ("ADBE", 5),
+        ("AVGO", 5),
+        ("CRWD", 5),
+        ("GOOGL", 20),
+        ("META", 5),
+        ("MSFT", 5),
+        ("NVDA", 10)
+    ]
+
+    for symbol, quantity in holdings:
+        cursor.execute("""
+            INSERT INTO holdings (trading_day_id, symbol, quantity)
+            VALUES (?, ?, ?)
+        """, (trading_day_id, symbol, quantity))
+
+    conn.commit()
+    conn.close()
+
+    # Mock get_db_connection to return our test db
+    import agent_tools.tool_trade as trade_module
+    original_get_db_connection = trade_module.get_db_connection
+
+    def mock_get_db_connection(path):
+        return get_db_connection(temp_db)
+
+    trade_module.get_db_connection = mock_get_db_connection
+
+    try:
+        # Now query position for job 2 on next trading day
+        position, _ = get_current_position_from_db(
+            job_id="job-2-uuid",  # Different job
+            model="deepseek-chat-v3.1",
+            date="2025-10-15",
+            initial_cash=10000.0
+        )
+
+        # Should see job 1's ending position, NOT initial $10k
+        assert position["CASH"] == 5121.52
+        assert position["ADBE"] == 5
+        assert position["AVGO"] == 5
+        assert position["CRWD"] == 5
+        assert position["GOOGL"] == 20
+        assert position["META"] == 5
+        assert position["MSFT"] == 5
+        assert position["NVDA"] == 10
+    finally:
+        # Restore original function
+        trade_module.get_db_connection = original_get_db_connection
+
+
+def test_position_returns_initial_state_for_first_day(temp_db):
+    """Test that first trading day returns initial cash."""
+    # Mock get_db_connection to return our test db
+    import agent_tools.tool_trade as trade_module
+    original_get_db_connection = trade_module.get_db_connection
+
+    def mock_get_db_connection(path):
+        return get_db_connection(temp_db)
+
+    trade_module.get_db_connection = mock_get_db_connection
+
+    try:
+        # No previous trading days exist
+        position, _ = get_current_position_from_db(
+            job_id="new-job-uuid",
+            model="new-model",
+            date="2025-10-13",
+            initial_cash=10000.0
+        )
+
+        # Should return initial position
+        assert position == {"CASH": 10000.0}
+    finally:
+        # Restore original function
+        trade_module.get_db_connection = original_get_db_connection
+
+
+def test_position_uses_most_recent_prior_date(temp_db):
+    """Test that position query uses the most recent date before current."""
+    conn = get_db_connection(temp_db)
+    cursor = conn.cursor()
+
+    # Insert two trading days
+    cursor.execute("""
+        INSERT INTO trading_days (
+            job_id, model, date, starting_cash, ending_cash,
+            profit, return_pct, portfolio_value, completed_at
+        )
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+    """, (
+        "job-1",
+        "model-a",
+        "2025-10-13",
+        10000.0,
+        9500.0,
+        -500.0,
+        -5.0,
+        9500.0,
+        "2025-11-07T01:00:00Z"
+    ))
+
+    cursor.execute("""
+        INSERT INTO trading_days (
+            job_id, model, date, starting_cash, ending_cash,
+            profit, return_pct, portfolio_value, completed_at
+        )
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+    """, (
+        "job-2",
+        "model-a",
+        "2025-10-14",
+        9500.0,
+        12000.0,
+        2500.0,
+        26.3,
+        12000.0,
+        "2025-11-07T02:00:00Z"
+    ))
+
+    conn.commit()
+    conn.close()
+
+    # Mock get_db_connection to return our test db
+    import agent_tools.tool_trade as trade_module
+    original_get_db_connection = trade_module.get_db_connection
+
+    def mock_get_db_connection(path):
+        return get_db_connection(temp_db)
+
+    trade_module.get_db_connection = mock_get_db_connection
+
+    try:
+        # Query for 2025-10-15 should use 2025-10-14's ending position
+        position, _ = get_current_position_from_db(
+            job_id="job-3",
+            model="model-a",
+            date="2025-10-15",
+            initial_cash=10000.0
+        )
+
+        assert position["CASH"] == 12000.0  # From 2025-10-14, not 2025-10-13
+    finally:
+        # Restore original function
+        trade_module.get_db_connection = original_get_db_connection
--- a/tests/unit/test_database_helpers.py
+++ b/tests/unit/test_database_helpers.py
@@ -130,6 +130,44 @@ class TestDatabaseHelpers:
        assert previous is not None
        assert previous["date"] == "2025-01-17"

+    def test_get_previous_trading_day_across_jobs(self, db):
+        """Test retrieving previous trading day from different job (cross-job continuity)."""
+        # Setup: Create two jobs
+        db.connection.execute(
+            "INSERT INTO jobs (job_id, status, config_path, date_range, models, created_at) VALUES (?, ?, ?, ?, ?, ?)",
+            ("job-1", "completed", "config.json", "2025-10-07,2025-10-07", "deepseek-chat-v3.1", "2025-11-07T00:00:00Z")
+        )
+        db.connection.execute(
+            "INSERT INTO jobs (job_id, status, config_path, date_range, models, created_at) VALUES (?, ?, ?, ?, ?, ?)",
+            ("job-2", "running", "config.json", "2025-10-08,2025-10-08", "deepseek-chat-v3.1", "2025-11-07T01:00:00Z")
+        )
+
+        # Day 1 in job-1
+        db.create_trading_day(
+            job_id="job-1",
+            model="deepseek-chat-v3.1",
+            date="2025-10-07",
+            starting_cash=10000.0,
+            starting_portfolio_value=10000.0,
+            daily_profit=214.58,
+            daily_return_pct=2.15,
+            ending_cash=123.59,
+            ending_portfolio_value=10214.58
+        )
+
+        # Test: Get previous day from job-2 on next date
+        # Should find job-1's record (cross-job continuity)
+        previous = db.get_previous_trading_day(
+            job_id="job-2",
+            model="deepseek-chat-v3.1",
+            current_date="2025-10-08"
+        )
+
+        assert previous is not None
+        assert previous["date"] == "2025-10-07"
+        assert previous["ending_cash"] == 123.59
+        assert previous["ending_portfolio_value"] == 10214.58
+
    def test_get_ending_holdings(self, db):
        """Test retrieving ending holdings for a trading day."""
        db.connection.execute(
@@ -224,6 +262,59 @@ class TestDatabaseHelpers:
        assert holdings[0]["symbol"] == "AAPL"
        assert holdings[0]["quantity"] == 10

+    def test_get_starting_holdings_across_jobs(self, db):
+        """Test starting holdings retrieval across different jobs (cross-job continuity)."""
+        # Setup: Create two jobs
+        db.connection.execute(
+            "INSERT INTO jobs (job_id, status, config_path, date_range, models, created_at) VALUES (?, ?, ?, ?, ?, ?)",
+            ("job-1", "completed", "config.json", "2025-10-07,2025-10-07", "deepseek-chat-v3.1", "2025-11-07T00:00:00Z")
+        )
+        db.connection.execute(
+            "INSERT INTO jobs (job_id, status, config_path, date_range, models, created_at) VALUES (?, ?, ?, ?, ?, ?)",
+            ("job-2", "running", "config.json", "2025-10-08,2025-10-08", "deepseek-chat-v3.1", "2025-11-07T01:00:00Z")
+        )
+
+        # Day 1 in job-1 with holdings
+        day1_id = db.create_trading_day(
+            job_id="job-1",
+            model="deepseek-chat-v3.1",
+            date="2025-10-07",
+            starting_cash=10000.0,
+            starting_portfolio_value=10000.0,
+            daily_profit=214.58,
+            daily_return_pct=2.15,
+            ending_cash=329.825,
+            ending_portfolio_value=10666.135
+        )
+        db.create_holding(day1_id, "AAPL", 10)
+        db.create_holding(day1_id, "AMD", 4)
+        db.create_holding(day1_id, "MSFT", 8)
+        db.create_holding(day1_id, "NVDA", 12)
+        db.create_holding(day1_id, "TSLA", 1)
+
+        # Day 2 in job-2 (different job)
+        day2_id = db.create_trading_day(
+            job_id="job-2",
+            model="deepseek-chat-v3.1",
+            date="2025-10-08",
+            starting_cash=329.825,
+            starting_portfolio_value=10609.475,
+            daily_profit=-56.66,
+            daily_return_pct=-0.53,
+            ending_cash=33.62,
+            ending_portfolio_value=329.825
+        )
+
+        # Test: Day 2 should get Day 1's holdings from different job
+        holdings = db.get_starting_holdings(day2_id)
+
+        assert len(holdings) == 5
+        assert {"symbol": "AAPL", "quantity": 10} in holdings
+        assert {"symbol": "AMD", "quantity": 4} in holdings
+        assert {"symbol": "MSFT", "quantity": 8} in holdings
+        assert {"symbol": "NVDA", "quantity": 12} in holdings
+        assert {"symbol": "TSLA", "quantity": 1} in holdings
+
    def test_create_action(self, db):
        """Test creating an action record."""
        db.connection.execute(
--- a/tests/unit/test_job_manager.py
+++ b/tests/unit/test_job_manager.py
@@ -26,11 +26,12 @@ class TestJobCreation:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16", "2025-01-17"],
            models=["gpt-5", "claude-3.7-sonnet"]
        )
+        job_id = job_result["job_id"]

        assert job_id is not None
        job = manager.get_job(job_id)
@@ -44,11 +45,12 @@ class TestJobCreation:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16", "2025-01-17"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        progress = manager.get_job_progress(job_id)
        assert progress["total_model_days"] == 2  # 2 dates × 1 model
@@ -60,11 +62,12 @@ class TestJobCreation:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job1_id = manager.create_job(
+        job1_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16"],
            ["gpt-5"]
        )
+        job1_id = job1_result["job_id"]

        with pytest.raises(ValueError, match="Another simulation job is already running"):
            manager.create_job(
@@ -78,20 +81,22 @@ class TestJobCreation:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job1_id = manager.create_job(
+        job1_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16"],
            ["gpt-5"]
        )
+        job1_id = job1_result["job_id"]

        manager.update_job_status(job1_id, "completed")

        # Now second job should be allowed
-        job2_id = manager.create_job(
+        job2_result = manager.create_job(
            "configs/test.json",
            ["2025-01-17"],
            ["gpt-5"]
        )
+        job2_id = job2_result["job_id"]
        assert job2_id is not None


@@ -104,11 +109,12 @@ class TestJobStatusTransitions:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16"],
            ["gpt-5"]
        )
+        job_id = job_result["job_id"]

        # Update detail to running
        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
@@ -122,11 +128,12 @@ class TestJobStatusTransitions:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16"],
            ["gpt-5"]
        )
+        job_id = job_result["job_id"]

        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
@@ -141,11 +148,12 @@ class TestJobStatusTransitions:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16"],
            ["gpt-5", "claude-3.7-sonnet"]
        )
+        job_id = job_result["job_id"]

        # First model succeeds
        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
@@ -183,10 +191,12 @@ class TestJobRetrieval:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job1_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job1_result = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job1_id = job1_result["job_id"]
        manager.update_job_status(job1_id, "completed")

-        job2_id = manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
+        job2_result = manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
+        job2_id = job2_result["job_id"]

        current = manager.get_current_job()
        assert current["job_id"] == job2_id
@@ -204,11 +214,12 @@ class TestJobRetrieval:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16", "2025-01-17"],
            ["gpt-5"]
        )
+        job_id = job_result["job_id"]

        found = manager.find_job_by_date_range(["2025-01-16", "2025-01-17"])
        assert found["job_id"] == job_id
@@ -237,11 +248,12 @@ class TestJobProgress:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16", "2025-01-17"],
            ["gpt-5"]
        )
+        job_id = job_result["job_id"]

        progress = manager.get_job_progress(job_id)
        assert progress["total_model_days"] == 2
@@ -254,11 +266,12 @@ class TestJobProgress:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16"],
            ["gpt-5"]
        )
+        job_id = job_result["job_id"]

        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")

@@ -270,11 +283,12 @@ class TestJobProgress:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            "configs/test.json",
            ["2025-01-16"],
            ["gpt-5", "claude-3.7-sonnet"]
        )
+        job_id = job_result["job_id"]

        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")

@@ -311,7 +325,8 @@ class TestConcurrencyControl:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_result = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_id = job_result["job_id"]
        manager.update_job_status(job_id, "running")

        assert manager.can_start_new_job() is False
@@ -321,7 +336,8 @@ class TestConcurrencyControl:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_result = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_id = job_result["job_id"]
        manager.update_job_status(job_id, "completed")

        assert manager.can_start_new_job() is True
@@ -331,13 +347,15 @@ class TestConcurrencyControl:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job1_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job1_result = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job1_id = job1_result["job_id"]

        # Complete first job
        manager.update_job_status(job1_id, "completed")

        # Create second job
-        job2_id = manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
+        job2_result = manager.create_job("configs/test.json", ["2025-01-17"], ["gpt-5"])
+        job2_id = job2_result["job_id"]

        running = manager.get_running_jobs()
        assert len(running) == 1
@@ -368,12 +386,13 @@ class TestJobCleanup:
        conn.close()

        # Create recent job
-        recent_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        recent_result = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        recent_id = recent_result["job_id"]

        # Cleanup jobs older than 30 days
-        result = manager.cleanup_old_jobs(days=30)
+        cleanup_result = manager.cleanup_old_jobs(days=30)

-        assert result["jobs_deleted"] == 1
+        assert cleanup_result["jobs_deleted"] == 1
        assert manager.get_job("old-job") is None
        assert manager.get_job(recent_id) is not None

@@ -387,7 +406,8 @@ class TestJobUpdateOperations:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_result = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_id = job_result["job_id"]

        manager.update_job_status(job_id, "failed", error="MCP service unavailable")

@@ -401,7 +421,8 @@ class TestJobUpdateOperations:
        import time

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_result = manager.create_job("configs/test.json", ["2025-01-16"], ["gpt-5"])
+        job_id = job_result["job_id"]

        # Start
        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
@@ -432,11 +453,12 @@ class TestJobWarnings:
        job_manager = JobManager(db_path=clean_db)

        # Create a job
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="config.json",
            date_range=["2025-10-01"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        # Add warnings
        warnings = ["Rate limit reached", "Skipped 2 dates"]
@@ -448,4 +470,172 @@ class TestJobWarnings:
        assert stored_warnings == warnings


+@pytest.mark.unit
+class TestStaleJobCleanup:
+    """Test cleanup of stale jobs from container restarts."""
+
+    def test_cleanup_stale_pending_job(self, clean_db):
+        """Should mark pending job as failed with no progress."""
+        from api.job_manager import JobManager
+
+        manager = JobManager(db_path=clean_db)
+        job_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-16", "2025-01-17"],
+            models=["gpt-5"]
+        )
+        job_id = job_result["job_id"]
+
+        # Job is pending - simulate container restart
+        result = manager.cleanup_stale_jobs()
+
+        assert result["jobs_cleaned"] == 1
+        job = manager.get_job(job_id)
+        assert job["status"] == "failed"
+        assert "container restart" in job["error"].lower()
+        assert "pending" in job["error"]
+        assert "no progress" in job["error"]
+
+    def test_cleanup_stale_running_job_with_partial_progress(self, clean_db):
+        """Should mark running job as partial if some model-days completed."""
+        from api.job_manager import JobManager
+
+        manager = JobManager(db_path=clean_db)
+        job_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-16", "2025-01-17"],
+            models=["gpt-5"]
+        )
+        job_id = job_result["job_id"]
+
+        # Mark job as running and complete one model-day
+        manager.update_job_status(job_id, "running")
+        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
+
+        # Simulate container restart
+        result = manager.cleanup_stale_jobs()
+
+        assert result["jobs_cleaned"] == 1
+        job = manager.get_job(job_id)
+        assert job["status"] == "partial"
+        assert "container restart" in job["error"].lower()
+        assert "1/2" in job["error"]  # 1 out of 2 model-days completed
+
+    def test_cleanup_stale_downloading_data_job(self, clean_db):
+        """Should mark downloading_data job as failed."""
+        from api.job_manager import JobManager
+
+        manager = JobManager(db_path=clean_db)
+        job_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-16"],
+            models=["gpt-5"]
+        )
+        job_id = job_result["job_id"]
+
+        # Mark as downloading data
+        manager.update_job_status(job_id, "downloading_data")
+
+        # Simulate container restart
+        result = manager.cleanup_stale_jobs()
+
+        assert result["jobs_cleaned"] == 1
+        job = manager.get_job(job_id)
+        assert job["status"] == "failed"
+        assert "downloading_data" in job["error"]
+
+    def test_cleanup_marks_incomplete_job_details_as_failed(self, clean_db):
+        """Should mark incomplete job_details as failed."""
+        from api.job_manager import JobManager
+
+        manager = JobManager(db_path=clean_db)
+        job_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-16", "2025-01-17"],
+            models=["gpt-5"]
+        )
+        job_id = job_result["job_id"]
+
+        # Mark job as running, one detail running, one pending
+        manager.update_job_status(job_id, "running")
+        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "running")
+
+        # Simulate container restart
+        manager.cleanup_stale_jobs()
+
+        # Check job_details were marked as failed
+        progress = manager.get_job_progress(job_id)
+        assert progress["failed"] == 2  # Both model-days marked failed
+        assert progress["pending"] == 0
+
+        details = manager.get_job_details(job_id)
+        for detail in details:
+            assert detail["status"] == "failed"
+            assert "container restarted" in detail["error"].lower()
+
+    def test_cleanup_no_stale_jobs(self, clean_db):
+        """Should report 0 cleaned jobs when none are stale."""
+        from api.job_manager import JobManager
+
+        manager = JobManager(db_path=clean_db)
+        job_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-16"],
+            models=["gpt-5"]
+        )
+        job_id = job_result["job_id"]
+
+        # Complete the job
+        manager.update_job_detail_status(job_id, "2025-01-16", "gpt-5", "completed")
+
+        # Simulate container restart
+        result = manager.cleanup_stale_jobs()
+
+        assert result["jobs_cleaned"] == 0
+        job = manager.get_job(job_id)
+        assert job["status"] == "completed"
+
+    def test_cleanup_multiple_stale_jobs(self, clean_db):
+        """Should clean up multiple stale jobs."""
+        from api.job_manager import JobManager
+
+        manager = JobManager(db_path=clean_db)
+
+        # Create first job
+        job1_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-16"],
+            models=["gpt-5"]
+        )
+        job1_id = job1_result["job_id"]
+        manager.update_job_status(job1_id, "running")
+        manager.update_job_status(job1_id, "completed")
+
+        # Create second job (pending)
+        job2_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-17"],
+            models=["gpt-5"]
+        )
+        job2_id = job2_result["job_id"]
+
+        # Create third job (running)
+        manager.update_job_status(job2_id, "completed")
+        job3_result = manager.create_job(
+            config_path="configs/test.json",
+            date_range=["2025-01-18"],
+            models=["gpt-5"]
+        )
+        job3_id = job3_result["job_id"]
+        manager.update_job_status(job3_id, "running")
+
+        # Simulate container restart
+        result = manager.cleanup_stale_jobs()
+
+        assert result["jobs_cleaned"] == 1  # Only job3 is running
+        assert manager.get_job(job1_id)["status"] == "completed"
+        assert manager.get_job(job2_id)["status"] == "completed"
+        assert manager.get_job(job3_id)["status"] == "failed"
+
+
 # Coverage target: 95%+ for api/job_manager.py
--- a/tests/unit/test_job_manager_duplicate_detection.py
+++ b/tests/unit/test_job_manager_duplicate_detection.py
@@ -0,0 +1,256 @@
+"""Test duplicate detection in job creation."""
+import pytest
+import tempfile
+import os
+from pathlib import Path
+from api.job_manager import JobManager
+
+
+@pytest.fixture
+def temp_db():
+    """Create temporary database for testing."""
+    fd, path = tempfile.mkstemp(suffix='.db')
+    os.close(fd)
+
+    # Initialize schema
+    from api.database import get_db_connection
+    conn = get_db_connection(path)
+    cursor = conn.cursor()
+
+    # Create jobs table
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS jobs (
+            job_id TEXT PRIMARY KEY,
+            config_path TEXT NOT NULL,
+            status TEXT NOT NULL,
+            date_range TEXT NOT NULL,
+            models TEXT NOT NULL,
+            created_at TEXT NOT NULL,
+            started_at TEXT,
+            updated_at TEXT,
+            completed_at TEXT,
+            total_duration_seconds REAL,
+            error TEXT,
+            warnings TEXT
+        )
+    """)
+
+    # Create job_details table
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS job_details (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            job_id TEXT NOT NULL,
+            date TEXT NOT NULL,
+            model TEXT NOT NULL,
+            status TEXT NOT NULL,
+            started_at TEXT,
+            completed_at TEXT,
+            duration_seconds REAL,
+            error TEXT,
+            FOREIGN KEY (job_id) REFERENCES jobs(job_id) ON DELETE CASCADE,
+            UNIQUE(job_id, date, model)
+        )
+    """)
+
+    conn.commit()
+    conn.close()
+
+    yield path
+
+    # Cleanup
+    if os.path.exists(path):
+        os.remove(path)
+
+
+def test_create_job_with_filter_skips_completed_simulations(temp_db):
+    """Test that job creation with model_day_filter skips already-completed pairs."""
+    manager = JobManager(db_path=temp_db)
+
+    # Create first job and mark model-day as completed
+    result_1 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15", "2025-10-16"],
+        models=["deepseek-chat-v3.1"],
+        model_day_filter=[("deepseek-chat-v3.1", "2025-10-15")]
+    )
+    job_id_1 = result_1["job_id"]
+
+    # Mark as completed
+    manager.update_job_detail_status(
+        job_id_1,
+        "2025-10-15",
+        "deepseek-chat-v3.1",
+        "completed"
+    )
+
+    # Try to create second job with overlapping date
+    model_day_filter = [
+        ("deepseek-chat-v3.1", "2025-10-15"),  # Already completed
+        ("deepseek-chat-v3.1", "2025-10-16")   # Not yet completed
+    ]
+
+    result_2 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15", "2025-10-16"],
+        models=["deepseek-chat-v3.1"],
+        model_day_filter=model_day_filter
+    )
+    job_id_2 = result_2["job_id"]
+
+    # Get job details for second job
+    details = manager.get_job_details(job_id_2)
+
+    # Should only have 2025-10-16 (2025-10-15 was skipped as already completed)
+    assert len(details) == 1
+    assert details[0]["date"] == "2025-10-16"
+    assert details[0]["model"] == "deepseek-chat-v3.1"
+
+
+def test_create_job_without_filter_skips_all_completed_simulations(temp_db):
+    """Test that job creation without filter skips all completed model-day pairs."""
+    manager = JobManager(db_path=temp_db)
+
+    # Create first job and complete some model-days
+    result_1 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15"],
+        models=["model-a", "model-b"]
+    )
+    job_id_1 = result_1["job_id"]
+
+    # Mark model-a/2025-10-15 as completed
+    manager.update_job_detail_status(job_id_1, "2025-10-15", "model-a", "completed")
+    # Mark model-b/2025-10-15 as failed to complete the job
+    manager.update_job_detail_status(job_id_1, "2025-10-15", "model-b", "failed")
+
+    # Create second job with same date range and models
+    result_2 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15", "2025-10-16"],
+        models=["model-a", "model-b"]
+    )
+    job_id_2 = result_2["job_id"]
+
+    # Get job details for second job
+    details = manager.get_job_details(job_id_2)
+
+    # Should have 3 entries (skip only completed model-a/2025-10-15):
+    # - model-b/2025-10-15 (failed in job 1, so not skipped - retry)
+    # - model-a/2025-10-16 (new date)
+    # - model-b/2025-10-16 (new date)
+    assert len(details) == 3
+
+    dates_models = [(d["date"], d["model"]) for d in details]
+    assert ("2025-10-15", "model-a") not in dates_models  # Skipped (completed)
+    assert ("2025-10-15", "model-b") in dates_models  # NOT skipped (failed, not completed)
+    assert ("2025-10-16", "model-a") in dates_models
+    assert ("2025-10-16", "model-b") in dates_models
+
+
+def test_create_job_returns_warnings_for_skipped_simulations(temp_db):
+    """Test that skipped simulations are returned as warnings."""
+    manager = JobManager(db_path=temp_db)
+
+    # Create and complete first simulation
+    result_1 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15"],
+        models=["model-a"]
+    )
+    job_id_1 = result_1["job_id"]
+    manager.update_job_detail_status(job_id_1, "2025-10-15", "model-a", "completed")
+
+    # Try to create job with overlapping date (one completed, one new)
+    result = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15", "2025-10-16"],  # Add new date
+        models=["model-a"]
+    )
+
+    # Result should be a dict with job_id and warnings
+    assert isinstance(result, dict)
+    assert "job_id" in result
+    assert "warnings" in result
+    assert len(result["warnings"]) == 1
+    assert "model-a" in result["warnings"][0]
+    assert "2025-10-15" in result["warnings"][0]
+
+    # Verify job_details only has the new date
+    details = manager.get_job_details(result["job_id"])
+    assert len(details) == 1
+    assert details[0]["date"] == "2025-10-16"
+
+
+def test_create_job_raises_error_when_all_simulations_completed(temp_db):
+    """Test that ValueError is raised when ALL requested simulations are already completed."""
+    manager = JobManager(db_path=temp_db)
+
+    # Create and complete first simulation
+    result_1 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15", "2025-10-16"],
+        models=["model-a", "model-b"]
+    )
+    job_id_1 = result_1["job_id"]
+
+    # Mark all model-days as completed
+    manager.update_job_detail_status(job_id_1, "2025-10-15", "model-a", "completed")
+    manager.update_job_detail_status(job_id_1, "2025-10-15", "model-b", "completed")
+    manager.update_job_detail_status(job_id_1, "2025-10-16", "model-a", "completed")
+    manager.update_job_detail_status(job_id_1, "2025-10-16", "model-b", "completed")
+
+    # Try to create job with same date range and models (all already completed)
+    with pytest.raises(ValueError) as exc_info:
+        manager.create_job(
+            config_path="test_config.json",
+            date_range=["2025-10-15", "2025-10-16"],
+            models=["model-a", "model-b"]
+        )
+
+    # Verify error message contains expected text
+    error_message = str(exc_info.value)
+    assert "All requested simulations are already completed" in error_message
+    assert "Skipped 4 model-day pair(s)" in error_message
+
+
+def test_create_job_with_skip_completed_false_includes_all_simulations(temp_db):
+    """Test that skip_completed=False includes ALL simulations, even already-completed ones."""
+    manager = JobManager(db_path=temp_db)
+
+    # Create first job and complete some model-days
+    result_1 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15", "2025-10-16"],
+        models=["model-a", "model-b"]
+    )
+    job_id_1 = result_1["job_id"]
+
+    # Mark all model-days as completed
+    manager.update_job_detail_status(job_id_1, "2025-10-15", "model-a", "completed")
+    manager.update_job_detail_status(job_id_1, "2025-10-15", "model-b", "completed")
+    manager.update_job_detail_status(job_id_1, "2025-10-16", "model-a", "completed")
+    manager.update_job_detail_status(job_id_1, "2025-10-16", "model-b", "completed")
+
+    # Create second job with skip_completed=False
+    result_2 = manager.create_job(
+        config_path="test_config.json",
+        date_range=["2025-10-15", "2025-10-16"],
+        models=["model-a", "model-b"],
+        skip_completed=False
+    )
+    job_id_2 = result_2["job_id"]
+
+    # Get job details for second job
+    details = manager.get_job_details(job_id_2)
+
+    # Should have ALL 4 model-day pairs (no skipping)
+    assert len(details) == 4
+
+    dates_models = [(d["date"], d["model"]) for d in details]
+    assert ("2025-10-15", "model-a") in dates_models
+    assert ("2025-10-15", "model-b") in dates_models
+    assert ("2025-10-16", "model-a") in dates_models
+    assert ("2025-10-16", "model-b") in dates_models
+
+    # Verify no warnings were returned
+    assert result_2.get("warnings") == []
--- a/tests/unit/test_job_skip_status.py
+++ b/tests/unit/test_job_skip_status.py
@@ -41,11 +41,12 @@ class TestSkipStatusDatabase:
    def test_skipped_status_allowed_in_job_details(self, job_manager):
        """Test job_details accepts 'skipped' status without constraint violation."""
        # Create job
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        # Mark a detail as skipped - should not raise constraint violation
        job_manager.update_job_detail_status(
@@ -70,11 +71,12 @@ class TestJobCompletionWithSkipped:
    def test_job_completes_with_all_dates_skipped(self, job_manager):
        """Test job transitions to completed when all dates are skipped."""
        # Create job with 3 dates
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02", "2025-10-03"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        # Mark all as skipped
        for date in ["2025-10-01", "2025-10-02", "2025-10-03"]:
@@ -93,11 +95,12 @@ class TestJobCompletionWithSkipped:

    def test_job_completes_with_mixed_completed_and_skipped(self, job_manager):
        """Test job completes when some dates completed, some skipped."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02", "2025-10-03"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        # Mark some completed, some skipped
        job_manager.update_job_detail_status(
@@ -119,11 +122,12 @@ class TestJobCompletionWithSkipped:

    def test_job_partial_with_mixed_completed_failed_skipped(self, job_manager):
        """Test job status 'partial' when some failed, some completed, some skipped."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02", "2025-10-03"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        # Mix of statuses
        job_manager.update_job_detail_status(
@@ -145,11 +149,12 @@ class TestJobCompletionWithSkipped:

    def test_job_remains_running_with_pending_dates(self, job_manager):
        """Test job stays running when some dates are still pending."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02", "2025-10-03"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        # Only mark some as terminal states
        job_manager.update_job_detail_status(
@@ -173,11 +178,12 @@ class TestProgressTrackingWithSkipped:

    def test_progress_includes_skipped_count(self, job_manager):
        """Test get_job_progress returns skipped count."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02", "2025-10-03", "2025-10-04"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        # Set various statuses
        job_manager.update_job_detail_status(
@@ -205,11 +211,12 @@ class TestProgressTrackingWithSkipped:

    def test_progress_all_skipped(self, job_manager):
        """Test progress when all dates are skipped."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        # Mark all as skipped
        for date in ["2025-10-01", "2025-10-02"]:
@@ -231,11 +238,12 @@ class TestMultiModelSkipHandling:

    def test_different_models_different_skip_states(self, job_manager):
        """Test that different models can have different skip states for same date."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02"],
            models=["model-a", "model-b"]
        )
+        job_id = job_result["job_id"]

        # Model A: 10/1 skipped (already completed), 10/2 completed
        job_manager.update_job_detail_status(
@@ -276,11 +284,12 @@ class TestMultiModelSkipHandling:

    def test_job_completes_with_per_model_skips(self, job_manager):
        """Test job completes when different models have different skip patterns."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01", "2025-10-02"],
            models=["model-a", "model-b"]
        )
+        job_id = job_result["job_id"]

        # Model A: one skipped, one completed
        job_manager.update_job_detail_status(
@@ -318,11 +327,12 @@ class TestSkipReasons:

    def test_skip_reason_already_completed(self, job_manager):
        """Test 'Already completed' skip reason is stored."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-01"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        job_manager.update_job_detail_status(
            job_id=job_id, date="2025-10-01", model="test-model",
@@ -334,11 +344,12 @@ class TestSkipReasons:

    def test_skip_reason_incomplete_price_data(self, job_manager):
        """Test 'Incomplete price data' skip reason is stored."""
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="test_config.json",
            date_range=["2025-10-04"],
            models=["test-model"]
        )
+        job_id = job_result["job_id"]

        job_manager.update_job_detail_status(
            job_id=job_id, date="2025-10-04", model="test-model",
--- a/tests/unit/test_model_day_executor.py
+++ b/tests/unit/test_model_day_executor.py
@@ -112,11 +112,12 @@ class TestModelDayExecutorExecution:

        # Create job and job_detail
        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path=str(config_path),
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        # Mock agent execution
        mock_agent = create_mock_agent(
@@ -156,11 +157,12 @@ class TestModelDayExecutorExecution:

        # Create job
        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        # Mock agent to raise error
        with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
@@ -212,11 +214,12 @@ class TestModelDayExecutorDataPersistence:

        # Create job
        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path=str(config_path),
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        # Mock successful execution (no trades)
        mock_agent = create_mock_agent(
@@ -269,11 +272,12 @@ class TestModelDayExecutorDataPersistence:

        # Create job
        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        # Mock execution with reasoning
        mock_agent = create_mock_agent(
@@ -320,11 +324,12 @@ class TestModelDayExecutorCleanup:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        mock_agent = create_mock_agent(
            session_result={"success": True}
@@ -355,11 +360,12 @@ class TestModelDayExecutorCleanup:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        with patch("api.model_day_executor.RuntimeConfigManager") as mock_runtime:
            mock_instance = Mock()
--- a/tests/unit/test_simulation_worker.py
+++ b/tests/unit/test_simulation_worker.py
@@ -41,11 +41,12 @@ class TestSimulationWorkerExecution:

        # Create job with 2 dates and 2 models = 4 model-days
        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16", "2025-01-17"],
            models=["gpt-5", "claude-3.7-sonnet"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)

@@ -73,11 +74,12 @@ class TestSimulationWorkerExecution:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16", "2025-01-17"],
            models=["gpt-5", "claude-3.7-sonnet"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)

@@ -118,11 +120,12 @@ class TestSimulationWorkerExecution:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)

@@ -159,11 +162,12 @@ class TestSimulationWorkerExecution:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5", "claude-3.7-sonnet"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)

@@ -214,11 +218,12 @@ class TestSimulationWorkerErrorHandling:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5", "claude-3.7-sonnet", "gemini"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)

@@ -259,11 +264,12 @@ class TestSimulationWorkerErrorHandling:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)

@@ -289,11 +295,12 @@ class TestSimulationWorkerConcurrency:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16"],
            models=["gpt-5", "claude-3.7-sonnet"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)

@@ -335,11 +342,12 @@ class TestSimulationWorkerJobRetrieval:
        from api.job_manager import JobManager

        manager = JobManager(db_path=clean_db)
-        job_id = manager.create_job(
+        job_result = manager.create_job(
            config_path="configs/test.json",
            date_range=["2025-01-16", "2025-01-17"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=clean_db)
        job_info = worker.get_job_info()
@@ -469,11 +477,12 @@ class TestSimulationWorkerHelperMethods:
        job_manager = JobManager(db_path=db_path)

        # Create job
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="config.json",
            date_range=["2025-10-01"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=db_path)

@@ -498,11 +507,12 @@ class TestSimulationWorkerHelperMethods:
        job_manager = JobManager(db_path=db_path)

        # Create job
-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="config.json",
            date_range=["2025-10-01"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=db_path)

@@ -545,11 +555,12 @@ class TestSimulationWorkerHelperMethods:
        initialize_database(db_path)
        job_manager = JobManager(db_path=db_path)

-        job_id = job_manager.create_job(
+        job_result = job_manager.create_job(
            config_path="config.json",
            date_range=["2025-10-01"],
            models=["gpt-5"]
        )
+        job_id = job_result["job_id"]

        worker = SimulationWorker(job_id=job_id, db_path=db_path)
Author	SHA1	Message	Date
Bill	31d6818130	fix: enable cross-job portfolio continuity in get_starting_holdings Remove job_id filter from get_starting_holdings() SQL JOIN to enable holdings continuity across jobs. This completes the cross-job portfolio continuity fix started in the previous commit. Root cause: get_starting_holdings() joined on job_id, preventing it from finding previous day's holdings when queried from a different job. This caused starting_position.holdings to be empty in API results for new jobs even though starting_cash was correctly retrieved. Changes: - api/database.py: Remove job_id from JOIN condition in get_starting_holdings() - tests/unit/test_database_helpers.py: Add test for cross-job holdings retrieval Together with the previous commit fixing get_previous_trading_day(), this ensures complete portfolio continuity (both cash and holdings) across jobs.	2025-11-07 16:38:33 -05:00
Bill	4638c073e3	fix: enable cross-job portfolio continuity in get_previous_trading_day Remove job_id filter from get_previous_trading_day() SQL query to enable portfolio continuity across jobs. Previously, new jobs would reset to initial $10,000 cash instead of continuing from previous job's ending position. Root cause: get_previous_trading_day() filtered by job_id, while get_current_position_from_db() correctly queries across all jobs. This inconsistency caused starting_cash to default to initial_cash when no previous day was found within the same job. Changes: - api/database.py: Remove job_id filter from SQL WHERE clause - tests/unit/test_database_helpers.py: Add test for cross-job continuity Fixes position tracking bug where subsequent jobs on consecutive dates would not recognize previous day's holdings from different job.	2025-11-07 16:13:28 -05:00
Bill	96f61cf347	release: v0.4.2 - fix critical negative cash position bug Remove debug logging and update CHANGELOG for v0.4.2 release. Fixed critical bug where trades calculated from initial $10,000 capital instead of accumulating, allowing over-spending and negative cash balances. Key changes: - Extract position dict from CallToolResult.structuredContent - Enable MCP service logging for better debugging - Update tests to match production MCP behavior All tests passing. Ready for production release.	2025-11-07 15:41:28 -05:00
Bill	0eb5fcc940	debug: enable stdout/stderr for MCP services to diagnose parameter injection MCP services were started with stdout/stderr redirected to DEVNULL, making debug logs invisible. This prevented diagnosing why _current_position parameter is not being received by buy() function. Changed subprocess.Popen to redirect MCP service output to main process stdout/stderr, allowing [DEBUG buy] logs to be visible in docker logs. This will help identify whether: 1. _current_position is being sent by ContextInjector but not received 2. MCP HTTP transport filters underscore-prefixed parameters 3. Parameter serialization is failing Related to negative cash bug where final position shows -$3,049.83 instead of +$727.92 tracked by ContextInjector.	2025-11-07 14:56:48 -05:00
Bill	bee6afe531	test: update ContextInjector tests to match production MCP behavior Update unit tests to mock CallToolResult objects instead of plain dicts, matching actual MCP tool behavior in production. Changes: - Add create_mcp_result() helper to create mock CallToolResult objects - Update all mock handlers to return MCP result objects - Update assertions to access result.structuredContent field - Maintains test coverage while accurately reflecting production behavior This ensures tests validate the actual code path used in production, where MCP tools return CallToolResult objects with structuredContent field containing the position dict.	2025-11-07 14:32:20 -05:00
Bill	f1f76b9a99	fix: extract position dict from CallToolResult.structuredContent Fix negative cash bug where ContextInjector._current_position never updated. Root cause: MCP tools return mcp.types.CallToolResult objects, not plain dicts. The isinstance(result, dict) check always failed, preventing _current_position from accumulating trades within a session. This caused all trades to calculate from initial $10,000 position instead of previous trade's ending position, resulting in negative cash balances when total purchases exceeded $10,000. Solution: Extract position dict from CallToolResult.structuredContent field before validating. Maintains backward compatibility by handling both CallToolResult objects (production) and plain dicts (unit tests). Impact: - Fixes negative cash positions (e.g., -$8,768.68 after 11 trades) - Enables proper intra-day position tracking - Validates sufficient cash before each trade based on cumulative position - Trade tool responses now properly accumulate all holdings Testing: - All existing unit tests pass (handle plain dict results) - Production logs confirm structuredContent extraction works - Debug logging shows _current_position now updates after each trade	2025-11-07 14:24:48 -05:00
Bill	277714f664	debug: add comprehensive logging for position tracking bug investigation Add debug logging to diagnose negative cash position issue where trades calculate from initial $10,000 instead of accumulating. Issue: After 11 trades, final cash shows -$8,768.68. Each trade appears to calculate from $10,000 starting position instead of previous trade's ending position. Hypothesis: ContextInjector._current_position not updating after trades, possibly due to MCP result type mismatch in isinstance(result, dict) check. Debug logging added: - agent/context_injector.py: Log MCP result type, content, and whether _current_position updates after each trade - agent_tools/tool_trade.py: Log whether injected position is used vs DB query, and full contents of returned position dict This will help identify: 1. What type is returned by MCP tool (dict vs other) 2. Whether _current_position is None on subsequent trades 3. What keys are present in returned position dicts Related to issue where reasoning summary claims no trades executed despite 4 sell orders being recorded.	2025-11-07 14:16:30 -05:00
Bill	db1341e204	feat: implement replace_existing parameter to allow re-running completed simulations Add skip_completed parameter to JobManager.create_job() to control duplicate detection: - When skip_completed=True (default), skips already-completed simulations (existing behavior) - When skip_completed=False, includes ALL requested simulations regardless of completion status API endpoint now uses request.replace_existing to control skip_completed parameter: - replace_existing=false (default): skip_completed=True (skip duplicates) - replace_existing=true: skip_completed=False (force re-run all simulations) This allows users to force re-running completed simulations when needed.	2025-11-07 13:39:51 -05:00
Bill	e5b83839ad	docs: document duplicate prevention and cross-job continuity Added documentation for: - Duplicate simulation prevention in JobManager.create_job() - Cross-job portfolio continuity in position tracking - Updated CLAUDE.md with Duplicate Simulation Prevention section - Updated docs/developer/architecture.md with Position Tracking Across Jobs section	2025-11-07 13:28:26 -05:00
Bill	4629bb1522	test: add integration tests for duplicate prevention and cross-job continuity - Test duplicate simulation detection and skipping - Test portfolio continuity across multiple jobs - Verify warnings are returned for skipped simulations - Use database mocking for isolated test environments	2025-11-07 13:26:34 -05:00
Bill	f175139863	fix: enable cross-job portfolio continuity - Remove job_id filter from get_current_position_from_db() - Position queries now search across all jobs for the model - Prevents portfolio reset when new jobs run overlapping dates - Add test coverage for cross-job position continuity	2025-11-07 13:15:06 -05:00
Bill	75a76bbb48	fix: address code review issues for Task 1 - Add test for ValueError when all simulations completed - Include warnings in API response for user visibility - Improve error message validation in tests	2025-11-07 13:11:09 -05:00
Bill	fbe383772a	feat: add duplicate detection to job creation - Skip already-completed model-day pairs in create_job() - Return warnings for skipped simulations - Raise error if all simulations are already completed - Update create_job() return type from str to Dict[str, Any] - Update all callers to handle new dict return type - Add comprehensive test coverage for duplicate detection - Log warnings when simulations are skipped	2025-11-07 13:03:31 -05:00
Bill	406bb281b2	fix: cleanup stale jobs on container restart to unblock new job creation When a Docker container is shutdown and restarted, jobs with status 'pending', 'downloading_data', or 'running' remained in the database, preventing new jobs from starting due to concurrency control checks. This commit adds automatic cleanup of stale jobs during FastAPI startup: - New cleanup_stale_jobs() method in JobManager (api/job_manager.py:702-779) - Integrated into FastAPI lifespan startup (api/main.py:164-168) - Intelligent status determination based on completion percentage: - 'partial' if any model-days completed (preserves progress data) - 'failed' if no progress made - Detailed error messages with original status and completion counts - Marks incomplete job_details as 'failed' with clear error messages - Deployment-aware: skips cleanup in DEV mode when DB is reset - Comprehensive logging at warning level for visibility Testing: - 6 new unit tests covering all cleanup scenarios (451-609) - All 30 existing job_manager tests still pass - Tests verify pending, running, downloading_data, partial progress, no stale jobs, and multiple stale jobs scenarios Resolves issue where container restarts left stale jobs blocking the can_start_new_job() concurrency check.	2025-11-06 21:24:45 -05:00
Bill	6ddc5abede	fix: resolve DeepSeek tool_calls validation errors (production ready) After extensive systematic debugging, identified and fixed LangChain bug where parse_tool_call() returns string args instead of dict. Root Cause: LangChain's parse_tool_call() has intermittent bug returning unparsed JSON string for 'args' field instead of dict object, violating AIMessage Pydantic schema. Solution: ToolCallArgsParsingWrapper provides two-layer fix: 1. Patches parse_tool_call() to detect string args and parse to dict 2. Normalizes non-standard tool_call formats to OpenAI standard Implementation: - Patches parse_tool_call in langchain_openai.chat_models.base namespace - Defensive approach: only acts when string args detected - Handles edge cases: invalid JSON, non-standard formats, invalid_tool_calls - Minimal performance impact: lightweight type checks - Thread-safe: patches apply at wrapper initialization Testing: - Confirmed fix working in production with DeepSeek Chat v3.1 - All tool calls now process successfully without validation errors - No impact on other AI providers (OpenAI, Anthropic, etc.) Impact: - Enables DeepSeek models via OpenRouter - Maintains backward compatibility - Future-proof against similar issues from other providers Closes systematic debugging investigation that spanned 6 alpha releases. Fixes: tool_calls.0.args validation error [type=dict_type, input_type=str]	2025-11-06 20:49:11 -05:00
Bill	5c73f30583	fix: patch parse_tool_call bug that returns string args instead of dict Root cause identified: langchain_core's parse_tool_call() sometimes returns tool_calls with 'args' as a JSON string instead of parsed dict object. This violates AIMessage's Pydantic schema which expects args to be dict. Solution: Wrapper now detects when parse_tool_call returns string args and immediately converts them to dict using json.loads(). This is a workaround for what appears to be a LangChain bug where parse_tool_call's json.loads() call either: 1. Fails silently without raising exception, or 2. Succeeds but result is not being assigned to args field The fix ensures AIMessage always receives properly parsed dict args, resolving Pydantic validation errors for all DeepSeek tool calls.	2025-11-06 17:58:41 -05:00
Bill	b73d88ca8f	fix: normalize DeepSeek non-standard tool_calls format Systematic debugging revealed DeepSeek returns tool_calls in non-standard format that bypasses LangChain's parse_tool_call(): Root Cause: - OpenAI standard: {function: {name, arguments}, id} - DeepSeek format: {name, args, id} - LangChain's parse_tool_call() returns None when no 'function' key - Result: Raw tool_call with string args → Pydantic validation error Solution: - ToolCallArgsParsingWrapper detects non-standard format - Normalizes to OpenAI standard before LangChain processing - Converts {name, args, id} → {function: {name, arguments}, id} - Added diagnostic logging to identify format variations Impact: - DeepSeek models now work via OpenRouter - No breaking changes to other providers (defensive design) - Diagnostic logs help debug future format issues Fixes validation errors: tool_calls.0.args: Input should be a valid dictionary [type=dict_type, input_value='{"symbol": "GILD", ...}', input_type=str]	2025-11-06 17:51:33 -05:00