358 Commits

Author SHA1 Message Date
2b040537b1 docs: update changelog for v0.5.0 release
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
v0.5.0-alpha.1
2025-11-07 21:04:00 -05:00
14cf88f642 test: improve test coverage from 61% to 84.81%
Major improvements:
- Fixed all 42 broken tests (database connection leaks)
- Added db_connection() context manager for proper cleanup
- Created comprehensive test suites for undertested modules

New test coverage:
- tools/general_tools.py: 26 tests (97% coverage)
- tools/price_tools.py: 11 tests (validates NASDAQ symbols, date handling)
- api/price_data_manager.py: 12 tests (85% coverage)
- api/routes/results_v2.py: 3 tests (98% coverage)
- agent/reasoning_summarizer.py: 2 tests (87% coverage)
- api/routes/period_metrics.py: 2 edge case tests (100% coverage)
- agent/mock_provider: 1 test (100% coverage)

Database fixes:
- Added db_connection() context manager to prevent leaks
- Updated 16+ test files to use context managers
- Fixed drop_all_tables() to match new schema
- Added CHECK constraint for action_type
- Added ON DELETE CASCADE to trading_days foreign key

Test improvements:
- Updated SQL INSERT statements with all required fields
- Fixed date parameter handling in API integration tests
- Added edge case tests for validation functions
- Fixed import errors across test suite

Results:
- Total coverage: 84.81% (was 61%)
- Tests passing: 406 (was 364 with 42 failures)
- Total lines covered: 6364 of 7504

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 21:02:38 -05:00
61baf3f90f test: fix remaining integration test for new results endpoint
Update test_results_filters_by_job_id to expect 404 when no data exists,
aligning with the new endpoint behavior where queries with no matching
data return 404 instead of 200 with empty results.

Also add design and implementation plan documents for reference.
2025-11-07 19:46:49 -05:00
dd99912ec7 test: update integration test for new results endpoint behavior 2025-11-07 19:43:57 -05:00
58937774bf test: update e2e test to use new results endpoint parameters 2025-11-07 19:40:15 -05:00
5475ac7e47 docs: add changelog entry for date range support breaking change 2025-11-07 19:36:29 -05:00
ebbd2c35b7 docs: add DEFAULT_RESULTS_LOOKBACK_DAYS environment variable 2025-11-07 19:35:40 -05:00
c62c01e701 docs: update /results endpoint documentation for date range support
Update API_REFERENCE.md to reflect the new date range query functionality
in the /results endpoint:

- Replace 'date' parameter with 'start_date' and 'end_date'
- Document single-date vs date range response formats
- Add period metrics calculations (period return, annualized return)
- Document default behavior (last 30 days)
- Update error responses for new validation rules
- Update Python and TypeScript client examples
- Add edge trimming behavior documentation
2025-11-07 19:34:43 -05:00
2612b85431 feat: implement date range support with period metrics in results endpoint
- Replace deprecated `date` parameter with `start_date`/`end_date`
- Return single-date format (detailed) when dates are equal
- Return range format (lightweight with period metrics) when dates differ
- Add period metrics: period_return_pct, annualized_return_pct, calendar_days, trading_days
- Default to last 30 days when no dates provided
- Group results by model for date range queries
- Add comprehensive test coverage for both response formats
- Implement automatic edge trimming for date ranges
- Add 404 error handling for empty result sets
- Include 422 error for deprecated `date` parameter usage
2025-11-07 19:26:06 -05:00
5c95180941 feat: add date validation and resolution for results endpoint 2025-11-07 19:18:35 -05:00
29c326a31f feat: add period metrics calculation for date range queries 2025-11-07 19:14:10 -05:00
8f09fa5501 release: v0.4.3 - fix cross-job portfolio continuity v0.4.3 2025-11-07 17:02:02 -05:00
31d6818130 fix: enable cross-job portfolio continuity in get_starting_holdings
Remove job_id filter from get_starting_holdings() SQL JOIN to enable
holdings continuity across jobs. This completes the cross-job portfolio
continuity fix started in the previous commit.

Root cause: get_starting_holdings() joined on job_id, preventing it from
finding previous day's holdings when queried from a different job. This
caused starting_position.holdings to be empty in API results for new jobs
even though starting_cash was correctly retrieved.

Changes:
- api/database.py: Remove job_id from JOIN condition in get_starting_holdings()
- tests/unit/test_database_helpers.py: Add test for cross-job holdings retrieval

Together with the previous commit fixing get_previous_trading_day(), this
ensures complete portfolio continuity (both cash and holdings) across jobs.
v0.4.3-alpha.2
2025-11-07 16:38:33 -05:00
4638c073e3 fix: enable cross-job portfolio continuity in get_previous_trading_day
Remove job_id filter from get_previous_trading_day() SQL query to enable
portfolio continuity across jobs. Previously, new jobs would reset to
initial $10,000 cash instead of continuing from previous job's ending
position.

Root cause: get_previous_trading_day() filtered by job_id, while
get_current_position_from_db() correctly queries across all jobs.
This inconsistency caused starting_cash to default to initial_cash
when no previous day was found within the same job.

Changes:
- api/database.py: Remove job_id filter from SQL WHERE clause
- tests/unit/test_database_helpers.py: Add test for cross-job continuity

Fixes position tracking bug where subsequent jobs on consecutive dates
would not recognize previous day's holdings from different job.
v0.4.3-alpha.1
2025-11-07 16:13:28 -05:00
96f61cf347 release: v0.4.2 - fix critical negative cash position bug
Remove debug logging and update CHANGELOG for v0.4.2 release.

Fixed critical bug where trades calculated from initial $10,000 capital
instead of accumulating, allowing over-spending and negative cash balances.

Key changes:
- Extract position dict from CallToolResult.structuredContent
- Enable MCP service logging for better debugging
- Update tests to match production MCP behavior

All tests passing. Ready for production release.
v0.4.2
2025-11-07 15:41:28 -05:00
0eb5fcc940 debug: enable stdout/stderr for MCP services to diagnose parameter injection
MCP services were started with stdout/stderr redirected to DEVNULL, making
debug logs invisible. This prevented diagnosing why _current_position parameter
is not being received by buy() function.

Changed subprocess.Popen to redirect MCP service output to main process
stdout/stderr, allowing [DEBUG buy] logs to be visible in docker logs.

This will help identify whether:
1. _current_position is being sent by ContextInjector but not received
2. MCP HTTP transport filters underscore-prefixed parameters
3. Parameter serialization is failing

Related to negative cash bug where final position shows -$3,049.83 instead
of +$727.92 tracked by ContextInjector.
v0.4.2-alpha.12
2025-11-07 14:56:48 -05:00
bee6afe531 test: update ContextInjector tests to match production MCP behavior
Update unit tests to mock CallToolResult objects instead of plain dicts,
matching actual MCP tool behavior in production.

Changes:
- Add create_mcp_result() helper to create mock CallToolResult objects
- Update all mock handlers to return MCP result objects
- Update assertions to access result.structuredContent field
- Maintains test coverage while accurately reflecting production behavior

This ensures tests validate the actual code path used in production,
where MCP tools return CallToolResult objects with structuredContent
field containing the position dict.
v0.4.2-alpha.11
2025-11-07 14:32:20 -05:00
f1f76b9a99 fix: extract position dict from CallToolResult.structuredContent
Fix negative cash bug where ContextInjector._current_position never updated.

Root cause: MCP tools return mcp.types.CallToolResult objects, not plain
dicts. The isinstance(result, dict) check always failed, preventing
_current_position from accumulating trades within a session.

This caused all trades to calculate from initial $10,000 position instead
of previous trade's ending position, resulting in negative cash balances
when total purchases exceeded $10,000.

Solution: Extract position dict from CallToolResult.structuredContent field
before validating. Maintains backward compatibility by handling both
CallToolResult objects (production) and plain dicts (unit tests).

Impact:
- Fixes negative cash positions (e.g., -$8,768.68 after 11 trades)
- Enables proper intra-day position tracking
- Validates sufficient cash before each trade based on cumulative position
- Trade tool responses now properly accumulate all holdings

Testing:
- All existing unit tests pass (handle plain dict results)
- Production logs confirm structuredContent extraction works
- Debug logging shows _current_position now updates after each trade
v0.4.2-alpha.10
2025-11-07 14:24:48 -05:00
277714f664 debug: add comprehensive logging for position tracking bug investigation
Add debug logging to diagnose negative cash position issue where trades
calculate from initial $10,000 instead of accumulating.

Issue: After 11 trades, final cash shows -$8,768.68. Each trade appears
to calculate from $10,000 starting position instead of previous trade's
ending position.

Hypothesis: ContextInjector._current_position not updating after trades,
possibly due to MCP result type mismatch in isinstance(result, dict) check.

Debug logging added:
- agent/context_injector.py: Log MCP result type, content, and whether
  _current_position updates after each trade
- agent_tools/tool_trade.py: Log whether injected position is used vs
  DB query, and full contents of returned position dict

This will help identify:
1. What type is returned by MCP tool (dict vs other)
2. Whether _current_position is None on subsequent trades
3. What keys are present in returned position dicts

Related to issue where reasoning summary claims no trades executed
despite 4 sell orders being recorded.
v0.4.2-alpha.9
2025-11-07 14:16:30 -05:00
db1341e204 feat: implement replace_existing parameter to allow re-running completed simulations
Add skip_completed parameter to JobManager.create_job() to control duplicate detection:
- When skip_completed=True (default), skips already-completed simulations (existing behavior)
- When skip_completed=False, includes ALL requested simulations regardless of completion status

API endpoint now uses request.replace_existing to control skip_completed parameter:
- replace_existing=false (default): skip_completed=True (skip duplicates)
- replace_existing=true: skip_completed=False (force re-run all simulations)

This allows users to force re-running completed simulations when needed.
v0.4.2-alpha.8
2025-11-07 13:39:51 -05:00
e5b83839ad docs: document duplicate prevention and cross-job continuity
Added documentation for:
- Duplicate simulation prevention in JobManager.create_job()
- Cross-job portfolio continuity in position tracking
- Updated CLAUDE.md with Duplicate Simulation Prevention section
- Updated docs/developer/architecture.md with Position Tracking Across Jobs section
2025-11-07 13:28:26 -05:00
4629bb1522 test: add integration tests for duplicate prevention and cross-job continuity
- Test duplicate simulation detection and skipping
- Test portfolio continuity across multiple jobs
- Verify warnings are returned for skipped simulations
- Use database mocking for isolated test environments
2025-11-07 13:26:34 -05:00
f175139863 fix: enable cross-job portfolio continuity
- Remove job_id filter from get_current_position_from_db()
- Position queries now search across all jobs for the model
- Prevents portfolio reset when new jobs run overlapping dates
- Add test coverage for cross-job position continuity
2025-11-07 13:15:06 -05:00
75a76bbb48 fix: address code review issues for Task 1
- Add test for ValueError when all simulations completed
- Include warnings in API response for user visibility
- Improve error message validation in tests
2025-11-07 13:11:09 -05:00
fbe383772a feat: add duplicate detection to job creation
- Skip already-completed model-day pairs in create_job()
- Return warnings for skipped simulations
- Raise error if all simulations are already completed
- Update create_job() return type from str to Dict[str, Any]
- Update all callers to handle new dict return type
- Add comprehensive test coverage for duplicate detection
- Log warnings when simulations are skipped
2025-11-07 13:03:31 -05:00
406bb281b2 fix: cleanup stale jobs on container restart to unblock new job creation
When a Docker container is shutdown and restarted, jobs with status
'pending', 'downloading_data', or 'running' remained in the database,
preventing new jobs from starting due to concurrency control checks.

This commit adds automatic cleanup of stale jobs during FastAPI startup:

- New cleanup_stale_jobs() method in JobManager (api/job_manager.py:702-779)
- Integrated into FastAPI lifespan startup (api/main.py:164-168)
- Intelligent status determination based on completion percentage:
  - 'partial' if any model-days completed (preserves progress data)
  - 'failed' if no progress made
- Detailed error messages with original status and completion counts
- Marks incomplete job_details as 'failed' with clear error messages
- Deployment-aware: skips cleanup in DEV mode when DB is reset
- Comprehensive logging at warning level for visibility

Testing:
- 6 new unit tests covering all cleanup scenarios (451-609)
- All 30 existing job_manager tests still pass
- Tests verify pending, running, downloading_data, partial progress,
  no stale jobs, and multiple stale jobs scenarios

Resolves issue where container restarts left stale jobs blocking the
can_start_new_job() concurrency check.
v0.4.2-alpha.7
2025-11-06 21:24:45 -05:00
6ddc5abede fix: resolve DeepSeek tool_calls validation errors (production ready)
After extensive systematic debugging, identified and fixed LangChain bug
where parse_tool_call() returns string args instead of dict.

**Root Cause:**
LangChain's parse_tool_call() has intermittent bug returning unparsed
JSON string for 'args' field instead of dict object, violating AIMessage
Pydantic schema.

**Solution:**
ToolCallArgsParsingWrapper provides two-layer fix:
1. Patches parse_tool_call() to detect string args and parse to dict
2. Normalizes non-standard tool_call formats to OpenAI standard

**Implementation:**
- Patches parse_tool_call in langchain_openai.chat_models.base namespace
- Defensive approach: only acts when string args detected
- Handles edge cases: invalid JSON, non-standard formats, invalid_tool_calls
- Minimal performance impact: lightweight type checks
- Thread-safe: patches apply at wrapper initialization

**Testing:**
- Confirmed fix working in production with DeepSeek Chat v3.1
- All tool calls now process successfully without validation errors
- No impact on other AI providers (OpenAI, Anthropic, etc.)

**Impact:**
- Enables DeepSeek models via OpenRouter
- Maintains backward compatibility
- Future-proof against similar issues from other providers

Closes systematic debugging investigation that spanned 6 alpha releases.

Fixes: tool_calls.0.args validation error [type=dict_type, input_type=str]
2025-11-06 20:49:11 -05:00
5c73f30583 fix: patch parse_tool_call bug that returns string args instead of dict
Root cause identified: langchain_core's parse_tool_call() sometimes returns
tool_calls with 'args' as a JSON string instead of parsed dict object.

This violates AIMessage's Pydantic schema which expects args to be dict.

Solution: Wrapper now detects when parse_tool_call returns string args
and immediately converts them to dict using json.loads().

This is a workaround for what appears to be a LangChain bug where
parse_tool_call's json.loads() call either:
1. Fails silently without raising exception, or
2. Succeeds but result is not being assigned to args field

The fix ensures AIMessage always receives properly parsed dict args,
resolving Pydantic validation errors for all DeepSeek tool calls.
v0.4.2-alpha.6
2025-11-06 17:58:41 -05:00
b73d88ca8f fix: normalize DeepSeek non-standard tool_calls format
Systematic debugging revealed DeepSeek returns tool_calls in non-standard
format that bypasses LangChain's parse_tool_call():

**Root Cause:**
- OpenAI standard: {function: {name, arguments}, id}
- DeepSeek format: {name, args, id}
- LangChain's parse_tool_call() returns None when no 'function' key
- Result: Raw tool_call with string args → Pydantic validation error

**Solution:**
- ToolCallArgsParsingWrapper detects non-standard format
- Normalizes to OpenAI standard before LangChain processing
- Converts {name, args, id} → {function: {name, arguments}, id}
- Added diagnostic logging to identify format variations

**Impact:**
- DeepSeek models now work via OpenRouter
- No breaking changes to other providers (defensive design)
- Diagnostic logs help debug future format issues

Fixes validation errors:
  tool_calls.0.args: Input should be a valid dictionary
  [type=dict_type, input_value='{"symbol": "GILD", ...}', input_type=str]
v0.4.2-alpha.5
2025-11-06 17:51:33 -05:00
d199b093c1 debug: patch parse_tool_call to identify source of string args
Added global monkey-patch of langchain_core's parse_tool_call to log
the type of 'args' it returns. This will definitively show whether:
1. parse_tool_call is returning string args (bug in langchain_core)
2. Something else is modifying the result after parse_tool_call returns
3. AIMessage construction is getting tool_calls from a different source

This is the critical diagnostic to find the root cause.
v0.4.2-alpha.4
2025-11-06 17:42:33 -05:00
483621f9b7 debug: add comprehensive diagnostics to trace error location
Adding detailed logging to:
1. Show call stack when _create_chat_result is called
2. Verify our wrapper is being executed
3. Check result after _convert_dict_to_message processes tool_calls
4. Identify exact point where string args become the problem

This will help determine if error occurs during response processing
or if there's a separate code path bypassing our wrapper.
v0.4.2-alpha.3
2025-11-06 12:10:29 -05:00
e8939be04e debug: enhance diagnostic logging to detect args field in tool_calls
Added more detailed logging to identify if DeepSeek responses include
both 'function.arguments' and 'args' fields, or if tool_calls are
objects vs dicts, to understand why parse_tool_call isn't converting
string args to dict as expected.
v0.4.2-alpha.2
2025-11-06 12:00:08 -05:00
2e0cf4d507 docs: add v0.5.0 roadmap for performance metrics and status APIs
Added new pre-v1.0 release (v0.5.0) with two new API endpoints:

1. Performance Metrics API (GET /metrics/performance)
   - Query model performance over custom date ranges
   - Returns total return, trade count, win rate, daily P&L stats
   - Enables model comparison and strategy evaluation

2. Status & Coverage Endpoint (GET /status)
   - Comprehensive system status in single endpoint
   - Price data coverage (symbols, date ranges, gaps)
   - Model simulation progress (date ranges, completion %)
   - System health (database, MCP services, disk usage)

Updated version history:
- Added v0.4.0 (current release)
- Added v0.5.0 (planned)
- Renamed v1.3.0 to "Advanced performance metrics"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-06 11:41:21 -05:00
7b35394ce7 fix: normalize DeepSeek non-standard tool_calls format
Systematic debugging revealed DeepSeek returns tool_calls in non-standard
format that bypasses LangChain's parse_tool_call():

**Root Cause:**
- OpenAI standard: {function: {name, arguments}, id}
- DeepSeek format: {name, args, id}
- LangChain's parse_tool_call() returns None when no 'function' key
- Result: Raw tool_call with string args → Pydantic validation error

**Solution:**
- ToolCallArgsParsingWrapper detects non-standard format
- Normalizes to OpenAI standard before LangChain processing
- Converts {name, args, id} → {function: {name, arguments}, id}
- Added diagnostic logging to identify format variations

**Impact:**
- DeepSeek models now work via OpenRouter
- No breaking changes to other providers (defensive design)
- Diagnostic logs help debug future format issues

Fixes validation errors:
  tool_calls.0.args: Input should be a valid dictionary
  [type=dict_type, input_value='{"symbol": "GILD", ...}', input_type=str]
v0.4.2-alpha.1
2025-11-06 11:38:35 -05:00
2d41717b2b docs: update v0.4.1 changelog (IF_TRADE fix only)
Reverted ChatDeepSeek integration approach as it conflicts with
OpenRouter unified gateway architecture.

The system uses OPENAI_API_BASE (OpenRouter) with a single
OPENAI_API_KEY for all AI providers, not direct provider connections.

v0.4.1 now only includes the IF_TRADE initialization fix.
v0.4.1 v0.4.1-alpha.4
2025-11-06 11:20:22 -05:00
7c4874715b fix: initialize IF_TRADE to True (trades expected by default)
Root cause: IF_TRADE was initialized to False and never updated when
trades executed, causing 'No trading' message to always display.

Design documents (2025-02-11-complete-schema-migration) specify
IF_TRADE should start as True, with trades setting it to False only
after completion.

Fixes sporadic issue where all trading sessions reported 'No trading'
despite successful buy/sell actions.
2025-11-06 07:33:33 -05:00
6d30244fc9 test: remove wrapper entirely to test if it's causing issues
Hypothesis: The ToolCallArgsParsingWrapper might be interfering with
LangChain's tool binding or response parsing in unexpected ways.

Testing with direct ChatOpenAI usage (no wrapper) to see if errors persist.

This is Phase 3 of systematic debugging - testing minimal change hypothesis.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
v0.4.1-alpha.2
2025-11-05 21:26:20 -05:00
0641ce554a fix: remove incorrect tool_calls conversion logic
Systematic debugging revealed the root cause of Pydantic validation errors:
- DeepSeek correctly returns tool_calls.arguments as JSON strings
- My wrapper was incorrectly converting strings to dicts
- This caused LangChain's parse_tool_call() to fail (json.loads(dict) error)
- Failure created invalid_tool_calls with dict args (should be string)
- Result: Pydantic validation error on invalid_tool_calls

Solution: Remove all conversion logic. DeepSeek format is already correct.

ToolCallArgsParsingWrapper now acts as a simple passthrough proxy.
Trading session completes successfully with no errors.

Fixes the systematic-debugging investigation that identified the
issue was in our fix attempt, not in the original API response.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
v0.4.1-alpha.1
2025-11-05 21:18:54 -05:00
0c6de5b74b debug: remove conversion logic to see original response structure
Removed all argument conversion code to see what DeepSeek actually returns.
This will help identify if the problem is with our conversion or with the
original API response format.

Phase 1 continued - gathering evidence about original response structure.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 21:12:48 -05:00
0f49977700 debug: add diagnostic logging to understand response structure
Added detailed logging to patched_create_chat_result to investigate why
invalid_tool_calls.args conversion is not working. This will show:
- Response structure and keys
- Whether invalid_tool_calls exists
- Type and value of args before/after conversion
- Whether conversion is actually executing

This is Phase 1 (Root Cause Investigation) of systematic debugging.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 21:08:11 -05:00
27a824f4a6 fix: handle invalid_tool_calls args normalization for DeepSeek
Extended ToolCallArgsParsingWrapper to handle both tool_calls and
invalid_tool_calls args formatting inconsistencies from DeepSeek:

- tool_calls.args: string -> dict (for successful calls)
- invalid_tool_calls.args: dict -> string (for failed calls)

The wrapper now normalizes both types before AIMessage construction,
preventing Pydantic validation errors in both success and error cases.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 21:03:48 -05:00
3e50868a4d fix: resolve DeepSeek tool_calls args parsing validation error
Added ToolCallArgsParsingWrapper to handle AI providers (like DeepSeek)
that return tool_calls.args as JSON strings instead of dictionaries.

The wrapper monkey-patches ChatOpenAI's _create_chat_result method to
parse string arguments before AIMessage construction, preventing
Pydantic validation errors.

Changes:
- New: agent/chat_model_wrapper.py - Wrapper implementation
- Modified: agent/base_agent/base_agent.py - Wrap model during init
- Modified: CHANGELOG.md - Document fix as v0.4.1
- New: tests/unit/test_chat_model_wrapper.py - Unit tests

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 20:57:17 -05:00
e20dce7432 fix: enable intra-day position tracking for sell-then-buy trades
Resolves issue where sell proceeds were not immediately available for
subsequent buy orders within the same trading session.

Problem:
- Both buy() and sell() independently queried database for starting position
- Multiple trades within same day all saw pre-trade cash balance
- Agents couldn't rebalance portfolios (sell + buy) in single session

Solution:
- ContextInjector maintains in-memory position state during trading session
- Position updates accumulate after each successful trade
- Position state injected into buy/sell via _current_position parameter
- Reset position state at start of each trading day

Changes:
- agent/context_injector.py: Add position tracking with reset_position()
- agent_tools/tool_trade.py: Accept _current_position in buy/sell functions
- agent/base_agent/base_agent.py: Reset position state daily
- tests: Add 13 comprehensive tests for position tracking

All new tests pass. Backward compatible, no schema changes required.
v0.4.0 v0.4.0-alpha.12
2025-11-05 06:56:54 -05:00
462de3adeb fix: extract tool messages before checking FINISH_SIGNAL
**Critical Bug:**
When agent returns FINISH_SIGNAL, the code breaks immediately (line 640)
BEFORE extracting tool messages (lines 642-650). This caused tool messages
to never be captured when agent completes in single step.

**Timeline:**
1. Agent calls buy tools (MSFT, AMZN, NVDA)
2. Agent returns response with <FINISH_SIGNAL>
3. Code detects signal → break (line 640)
4. Lines 642-650 NEVER EXECUTE
5. Tool messages not captured → summarizer sees 0 tools

**Evidence from logs:**
- Console: 'Bought NVDA 10 shares'
- API: 3 trades executed (MSFT 5, AMZN 15, NVDA 10)
- Debug: 'Tool messages: 0' 

**Fix:**
Move tool extraction BEFORE stop signal check.
Agent can call tools AND return FINISH_SIGNAL in same response,
so we must process tools first.

**Impact:**
Now tool messages will be captured even when agent finishes in
single step. Summarizer will see actual trades executed.

This is the true root cause of empty tool messages in conversation_history.
v0.4.0-alpha.11
2025-11-05 00:57:22 -05:00
31e346ecbb debug: add logging to verify conversation history capture
Added debug output to confirm:
- How many messages are in conversation_history
- How many assistant vs tool messages
- Preview of first assistant message content
- What the summarizer receives

This will verify that the full detailed reasoning (like portfolio
analysis, trade execution details) is being captured and passed
to the summarizer.

Output will show:
[DEBUG] Generating summary from N messages
[DEBUG] Assistant messages: X, Tool messages: Y
[DEBUG] First assistant message preview: ...
[DEBUG ReasoningSummarizer] Formatting N messages
[DEBUG ReasoningSummarizer] Breakdown: X assistant, Y tool
v0.4.0-alpha.10
2025-11-05 00:46:30 -05:00
abb9cd0726 fix: capture tool messages in conversation history for summarizer
**Root Cause:**
The summarizer was not receiving tool execution results (buy/sell trades)
because they were never captured to conversation_history.

**What was captured:**
- User: 'Please analyze positions'
- Assistant: 'I will buy/sell...'
- Assistant: 'Done <FINISH_SIGNAL>'

**What was MISSING:**
- Tool: buy 14 NVDA at $185.24
- Tool: sell 1 GOOGL at $245.15

**Changes:**
- Added tool message capture in trading loop (line 649)
- Extract tool_name and tool_content from each tool message
- Capture to conversation_history before processing
- Changed message['tool_name'] to message['name'] for consistency

**Impact:**
Now the summarizer sees the actual tool results, not just the AI's
intentions. Combined with alpha.8's prompt improvements, summaries
will accurately reflect executed trades.

Fixes reasoning summaries that contradicted actual trades.
v0.4.0-alpha.9
2025-11-05 00:44:24 -05:00
6d126db03c fix: improve reasoning summary to explicitly mention trades
The reasoning summary was not accurately reflecting actual trades.
For example, 2 sell trades were summarized as 'maintain core holdings'.

Changes:
- Updated prompt to require explicit mention of trades executed
- Added emphasis on buy/sell tool calls in formatted log
- Trades now highlighted at top of log with TRADES EXECUTED section
- Prompt instructs: state specific trades (symbols, quantities, action)

Example before: 'chose to maintain core holdings'
Example after: 'sold 1 GOOGL and 1 AMZN to reduce exposure'

This ensures reasoning field accurately describes what the AI actually did.
v0.4.0-alpha.8
2025-11-05 00:41:59 -05:00
1e7bdb509b chore: remove debug logging from ContextInjector
Removed noisy debug print statements that were added during
troubleshooting. The context injection is now working correctly
and no longer needs diagnostic output.

Cleaned up:
- Entry point logging
- Before/after injection logging
- Tool name and args logging
2025-11-05 00:31:16 -05:00
a8d912bb4b fix: calculate final holdings from actions instead of querying database
**Problem:**
Final positions showed empty holdings despite executing 15+ trades.
The issue persisted even after fixing the get_current_position_from_db query.

**Root Cause:**
At end of trading day, base_agent.py line 672 called
_get_current_portfolio_state() which queried the database for current
position. On the FIRST trading day, this query returns empty holdings
because there's no previous day's record.

**Why the Previous Fix Wasn't Enough:**
The previous fix (date < instead of date <=) correctly retrieves
STARTING position for subsequent days, but didn't address END-OF-DAY
position calculation, which needs to account for trades executed
during the current session.

**Solution:**
Added new method _calculate_final_position_from_actions() that:
1. Gets starting holdings from previous day (via get_starting_holdings)
2. Gets all actions from actions table for current trading day
3. Applies each buy/sell to calculate final state:
   - Buy: holdings[symbol] += qty, cash -= qty * price
   - Sell: holdings[symbol] -= qty, cash += qty * price
4. Returns accurate final holdings and cash

**Impact:**
- First trading day: Correctly saves all executed trades as final holdings
- Subsequent days: Final position reflects all trades from that day
- Holdings now persist correctly across all trading days

**Tests:**
- test_calculate_final_position_first_day_with_trades: 15 trades on first day
- test_calculate_final_position_with_previous_holdings: Multi-day scenario
- test_calculate_final_position_no_trades: No-trade edge case

All tests pass 
v0.4.0-alpha.6
2025-11-04 23:51:54 -05:00
aa16480158 fix: query previous day's holdings instead of current day
**Problem:**
Subsequent trading days were not retrieving starting holdings correctly.
The API showed empty starting_position and final_position even after
executing multiple buy trades.

**Root Cause:**
get_current_position_from_db() used `date <= ?` which returned the
CURRENT day's trading_day record instead of the PREVIOUS day's ending.
Since holdings are written at END of trading day, querying the current
day's record would return incomplete/empty holdings.

**Timeline on Day 1 (2025-10-02):**
1. Start: Create trading_day with empty holdings
2. Trade: Execute 8 buy trades (recorded in actions table)
3. End: Call get_current_position_from_db(date='2025-10-02')
   - Query: `date <= 2025-10-02` returns TODAY's record
   - Holdings: EMPTY (not written yet)
   - Saves: Empty holdings to database 

**Solution:**
Changed query to use `date < ?` to retrieve PREVIOUS day's ending
position, which becomes the current day's starting position.

**Impact:**
- Day 1: Correctly saves ending holdings after trades
- Day 2+: Correctly retrieves previous day's ending as starting position
- Holdings now persist between trading days as expected

**Tests Added:**
- test_get_position_retrieves_previous_day_not_current: Verifies query
  returns previous day when multiple days exist
- Updated existing tests to align with new behavior

Fixes holdings persistence bug identified in API response showing
empty starting_position/final_position despite successful trades.
v0.4.0-alpha.5
2025-11-04 23:29:30 -05:00