Commit Graph

32 Commits

Author SHA1 Message Date
406bb281b2 fix: cleanup stale jobs on container restart to unblock new job creation
When a Docker container is shutdown and restarted, jobs with status
'pending', 'downloading_data', or 'running' remained in the database,
preventing new jobs from starting due to concurrency control checks.

This commit adds automatic cleanup of stale jobs during FastAPI startup:

- New cleanup_stale_jobs() method in JobManager (api/job_manager.py:702-779)
- Integrated into FastAPI lifespan startup (api/main.py:164-168)
- Intelligent status determination based on completion percentage:
  - 'partial' if any model-days completed (preserves progress data)
  - 'failed' if no progress made
- Detailed error messages with original status and completion counts
- Marks incomplete job_details as 'failed' with clear error messages
- Deployment-aware: skips cleanup in DEV mode when DB is reset
- Comprehensive logging at warning level for visibility

Testing:
- 6 new unit tests covering all cleanup scenarios (451-609)
- All 30 existing job_manager tests still pass
- Tests verify pending, running, downloading_data, partial progress,
  no stale jobs, and multiple stale jobs scenarios

Resolves issue where container restarts left stale jobs blocking the
can_start_new_job() concurrency check.
2025-11-06 21:24:45 -05:00
6ddc5abede fix: resolve DeepSeek tool_calls validation errors (production ready)
After extensive systematic debugging, identified and fixed LangChain bug
where parse_tool_call() returns string args instead of dict.

**Root Cause:**
LangChain's parse_tool_call() has intermittent bug returning unparsed
JSON string for 'args' field instead of dict object, violating AIMessage
Pydantic schema.

**Solution:**
ToolCallArgsParsingWrapper provides two-layer fix:
1. Patches parse_tool_call() to detect string args and parse to dict
2. Normalizes non-standard tool_call formats to OpenAI standard

**Implementation:**
- Patches parse_tool_call in langchain_openai.chat_models.base namespace
- Defensive approach: only acts when string args detected
- Handles edge cases: invalid JSON, non-standard formats, invalid_tool_calls
- Minimal performance impact: lightweight type checks
- Thread-safe: patches apply at wrapper initialization

**Testing:**
- Confirmed fix working in production with DeepSeek Chat v3.1
- All tool calls now process successfully without validation errors
- No impact on other AI providers (OpenAI, Anthropic, etc.)

**Impact:**
- Enables DeepSeek models via OpenRouter
- Maintains backward compatibility
- Future-proof against similar issues from other providers

Closes systematic debugging investigation that spanned 6 alpha releases.

Fixes: tool_calls.0.args validation error [type=dict_type, input_type=str]
2025-11-06 20:49:11 -05:00
7b35394ce7 fix: normalize DeepSeek non-standard tool_calls format
Systematic debugging revealed DeepSeek returns tool_calls in non-standard
format that bypasses LangChain's parse_tool_call():

**Root Cause:**
- OpenAI standard: {function: {name, arguments}, id}
- DeepSeek format: {name, args, id}
- LangChain's parse_tool_call() returns None when no 'function' key
- Result: Raw tool_call with string args → Pydantic validation error

**Solution:**
- ToolCallArgsParsingWrapper detects non-standard format
- Normalizes to OpenAI standard before LangChain processing
- Converts {name, args, id} → {function: {name, arguments}, id}
- Added diagnostic logging to identify format variations

**Impact:**
- DeepSeek models now work via OpenRouter
- No breaking changes to other providers (defensive design)
- Diagnostic logs help debug future format issues

Fixes validation errors:
  tool_calls.0.args: Input should be a valid dictionary
  [type=dict_type, input_value='{"symbol": "GILD", ...}', input_type=str]
2025-11-06 11:38:35 -05:00
2d41717b2b docs: update v0.4.1 changelog (IF_TRADE fix only)
Reverted ChatDeepSeek integration approach as it conflicts with
OpenRouter unified gateway architecture.

The system uses OPENAI_API_BASE (OpenRouter) with a single
OPENAI_API_KEY for all AI providers, not direct provider connections.

v0.4.1 now only includes the IF_TRADE initialization fix.
2025-11-06 11:20:22 -05:00
0641ce554a fix: remove incorrect tool_calls conversion logic
Systematic debugging revealed the root cause of Pydantic validation errors:
- DeepSeek correctly returns tool_calls.arguments as JSON strings
- My wrapper was incorrectly converting strings to dicts
- This caused LangChain's parse_tool_call() to fail (json.loads(dict) error)
- Failure created invalid_tool_calls with dict args (should be string)
- Result: Pydantic validation error on invalid_tool_calls

Solution: Remove all conversion logic. DeepSeek format is already correct.

ToolCallArgsParsingWrapper now acts as a simple passthrough proxy.
Trading session completes successfully with no errors.

Fixes the systematic-debugging investigation that identified the
issue was in our fix attempt, not in the original API response.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 21:18:54 -05:00
27a824f4a6 fix: handle invalid_tool_calls args normalization for DeepSeek
Extended ToolCallArgsParsingWrapper to handle both tool_calls and
invalid_tool_calls args formatting inconsistencies from DeepSeek:

- tool_calls.args: string -> dict (for successful calls)
- invalid_tool_calls.args: dict -> string (for failed calls)

The wrapper now normalizes both types before AIMessage construction,
preventing Pydantic validation errors in both success and error cases.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 21:03:48 -05:00
3e50868a4d fix: resolve DeepSeek tool_calls args parsing validation error
Added ToolCallArgsParsingWrapper to handle AI providers (like DeepSeek)
that return tool_calls.args as JSON strings instead of dictionaries.

The wrapper monkey-patches ChatOpenAI's _create_chat_result method to
parse string arguments before AIMessage construction, preventing
Pydantic validation errors.

Changes:
- New: agent/chat_model_wrapper.py - Wrapper implementation
- Modified: agent/base_agent/base_agent.py - Wrap model during init
- Modified: CHANGELOG.md - Document fix as v0.4.1
- New: tests/unit/test_chat_model_wrapper.py - Unit tests

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 20:57:17 -05:00
481126ceca chore: release v0.4.0
Major version bump due to breaking changes:
- Schema migration from action-centric to day-centric model
- Old database tables removed (trading_sessions, positions, reasoning_logs)
- /reasoning endpoint removed (replaced by /results)
- Accurate daily P&L calculation system implemented
2025-11-04 10:59:27 -05:00
3c7ee0d423 docs: add breaking changes for schema migration to CHANGELOG
- Document removal of trading_sessions, positions, reasoning_logs tables
- Document removal of /reasoning endpoint with migration guide
- Add migration instructions for production databases
- Document response structure changes between old and new endpoints
- Update Changed section with trade tools and executor modifications
2025-11-04 10:37:16 -05:00
faa2135668 docs: update changelog for daily P&L and results API refactor 2025-11-04 08:06:44 -05:00
f005571c9f chore: reduce healthcheck interval to 1h to minimize log noise
Healthcheck now runs once per hour instead of every 30 seconds,
reducing log spam while still maintaining startup verification
during the 40s start_period.

Benefits:
- Minimal log noise (1 check/hour vs every 30s)
- Maintains startup verification
- Compatible with Docker orchestration tools
2025-11-03 22:56:24 -05:00
0669bd1bab chore: release v0.3.1
Critical bug fixes for position tracking:
- Fixed cash reset between trading days
- Fixed positions lost over weekends
- Fixed profit calculation accuracy

Plus standardized testing infrastructure.
2025-11-03 21:45:56 -05:00
84320ab8a5 docs: update changelog and schema docs for position tracking fixes
Document the critical bug fixes for position tracking:
- Cash reset to initial value each day
- Positions lost over weekends
- Incorrect profit calculations treating trades as losses

Update database schema documentation to explain the corrected
profit calculation logic that compares to start-of-day portfolio
value instead of previous day's final value.
2025-11-03 21:34:34 -05:00
1095798320 docs: finalize v0.3.0 changelog for release
Consolidated all unreleased changes into v0.3.0 release dated 2025-11-03.

Key additions:
- Development Mode with mock AI provider
- Config Override System for Docker
- Async Price Download (non-blocking)
- Resume Mode (idempotent execution)
- Reasoning Logs API (GET /reasoning)
- Project rebrand to AI-Trader-Server

Includes comprehensive bug fixes for context injection, simulation
re-runs, database reliability, and configuration handling.
2025-11-03 00:19:12 -05:00
73c0fcd908 fix: ensure DEV mode warning appears in Docker logs on startup
- Add FastAPI @app.on_event("startup") handler to display warning
- Previously only appeared when running directly (not via uvicorn)
- Add DEPLOYMENT_MODE and PRESERVE_DEV_DATA to docker-compose.yml
- Update CHANGELOG.md with fix documentation

Fixes issue where dev mode banner wasn't visible in Docker logs
because uvicorn imports app without executing __main__ block.
2025-11-01 13:40:15 -04:00
163cc3c463 docs: rebrand CHANGELOG.md to AI-Trader-Server
Update CHANGELOG.md with AI-Trader-Server rebrand:
- Project name: AI-Trader → AI-Trader-Server
- Repository URLs: Xe138/AI-Trader → Xe138/AI-Trader-Server
- Docker images: ghcr.io/xe138/ai-trader → ghcr.io/xe138/ai-trader-server
- Docker service name: ai-trader → ai-trader-server
2025-11-01 11:32:14 -04:00
02c8a48b37 docs: improve CHANGELOG to reflect actual v0.2.0 baseline
Clarify that v0.3.0 is the first version with REST API functionality,
and remove misleading "API Request Format Changed" entries that implied
the API existed in v0.2.0.

Key improvements:
- Remove "API Request Format Changed" from Changed section (API is new)
- Remove "Model Selection" and "API Interface" items (API design, not changes)
- Clarify batch mode removal context (v0.2.0 had batch, v0.3.0 adds API)
- Update test counts to reflect new tests (175 total, up from 102)
- Add coverage details for new test files (date_utils, price_data_manager)
- Update test execution time estimate (~12 seconds for full suite)

Breaking changes now correctly identify what changed from v0.2.0:
- Batch execution replaced with REST API (new capability)
- Price data storage moved from JSONL to SQLite (migration required)
- Configuration variables added/removed for new features

v0.2.0 was Docker-focused with batch execution
v0.3.0 adds REST API, on-demand downloads, and database storage

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-31 17:15:50 -04:00
1bfcdd78b8 feat: complete v0.3.0 database migration and configuration
Final phase of v0.3.0 implementation - all core features complete.

Price Tools Migration:
- Update get_open_prices() to query price_data table
- Update get_yesterday_open_and_close_price() to query database
- Remove merged.jsonl file I/O (replaced with SQLite queries)
- Maintain backward-compatible function signatures
- Add db_path parameter (default: data/jobs.db)

Configuration:
- Add AUTO_DOWNLOAD_PRICE_DATA to .env.example (default: true)
- Add MAX_SIMULATION_DAYS to .env.example (default: 30)
- Document new configuration options

Documentation:
- Comprehensive CHANGELOG updates for v0.3.0
- Document all breaking changes (API format, data storage, config)
- Document new features (on-demand downloads, date ranges, database)
- Document migration path (scripts/migrate_price_data.py)
- Clear upgrade instructions

Breaking Changes (v0.3.0):
1. API request format: date_range -> start_date/end_date
2. Data storage: merged.jsonl -> price_data table
3. Config variables: removed RUNTIME_ENV_PATH, MCP ports, WEB_HTTP_PORT
4. Added AUTO_DOWNLOAD_PRICE_DATA, MAX_SIMULATION_DAYS

Migration Steps:
1. Run: python scripts/migrate_price_data.py
2. Update API clients to use new date format
3. Update .env with new variables
4. Remove old config variables

Status: v0.3.0 implementation complete
Ready for: Testing, deployment, and release
2025-10-31 16:44:46 -04:00
8e7e80807b refactor: remove config_path from API interface
Makes config_path an internal server detail rather than an API parameter.

Changes:
- Remove config_path from SimulateTriggerRequest
- Add config_path parameter to create_app() with default
- Store in app.state.config_path for internal use
- Update trigger endpoint to use internal config path
- Change missing config error from 400 to 500 (server error)

API calls now only need to specify date_range (and optionally models):
  POST /simulate/trigger
  {"date_range": ["2025-01-16"]}

The server uses configs/default_config.json by default.
This simplifies the API and hides implementation details from clients.
2025-10-31 15:18:56 -04:00
ec2a37e474 feat: use enabled field from config to determine which models run
Changed the API to respect the 'enabled' field in model configurations,
rather than requiring models to be explicitly specified in API requests.

Changes:
- Make 'models' parameter optional in POST /simulate/trigger
- If models not provided, read config and use enabled models
- If models provided, use as explicit override (for testing)
- Raise error if no enabled models found and none specified
- Update response message to show model count

Behavior:
- Default: Only runs models with "enabled": true in config
- Override: Can still specify models in request for manual testing
- Safety: Prevents accidental execution of disabled/expensive models

Example before (required):
  POST /simulate/trigger
  {"config_path": "...", "date_range": [...], "models": ["gpt-4"]}

Example after (optional):
  POST /simulate/trigger
  {"config_path": "...", "date_range": [...]}
  # Uses models where enabled: true

This makes the config file the source of truth for which models
should run, while still allowing ad-hoc overrides for testing.
2025-10-31 15:12:11 -04:00
246dbd1b34 refactor: remove unused web UI port configuration
The web UI (docs/index.html, portfolio.html) exists but is not served
in API mode. Removing the port configuration to eliminate confusion.

Changes:
- Remove port 8888 mapping from docker-compose.yml
- Remove WEB_HTTP_PORT from .env.example
- Update Dockerfile EXPOSE to only port 8080
- Update CHANGELOG.md to document removal

Technical details:
- Web UI static files remain in docs/ folder (legacy from batch mode)
- These were designed for JSONL file format, not the new SQLite database
- No web server was ever started in entrypoint.sh for API mode
- Port 8888 was exposed but nothing listened on it

Result:
- Cleaner configuration (1 fewer port mapping)
- Only REST API (8080) is exposed
- Eliminates user confusion about non-functional web UI
2025-10-31 14:54:10 -04:00
47b9df6b82 docs: merge unreleased configuration changes into v0.3.0
Consolidated the configuration simplification changes (RUNTIME_ENV_PATH
removal, API_PORT cleanup, MCP port removal) into the v0.3.0 release
notes under the 'Changed - Configuration' section.

This ensures all v0.3.0 changes are documented together in a single
release entry rather than split across Unreleased and v0.3.0 sections.
2025-10-31 14:43:31 -04:00
d587a5f213 refactor: remove unnecessary MCP service port configuration
MCP services are completely internal to the container and accessed
only via localhost. They should not be configurable or exposed.

Changes:
- Remove MATH_HTTP_PORT, SEARCH_HTTP_PORT, TRADE_HTTP_PORT,
  GETPRICE_HTTP_PORT from docker-compose.yml environment
- Remove MCP service port mappings from docker-compose.yml
- Remove MCP port configuration from .env.example
- Update README.md to remove MCP port configuration
- Update CLAUDE.md to clarify MCP services use fixed internal ports
- Update CHANGELOG.md with these simplifications

Technical details:
- MCP services hardcode to ports 8000-8003 via os.getenv() defaults
- Services only accessed via localhost URLs within container:
  - http://localhost:8000/mcp (math)
  - http://localhost:8001/mcp (search)
  - http://localhost:8002/mcp (trade)
  - http://localhost:8003/mcp (price)
- No external access needed or desired for these services
- Only API (8080) and web dashboard (8888) should be exposed

Benefits:
- Simpler configuration (4 fewer environment variables)
- Reduced attack surface (4 fewer exposed ports)
- Clearer architecture (internal vs external services)
- Prevents accidental misconfiguration of internal services
2025-10-31 14:41:07 -04:00
c929080960 fix: remove API_PORT from container environment variables
The API_PORT variable was incorrectly included in the container's
environment section. It should only be used for host port mapping
in docker-compose.yml, not passed into the container.

Changes:
- Remove API_PORT from environment section in docker-compose.yml
- Container always uses port 8080 internally (hardcoded in entrypoint.sh)
- API_PORT in .env/.env.example only controls the host-side mapping:
  ports: "${API_PORT:-8080}:8080" (host:container)

Why this matters:
- Prevents confusion about whether API_PORT changes internal port
- Clarifies that entrypoint.sh hardcodes --port 8080
- Simplifies container environment (one less unused variable)
- More explicit about the port mapping behavior

No functional change - the container was already ignoring this variable.
2025-10-31 14:38:53 -04:00
849e7bffa2 refactor: remove unnecessary RUNTIME_ENV_PATH environment variable
Simplifies deployment configuration by removing the RUNTIME_ENV_PATH
environment variable, which is no longer needed for API mode.

Changes:
- Remove RUNTIME_ENV_PATH from docker-compose.yml
- Remove RUNTIME_ENV_PATH from .env.example
- Update CLAUDE.md to reflect API-managed runtime configs
- Update README.md to remove RUNTIME_ENV_PATH from config examples
- Update CHANGELOG.md with this simplification

Technical details:
- API mode dynamically creates isolated runtime config files via
  RuntimeConfigManager (data/runtime_env_{job_id}_{model}_{date}.json)
- tools/general_tools.py already handles missing RUNTIME_ENV_PATH
  gracefully, returning empty dict and warning on writes
- No functional impact - all tests pass without this variable set
- Reduces configuration complexity for new deployments

Breaking change: None - variable was vestigial from batch mode era
2025-10-31 14:37:00 -04:00
cf6b56247e docs: merge unreleased changes into v0.3.0 release notes
- Consolidated batch mode removal into v0.3.0
- Updated deployment description to API-only
- Added breaking changes section
- Documented port configuration enhancements
- Added system dependencies (curl, procps)
- Removed outdated dual-mode references
- Ready for v0.3.0 release
2025-10-31 14:21:56 -04:00
357e561b1f refactor: remove batch mode, simplify to API-only deployment
Removes dual-mode deployment complexity, focusing on REST API service only.

Changes:
- Removed batch mode from docker-compose.yml (now single ai-trader service)
- Deleted scripts/test_batch_mode.sh validation script
- Renamed entrypoint-api.sh to entrypoint.sh (now default)
- Simplified Dockerfile (single entrypoint, removed CMD)
- Updated validation scripts to use 'ai-trader' service name
- Updated documentation (README.md, TESTING_GUIDE.md, CHANGELOG.md)

Benefits:
- Eliminates port conflicts between batch and API services
- Simpler configuration and deployment
- API-first architecture aligned with Windmill integration
- Reduced maintenance complexity

Breaking Changes:
- Batch mode no longer available
- All simulations must use REST API endpoints
2025-10-31 13:54:14 -04:00
fb9583b374 feat: transform to REST API service with SQLite persistence (v0.3.0)
Major architecture transformation from batch-only to API service with
database persistence for Windmill integration.

## REST API Implementation
- POST /simulate/trigger - Start simulation jobs
- GET /simulate/status/{job_id} - Monitor job progress
- GET /results - Query results with filters (job_id, date, model)
- GET /health - Service health checks

## Database Layer
- SQLite persistence with 6 tables (jobs, job_details, positions,
  holdings, reasoning_logs, tool_usage)
- Foreign key constraints with cascade deletes
- Replaces JSONL file storage

## Backend Components
- JobManager: Job lifecycle management with concurrency control
- RuntimeConfigManager: Thread-safe isolated runtime configs
- ModelDayExecutor: Single model-day execution engine
- SimulationWorker: Date-sequential, model-parallel orchestration

## Testing
- 102 unit and integration tests (85% coverage)
- Database: 98% coverage
- Job manager: 98% coverage
- API endpoints: 81% coverage
- Pydantic models: 100% coverage
- TDD approach throughout

## Docker Deployment
- Dual-mode: API server (persistent) + batch (one-time)
- Health checks with 30s interval
- Volume persistence for database and logs
- Separate entrypoints for each mode

## Validation Tools
- scripts/validate_docker_build.sh - Build validation
- scripts/test_api_endpoints.sh - Complete API testing
- scripts/test_batch_mode.sh - Batch mode validation
- DOCKER_API.md - Deployment guide
- TESTING_GUIDE.md - Testing procedures

## Configuration
- API_PORT environment variable (default: 8080)
- Backwards compatible with existing configs
- FastAPI, uvicorn, pydantic>=2.0 dependencies

Co-Authored-By: AI Assistant <noreply@example.com>
2025-10-31 11:47:10 -04:00
5da02b4ba0 docs: update CHANGELOG.md for v0.2.0 release
Update changelog with comprehensive release notes including:
- All features added during alpha testing phase
- Configuration improvements and new documentation
- Bug fixes and stability improvements
- Corrected release date to 2025-10-31

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-31 00:28:13 -04:00
e4b7e197d3 Update repository URLs for fork
Change HKUDS/AI-Trader to Xe138/AI-Trader
Update all documentation links and references
2025-10-30 20:26:18 -04:00
928f5fb53f Update CHANGELOG for v0.2.0 release
Set version to 0.2.0 for Docker deployment feature
Release date: 2025-10-30
Update comparison links for proper version tracking
2025-10-30 20:24:46 -04:00
46582d38bb Add CHANGELOG.md
Document all changes including Docker deployment feature
Follow Keep a Changelog format with semantic versioning
Include template for future releases
2025-10-30 20:22:10 -04:00