Compare commits

..

10 Commits

Author SHA1 Message Date
1785f9b06f feat: automate merged.jsonl creation during price fetching
Streamline the data preparation workflow by having get_daily_price.py
automatically invoke merge_jsonl.py after fetching all stock prices.

Changes:
- Modified get_daily_price.py to call merge_jsonl.py automatically
- Updated entrypoint.sh to remove redundant merge_jsonl.py call
- Updated main.sh to remove redundant merge_jsonl.py call
- Fixed import order for linting compliance

Benefits:
- Single command now handles both fetching and merging
- Ensures merged.jsonl is always created after price updates
- Simplifies Docker container startup process
- Prevents missing merged.jsonl errors in production
2025-10-30 23:48:12 -04:00
55206549c7 feat: add configurable volume path for persistent data
Add VOLUME_PATH environment variable to customize storage location for
all persistent data (data/, logs/, configs/ subdirectories).

Changes:
- .env.example: Add VOLUME_PATH variable with documentation
- docker-compose.yml: Use ${VOLUME_PATH:-.} for all volume mounts
- docs/DOCKER.md: Document custom volume location feature

Benefits:
- Store data outside project directory (e.g., on separate disk)
- Easier backup/restore (single directory)
- Clean separation of code and data
- Support for absolute or relative paths

Usage:
  # In .env file
  VOLUME_PATH=/home/user/trading-data

  # Results in mounts:
  /home/user/trading-data/data:/app/data
  /home/user/trading-data/logs:/app/logs
  /home/user/trading-data/configs:/app/configs

Defaults to current directory (.) if not set for backward compatibility.
2025-10-30 23:40:21 -04:00
9e05ce0891 fix: prevent price data overwrite on container restart
Preserve existing merged.jsonl to avoid data loss and API rate limits.
Only fetch new data if merged.jsonl is missing or empty.

Problem:
- Entrypoint always fetched fresh data from Alpha Vantage on every start
- Overwrote existing mounted data directory
- Caused API rate limit issues and data inconsistencies
- Lost historical data needed for backtesting specific date ranges

Solution:
- Check if merged.jsonl exists and has content before fetching
- Display stock count when using existing data
- Provide manual refresh instructions for when updates are needed

Benefits:
- Faster container startup (no API calls if data exists)
- Avoids Alpha Vantage rate limits (5 calls/min, 500/day)
- Preserves user's existing historical datasets
- Enables reliable backtesting with consistent data

To refresh data: rm data/merged.jsonl && docker-compose restart
2025-10-30 23:27:58 -04:00
2f2c1d6ea2 feat: simplify Docker config file selection with convention over configuration
Implement automatic detection of custom_config.json for simpler Docker usage.
No environment variables or command-line arguments needed for most users.

Changes:
- entrypoint.sh: Add smart config detection (custom_config.json > CLI arg > default_config.json)
- docker-compose.yml: Add configs volume mount for editing without rebuilds
- docs/DOCKER.md: Update documentation with simplified workflow
- .gitignore: Add custom_config.json to prevent accidental commits

Usage:
  cp configs/default_config.json configs/custom_config.json
  nano configs/custom_config.json
  docker-compose up  # Automatically uses custom_config.json

Simplifies user experience by following convention over configuration principle.
2025-10-30 22:34:22 -04:00
e9c571402a fix: set PYTHONPATH for MCP services to resolve import errors
Root cause: Python adds script directory (not working directory)
to sys.path. Services in /app/agent_tools/ couldn't import from
/app/tools/ because sys.path[0] was /app/agent_tools.

Solution: Set PYTHONPATH=/app when starting services so /app is
always in the Python import path.

Fixes: ModuleNotFoundError: No module named 'tools' in price.log
2025-10-30 21:47:58 -04:00
79d14444ed docs: add data cache reuse design document
Captures design for staleness-based data refresh to avoid
re-fetching all 103 NASDAQ tickers on every container restart.

Key features:
- Check all daily_prices_*.json files for staleness
- Configurable MAX_DATA_AGE_DAYS threshold (default: 7)
- Bash wrapper logic in entrypoint.sh
- Handles edge cases (partial data, missing files, rate limits)
2025-10-30 21:46:05 -04:00
12ecb1e6b6 docs: clarify OPENAI_API_BASE can be left empty
Add comment explaining OPENAI_API_BASE can be left empty
to use the default OpenAI endpoint, or set to a custom
proxy URL if needed.

Sets default value to empty instead of placeholder URL.
2025-10-30 21:34:16 -04:00
203b60b252 Revert "fix: improve MCP service startup reliability"
This reverts commit d70362b9d4.
2025-10-30 21:33:12 -04:00
d70362b9d4 fix: improve MCP service startup reliability
- Clarify OPENAI_API_BASE can be left empty for default endpoint
- Increase initial wait time from 3s to 5s
- Add health checking for all MCP services (ports 8000-8003)
- Retry up to 10 times with 1s intervals
- Install netcat for port checking
- Display warnings if services aren't ready

Helps diagnose and prevent MCP client initialization errors.
2025-10-30 21:32:42 -04:00
5ec7977b47 fix: resolve module import error for MCP services
Run MCP service manager from /app instead of /app/agent_tools
to ensure services can import from tools module.

Changes:
- entrypoint.sh: Run start_mcp_services.py from /app directory
- start_mcp_services.py: Use absolute paths for service scripts
- start_mcp_services.py: Set working directory to /app for services

Fixes ModuleNotFoundError: No module named 'tools' in price.log
2025-10-30 21:22:42 -04:00
9 changed files with 366 additions and 42 deletions

View File

@@ -5,7 +5,8 @@
# Docker Compose automatically reads .env from project root
# AI Model API Configuration
OPENAI_API_BASE=https://your-openai-proxy.com/v1
# OPENAI_API_BASE: Leave empty to use default OpenAI endpoint, or set to custom proxy URL
OPENAI_API_BASE=
OPENAI_API_KEY=your_openai_key_here # https://platform.openai.com/api-keys
# Data Source Configuration
@@ -29,3 +30,9 @@ WEB_HTTP_PORT=8888
# Agent Configuration
AGENT_MAX_STEP=30
# Data Volume Configuration
# Base directory for all persistent data (will contain data/, logs/, configs/ subdirectories)
# Use relative paths (./volumes) or absolute paths (/home/user/ai-trader-volumes)
# Defaults to current directory (.) if not set
VOLUME_PATH=.

1
.gitignore vendored
View File

@@ -57,6 +57,7 @@ delete.py
refresh_data.sh
# Config files (optional - uncomment if needed)
configs/custom_config.json
configs/day_config.json
configs/hour_config.json
configs/test_config.json

View File

@@ -27,32 +27,33 @@ class MCPServiceManager:
'price': int(os.getenv('GETPRICE_HTTP_PORT', '8003'))
}
# Service configurations
# Service configurations (paths relative to /app)
self.agent_tools_dir = Path(__file__).parent.resolve()
self.service_configs = {
'math': {
'script': 'tool_math.py',
'script': self.agent_tools_dir / 'tool_math.py',
'name': 'Math',
'port': self.ports['math']
},
'search': {
'script': 'tool_jina_search.py',
'script': self.agent_tools_dir / 'tool_jina_search.py',
'name': 'Search',
'port': self.ports['search']
},
'trade': {
'script': 'tool_trade.py',
'script': self.agent_tools_dir / 'tool_trade.py',
'name': 'TradeTools',
'port': self.ports['trade']
},
'price': {
'script': 'tool_get_price_local.py',
'script': self.agent_tools_dir / 'tool_get_price_local.py',
'name': 'LocalPrices',
'port': self.ports['price']
}
}
# Create logs directory
self.log_dir = Path('../logs')
self.log_dir = Path('logs')
self.log_dir.mkdir(exist_ok=True)
# Set signal handlers
@@ -70,20 +71,26 @@ class MCPServiceManager:
script_path = config['script']
service_name = config['name']
port = config['port']
if not Path(script_path).exists():
if not script_path.exists():
print(f"❌ Script file not found: {script_path}")
return False
try:
# Start service process
log_file = self.log_dir / f"{service_id}.log"
# Set PYTHONPATH to /app so services can import from tools module
env = os.environ.copy()
env['PYTHONPATH'] = str(Path.cwd())
with open(log_file, 'w') as f:
process = subprocess.Popen(
[sys.executable, script_path],
[sys.executable, str(script_path)],
stdout=f,
stderr=subprocess.STDOUT,
cwd=os.getcwd()
cwd=Path.cwd(), # Use current working directory (/app)
env=env # Pass environment with PYTHONPATH
)
self.services[service_id] = {

View File

@@ -1,8 +1,12 @@
import requests
import os
from dotenv import load_dotenv
load_dotenv()
import json
import os
import subprocess
import sys
import requests
from dotenv import load_dotenv
load_dotenv()
all_nasdaq_100_symbols = [
@@ -42,4 +46,15 @@ if __name__ == "__main__":
for symbol in all_nasdaq_100_symbols:
get_daily_price(symbol)
get_daily_price("QQQ")
get_daily_price("QQQ")
# Automatically run merge after fetching
print("\n📦 Merging price data...")
try:
script_dir = os.path.dirname(os.path.abspath(__file__))
merge_script = os.path.join(script_dir, "merge_jsonl.py")
subprocess.run([sys.executable, merge_script], check=True)
print("✅ Price data merged successfully")
except Exception as e:
print(f"⚠️ Failed to merge data: {e}")
print(" Please run 'python merge_jsonl.py' manually")

View File

@@ -5,8 +5,9 @@ services:
# build: .
container_name: ai-trader-app
volumes:
- ./data:/app/data
- ./logs:/app/logs
- ${VOLUME_PATH:-.}/data:/app/data
- ${VOLUME_PATH:-.}/logs:/app/logs
- ${VOLUME_PATH:-.}/configs:/app/configs
environment:
# AI Model API Configuration
- OPENAI_API_BASE=${OPENAI_API_BASE}

View File

@@ -53,10 +53,30 @@ AGENT_MAX_STEP=30
### Custom Trading Configuration
Pass a custom config file:
**Simple Method (Recommended):**
Create a `configs/custom_config.json` file - it will be automatically used:
```bash
docker-compose run ai-trader configs/my_config.json
# Copy default config as starting point
cp configs/default_config.json configs/custom_config.json
# Edit your custom config
nano configs/custom_config.json
# Run normally - custom_config.json is automatically detected!
docker-compose up
```
**Priority order:**
1. `configs/custom_config.json` (if exists) - **Highest priority**
2. Command-line argument: `docker-compose run ai-trader configs/other.json`
3. `configs/default_config.json` (fallback)
**Advanced: Use a different config file name:**
```bash
docker-compose run ai-trader configs/my_special_config.json
```
## Usage Examples
@@ -92,16 +112,43 @@ docker-compose up
### Volume Mounts
Docker Compose mounts two volumes:
Docker Compose mounts three volumes for persistent data. By default, these are stored in the project directory:
- `./data:/app/data` - Price data and trading records
- `./logs:/app/logs` - MCP service logs
- `./configs:/app/configs` - Configuration files (allows editing configs without rebuilding)
Data persists across container restarts. To reset:
### Custom Volume Location
You can change where data is stored by setting `VOLUME_PATH` in your `.env` file:
```bash
# Store data in a different location
VOLUME_PATH=/home/user/trading-data
# Or use a relative path
VOLUME_PATH=./volumes
```
This will store data in:
- `/home/user/trading-data/data/`
- `/home/user/trading-data/logs/`
- `/home/user/trading-data/configs/`
**Note:** The directory structure is automatically created. You'll need to copy your existing configs:
```bash
# After changing VOLUME_PATH
mkdir -p /home/user/trading-data/configs
cp configs/custom_config.json /home/user/trading-data/configs/
```
### Reset Data
To reset all trading data:
```bash
docker-compose down
rm -rf data/agent_data/* logs/*
rm -rf ${VOLUME_PATH:-.}/data/agent_data/* ${VOLUME_PATH:-.}/logs/*
docker-compose up
```
@@ -227,13 +274,45 @@ Services exposed on host:
### Test Different Configurations
```bash
# Create test config
cp configs/default_config.json configs/test_config.json
# Edit test_config.json
**Method 1: Use the standard custom_config.json**
# Run with test config
docker-compose run ai-trader configs/test_config.json
```bash
# Create and edit your config
cp configs/default_config.json configs/custom_config.json
nano configs/custom_config.json
# Run - automatically uses custom_config.json
docker-compose up
```
**Method 2: Test multiple configs with different names**
```bash
# Create multiple test configs
cp configs/default_config.json configs/conservative.json
cp configs/default_config.json configs/aggressive.json
# Edit each config...
# Test conservative strategy
docker-compose run ai-trader configs/conservative.json
# Test aggressive strategy
docker-compose run ai-trader configs/aggressive.json
```
**Method 3: Temporarily switch configs**
```bash
# Temporarily rename your custom config
mv configs/custom_config.json configs/custom_config.json.backup
cp configs/test_strategy.json configs/custom_config.json
# Run with test strategy
docker-compose up
# Restore original
mv configs/custom_config.json.backup configs/custom_config.json
```
## Production Deployment

View File

@@ -0,0 +1,197 @@
# Data Cache Reuse Design
**Date:** 2025-10-30
**Status:** Approved
## Problem Statement
Docker containers currently fetch all 103 NASDAQ 100 tickers from Alpha Vantage on every startup, even when price data is volume-mounted and already cached in `./data`. This causes:
- Slow startup times (103 API calls)
- Unnecessary API quota consumption
- Rate limit risks during frequent development iterations
## Solution Overview
Implement staleness-based data refresh with configurable age threshold. Container checks all `daily_prices_*.json` files and only refetches if any file is missing or older than `MAX_DATA_AGE_DAYS`.
## Design Decisions
### Architecture Choice
**Selected:** Check all `daily_prices_*.json` files individually
**Rationale:** Ensures data integrity by detecting partial/missing files, not just stale merged data
### Implementation Location
**Selected:** Bash wrapper logic in `entrypoint.sh`
**Rationale:** Keeps data fetching scripts unchanged, adds orchestration at container startup layer
### Staleness Threshold
**Selected:** Configurable via `MAX_DATA_AGE_DAYS` environment variable (default: 7 days)
**Rationale:** Balances freshness with API usage; flexible for different use cases (development vs production)
## Technical Design
### Components
#### 1. Staleness Check Function
Location: `entrypoint.sh` (after environment validation, before data fetch)
```bash
should_refresh_data() {
MAX_AGE=${MAX_DATA_AGE_DAYS:-7}
# Check if at least one price file exists
if ! ls /app/data/daily_prices_*.json >/dev/null 2>&1; then
echo "📭 No price data found"
return 0 # Need refresh
fi
# Find any files older than MAX_AGE days
STALE_COUNT=$(find /app/data -name "daily_prices_*.json" -mtime +$MAX_AGE | wc -l)
TOTAL_COUNT=$(ls /app/data/daily_prices_*.json 2>/dev/null | wc -l)
if [ $STALE_COUNT -gt 0 ]; then
echo "📅 Found $STALE_COUNT stale files (>$MAX_AGE days old)"
return 0 # Need refresh
fi
echo "✅ All $TOTAL_COUNT price files are fresh (<$MAX_AGE days old)"
return 1 # Skip refresh
}
```
**Logic:**
- Uses `find -mtime +N` to detect files modified more than N days ago
- Returns shell exit codes: 0 (refresh needed), 1 (skip refresh)
- Logs informative messages for debugging
#### 2. Conditional Data Fetch
Location: `entrypoint.sh` lines 40-46 (replace existing unconditional fetch)
```bash
# Step 1: Data preparation (conditional)
echo "📊 Checking price data freshness..."
if should_refresh_data; then
echo "🔄 Fetching and merging price data..."
cd /app/data
python /app/scripts/get_daily_price.py
python /app/scripts/merge_jsonl.py
cd /app
else
echo "⏭️ Skipping data fetch (using cached data)"
fi
```
#### 3. Environment Configuration
**docker-compose.yml:**
```yaml
environment:
- MAX_DATA_AGE_DAYS=${MAX_DATA_AGE_DAYS:-7}
```
**.env.example:**
```bash
# Data Refresh Configuration
MAX_DATA_AGE_DAYS=7 # Refresh price data older than N days (0=always refresh)
```
### Data Flow
1. **Container Startup** → entrypoint.sh begins execution
2. **Environment Validation** → Check required API keys (existing logic)
3. **Staleness Check**`should_refresh_data()` scans `/app/data/daily_prices_*.json`
- No files found → Return 0 (refresh)
- Any file older than `MAX_DATA_AGE_DAYS` → Return 0 (refresh)
- All files fresh → Return 1 (skip)
4. **Conditional Fetch** → Run get_daily_price.py only if refresh needed
5. **Merge Data** → Always run merge_jsonl.py (handles missing merged.jsonl)
6. **MCP Services** → Start services (existing logic)
7. **Trading Agent** → Begin trading (existing logic)
### Edge Cases
| Scenario | Behavior |
|----------|----------|
| **First run (no data)** | Detects no files → triggers full fetch |
| **Restart within 7 days** | All files fresh → skips fetch (fast startup) |
| **Restart after 7 days** | Files stale → refreshes all data |
| **Partial data (some files missing)** | Missing files treated as infinitely old → triggers refresh |
| **Corrupt merged.jsonl but fresh price files** | Skips fetch, re-runs merge to rebuild merged.jsonl |
| **MAX_DATA_AGE_DAYS=0** | Always refresh (useful for testing/production) |
| **MAX_DATA_AGE_DAYS unset** | Defaults to 7 days |
| **Alpha Vantage rate limit** | get_daily_price.py handles with warning (existing behavior) |
## Configuration Options
| Variable | Default | Purpose |
|----------|---------|---------|
| `MAX_DATA_AGE_DAYS` | 7 | Days before price data considered stale |
**Special Values:**
- `0` → Always refresh (force fresh data)
- `999` → Never refresh (use cached data indefinitely)
## User Experience
### Scenario 1: Fresh Container
```
🚀 Starting AI-Trader...
🔍 Validating environment variables...
✅ Environment variables validated
📊 Checking price data freshness...
📭 No price data found
🔄 Fetching and merging price data...
✓ Fetched NVDA
✓ Fetched MSFT
...
```
### Scenario 2: Restart Within 7 Days
```
🚀 Starting AI-Trader...
🔍 Validating environment variables...
✅ Environment variables validated
📊 Checking price data freshness...
✅ All 103 price files are fresh (<7 days old)
⏭️ Skipping data fetch (using cached data)
🔧 Starting MCP services...
```
### Scenario 3: Restart After 7 Days
```
🚀 Starting AI-Trader...
🔍 Validating environment variables...
✅ Environment variables validated
📊 Checking price data freshness...
📅 Found 103 stale files (>7 days old)
🔄 Fetching and merging price data...
✓ Fetched NVDA
✓ Fetched MSFT
...
```
## Testing Plan
1. **Test fresh container:** Delete `./data/daily_prices_*.json`, start container → should fetch all
2. **Test cached data:** Restart immediately → should skip fetch
3. **Test staleness:** `touch -d "8 days ago" ./data/daily_prices_AAPL.json`, restart → should refresh
4. **Test partial data:** Delete 10 random price files → should refresh all
5. **Test MAX_DATA_AGE_DAYS=0:** Restart with env var set → should always fetch
6. **Test MAX_DATA_AGE_DAYS=30:** Restart with 8-day-old data → should skip
## Documentation Updates
Files requiring updates:
- `entrypoint.sh` → Add function and conditional logic
- `docker-compose.yml` → Add MAX_DATA_AGE_DAYS environment variable
- `.env.example` → Document MAX_DATA_AGE_DAYS with default value
- `CLAUDE.md` → Update "Docker Deployment" section with new env var
- `docs/DOCKER.md` (if exists) → Explain data caching behavior
## Benefits
- **Development:** Instant container restarts during iteration
- **API Quota:** ~103 fewer API calls per restart
- **Reliability:** No rate limit risks during frequent testing
- **Flexibility:** Configurable threshold for different use cases
- **Consistency:** Checks all files to ensure complete data

View File

@@ -38,19 +38,24 @@ fi
echo "✅ Environment variables validated"
# Step 1: Data preparation
echo "📊 Fetching and merging price data..."
# Run scripts from /app/scripts but output to /app/data
cd /app/data
python /app/scripts/get_daily_price.py
python /app/scripts/merge_jsonl.py
cd /app
echo "📊 Checking price data..."
if [ -f "/app/data/merged.jsonl" ] && [ -s "/app/data/merged.jsonl" ]; then
echo "✅ Using existing price data ($(wc -l < /app/data/merged.jsonl) stocks)"
echo " To refresh data, delete /app/data/merged.jsonl and restart"
else
echo "📊 Fetching and merging price data..."
# Run script from /app/scripts but output to /app/data
# Note: get_daily_price.py now automatically calls merge_jsonl.py after fetching
cd /app/data
python /app/scripts/get_daily_price.py
cd /app
fi
# Step 2: Start MCP services in background
echo "🔧 Starting MCP services..."
cd /app/agent_tools
python start_mcp_services.py &
MCP_PID=$!
cd /app
python agent_tools/start_mcp_services.py &
MCP_PID=$!
# Step 3: Wait for services to initialize
echo "⏳ Waiting for MCP services to start..."
@@ -58,7 +63,19 @@ sleep 3
# Step 4: Run trading agent with config file
echo "🤖 Starting trading agent..."
CONFIG_FILE="${1:-configs/default_config.json}"
# Smart config selection: custom_config.json takes precedence if it exists
if [ -f "configs/custom_config.json" ]; then
CONFIG_FILE="configs/custom_config.json"
echo "✅ Using custom configuration: configs/custom_config.json"
elif [ -n "$1" ]; then
CONFIG_FILE="$1"
echo "✅ Using specified configuration: $CONFIG_FILE"
else
CONFIG_FILE="configs/default_config.json"
echo "✅ Using default configuration: configs/default_config.json"
fi
python main.py "$CONFIG_FILE"
# Cleanup on exit

View File

@@ -10,8 +10,8 @@ echo "🚀 Launching AI Trader Environment..."
echo "📊 Now getting and merging price data..."
cd ./data
# Note: get_daily_price.py now automatically calls merge_jsonl.py after fetching
python get_daily_price.py
python merge_jsonl.py
cd ../
echo "🔧 Now starting MCP services..."