fix: prevent price data overwrite on container restart

Preserve existing merged.jsonl to avoid data loss and API rate limits.
Only fetch new data if merged.jsonl is missing or empty.

Problem:
- Entrypoint always fetched fresh data from Alpha Vantage on every start
- Overwrote existing mounted data directory
- Caused API rate limit issues and data inconsistencies
- Lost historical data needed for backtesting specific date ranges

Solution:
- Check if merged.jsonl exists and has content before fetching
- Display stock count when using existing data
- Provide manual refresh instructions for when updates are needed

Benefits:
- Faster container startup (no API calls if data exists)
- Avoids Alpha Vantage rate limits (5 calls/min, 500/day)
- Preserves user's existing historical datasets
- Enables reliable backtesting with consistent data

To refresh data: rm data/merged.jsonl && docker-compose restart
This commit is contained in:
2025-10-30 23:27:58 -04:00
parent 2f2c1d6ea2
commit 9e05ce0891

View File

@@ -38,12 +38,18 @@ fi
echo "✅ Environment variables validated"
# Step 1: Data preparation
echo "📊 Fetching and merging price data..."
# Run scripts from /app/scripts but output to /app/data
cd /app/data
python /app/scripts/get_daily_price.py
python /app/scripts/merge_jsonl.py
cd /app
echo "📊 Checking price data..."
if [ -f "/app/data/merged.jsonl" ] && [ -s "/app/data/merged.jsonl" ]; then
echo "✅ Using existing price data ($(wc -l < /app/data/merged.jsonl) stocks)"
echo " To refresh data, delete /app/data/merged.jsonl and restart"
else
echo "📊 Fetching and merging price data..."
# Run scripts from /app/scripts but output to /app/data
cd /app/data
python /app/scripts/get_daily_price.py
python /app/scripts/merge_jsonl.py
cd /app
fi
# Step 2: Start MCP services in background
echo "🔧 Starting MCP services..."