Files
xer-mcp/specs/002-direct-db-access/tasks.md

738 lines
22 KiB
Markdown

# Implementation Tasks: Direct Database Access for Scripts
**Branch**: `002-direct-db-access` | **Spec**: [spec.md](./spec.md) | **Plan**: [plan.md](./plan.md)
## Quick Reference
- **Feature**: Direct Database Access for Scripts
- **Version**: 0.2.0
- **Test Command**: `pytest`
- **Lint Command**: `ruff check .`
---
## Phase 1: Setup
### Task 1.1: Create Feature Branch and Documentation Structure [X]
**Type**: Setup
**Why**: Establish isolated workspace for feature development
**Steps**:
1. Verify branch `002-direct-db-access` exists (already created during spec/plan)
2. Verify all design artifacts exist:
- `specs/002-direct-db-access/spec.md`
- `specs/002-direct-db-access/plan.md`
- `specs/002-direct-db-access/research.md`
- `specs/002-direct-db-access/data-model.md`
- `specs/002-direct-db-access/contracts/mcp-tools.json`
- `specs/002-direct-db-access/quickstart.md`
**Verification**: `git branch --show-current` shows `002-direct-db-access`
**Acceptance**: All design artifacts present and branch is active
---
## Phase 2: Foundational - DatabaseManager Extension
### Task 2.1: Add File-Based Database Support to DatabaseManager [X]
**Type**: Implementation
**Why**: Foundation for all persistent database features
**Dependencies**: None
**Contract Reference**: `contracts/mcp-tools.json` - DatabaseInfo schema
**Test First** (TDD):
```python
# tests/unit/test_db_manager.py
def test_initialize_with_memory_by_default():
"""Default initialization uses in-memory database."""
dm = DatabaseManager()
dm.initialize()
assert dm.db_path == ":memory:"
assert dm.is_persistent is False
def test_initialize_with_file_path():
"""Can initialize with explicit file path."""
dm = DatabaseManager()
dm.initialize(db_path="/tmp/test.db")
assert dm.db_path == "/tmp/test.db"
assert dm.is_persistent is True
# Cleanup
os.unlink("/tmp/test.db")
def test_initialize_with_empty_string_auto_generates_path():
"""Empty string db_path with source_file auto-generates path."""
dm = DatabaseManager()
dm.initialize(db_path="", source_file="/path/to/schedule.xer")
assert dm.db_path == "/path/to/schedule.sqlite"
assert dm.is_persistent is True
def test_file_database_persists_after_close():
"""File-based database persists after connection close."""
dm = DatabaseManager()
dm.initialize(db_path="/tmp/persist_test.db")
# Insert test data would go here
dm.close()
assert os.path.exists("/tmp/persist_test.db")
# Cleanup
os.unlink("/tmp/persist_test.db")
```
**Implementation**:
- Modify `src/xer_mcp/db/__init__.py`:
- Add `db_path` property to track current database path
- Add `is_persistent` property (True if file-based, False if in-memory)
- Add `source_file` property to track loaded XER file
- Add `loaded_at` property (datetime when data was loaded)
- Modify `initialize()` to accept optional `db_path` and `source_file` parameters
- If `db_path` is empty string and `source_file` provided, derive path from source file
- Use WAL mode for file-based databases: `PRAGMA journal_mode=WAL`
**Files Changed**:
- `src/xer_mcp/db/__init__.py`
- `tests/unit/test_db_manager.py` (new)
**Verification**: `pytest tests/unit/test_db_manager.py -v`
**Acceptance**: All unit tests pass; DatabaseManager supports both in-memory and file-based modes
---
### Task 2.2: Implement Atomic Write Pattern [X]
**Type**: Implementation
**Why**: Prevents corrupted database files if process interrupted during load
**Dependencies**: Task 2.1
**Contract Reference**: `research.md` - Atomic Write Strategy
**Test First** (TDD):
```python
# tests/unit/test_db_manager.py
def test_atomic_write_creates_temp_file_first():
"""Database is created at .tmp path first, then renamed."""
dm = DatabaseManager()
target = "/tmp/atomic_test.db"
# During initialization, temp file should exist
# After completion, only target should exist
dm.initialize(db_path=target)
assert os.path.exists(target)
assert not os.path.exists(target + ".tmp")
os.unlink(target)
def test_atomic_write_removes_temp_on_failure():
"""Temp file is cleaned up if initialization fails."""
# Test with invalid schema or similar failure scenario
pass
```
**Implementation**:
- In `initialize()` method when `db_path` is not `:memory:`:
1. Create connection to `{db_path}.tmp`
2. Execute schema creation
3. Close connection
4. Rename `{db_path}.tmp` to `{db_path}` (atomic on POSIX)
5. Reopen connection to final path
- Handle cleanup of `.tmp` file on failure
**Files Changed**:
- `src/xer_mcp/db/__init__.py`
- `tests/unit/test_db_manager.py`
**Verification**: `pytest tests/unit/test_db_manager.py -v`
**Acceptance**: Atomic write tests pass; no partial database files created on failure
---
### Task 2.3: Add Schema Introspection Query [X]
**Type**: Implementation
**Why**: Required for returning schema information in responses
**Dependencies**: Task 2.1
**Contract Reference**: `contracts/mcp-tools.json` - SchemaInfo, TableInfo, ColumnInfo schemas
**Test First** (TDD):
```python
# tests/unit/test_db_manager.py
def test_get_schema_info_returns_all_tables():
"""Schema info includes all database tables."""
dm = DatabaseManager()
dm.initialize()
schema = dm.get_schema_info()
assert schema["version"] == "0.2.0"
table_names = [t["name"] for t in schema["tables"]]
assert "projects" in table_names
assert "activities" in table_names
assert "relationships" in table_names
assert "wbs" in table_names
assert "calendars" in table_names
def test_get_schema_info_includes_column_details():
"""Schema info includes column names, types, and nullable."""
dm = DatabaseManager()
dm.initialize()
schema = dm.get_schema_info()
activities_table = next(t for t in schema["tables"] if t["name"] == "activities")
column_names = [c["name"] for c in activities_table["columns"]]
assert "task_id" in column_names
assert "task_name" in column_names
# Check column details
task_id_col = next(c for c in activities_table["columns"] if c["name"] == "task_id")
assert task_id_col["type"] == "TEXT"
assert task_id_col["nullable"] is False
def test_get_schema_info_includes_row_counts():
"""Schema info includes row counts for each table."""
dm = DatabaseManager()
dm.initialize()
schema = dm.get_schema_info()
for table in schema["tables"]:
assert "row_count" in table
assert isinstance(table["row_count"], int)
```
**Implementation**:
- Add `get_schema_info()` method to DatabaseManager:
- Query `sqlite_master` for table names
- Use `PRAGMA table_info(table_name)` for column details
- Use `PRAGMA foreign_key_list(table_name)` for foreign keys
- Query `SELECT COUNT(*) FROM table` for row counts
- Return SchemaInfo structure matching contract
**Files Changed**:
- `src/xer_mcp/db/__init__.py`
- `tests/unit/test_db_manager.py`
**Verification**: `pytest tests/unit/test_db_manager.py -v`
**Acceptance**: Schema introspection returns accurate table/column information
---
## Phase 3: User Story 1 - Load XER to Persistent Database (P1)
### Task 3.1: Extend load_xer Tool with db_path Parameter [X]
**Type**: Implementation
**Why**: Core feature - enables persistent database creation
**Dependencies**: Task 2.1, Task 2.2
**Contract Reference**: `contracts/mcp-tools.json` - load_xer inputSchema
**Test First** (TDD):
```python
# tests/contract/test_load_xer.py
@pytest.mark.asyncio
async def test_load_xer_with_db_path_creates_file(tmp_path, sample_xer_file):
"""load_xer with db_path creates persistent database file."""
db_file = tmp_path / "schedule.db"
result = await load_xer(sample_xer_file, db_path=str(db_file))
assert result["success"] is True
assert db_file.exists()
assert result["database"]["db_path"] == str(db_file)
assert result["database"]["is_persistent"] is True
@pytest.mark.asyncio
async def test_load_xer_with_empty_db_path_auto_generates(sample_xer_file):
"""load_xer with empty db_path generates path from XER filename."""
result = await load_xer(sample_xer_file, db_path="")
assert result["success"] is True
expected_db = sample_xer_file.replace(".xer", ".sqlite")
assert result["database"]["db_path"] == expected_db
assert result["database"]["is_persistent"] is True
# Cleanup
os.unlink(expected_db)
@pytest.mark.asyncio
async def test_load_xer_without_db_path_uses_memory(sample_xer_file):
"""load_xer without db_path uses in-memory database (backward compatible)."""
result = await load_xer(sample_xer_file)
assert result["success"] is True
assert result["database"]["db_path"] == ":memory:"
assert result["database"]["is_persistent"] is False
@pytest.mark.asyncio
async def test_load_xer_database_contains_all_data(tmp_path, sample_xer_file):
"""Persistent database contains all parsed data."""
db_file = tmp_path / "schedule.db"
result = await load_xer(sample_xer_file, db_path=str(db_file))
# Verify data via direct SQL
import sqlite3
conn = sqlite3.connect(str(db_file))
cursor = conn.execute("SELECT COUNT(*) FROM activities")
count = cursor.fetchone()[0]
conn.close()
assert count == result["activity_count"]
```
**Implementation**:
- Modify `src/xer_mcp/tools/load_xer.py`:
- Add `db_path: str | None = None` parameter
- Pass `db_path` and `file_path` (as source_file) to `db.initialize()`
- Include `database` field in response with DatabaseInfo
**Files Changed**:
- `src/xer_mcp/tools/load_xer.py`
- `tests/contract/test_load_xer.py`
**Verification**: `pytest tests/contract/test_load_xer.py -v`
**Acceptance**: load_xer creates persistent database when db_path provided
---
### Task 3.2: Add Database Info to load_xer Response [X]
**Type**: Implementation
**Why**: Response must include all info needed to connect to database
**Dependencies**: Task 3.1, Task 2.3
**Contract Reference**: `contracts/mcp-tools.json` - load_xer outputSchema.database
**Test First** (TDD):
```python
# tests/contract/test_load_xer.py
@pytest.mark.asyncio
async def test_load_xer_response_includes_database_info(tmp_path, sample_xer_file):
"""load_xer response includes complete database info."""
db_file = tmp_path / "schedule.db"
result = await load_xer(sample_xer_file, db_path=str(db_file))
assert "database" in result
db_info = result["database"]
assert "db_path" in db_info
assert "is_persistent" in db_info
assert "source_file" in db_info
assert "loaded_at" in db_info
assert "schema" in db_info
@pytest.mark.asyncio
async def test_load_xer_response_schema_includes_tables(tmp_path, sample_xer_file):
"""load_xer response schema includes table information."""
db_file = tmp_path / "schedule.db"
result = await load_xer(sample_xer_file, db_path=str(db_file))
schema = result["database"]["schema"]
assert "version" in schema
assert "tables" in schema
table_names = [t["name"] for t in schema["tables"]]
assert "activities" in table_names
assert "relationships" in table_names
```
**Implementation**:
- Modify `src/xer_mcp/tools/load_xer.py`:
- After successful load, call `db.get_schema_info()`
- Build DatabaseInfo response structure
- Include in return dictionary
**Files Changed**:
- `src/xer_mcp/tools/load_xer.py`
- `tests/contract/test_load_xer.py`
**Verification**: `pytest tests/contract/test_load_xer.py -v`
**Acceptance**: load_xer response includes complete DatabaseInfo with schema
---
### Task 3.3: Register db_path Parameter with MCP Server [X]
**Type**: Implementation
**Why**: MCP server must expose the new parameter to clients
**Dependencies**: Task 3.1
**Contract Reference**: `contracts/mcp-tools.json` - load_xer inputSchema
**Test First** (TDD):
```python
# tests/contract/test_load_xer.py
def test_load_xer_tool_schema_includes_db_path():
"""MCP tool schema includes db_path parameter."""
from xer_mcp.server import server
tools = server.list_tools()
load_xer_tool = next(t for t in tools if t.name == "load_xer")
props = load_xer_tool.inputSchema["properties"]
assert "db_path" in props
assert props["db_path"]["type"] == "string"
```
**Implementation**:
- Modify MCP tool registration in `src/xer_mcp/server.py`:
- Add `db_path` to load_xer tool inputSchema
- Update tool handler to pass db_path to load_xer function
**Files Changed**:
- `src/xer_mcp/server.py`
- `tests/contract/test_load_xer.py`
**Verification**: `pytest tests/contract/test_load_xer.py -v`
**Acceptance**: MCP server exposes db_path parameter for load_xer tool
---
### Task 3.4: Handle Database Write Errors [X]
**Type**: Implementation
**Why**: Clear error messages for write failures (FR-008)
**Dependencies**: Task 3.1
**Contract Reference**: `contracts/mcp-tools.json` - Error schema
**Test First** (TDD):
```python
# tests/contract/test_load_xer.py
@pytest.mark.asyncio
async def test_load_xer_error_on_unwritable_path(sample_xer_file):
"""load_xer returns error for unwritable path."""
result = await load_xer(sample_xer_file, db_path="/root/forbidden.db")
assert result["success"] is False
assert result["error"]["code"] == "FILE_NOT_WRITABLE"
@pytest.mark.asyncio
async def test_load_xer_error_on_invalid_path(sample_xer_file):
"""load_xer returns error for invalid path."""
result = await load_xer(sample_xer_file, db_path="/nonexistent/dir/file.db")
assert result["success"] is False
assert result["error"]["code"] == "FILE_NOT_WRITABLE"
```
**Implementation**:
- Add error handling in load_xer for database creation failures:
- Catch PermissionError → FILE_NOT_WRITABLE
- Catch OSError with ENOSPC → DISK_FULL
- Catch other sqlite3 errors → DATABASE_ERROR
**Files Changed**:
- `src/xer_mcp/tools/load_xer.py`
- `src/xer_mcp/errors.py` (add new error classes if needed)
- `tests/contract/test_load_xer.py`
**Verification**: `pytest tests/contract/test_load_xer.py -v`
**Acceptance**: Clear error messages returned for all database write failures
---
## Phase 4: User Story 2 - Retrieve Database Connection Information (P1)
### Task 4.1: Create get_database_info Tool [X]
**Type**: Implementation
**Why**: Allows retrieval of database info without reloading
**Dependencies**: Task 2.3
**Contract Reference**: `contracts/mcp-tools.json` - get_database_info
**Test First** (TDD):
```python
# tests/contract/test_get_database_info.py
@pytest.mark.asyncio
async def test_get_database_info_returns_current_database(tmp_path, sample_xer_file):
"""get_database_info returns info about currently loaded database."""
db_file = tmp_path / "schedule.db"
await load_xer(sample_xer_file, db_path=str(db_file))
result = await get_database_info()
assert "database" in result
assert result["database"]["db_path"] == str(db_file)
assert result["database"]["is_persistent"] is True
@pytest.mark.asyncio
async def test_get_database_info_error_when_no_database():
"""get_database_info returns error when no database loaded."""
# Reset database state
from xer_mcp.db import db
db.close()
result = await get_database_info()
assert "error" in result
assert result["error"]["code"] == "NO_FILE_LOADED"
@pytest.mark.asyncio
async def test_get_database_info_includes_schema(tmp_path, sample_xer_file):
"""get_database_info includes schema information."""
db_file = tmp_path / "schedule.db"
await load_xer(sample_xer_file, db_path=str(db_file))
result = await get_database_info()
assert "schema" in result["database"]
assert "tables" in result["database"]["schema"]
```
**Implementation**:
- Create `src/xer_mcp/tools/get_database_info.py`:
- Check if database is initialized
- Return NO_FILE_LOADED error if not
- Return DatabaseInfo structure with schema
**Files Changed**:
- `src/xer_mcp/tools/get_database_info.py` (new)
- `tests/contract/test_get_database_info.py` (new)
**Verification**: `pytest tests/contract/test_get_database_info.py -v`
**Acceptance**: get_database_info returns complete DatabaseInfo or appropriate error
---
### Task 4.2: Register get_database_info with MCP Server [X]
**Type**: Implementation
**Why**: MCP server must expose the new tool to clients
**Dependencies**: Task 4.1
**Contract Reference**: `contracts/mcp-tools.json` - get_database_info
**Test First** (TDD):
```python
# tests/contract/test_get_database_info.py
def test_get_database_info_tool_registered():
"""get_database_info tool is registered with MCP server."""
from xer_mcp.server import server
tools = server.list_tools()
tool_names = [t.name for t in tools]
assert "get_database_info" in tool_names
```
**Implementation**:
- Modify `src/xer_mcp/server.py`:
- Import get_database_info function
- Add tool definition with empty inputSchema
- Add handler for get_database_info calls
**Files Changed**:
- `src/xer_mcp/server.py`
- `tests/contract/test_get_database_info.py`
**Verification**: `pytest tests/contract/test_get_database_info.py -v`
**Acceptance**: get_database_info tool accessible via MCP
---
## Phase 5: User Story 3 - Query Database Schema Information (P2)
### Task 5.1: Add Primary Key Information to Schema [X]
**Type**: Implementation
**Why**: Helps developers understand table structure for queries
**Dependencies**: Task 2.3
**Contract Reference**: `contracts/mcp-tools.json` - TableInfo.primary_key
**Test First** (TDD):
```python
# tests/unit/test_db_manager.py
def test_schema_info_includes_primary_keys():
"""Schema info includes primary key for each table."""
dm = DatabaseManager()
dm.initialize()
schema = dm.get_schema_info()
activities_table = next(t for t in schema["tables"] if t["name"] == "activities")
assert "primary_key" in activities_table
assert "task_id" in activities_table["primary_key"]
```
**Implementation**:
- Enhance `get_schema_info()` in DatabaseManager:
- Use `PRAGMA table_info()` to identify PRIMARY KEY columns
- Add to TableInfo structure
**Files Changed**:
- `src/xer_mcp/db/__init__.py`
- `tests/unit/test_db_manager.py`
**Verification**: `pytest tests/unit/test_db_manager.py -v`
**Acceptance**: Primary key information included in schema response
---
### Task 5.2: Add Foreign Key Information to Schema [X]
**Type**: Implementation
**Why**: Documents table relationships for complex queries
**Dependencies**: Task 5.1
**Contract Reference**: `contracts/mcp-tools.json` - ForeignKeyInfo
**Test First** (TDD):
```python
# tests/unit/test_db_manager.py
def test_schema_info_includes_foreign_keys():
"""Schema info includes foreign key relationships."""
dm = DatabaseManager()
dm.initialize()
schema = dm.get_schema_info()
activities_table = next(t for t in schema["tables"] if t["name"] == "activities")
assert "foreign_keys" in activities_table
# activities.proj_id -> projects.proj_id
fk = next((fk for fk in activities_table["foreign_keys"]
if fk["column"] == "proj_id"), None)
assert fk is not None
assert fk["references_table"] == "projects"
assert fk["references_column"] == "proj_id"
```
**Implementation**:
- Enhance `get_schema_info()` in DatabaseManager:
- Use `PRAGMA foreign_key_list(table_name)` to get FK relationships
- Add to TableInfo structure
**Files Changed**:
- `src/xer_mcp/db/__init__.py`
- `tests/unit/test_db_manager.py`
**Verification**: `pytest tests/unit/test_db_manager.py -v`
**Acceptance**: Foreign key information included in schema response
---
## Phase 6: Polish
### Task 6.1: Integration Test - External Script Access [X]
**Type**: Testing
**Why**: Validates end-to-end workflow matches quickstart documentation
**Dependencies**: All previous tasks
**Contract Reference**: `quickstart.md` - Python Example
**Test**:
```python
# tests/integration/test_direct_db_access.py
@pytest.mark.asyncio
async def test_external_script_can_query_database(tmp_path, sample_xer_file):
"""External script can query database using returned path."""
db_file = tmp_path / "schedule.db"
result = await load_xer(sample_xer_file, db_path=str(db_file))
# Simulate external script access (as shown in quickstart.md)
import sqlite3
db_path = result["database"]["db_path"]
conn = sqlite3.connect(db_path)
conn.row_factory = sqlite3.Row
# Query milestones
cursor = conn.execute("""
SELECT task_code, task_name, target_start_date, milestone_type
FROM activities
WHERE task_type IN ('TT_Mile', 'TT_FinMile')
ORDER BY target_start_date
""")
milestones = cursor.fetchall()
conn.close()
assert len(milestones) > 0
assert all(row["task_code"] for row in milestones)
```
**Files Changed**:
- `tests/integration/test_direct_db_access.py` (new)
**Verification**: `pytest tests/integration/test_direct_db_access.py -v`
**Acceptance**: External script workflow matches quickstart documentation
---
### Task 6.2: Update Version to 0.2.0 [X]
**Type**: Configuration
**Why**: Semantic versioning for new feature release
**Dependencies**: All previous tasks
**Steps**:
1. Update version in `pyproject.toml` to `0.2.0`
2. Update version in schema introspection response to `0.2.0`
3. Update any version references in documentation
**Files Changed**:
- `pyproject.toml`
- `src/xer_mcp/db/__init__.py` (SCHEMA_VERSION constant)
**Verification**: `grep -r "0.2.0" pyproject.toml src/`
**Acceptance**: Version consistently shows 0.2.0 across project
---
### Task 6.3: Run Full Test Suite and Linting [X]
**Type**: Verification
**Why**: Ensure all tests pass and code meets standards
**Dependencies**: All previous tasks
**Steps**:
1. Run `pytest` - all tests must pass
2. Run `ruff check .` - no linting errors
3. Run `ruff format --check .` - code properly formatted
**Verification**:
```bash
pytest
ruff check .
ruff format --check .
```
**Acceptance**: All tests pass, no linting errors, code properly formatted
---
### Task 6.4: Commit and Prepare for Merge [X]
**Type**: Git Operations
**Why**: Prepare feature for merge to main branch
**Dependencies**: Task 6.3
**Steps**:
1. Review all changes with `git diff main`
2. Commit any uncommitted changes with descriptive messages
3. Verify branch is ready for PR/merge
**Verification**: `git status` shows clean working directory
**Acceptance**: All changes committed, branch ready for merge
---
## Summary
| Phase | Tasks | Focus |
|-------|-------|-------|
| Phase 1 | 1 | Setup and verification |
| Phase 2 | 3 | DatabaseManager foundation |
| Phase 3 | 4 | US1 - Load to persistent DB (P1) |
| Phase 4 | 2 | US2 - Retrieve DB info (P1) |
| Phase 5 | 2 | US3 - Schema information (P2) |
| Phase 6 | 4 | Integration, versioning, polish |
**Total Tasks**: 16
**Estimated Test Count**: ~25 new tests