# Implementation Tasks: Direct Database Access for Scripts **Branch**: `002-direct-db-access` | **Spec**: [spec.md](./spec.md) | **Plan**: [plan.md](./plan.md) ## Quick Reference - **Feature**: Direct Database Access for Scripts - **Version**: 0.2.0 - **Test Command**: `pytest` - **Lint Command**: `ruff check .` --- ## Phase 1: Setup ### Task 1.1: Create Feature Branch and Documentation Structure [X] **Type**: Setup **Why**: Establish isolated workspace for feature development **Steps**: 1. Verify branch `002-direct-db-access` exists (already created during spec/plan) 2. Verify all design artifacts exist: - `specs/002-direct-db-access/spec.md` - `specs/002-direct-db-access/plan.md` - `specs/002-direct-db-access/research.md` - `specs/002-direct-db-access/data-model.md` - `specs/002-direct-db-access/contracts/mcp-tools.json` - `specs/002-direct-db-access/quickstart.md` **Verification**: `git branch --show-current` shows `002-direct-db-access` **Acceptance**: All design artifacts present and branch is active --- ## Phase 2: Foundational - DatabaseManager Extension ### Task 2.1: Add File-Based Database Support to DatabaseManager [X] **Type**: Implementation **Why**: Foundation for all persistent database features **Dependencies**: None **Contract Reference**: `contracts/mcp-tools.json` - DatabaseInfo schema **Test First** (TDD): ```python # tests/unit/test_db_manager.py def test_initialize_with_memory_by_default(): """Default initialization uses in-memory database.""" dm = DatabaseManager() dm.initialize() assert dm.db_path == ":memory:" assert dm.is_persistent is False def test_initialize_with_file_path(): """Can initialize with explicit file path.""" dm = DatabaseManager() dm.initialize(db_path="/tmp/test.db") assert dm.db_path == "/tmp/test.db" assert dm.is_persistent is True # Cleanup os.unlink("/tmp/test.db") def test_initialize_with_empty_string_auto_generates_path(): """Empty string db_path with source_file auto-generates path.""" dm = DatabaseManager() dm.initialize(db_path="", source_file="/path/to/schedule.xer") assert dm.db_path == "/path/to/schedule.sqlite" assert dm.is_persistent is True def test_file_database_persists_after_close(): """File-based database persists after connection close.""" dm = DatabaseManager() dm.initialize(db_path="/tmp/persist_test.db") # Insert test data would go here dm.close() assert os.path.exists("/tmp/persist_test.db") # Cleanup os.unlink("/tmp/persist_test.db") ``` **Implementation**: - Modify `src/xer_mcp/db/__init__.py`: - Add `db_path` property to track current database path - Add `is_persistent` property (True if file-based, False if in-memory) - Add `source_file` property to track loaded XER file - Add `loaded_at` property (datetime when data was loaded) - Modify `initialize()` to accept optional `db_path` and `source_file` parameters - If `db_path` is empty string and `source_file` provided, derive path from source file - Use WAL mode for file-based databases: `PRAGMA journal_mode=WAL` **Files Changed**: - `src/xer_mcp/db/__init__.py` - `tests/unit/test_db_manager.py` (new) **Verification**: `pytest tests/unit/test_db_manager.py -v` **Acceptance**: All unit tests pass; DatabaseManager supports both in-memory and file-based modes --- ### Task 2.2: Implement Atomic Write Pattern [X] **Type**: Implementation **Why**: Prevents corrupted database files if process interrupted during load **Dependencies**: Task 2.1 **Contract Reference**: `research.md` - Atomic Write Strategy **Test First** (TDD): ```python # tests/unit/test_db_manager.py def test_atomic_write_creates_temp_file_first(): """Database is created at .tmp path first, then renamed.""" dm = DatabaseManager() target = "/tmp/atomic_test.db" # During initialization, temp file should exist # After completion, only target should exist dm.initialize(db_path=target) assert os.path.exists(target) assert not os.path.exists(target + ".tmp") os.unlink(target) def test_atomic_write_removes_temp_on_failure(): """Temp file is cleaned up if initialization fails.""" # Test with invalid schema or similar failure scenario pass ``` **Implementation**: - In `initialize()` method when `db_path` is not `:memory:`: 1. Create connection to `{db_path}.tmp` 2. Execute schema creation 3. Close connection 4. Rename `{db_path}.tmp` to `{db_path}` (atomic on POSIX) 5. Reopen connection to final path - Handle cleanup of `.tmp` file on failure **Files Changed**: - `src/xer_mcp/db/__init__.py` - `tests/unit/test_db_manager.py` **Verification**: `pytest tests/unit/test_db_manager.py -v` **Acceptance**: Atomic write tests pass; no partial database files created on failure --- ### Task 2.3: Add Schema Introspection Query [X] **Type**: Implementation **Why**: Required for returning schema information in responses **Dependencies**: Task 2.1 **Contract Reference**: `contracts/mcp-tools.json` - SchemaInfo, TableInfo, ColumnInfo schemas **Test First** (TDD): ```python # tests/unit/test_db_manager.py def test_get_schema_info_returns_all_tables(): """Schema info includes all database tables.""" dm = DatabaseManager() dm.initialize() schema = dm.get_schema_info() assert schema["version"] == "0.2.0" table_names = [t["name"] for t in schema["tables"]] assert "projects" in table_names assert "activities" in table_names assert "relationships" in table_names assert "wbs" in table_names assert "calendars" in table_names def test_get_schema_info_includes_column_details(): """Schema info includes column names, types, and nullable.""" dm = DatabaseManager() dm.initialize() schema = dm.get_schema_info() activities_table = next(t for t in schema["tables"] if t["name"] == "activities") column_names = [c["name"] for c in activities_table["columns"]] assert "task_id" in column_names assert "task_name" in column_names # Check column details task_id_col = next(c for c in activities_table["columns"] if c["name"] == "task_id") assert task_id_col["type"] == "TEXT" assert task_id_col["nullable"] is False def test_get_schema_info_includes_row_counts(): """Schema info includes row counts for each table.""" dm = DatabaseManager() dm.initialize() schema = dm.get_schema_info() for table in schema["tables"]: assert "row_count" in table assert isinstance(table["row_count"], int) ``` **Implementation**: - Add `get_schema_info()` method to DatabaseManager: - Query `sqlite_master` for table names - Use `PRAGMA table_info(table_name)` for column details - Use `PRAGMA foreign_key_list(table_name)` for foreign keys - Query `SELECT COUNT(*) FROM table` for row counts - Return SchemaInfo structure matching contract **Files Changed**: - `src/xer_mcp/db/__init__.py` - `tests/unit/test_db_manager.py` **Verification**: `pytest tests/unit/test_db_manager.py -v` **Acceptance**: Schema introspection returns accurate table/column information --- ## Phase 3: User Story 1 - Load XER to Persistent Database (P1) ### Task 3.1: Extend load_xer Tool with db_path Parameter [X] **Type**: Implementation **Why**: Core feature - enables persistent database creation **Dependencies**: Task 2.1, Task 2.2 **Contract Reference**: `contracts/mcp-tools.json` - load_xer inputSchema **Test First** (TDD): ```python # tests/contract/test_load_xer.py @pytest.mark.asyncio async def test_load_xer_with_db_path_creates_file(tmp_path, sample_xer_file): """load_xer with db_path creates persistent database file.""" db_file = tmp_path / "schedule.db" result = await load_xer(sample_xer_file, db_path=str(db_file)) assert result["success"] is True assert db_file.exists() assert result["database"]["db_path"] == str(db_file) assert result["database"]["is_persistent"] is True @pytest.mark.asyncio async def test_load_xer_with_empty_db_path_auto_generates(sample_xer_file): """load_xer with empty db_path generates path from XER filename.""" result = await load_xer(sample_xer_file, db_path="") assert result["success"] is True expected_db = sample_xer_file.replace(".xer", ".sqlite") assert result["database"]["db_path"] == expected_db assert result["database"]["is_persistent"] is True # Cleanup os.unlink(expected_db) @pytest.mark.asyncio async def test_load_xer_without_db_path_uses_memory(sample_xer_file): """load_xer without db_path uses in-memory database (backward compatible).""" result = await load_xer(sample_xer_file) assert result["success"] is True assert result["database"]["db_path"] == ":memory:" assert result["database"]["is_persistent"] is False @pytest.mark.asyncio async def test_load_xer_database_contains_all_data(tmp_path, sample_xer_file): """Persistent database contains all parsed data.""" db_file = tmp_path / "schedule.db" result = await load_xer(sample_xer_file, db_path=str(db_file)) # Verify data via direct SQL import sqlite3 conn = sqlite3.connect(str(db_file)) cursor = conn.execute("SELECT COUNT(*) FROM activities") count = cursor.fetchone()[0] conn.close() assert count == result["activity_count"] ``` **Implementation**: - Modify `src/xer_mcp/tools/load_xer.py`: - Add `db_path: str | None = None` parameter - Pass `db_path` and `file_path` (as source_file) to `db.initialize()` - Include `database` field in response with DatabaseInfo **Files Changed**: - `src/xer_mcp/tools/load_xer.py` - `tests/contract/test_load_xer.py` **Verification**: `pytest tests/contract/test_load_xer.py -v` **Acceptance**: load_xer creates persistent database when db_path provided --- ### Task 3.2: Add Database Info to load_xer Response [X] **Type**: Implementation **Why**: Response must include all info needed to connect to database **Dependencies**: Task 3.1, Task 2.3 **Contract Reference**: `contracts/mcp-tools.json` - load_xer outputSchema.database **Test First** (TDD): ```python # tests/contract/test_load_xer.py @pytest.mark.asyncio async def test_load_xer_response_includes_database_info(tmp_path, sample_xer_file): """load_xer response includes complete database info.""" db_file = tmp_path / "schedule.db" result = await load_xer(sample_xer_file, db_path=str(db_file)) assert "database" in result db_info = result["database"] assert "db_path" in db_info assert "is_persistent" in db_info assert "source_file" in db_info assert "loaded_at" in db_info assert "schema" in db_info @pytest.mark.asyncio async def test_load_xer_response_schema_includes_tables(tmp_path, sample_xer_file): """load_xer response schema includes table information.""" db_file = tmp_path / "schedule.db" result = await load_xer(sample_xer_file, db_path=str(db_file)) schema = result["database"]["schema"] assert "version" in schema assert "tables" in schema table_names = [t["name"] for t in schema["tables"]] assert "activities" in table_names assert "relationships" in table_names ``` **Implementation**: - Modify `src/xer_mcp/tools/load_xer.py`: - After successful load, call `db.get_schema_info()` - Build DatabaseInfo response structure - Include in return dictionary **Files Changed**: - `src/xer_mcp/tools/load_xer.py` - `tests/contract/test_load_xer.py` **Verification**: `pytest tests/contract/test_load_xer.py -v` **Acceptance**: load_xer response includes complete DatabaseInfo with schema --- ### Task 3.3: Register db_path Parameter with MCP Server [X] **Type**: Implementation **Why**: MCP server must expose the new parameter to clients **Dependencies**: Task 3.1 **Contract Reference**: `contracts/mcp-tools.json` - load_xer inputSchema **Test First** (TDD): ```python # tests/contract/test_load_xer.py def test_load_xer_tool_schema_includes_db_path(): """MCP tool schema includes db_path parameter.""" from xer_mcp.server import server tools = server.list_tools() load_xer_tool = next(t for t in tools if t.name == "load_xer") props = load_xer_tool.inputSchema["properties"] assert "db_path" in props assert props["db_path"]["type"] == "string" ``` **Implementation**: - Modify MCP tool registration in `src/xer_mcp/server.py`: - Add `db_path` to load_xer tool inputSchema - Update tool handler to pass db_path to load_xer function **Files Changed**: - `src/xer_mcp/server.py` - `tests/contract/test_load_xer.py` **Verification**: `pytest tests/contract/test_load_xer.py -v` **Acceptance**: MCP server exposes db_path parameter for load_xer tool --- ### Task 3.4: Handle Database Write Errors [X] **Type**: Implementation **Why**: Clear error messages for write failures (FR-008) **Dependencies**: Task 3.1 **Contract Reference**: `contracts/mcp-tools.json` - Error schema **Test First** (TDD): ```python # tests/contract/test_load_xer.py @pytest.mark.asyncio async def test_load_xer_error_on_unwritable_path(sample_xer_file): """load_xer returns error for unwritable path.""" result = await load_xer(sample_xer_file, db_path="/root/forbidden.db") assert result["success"] is False assert result["error"]["code"] == "FILE_NOT_WRITABLE" @pytest.mark.asyncio async def test_load_xer_error_on_invalid_path(sample_xer_file): """load_xer returns error for invalid path.""" result = await load_xer(sample_xer_file, db_path="/nonexistent/dir/file.db") assert result["success"] is False assert result["error"]["code"] == "FILE_NOT_WRITABLE" ``` **Implementation**: - Add error handling in load_xer for database creation failures: - Catch PermissionError → FILE_NOT_WRITABLE - Catch OSError with ENOSPC → DISK_FULL - Catch other sqlite3 errors → DATABASE_ERROR **Files Changed**: - `src/xer_mcp/tools/load_xer.py` - `src/xer_mcp/errors.py` (add new error classes if needed) - `tests/contract/test_load_xer.py` **Verification**: `pytest tests/contract/test_load_xer.py -v` **Acceptance**: Clear error messages returned for all database write failures --- ## Phase 4: User Story 2 - Retrieve Database Connection Information (P1) ### Task 4.1: Create get_database_info Tool [X] **Type**: Implementation **Why**: Allows retrieval of database info without reloading **Dependencies**: Task 2.3 **Contract Reference**: `contracts/mcp-tools.json` - get_database_info **Test First** (TDD): ```python # tests/contract/test_get_database_info.py @pytest.mark.asyncio async def test_get_database_info_returns_current_database(tmp_path, sample_xer_file): """get_database_info returns info about currently loaded database.""" db_file = tmp_path / "schedule.db" await load_xer(sample_xer_file, db_path=str(db_file)) result = await get_database_info() assert "database" in result assert result["database"]["db_path"] == str(db_file) assert result["database"]["is_persistent"] is True @pytest.mark.asyncio async def test_get_database_info_error_when_no_database(): """get_database_info returns error when no database loaded.""" # Reset database state from xer_mcp.db import db db.close() result = await get_database_info() assert "error" in result assert result["error"]["code"] == "NO_FILE_LOADED" @pytest.mark.asyncio async def test_get_database_info_includes_schema(tmp_path, sample_xer_file): """get_database_info includes schema information.""" db_file = tmp_path / "schedule.db" await load_xer(sample_xer_file, db_path=str(db_file)) result = await get_database_info() assert "schema" in result["database"] assert "tables" in result["database"]["schema"] ``` **Implementation**: - Create `src/xer_mcp/tools/get_database_info.py`: - Check if database is initialized - Return NO_FILE_LOADED error if not - Return DatabaseInfo structure with schema **Files Changed**: - `src/xer_mcp/tools/get_database_info.py` (new) - `tests/contract/test_get_database_info.py` (new) **Verification**: `pytest tests/contract/test_get_database_info.py -v` **Acceptance**: get_database_info returns complete DatabaseInfo or appropriate error --- ### Task 4.2: Register get_database_info with MCP Server [X] **Type**: Implementation **Why**: MCP server must expose the new tool to clients **Dependencies**: Task 4.1 **Contract Reference**: `contracts/mcp-tools.json` - get_database_info **Test First** (TDD): ```python # tests/contract/test_get_database_info.py def test_get_database_info_tool_registered(): """get_database_info tool is registered with MCP server.""" from xer_mcp.server import server tools = server.list_tools() tool_names = [t.name for t in tools] assert "get_database_info" in tool_names ``` **Implementation**: - Modify `src/xer_mcp/server.py`: - Import get_database_info function - Add tool definition with empty inputSchema - Add handler for get_database_info calls **Files Changed**: - `src/xer_mcp/server.py` - `tests/contract/test_get_database_info.py` **Verification**: `pytest tests/contract/test_get_database_info.py -v` **Acceptance**: get_database_info tool accessible via MCP --- ## Phase 5: User Story 3 - Query Database Schema Information (P2) ### Task 5.1: Add Primary Key Information to Schema [X] **Type**: Implementation **Why**: Helps developers understand table structure for queries **Dependencies**: Task 2.3 **Contract Reference**: `contracts/mcp-tools.json` - TableInfo.primary_key **Test First** (TDD): ```python # tests/unit/test_db_manager.py def test_schema_info_includes_primary_keys(): """Schema info includes primary key for each table.""" dm = DatabaseManager() dm.initialize() schema = dm.get_schema_info() activities_table = next(t for t in schema["tables"] if t["name"] == "activities") assert "primary_key" in activities_table assert "task_id" in activities_table["primary_key"] ``` **Implementation**: - Enhance `get_schema_info()` in DatabaseManager: - Use `PRAGMA table_info()` to identify PRIMARY KEY columns - Add to TableInfo structure **Files Changed**: - `src/xer_mcp/db/__init__.py` - `tests/unit/test_db_manager.py` **Verification**: `pytest tests/unit/test_db_manager.py -v` **Acceptance**: Primary key information included in schema response --- ### Task 5.2: Add Foreign Key Information to Schema [X] **Type**: Implementation **Why**: Documents table relationships for complex queries **Dependencies**: Task 5.1 **Contract Reference**: `contracts/mcp-tools.json` - ForeignKeyInfo **Test First** (TDD): ```python # tests/unit/test_db_manager.py def test_schema_info_includes_foreign_keys(): """Schema info includes foreign key relationships.""" dm = DatabaseManager() dm.initialize() schema = dm.get_schema_info() activities_table = next(t for t in schema["tables"] if t["name"] == "activities") assert "foreign_keys" in activities_table # activities.proj_id -> projects.proj_id fk = next((fk for fk in activities_table["foreign_keys"] if fk["column"] == "proj_id"), None) assert fk is not None assert fk["references_table"] == "projects" assert fk["references_column"] == "proj_id" ``` **Implementation**: - Enhance `get_schema_info()` in DatabaseManager: - Use `PRAGMA foreign_key_list(table_name)` to get FK relationships - Add to TableInfo structure **Files Changed**: - `src/xer_mcp/db/__init__.py` - `tests/unit/test_db_manager.py` **Verification**: `pytest tests/unit/test_db_manager.py -v` **Acceptance**: Foreign key information included in schema response --- ## Phase 6: Polish ### Task 6.1: Integration Test - External Script Access [X] **Type**: Testing **Why**: Validates end-to-end workflow matches quickstart documentation **Dependencies**: All previous tasks **Contract Reference**: `quickstart.md` - Python Example **Test**: ```python # tests/integration/test_direct_db_access.py @pytest.mark.asyncio async def test_external_script_can_query_database(tmp_path, sample_xer_file): """External script can query database using returned path.""" db_file = tmp_path / "schedule.db" result = await load_xer(sample_xer_file, db_path=str(db_file)) # Simulate external script access (as shown in quickstart.md) import sqlite3 db_path = result["database"]["db_path"] conn = sqlite3.connect(db_path) conn.row_factory = sqlite3.Row # Query milestones cursor = conn.execute(""" SELECT task_code, task_name, target_start_date, milestone_type FROM activities WHERE task_type IN ('TT_Mile', 'TT_FinMile') ORDER BY target_start_date """) milestones = cursor.fetchall() conn.close() assert len(milestones) > 0 assert all(row["task_code"] for row in milestones) ``` **Files Changed**: - `tests/integration/test_direct_db_access.py` (new) **Verification**: `pytest tests/integration/test_direct_db_access.py -v` **Acceptance**: External script workflow matches quickstart documentation --- ### Task 6.2: Update Version to 0.2.0 [X] **Type**: Configuration **Why**: Semantic versioning for new feature release **Dependencies**: All previous tasks **Steps**: 1. Update version in `pyproject.toml` to `0.2.0` 2. Update version in schema introspection response to `0.2.0` 3. Update any version references in documentation **Files Changed**: - `pyproject.toml` - `src/xer_mcp/db/__init__.py` (SCHEMA_VERSION constant) **Verification**: `grep -r "0.2.0" pyproject.toml src/` **Acceptance**: Version consistently shows 0.2.0 across project --- ### Task 6.3: Run Full Test Suite and Linting [X] **Type**: Verification **Why**: Ensure all tests pass and code meets standards **Dependencies**: All previous tasks **Steps**: 1. Run `pytest` - all tests must pass 2. Run `ruff check .` - no linting errors 3. Run `ruff format --check .` - code properly formatted **Verification**: ```bash pytest ruff check . ruff format --check . ``` **Acceptance**: All tests pass, no linting errors, code properly formatted --- ### Task 6.4: Commit and Prepare for Merge [X] **Type**: Git Operations **Why**: Prepare feature for merge to main branch **Dependencies**: Task 6.3 **Steps**: 1. Review all changes with `git diff main` 2. Commit any uncommitted changes with descriptive messages 3. Verify branch is ready for PR/merge **Verification**: `git status` shows clean working directory **Acceptance**: All changes committed, branch ready for merge --- ## Summary | Phase | Tasks | Focus | |-------|-------|-------| | Phase 1 | 1 | Setup and verification | | Phase 2 | 3 | DatabaseManager foundation | | Phase 3 | 4 | US1 - Load to persistent DB (P1) | | Phase 4 | 2 | US2 - Retrieve DB info (P1) | | Phase 5 | 2 | US3 - Schema information (P2) | | Phase 6 | 4 | Integration, versioning, polish | **Total Tasks**: 16 **Estimated Test Count**: ~25 new tests