# Research: Project Schedule Tools **Date**: 2026-01-06 **Branch**: `001-schedule-tools` ## XER File Format ### Decision: Parse tab-delimited format with %T table headers **Rationale**: XER is Primavera P6's native export format. It uses a simple text-based structure that's straightforward to parse without external libraries. **Format Structure**: ``` ERMHDR ...header info... %T TABLE_NAME %F field1 field2 field3 ... %R value1 value2 value3 ... %R value1 value2 value3 ... %T NEXT_TABLE ... %E ``` **Key Tables for Schedule Tools**: | Table | Purpose | Key Fields | |-------|---------|------------| | PROJECT | Project metadata | proj_id, proj_short_name, plan_start_date, plan_end_date | | TASK | Activities | task_id, task_code, task_name, task_type, target_start_date, target_end_date, act_start_date, act_end_date, driving_path_flag | | TASKPRED | Relationships | task_pred_id, task_id, pred_task_id, pred_type, lag_hr_cnt | | PROJWBS | WBS structure | wbs_id, wbs_short_name, wbs_name, parent_wbs_id, proj_id | | CALENDAR | Work calendars | clndr_id, clndr_name, day_hr_cnt | **Alternatives Considered**: - XML export: More complex to parse, larger file sizes - Database direct access: Requires P6 installation, not portable ## MCP Python SDK ### Decision: Use `mcp` package with stdio transport **Rationale**: The official MCP Python SDK provides a clean async interface for building MCP servers. Stdio transport is simplest for local tools. **Implementation Pattern**: ```python from mcp.server import Server from mcp.server.stdio import stdio_server server = Server("xer-mcp") @server.tool() async def load_xer(file_path: str, project_id: str | None = None) -> dict: """Load an XER file and optionally select a project.""" ... async def main(): async with stdio_server() as (read_stream, write_stream): await server.run(read_stream, write_stream, server.create_initialization_options()) ``` **Key Considerations**: - All tools are async functions decorated with `@server.tool()` - Tool parameters use Python type hints for JSON schema generation - Return values are automatically serialized to JSON - Errors should raise `McpError` with appropriate error codes **Alternatives Considered**: - SSE transport: Adds complexity, not needed for local use - Custom protocol: Would break MCP compatibility ## SQLite Schema Design ### Decision: In-memory SQLite with normalized tables **Rationale**: SQLite in-memory mode provides fast queries without file I/O overhead. Normalized tables map directly to XER structure while enabling efficient JOINs for relationship queries. **Schema Design**: ```sql -- Project table CREATE TABLE projects ( proj_id TEXT PRIMARY KEY, proj_short_name TEXT NOT NULL, plan_start_date TEXT, -- ISO8601 plan_end_date TEXT, loaded_at TEXT NOT NULL ); -- Activities table CREATE TABLE activities ( task_id TEXT PRIMARY KEY, proj_id TEXT NOT NULL REFERENCES projects(proj_id), wbs_id TEXT, task_code TEXT NOT NULL, task_name TEXT NOT NULL, task_type TEXT, -- TT_Task, TT_Mile, TT_LOE, etc. target_start_date TEXT, target_end_date TEXT, act_start_date TEXT, act_end_date TEXT, total_float_hr_cnt REAL, driving_path_flag TEXT, -- 'Y' or 'N' status_code TEXT ); -- Relationships table CREATE TABLE relationships ( task_pred_id TEXT PRIMARY KEY, task_id TEXT NOT NULL REFERENCES activities(task_id), pred_task_id TEXT NOT NULL REFERENCES activities(task_id), pred_type TEXT NOT NULL, -- PR_FS, PR_SS, PR_FF, PR_SF lag_hr_cnt REAL DEFAULT 0 ); -- WBS table CREATE TABLE wbs ( wbs_id TEXT PRIMARY KEY, proj_id TEXT NOT NULL REFERENCES projects(proj_id), parent_wbs_id TEXT REFERENCES wbs(wbs_id), wbs_short_name TEXT NOT NULL, wbs_name TEXT ); -- Indexes for common queries CREATE INDEX idx_activities_proj ON activities(proj_id); CREATE INDEX idx_activities_wbs ON activities(wbs_id); CREATE INDEX idx_activities_type ON activities(task_type); CREATE INDEX idx_activities_dates ON activities(target_start_date, target_end_date); CREATE INDEX idx_relationships_task ON relationships(task_id); CREATE INDEX idx_relationships_pred ON relationships(pred_task_id); CREATE INDEX idx_wbs_parent ON wbs(parent_wbs_id); ``` **Query Patterns**: - Pagination: `LIMIT ? OFFSET ?` with `COUNT(*)` for total - Date filtering: `WHERE target_start_date >= ? AND target_end_date <= ?` - Critical path: `WHERE driving_path_flag = 'Y'` - Predecessors: `SELECT * FROM relationships WHERE task_id = ?` - Successors: `SELECT * FROM relationships WHERE pred_task_id = ?` **Alternatives Considered**: - File-based SQLite: Adds complexity for file management, not needed for single-session use - In-memory dictionaries: Would require custom indexing for efficient queries - DuckDB: Overkill for this use case, larger dependency ## XER Parsing Strategy ### Decision: Streaming line-by-line parser with table handler registry **Rationale**: XER files can be large (50K+ activities). Streaming avoids loading entire file into memory. Table handler registry enables extensibility per constitution. **Implementation Approach**: 1. Read file line by line 2. Track current table context (%T lines) 3. Parse %F lines as field headers 4. Parse %R lines as records using current field map 5. Dispatch to registered table handler 6. Handler converts to model and inserts into SQLite **Encoding Handling**: - XER files use Windows-1252 encoding by default - Attempt UTF-8 first, fallback to Windows-1252 - Log encoding detection result ## Pagination Implementation ### Decision: Offset-based pagination with metadata **Rationale**: Simple to implement with SQLite's LIMIT/OFFSET. Metadata enables clients to navigate results. **Response Format**: ```python @dataclass class PaginatedResponse: items: list[dict] pagination: PaginationMetadata @dataclass class PaginationMetadata: total_count: int offset: int limit: int has_more: bool ``` **Default Limit**: 100 items (per spec clarification) ## Error Handling ### Decision: Structured MCP errors with codes **Rationale**: MCP protocol defines error format. Consistent error codes help clients handle failures. **Error Codes**: | Code | Name | When Used | |------|------|-----------| | -32001 | FILE_NOT_FOUND | XER file path doesn't exist | | -32002 | PARSE_ERROR | XER file is malformed | | -32003 | NO_FILE_LOADED | Query attempted before load | | -32004 | PROJECT_SELECTION_REQUIRED | Multi-project file without selection | | -32005 | ACTIVITY_NOT_FOUND | Requested activity ID doesn't exist | | -32006 | INVALID_PARAMETER | Bad filter/pagination parameters | ## Driving Relationship Flag **Research Date**: 2026-01-06 ### Question: What field in the XER TASKPRED table contains the driving relationship flag? **Finding**: The TASKPRED table in P6 XER files does NOT contain a direct `driving_flag` field. **Evidence**: Analysis of sample XER file (S48019R - Proposal Schedule): ``` %F task_pred_id task_id pred_task_id proj_id pred_proj_id pred_type lag_hr_cnt comments float_path aref arls ``` Fields available: - `task_pred_id` - Unique relationship identifier - `task_id` - Successor activity ID - `pred_task_id` - Predecessor activity ID - `proj_id` / `pred_proj_id` - Project identifiers - `pred_type` - Relationship type (PR_FS, PR_SS, PR_FF, PR_SF) - `lag_hr_cnt` - Lag duration in hours - `comments` - User comments - `float_path` - Float path indicator (contains dates, not boolean) - `aref` / `arls` - Activity reference dates ### Question: Where is driving/critical path information stored in P6 XER files? **Finding**: The `driving_path_flag` is stored at the ACTIVITY level on the TASK table, not on individual relationships. **Evidence**: ``` TASK table includes: driving_path_flag (Y/N) ``` This flag indicates whether an activity is on the driving/critical path, but does not indicate which specific predecessor relationship is driving that activity's dates. ### Question: Can driving relationships be derived from available data? **Finding**: Yes, driving relationships can be computed using schedule date comparison logic. A relationship is "driving" when the successor activity's early start is constrained by the predecessor's completion. For a Finish-to-Start (FS) relationship: ``` driving = (predecessor.early_end_date + lag_hours ≈ successor.early_start_date) ``` ### Decision: Compute driving flag at query time using early dates **Rationale**: 1. P6 does not export a pre-computed driving flag per relationship 2. The driving relationship determination can be computed from activity dates 3. This matches how P6 itself determines driving relationships in the UI **Implementation Approach**: 1. Early dates (`early_start_date`, `early_end_date`) are already parsed from TASK table 2. When querying relationships, compute `driving` by comparing dates 3. For FS: Compare `pred.early_end_date + lag` to `succ.early_start_date` 4. Use 1-hour tolerance for floating point date arithmetic **Alternatives Considered**: 1. **Static flag from XER**: Not available in standard exports 2. **Always false**: Would not provide value to users 3. **Require user to specify**: Adds complexity, not aligned with P6 behavior ### Schema Impact No schema changes needed for relationships table. Required activity date columns are already present: - `activities.early_start_date` - Already in schema ✓ - `activities.early_end_date` - Already in schema ✓ The driving flag will be computed at query time via JOIN on activity dates. ### Validation Plan - [ ] Verify early_start_date and early_end_date are parsed correctly from TASK table - [ ] Test driving computation against known P6 schedules - [ ] Confirm results match P6 "show driving" feature where possible