Add audit subagent and restructure skill for progressive disclosure

- Add audit subagent that runs automatically after bill entries
- Create download-attachment.sh for retrieving invoice/receipt PDFs
- Create verify-pdf.py for PDF extraction with OCR fallback
- Restructure SKILL.md from 856 to 177 lines using reference files
- Move detailed content to references/:
  - schema.md: table schemas
  - workflows.md: code examples
  - queries.md: SQL queries and financial reports
  - audit.md: audit queries and remediation steps
This commit is contained in:
2026-01-12 12:45:32 -05:00
parent 70ac6be681
commit 4ebc19408c
7 changed files with 1215 additions and 568 deletions

679
SKILL.md
View File

@@ -5,55 +5,46 @@ description: Use when working with the Grist double-entry accounting system - re
# Grist Double-Entry Accounting System
## Overview
Double-entry accounting for sole proprietorship. Every transaction creates balanced journal entries (debits = credits).
A complete double-entry accounting system for sole proprietorship service businesses. Every transaction creates balanced journal entries (debits = credits). Account balances roll up through parent-child hierarchy.
## Quick Reference
| Task | Action |
|------|--------|
| Record vendor invoice | Create Bill + BillLines + Transaction + TransactionLines, then audit |
| Record payment | Create payment Transaction + BillPayment, update Bill status |
| Query balances | Use `sql_query` on Accounts table |
| Generate reports | See [queries.md](references/queries.md) |
## Recording Transactions: Decision Guide
| Source Document | What to Create |
|-----------------|----------------|
| **Invoice/Bill PDF from vendor** | Bill + BillLines + Transaction + TransactionLines |
| **Invoice/Bill from vendor** | Bill + BillLines + Transaction + TransactionLines |
| **Receipt showing payment** | BillPayment + attach Receipt to existing Bill |
| **Bank statement entry** | Transaction + TransactionLines only |
| **Journal adjustment** | Transaction + TransactionLines only |
**Key Rule:** If there's a vendor invoice number, always create a Bill record—not just a transaction. Bills provide:
- Vendor tracking and AP aging
- Invoice/Receipt attachment storage
- Payment status tracking (Open/Partial/Paid)
**Key Rule:** If there's a vendor invoice number, always create a Bill record.
### Quick Reference: Vendor Invoice Paid by Owner
When recording an invoice that was already paid by the owner:
1. Upload Invoice attachment → get attachment_id
2. Create Bill (Status: "Paid", Invoice: ["L", attachment_id])
3. Create BillLine(s) with expense account and amount
4. Create Transaction (debit Expense, credit Due to Owner)
5. Create TransactionLines
6. Update Bill.EntryTransaction to link to transaction
7. Create BillPayment record
8. Upload Receipt attachment if available
## MCP Tools Available
## MCP Tools
| Tool | Purpose |
|------|---------|
| `mcp__grist-accounting__list_documents` | List accessible Grist documents |
| `mcp__grist-accounting__list_tables` | List tables in a document |
| `mcp__grist-accounting__describe_table` | Get column schema for a table |
| `mcp__grist-accounting__get_records` | Fetch records (with optional filter, sort, limit) |
| `mcp__grist-accounting__add_records` | Insert new records, returns `{"inserted_ids": [...]}` |
| `mcp__grist-accounting__update_records` | Update existing records by ID |
| `mcp__grist-accounting__delete_records` | Delete records by ID |
| `mcp__grist-accounting__sql_query` | Run read-only SQL queries |
| `list_documents` | List accessible Grist documents |
| `list_tables` | List tables in a document |
| `describe_table` | Get column schema |
| `get_records` | Fetch records (filter, sort, limit) |
| `add_records` | Insert records, returns `{"inserted_ids": [...]}` |
| `update_records` | Update by ID |
| `delete_records` | Delete by ID |
| `sql_query` | Read-only SQL |
The document name is `accounting` for all operations.
Document name: `accounting`
## Date Handling
All date fields use **Unix timestamps** (seconds since 1970-01-01 UTC).
All dates use **Unix timestamps** (seconds since epoch).
| Date | Timestamp |
|------|-----------|
@@ -62,92 +53,6 @@ All date fields use **Unix timestamps** (seconds since 1970-01-01 UTC).
| Dec 1, 2025 | 1764633600 |
| Jan 1, 2026 | 1767312000 |
Python: `int(datetime(2025, 10, 1).timestamp())`
## Complete Table Schemas
### Accounts
| Column | Type | Notes |
|--------|------|-------|
| Code | Text | Account number (e.g., "5080") |
| Name | Text | Account name |
| Type | Choice | "Asset", "Liability", "Equity", "Income", "Expense" |
| Parent | Ref:Accounts | Parent account for hierarchy (0 = top-level) |
| Description | Text | |
| IsActive | Bool | |
| Balance | Formula | Calculated from transactions |
### Vendors
| Column | Type | Notes |
|--------|------|-------|
| Name | Text | Vendor name |
| DefaultExpenseAccount | Ref:Accounts | Auto-fills on bill lines |
| PaymentTerms | Choice | "Due on Receipt", "Net 15", "Net 30" |
| Notes | Text | |
| IsActive | Bool | |
| Balance | Formula | Sum of unpaid bills |
### Items
| Column | Type | Notes |
|--------|------|-------|
| Name | Text | Item name (e.g., "Software Subscription") |
| DefaultAccount | Ref:Accounts | Expense account for this item |
| DefaultDescription | Text | Auto-fills on bill lines |
| IsActive | Bool | |
### Bills
| Column | Type | Notes |
|--------|------|-------|
| Vendor | Ref:Vendors | Required |
| BillNumber | Text | Invoice number from vendor |
| BillDate | Date | Unix timestamp |
| DueDate | Date | Unix timestamp |
| Status | Choice | "Open", "Partial", "Paid" |
| Memo | Text | |
| EntryTransaction | Ref:Transactions | Link to journal entry |
| Invoice | Attachments | Vendor invoice document (use `["L", id]` format) |
| Receipt | Attachments | Payment receipt/confirmation (use `["L", id]` format) |
| Amount | Formula | Sum of BillLines.Amount |
| AmountPaid | Formula | Sum of BillPayments.Amount |
| AmountDue | Formula | Amount - AmountPaid |
### BillLines
| Column | Type | Notes |
|--------|------|-------|
| Bill | Ref:Bills | Required |
| Item | Ref:Items | Optional - auto-fills Account/Description |
| Account | Ref:Accounts | Expense account |
| Description | Text | |
| Amount | Numeric | Line item amount |
### Transactions
| Column | Type | Notes |
|--------|------|-------|
| Date | Date | Unix timestamp |
| Description | Text | Transaction description |
| Reference | Text | Check number, invoice reference, etc. |
| Status | Choice | "Draft", "Posted", "Cleared" |
| Memo | Text | |
| Total | Formula | Sum of debits |
| IsBalanced | Formula | True if debits = credits |
### TransactionLines
| Column | Type | Notes |
|--------|------|-------|
| Transaction | Ref:Transactions | Required |
| Account | Ref:Accounts | Required |
| Debit | Numeric | Debit amount (or 0) |
| Credit | Numeric | Credit amount (or 0) |
| Memo | Text | |
### BillPayments
| Column | Type | Notes |
|--------|------|-------|
| Bill | Ref:Bills | Required |
| Transaction | Ref:Transactions | Payment journal entry |
| Amount | Numeric | Payment amount |
| PaymentDate | Date | Unix timestamp |
## Key Account IDs
| ID | Code | Name | Type |
@@ -157,478 +62,116 @@ Python: `int(datetime(2025, 10, 1).timestamp())`
| 22 | 2203 | Due to Owner | Liability |
| 36 | 5080 | Software & Subscriptions | Expense |
Query all accounts:
```sql
SELECT id, Code, Name, Type FROM Accounts WHERE IsActive = true ORDER BY Code
```
Query all: `SELECT id, Code, Name, Type FROM Accounts WHERE IsActive = true ORDER BY Code`
## Account Types
| Type | Normal Balance | Increases With | Examples |
|------|----------------|----------------|----------|
| Asset | Debit | Debit | Cash, AR, Prepaid |
| Liability | Credit | Credit | AP, Credit Cards, Due to Owner |
| Equity | Credit | Credit | Owner's Investment, Draws, Retained Earnings |
| Income | Credit | Credit | Service Revenue, Interest Income |
| Expense | Debit | Debit | Rent, Utilities, Office Supplies |
## Complete Workflows
### Create a Vendor
```python
add_records("Vendors", [{
"Name": "Acme Corp",
"DefaultExpenseAccount": 36, # Software & Subscriptions
"PaymentTerms": "Due on Receipt",
"Notes": "Software vendor",
"IsActive": True
}])
# Returns: {"inserted_ids": [vendor_id]}
```
### Create Items for Common Purchases
```python
add_records("Items", [{
"Name": "Monthly Software",
"DefaultAccount": 36,
"DefaultDescription": "Monthly SaaS subscription",
"IsActive": True
}])
```
### Complete Bill Entry (5 Steps)
**Step 1: Create Bill Header**
```python
add_records("Bills", [{
"Vendor": 1, # vendor_id
"BillNumber": "INV-001",
"BillDate": 1759708800, # Unix timestamp
"DueDate": 1759708800,
"Status": "Open",
"Memo": "October services"
}])
# Returns: {"inserted_ids": [bill_id]}
```
**Step 2: Create Bill Line(s)**
```python
add_records("BillLines", [{
"Bill": 1, # bill_id from step 1
"Item": 1, # optional - auto-fills Account/Description
"Account": 36, # expense account
"Description": "Monthly subscription",
"Amount": 100.00
}])
```
**Step 3: Create Journal Entry**
```python
# Transaction header
add_records("Transactions", [{
"Date": 1759708800,
"Description": "Acme Corp - October services",
"Reference": "INV-001",
"Status": "Posted"
}])
# Returns: {"inserted_ids": [txn_id]}
# Transaction lines: Debit expense, Credit AP
add_records("TransactionLines", [
{"Transaction": 1, "Account": 36, "Debit": 100.00, "Credit": 0, "Memo": "Monthly subscription"},
{"Transaction": 1, "Account": 4, "Debit": 0, "Credit": 100.00, "Memo": "Monthly subscription"}
])
```
**Step 4: Link Bill to Transaction**
```python
update_records("Bills", [{"id": 1, "fields": {"EntryTransaction": 1}}])
```
**Step 5: Upload Invoice (if available)**
If an invoice PDF is available, upload and link it to the Invoice field:
```bash
# Get session token, then upload to Invoice field
bash /path/to/scripts/upload-attachment.sh invoice.pdf Bills 1 $TOKEN Invoice
```
Or for batch uploads, use a script (see Batch Operations).
### Pay Bill from Checking Account
```python
# Step 1: Create payment transaction
add_records("Transactions", [{
"Date": 1760832000,
"Description": "Payment - Acme Corp INV-001",
"Reference": "Check #1001",
"Status": "Cleared"
}])
# Returns: {"inserted_ids": [txn_id]}
# Step 2: Debit AP, Credit Checking
add_records("TransactionLines", [
{"Transaction": 2, "Account": 4, "Debit": 100.00, "Credit": 0, "Memo": "Pay INV-001"},
{"Transaction": 2, "Account": 14, "Debit": 0, "Credit": 100.00, "Memo": "Pay INV-001"}
])
# Step 3: Create BillPayment record
add_records("BillPayments", [{
"Bill": 1,
"Transaction": 2,
"Amount": 100.00,
"PaymentDate": 1760832000
}])
# Step 4: Update bill status
update_records("Bills", [{"id": 1, "fields": {"Status": "Paid"}}])
# Step 5: Upload receipt (if available)
bash /path/to/scripts/upload-attachment.sh receipt.pdf Bills 1 $TOKEN Receipt
```
### Pay Bill via Owner Reimbursement
When the owner pays a business expense personally:
```python
# Step 1: Create payment transaction
add_records("Transactions", [{
"Date": 1760832000,
"Description": "Owner payment - Acme Corp INV-001",
"Reference": "Owner Reimb",
"Status": "Posted"
}])
# Step 2: Debit AP, Credit Due to Owner (not Checking)
add_records("TransactionLines", [
{"Transaction": 2, "Account": 4, "Debit": 100.00, "Credit": 0, "Memo": "Pay INV-001"},
{"Transaction": 2, "Account": 22, "Debit": 0, "Credit": 100.00, "Memo": "Owner paid"}
])
# Step 3: Create BillPayment record
add_records("BillPayments", [{
"Bill": 1,
"Transaction": 2,
"Amount": 100.00,
"PaymentDate": 1760832000
}])
# Step 4: Update bill status
update_records("Bills", [{"id": 1, "fields": {"Status": "Paid"}}])
# Step 5: Upload receipt (if available)
bash /path/to/scripts/upload-attachment.sh receipt.pdf Bills 1 $TOKEN Receipt
```
### Reimburse Owner
When business pays back the owner:
```python
add_records("Transactions", [{
"Date": 1762041600,
"Description": "Owner reimbursement",
"Reference": "Transfer",
"Status": "Cleared"
}])
add_records("TransactionLines", [
{"Transaction": 3, "Account": 22, "Debit": 500.00, "Credit": 0, "Memo": "Reimburse owner"},
{"Transaction": 3, "Account": 14, "Debit": 0, "Credit": 500.00, "Memo": "Reimburse owner"}
])
```
## Batch Operations
When entering multiple bills efficiently:
1. **Create all Bills first** → collect inserted IDs
2. **Create all BillLines** referencing bill IDs
3. **Create all Transactions** → collect inserted IDs
4. **Create all TransactionLines** referencing transaction IDs
5. **Update all Bills** with EntryTransaction links in one call
6. (If paying) Create payment transactions, lines, and BillPayments
7. **Upload invoice attachments** if files are available
### Batch Attachment Uploads
When invoice files are available, upload them after bill entry:
1. Request session token with write permission (1 hour TTL for batch work)
2. Create a mapping of bill_id → invoice file path
3. Loop: upload each file, link to corresponding bill
```bash
# Example batch upload pattern for invoices
TOKEN=$(request_session_token with write permission)
for each (bill_id, invoice_path):
curl -X POST -H "Authorization: Bearer $TOKEN" \
-F "file=@$invoice_path" \
https://grist-mcp.bballou.com/api/v1/attachments
# Returns attachment_id
update_records("Bills", [{"id": bill_id, "fields": {"Invoice": ["L", attachment_id]}}])
# For receipts (after payment):
update_records("Bills", [{"id": bill_id, "fields": {"Receipt": ["L", attachment_id]}}])
```
Example batch update:
```python
update_records("Bills", [
{"id": 1, "fields": {"EntryTransaction": 1}},
{"id": 2, "fields": {"EntryTransaction": 2}},
{"id": 3, "fields": {"EntryTransaction": 3}}
])
```
## Common Queries
### Unpaid Bills by Vendor
```sql
SELECT v.Name, b.BillNumber, b.BillDate, b.Amount, b.AmountDue
FROM Bills b
JOIN Vendors v ON b.Vendor = v.id
WHERE b.Status IN ('Open', 'Partial')
ORDER BY b.DueDate
```
### Bills Summary by Vendor
```sql
SELECT v.Name as Vendor, COUNT(b.id) as Bills, SUM(b.Amount) as Total, SUM(b.AmountDue) as Due
FROM Bills b
JOIN Vendors v ON b.Vendor = v.id
GROUP BY v.Name
ORDER BY Total DESC
```
### Account Balances (Non-Zero)
```sql
SELECT Code, Name, Type, Balance
FROM Accounts
WHERE Balance != 0
ORDER BY Code
```
### Owner Reimbursement Balance
```sql
SELECT Balance FROM Accounts WHERE Code = '2203'
```
### Expense Summary by Account
```sql
SELECT a.Code, a.Name, a.Balance
FROM Accounts a
WHERE a.Type = 'Expense' AND a.Balance != 0
ORDER BY a.Balance DESC
```
### Transaction History for Account
```sql
SELECT t.Date, t.Description, t.Reference, tl.Debit, tl.Credit
FROM TransactionLines tl
JOIN Transactions t ON tl.Transaction = t.id
WHERE tl.Account = 36
ORDER BY t.Date DESC
```
### Verify All Transactions Balance
```sql
SELECT id, Description, Total, IsBalanced
FROM Transactions
WHERE IsBalanced = false
```
## Financial Reports
### Balance Sheet
Shows Assets = Liabilities + Equity at a point in time.
**Important:** Parent accounts roll up child balances. Query only top-level parents (Parent = 0) to avoid double-counting.
```sql
-- Assets, Liabilities, Equity (top-level only)
SELECT Code, Name, Type, Balance
FROM Accounts
WHERE Type IN ('Asset', 'Liability', 'Equity')
AND Parent = 0
ORDER BY Type, Code
```
```sql
-- Net Income (for Equity section)
-- Query leaf expense accounts only (no children)
SELECT
COALESCE(SUM(CASE WHEN Type = 'Income' THEN Balance ELSE 0 END), 0) -
COALESCE(SUM(CASE WHEN Type = 'Expense' THEN Balance ELSE 0 END), 0) as NetIncome
FROM Accounts
WHERE Type IN ('Income', 'Expense')
AND id NOT IN (SELECT DISTINCT Parent FROM Accounts WHERE Parent != 0)
```
**Presentation:**
| **Assets** | |
|---|---:|
| Cash & Bank Accounts | $X.XX |
| Accounts Receivable | $X.XX |
| **Total Assets** | **$X.XX** |
| **Liabilities** | |
|---|---:|
| Accounts Payable | $X.XX |
| Due to Owner | $X.XX |
| **Total Liabilities** | **$X.XX** |
| **Equity** | |
|---|---:|
| Retained Earnings | $X.XX |
| Net Income (Loss) | $X.XX |
| **Total Equity** | **$X.XX** |
| **Total Liabilities + Equity** | **$X.XX** |
### Income Statement
Shows Revenue - Expenses = Net Income for a period.
```sql
-- All income and expense accounts (leaf accounts only)
SELECT Code, Name, Type, Balance
FROM Accounts
WHERE Type IN ('Income', 'Expense')
AND Balance != 0
AND id NOT IN (SELECT DISTINCT Parent FROM Accounts WHERE Parent != 0)
ORDER BY Type DESC, Code
```
**Presentation:**
| **Income** | |
|---|---:|
| Service Revenue | $X.XX |
| **Total Income** | **$X.XX** |
| **Expenses** | |
|---|---:|
| Software & Subscriptions | $X.XX |
| Professional Services | $X.XX |
| **Total Expenses** | **$X.XX** |
| **Net Income (Loss)** | **$X.XX** |
### Trial Balance
Lists all accounts with non-zero balances. Debits should equal Credits.
```sql
SELECT
Code,
Name,
Type,
CASE WHEN Type IN ('Asset', 'Expense') THEN Balance ELSE 0 END as Debit,
CASE WHEN Type IN ('Liability', 'Equity', 'Income') THEN Balance ELSE 0 END as Credit
FROM Accounts
WHERE Balance != 0
AND id NOT IN (SELECT DISTINCT Parent FROM Accounts WHERE Parent != 0)
ORDER BY Code
```
### Accounts Payable Aging
```sql
SELECT
v.Name as Vendor,
b.BillNumber,
b.BillDate,
b.DueDate,
b.AmountDue,
CASE
WHEN b.DueDate >= strftime('%s', 'now') THEN 'Current'
WHEN b.DueDate >= strftime('%s', 'now') - 2592000 THEN '1-30 Days'
WHEN b.DueDate >= strftime('%s', 'now') - 5184000 THEN '31-60 Days'
ELSE '60+ Days'
END as Aging
FROM Bills b
JOIN Vendors v ON b.Vendor = v.id
WHERE b.Status IN ('Open', 'Partial')
ORDER BY b.DueDate
```
| Type | Normal Balance | Increases With |
|------|----------------|----------------|
| Asset | Debit | Debit |
| Liability | Credit | Credit |
| Equity | Credit | Credit |
| Income | Credit | Credit |
| Expense | Debit | Debit |
## Bill Entry Workflow (6 Steps)
1. Create Bill header with Vendor, BillNumber, BillDate, DueDate, Status="Open"
2. Create BillLine(s) with expense Account and Amount
3. Create Transaction + TransactionLines (Dr Expense, Cr AP)
4. Link Bill.EntryTransaction to transaction ID
5. Upload Invoice attachment if available
6. **Run post-entry audit (REQUIRED)**
For detailed code: see [workflows.md](references/workflows.md)
## Payment Workflows
**Pay from Checking:**
1. Create Transaction (Dr AP, Cr Checking)
2. Create BillPayment record
3. Update Bill.Status = "Paid"
4. Upload Receipt if available
5. **Run post-payment audit**
**Owner pays personally:**
Same as above but Cr Due to Owner (id=22) instead of Checking
For detailed code: see [workflows.md](references/workflows.md)
## Validation Checklist
After entering bills, verify:
- [ ] Total bills match expected: `SELECT SUM(Amount) FROM Bills`
- [ ] All transactions balanced: `SELECT * FROM Transactions WHERE IsBalanced = false`
- [ ] AP balance correct: `SELECT Balance FROM Accounts WHERE Code = '2000'`
- [ ] Expense accounts increased appropriately
- [ ] Vendor balances reflect unpaid bills
- [ ] Invoices attached: `SELECT id, BillNumber FROM Bills WHERE Invoice IS NULL`
- [ ] Receipts attached for paid bills: `SELECT id, BillNumber FROM Bills WHERE Status = 'Paid' AND Receipt IS NULL`
After entering bills:
- [ ] `SELECT * FROM Transactions WHERE IsBalanced = false` returns empty
- [ ] `SELECT Balance FROM Accounts WHERE Code = '2000'` shows correct AP
- [ ] `SELECT id, BillNumber FROM Bills WHERE Invoice IS NULL` - upload missing
## Common Mistakes
| Mistake | Fix |
|---------|-----|
| Transaction not balanced | Ensure SUM(Debit) = SUM(Credit) before saving |
| Wrong debit/credit direction | Assets/Expenses increase with debit; Liabilities/Equity/Income increase with credit |
| Posting to parent account | Post to leaf accounts (1001 Checking, not 1000 Cash) |
| Forgetting AP entry for bills | Bills need both the expense debit AND the AP credit |
| Missing EntryTransaction link | Always update Bill.EntryTransaction after creating journal entry |
| Bill status not updated | Manually set Status to "Paid" after full payment |
| Using string dates | Dates must be Unix timestamps (seconds), not strings |
| Missing invoice/receipt | Upload invoice after bill entry, receipt after payment |
| Transaction not balanced | Ensure SUM(Debit) = SUM(Credit) |
| Wrong debit/credit direction | Assets/Expenses: debit increases; Liabilities/Equity/Income: credit increases |
| Posting to parent account | Post to leaf accounts (1001 not 1000) |
| Missing EntryTransaction link | Always link Bill to Transaction |
| Using string dates | Use Unix timestamps |
## Uploading Attachments
Attachments (invoices, receipts) are uploaded via the HTTP proxy endpoint, not MCP tools. This is efficient for binary files.
### Workflow
1. **Request session token** with write permission via MCP
2. **Upload file** via `POST /api/v1/attachments` with multipart/form-data
3. **Link attachment** to record via `update_records`
### Upload Script
Use `scripts/upload-attachment.sh` in this skill directory:
```bash
# Get session token first (via MCP request_session_token tool)
# Then run:
bash scripts/upload-attachment.sh invoice.pdf Bills 13 # Invoice column (default)
bash scripts/upload-attachment.sh invoice.pdf Bills 13 $TOKEN # With token
bash scripts/upload-attachment.sh receipt.pdf Bills 13 $TOKEN Receipt # Receipt column
# Environment variable for custom endpoint:
GRIST_MCP_URL=https://custom.example.com bash scripts/upload-attachment.sh ...
# Get session token, then:
bash scripts/upload-attachment.sh invoice.pdf Bills {id} $TOKEN Invoice
bash scripts/upload-attachment.sh receipt.pdf Bills {id} $TOKEN Receipt
```
Run `bash scripts/upload-attachment.sh` without arguments for full usage.
## Audit Subagent
### Linking Attachments Manually
**REQUIRED:** Run audit checks after every bill entry before considering complete.
Grist attachment columns use format: `["L", attachment_id]`
### Behavior
```python
# Link invoice to bill
update_records("Bills", [{"id": 13, "fields": {"Invoice": ["L", 1]}}])
Claude MUST run post-entry audit checks. The audit:
1. Executes independently from entry workflow
2. Validates all aspects of newly created records
3. Reports findings in structured format
4. Does not auto-correct - alerts user to take action
# Link receipt to bill (after payment)
update_records("Bills", [{"id": 13, "fields": {"Receipt": ["L", 2]}}])
### Audit Categories
| Category | Severity | Description |
|----------|----------|-------------|
| Transaction Balance | Critical | Debits must equal credits |
| Account Usage | Error | Correct account types |
| Bill Linkage | Error | EntryTransaction and Vendor set |
| Amount Match | Error | Bill.Amount matches transaction |
| PDF Verification | Warning | Document values match database |
| Missing Attachments | Warning | Invoice/Receipt attached |
### Quick Audit
```sql
-- Check transaction balanced
SELECT IsBalanced FROM Transactions WHERE id = {txn_id}
-- Check bill integrity
SELECT id, Vendor, EntryTransaction, Amount FROM Bills WHERE id = {bill_id}
```
## Formula Columns (Auto-Calculated)
### Output Format
| Table.Column | Description |
|--------------|-------------|
| Accounts.Balance | OwnBalance + ChildrenBalance |
| Transactions.IsBalanced | True if sum of debits = sum of credits |
| Transactions.Total | Sum of debit amounts |
| Bills.Amount | Sum of BillLines.Amount |
| Bills.AmountPaid | Sum of BillPayments.Amount |
| Bills.AmountDue | Amount - AmountPaid |
| Vendors.Balance | Sum of AmountDue for unpaid bills |
| Check | Status | Details |
|-------|--------|---------|
| Transaction Balanced | PASS/FAIL | ... |
| Bill Integrity | PASS/FAIL | ... |
| PDF Verification | PASS/WARN/SKIP | ... |
For full audit queries and remediation: see [audit.md](references/audit.md)
## Reference Files
| File | Contents |
|------|----------|
| [references/schema.md](references/schema.md) | Complete table schemas |
| [references/workflows.md](references/workflows.md) | Detailed code examples |
| [references/queries.md](references/queries.md) | SQL queries and financial reports |
| [references/audit.md](references/audit.md) | Audit queries and remediation |

161
references/audit.md Normal file
View File

@@ -0,0 +1,161 @@
# Audit Reference
Detailed audit queries, workflows, and remediation steps.
## Contents
- [Audit SQL Queries](#audit-sql-queries)
- [PDF Verification Workflow](#pdf-verification-workflow)
- [Post-Entry Audit Checklist](#post-entry-audit-checklist)
- [Output Formats](#output-formats)
- [Remediation Steps](#remediation-steps)
## Audit SQL Queries
### Check 1: Unbalanced Transactions
```sql
-- Simple check using formula column
SELECT id, Date, Description, IsBalanced FROM Transactions WHERE IsBalanced = 0
-- Detailed check with actual sums (for debugging)
SELECT t.id, t.Description,
(SELECT SUM(Debit) FROM TransactionLines WHERE Transaction = t.id) as TotalDebit,
(SELECT SUM(Credit) FROM TransactionLines WHERE Transaction = t.id) as TotalCredit
FROM Transactions t
WHERE t.IsBalanced = 0
```
### Check 2: Invalid Account Usage
Detect debits to non-debit-normal accounts or credits to non-credit-normal accounts:
```sql
SELECT t.id, t.Description, tl.id as LineId, a.Name, a.Type,
tl.Debit, tl.Credit
FROM TransactionLines tl
JOIN Transactions t ON tl.Transaction = t.id
JOIN Accounts a ON tl.Account = a.id
WHERE (tl.Debit > 0 AND a.Type NOT IN ('Asset', 'Expense'))
OR (tl.Credit > 0 AND a.Type NOT IN ('Liability', 'Equity', 'Income'))
```
Note: This query flags unusual patterns. Some are valid (e.g., paying down AP debits a Liability). Review flagged items manually.
### Check 3: Bills Missing Required Links
```sql
SELECT b.id, b.BillNumber, b.Status, b.EntryTransaction, b.Vendor
FROM Bills b
WHERE b.EntryTransaction IS NULL
OR b.Vendor IS NULL
```
### Check 4: Bill/Transaction Amount Mismatch
```sql
SELECT b.id, b.BillNumber, b.Amount as BillAmount,
(SELECT SUM(tl.Debit) FROM TransactionLines tl
JOIN Accounts a ON tl.Account = a.id
WHERE tl.Transaction = b.EntryTransaction AND a.Type = 'Expense') as TxnExpense
FROM Bills b
WHERE b.EntryTransaction IS NOT NULL
AND b.Amount != (SELECT SUM(tl.Debit) FROM TransactionLines tl
JOIN Accounts a ON tl.Account = a.id
WHERE tl.Transaction = b.EntryTransaction AND a.Type = 'Expense')
```
### Check 5: Paid Bills Without BillPayments
```sql
SELECT b.id, b.BillNumber, b.Amount, b.AmountPaid, b.Status
FROM Bills b
WHERE b.Status = 'Paid' AND (b.AmountPaid IS NULL OR b.AmountPaid = 0)
```
### Check 6: Bills Missing Attachments
```sql
SELECT b.id, b.BillNumber, b.Invoice, b.Receipt, b.Status
FROM Bills b
WHERE b.Invoice IS NULL
OR (b.Status = 'Paid' AND b.Receipt IS NULL)
```
## PDF Verification Workflow
When an invoice attachment exists, verify its contents match the bill record:
1. **Download attachment**
```bash
bash scripts/download-attachment.sh <attachment_id> /tmp/invoice.pdf $TOKEN
```
2. **Extract invoice data**
```bash
python scripts/verify-pdf.py /tmp/invoice.pdf --json
```
3. **Compare extracted values**
- Invoice number vs Bill.BillNumber
- Date vs Bill.BillDate (allow 1 day tolerance)
- Amount vs Bill.Amount (must match within $0.01)
- Vendor name vs Vendors.Name (fuzzy match)
4. **Report discrepancies** with severity levels
## Post-Entry Audit Checklist
After completing a bill entry, run these checks on the newly created records:
**Step 1: Transaction Balance**
```sql
SELECT IsBalanced FROM Transactions WHERE id = {txn_id}
```
Expected: `true`
**Step 2: Account Usage**
Verify the transaction lines use correct accounts:
- Expense account (Type = 'Expense') for the debit
- AP account (id=4, code 2000) for the credit
**Step 3: Bill Integrity**
```sql
SELECT id, Vendor, EntryTransaction, Amount
FROM Bills WHERE id = {bill_id}
```
Expected: All fields populated, Amount > 0
**Step 4: PDF Verification** (if Invoice attachment exists)
Run the PDF verification workflow above.
## Output Formats
### Single Entry Audit
| Check | Status | Details |
|-------|--------|---------|
| Transaction Balanced | PASS | Debits = Credits = $X.XX |
| Account Usage | PASS | Expense: 5080, AP: 2000 |
| Bill Integrity | PASS | All required fields set |
| PDF Verification | WARN | Date mismatch: PDF shows 10/5, Bill has 10/6 |
### Full Audit Report
| Category | Severity | Count | Details |
|----------|----------|-------|---------|
| Unbalanced Transactions | Critical | 0 | None |
| Account Misuse | Warning | 2 | Txn #5, #12 |
| Missing Bill Links | Error | 1 | Bill #123 |
| Amount Mismatches | Error | 0 | None |
| PDF Discrepancies | Warning | 3 | Bills #1, #2, #5 |
| Missing Attachments | Warning | 5 | Bills #3, #4, #6, #7, #8 |
### Severity Levels
- **PASS**: Check passed
- **WARN**: Minor discrepancy, review recommended
- **ERROR**: Significant issue, correction required
- **CRITICAL**: Data integrity problem, must fix immediately
## Remediation Steps
| Issue | Remediation |
|-------|-------------|
| Unbalanced transaction | Review TransactionLines, add/adjust lines until SUM(Debit) = SUM(Credit) |
| Wrong account type | Update TransactionLine.Account to correct account |
| Missing EntryTransaction | Link bill to transaction: `update_records("Bills", [{"id": X, "fields": {"EntryTransaction": Y}}])` |
| Missing Vendor | Set Bill.Vendor to appropriate vendor ID |
| Amount mismatch | Review bill lines and transaction lines, correct the discrepancy |
| PDF mismatch | Verify source document, update bill fields if database is wrong |
| Missing attachment | Upload invoice/receipt using `scripts/upload-attachment.sh` |

181
references/queries.md Normal file
View File

@@ -0,0 +1,181 @@
# SQL Queries and Reports
Common queries and financial report templates.
## Contents
- [Common Queries](#common-queries)
- [Financial Reports](#financial-reports)
- [Balance Sheet](#balance-sheet)
- [Income Statement](#income-statement)
- [Trial Balance](#trial-balance)
- [AP Aging](#accounts-payable-aging)
## Common Queries
### Unpaid Bills by Vendor
```sql
SELECT v.Name, b.BillNumber, b.BillDate, b.Amount, b.AmountDue
FROM Bills b
JOIN Vendors v ON b.Vendor = v.id
WHERE b.Status IN ('Open', 'Partial')
ORDER BY b.DueDate
```
### Bills Summary by Vendor
```sql
SELECT v.Name as Vendor, COUNT(b.id) as Bills, SUM(b.Amount) as Total, SUM(b.AmountDue) as Due
FROM Bills b
JOIN Vendors v ON b.Vendor = v.id
GROUP BY v.Name
ORDER BY Total DESC
```
### Account Balances (Non-Zero)
```sql
SELECT Code, Name, Type, Balance
FROM Accounts
WHERE Balance != 0
ORDER BY Code
```
### Owner Reimbursement Balance
```sql
SELECT Balance FROM Accounts WHERE Code = '2203'
```
### Expense Summary by Account
```sql
SELECT a.Code, a.Name, a.Balance
FROM Accounts a
WHERE a.Type = 'Expense' AND a.Balance != 0
ORDER BY a.Balance DESC
```
### Transaction History for Account
```sql
SELECT t.Date, t.Description, t.Reference, tl.Debit, tl.Credit
FROM TransactionLines tl
JOIN Transactions t ON tl.Transaction = t.id
WHERE tl.Account = {account_id}
ORDER BY t.Date DESC
```
### Verify All Transactions Balance
```sql
SELECT id, Description, Total, IsBalanced
FROM Transactions
WHERE IsBalanced = false
```
## Financial Reports
### Balance Sheet
Shows Assets = Liabilities + Equity at a point in time.
**Important:** Parent accounts roll up child balances. Query only top-level parents (Parent = 0) to avoid double-counting.
```sql
-- Assets, Liabilities, Equity (top-level only)
SELECT Code, Name, Type, Balance
FROM Accounts
WHERE Type IN ('Asset', 'Liability', 'Equity')
AND Parent = 0
ORDER BY Type, Code
```
```sql
-- Net Income (for Equity section)
SELECT
COALESCE(SUM(CASE WHEN Type = 'Income' THEN Balance ELSE 0 END), 0) -
COALESCE(SUM(CASE WHEN Type = 'Expense' THEN Balance ELSE 0 END), 0) as NetIncome
FROM Accounts
WHERE Type IN ('Income', 'Expense')
AND id NOT IN (SELECT DISTINCT Parent FROM Accounts WHERE Parent != 0)
```
**Presentation:**
| **Assets** | |
|---|---:|
| Cash & Bank Accounts | $X.XX |
| Accounts Receivable | $X.XX |
| **Total Assets** | **$X.XX** |
| **Liabilities** | |
|---|---:|
| Accounts Payable | $X.XX |
| Due to Owner | $X.XX |
| **Total Liabilities** | **$X.XX** |
| **Equity** | |
|---|---:|
| Retained Earnings | $X.XX |
| Net Income (Loss) | $X.XX |
| **Total Equity** | **$X.XX** |
| **Total Liabilities + Equity** | **$X.XX** |
### Income Statement
Shows Revenue - Expenses = Net Income for a period.
```sql
SELECT Code, Name, Type, Balance
FROM Accounts
WHERE Type IN ('Income', 'Expense')
AND Balance != 0
AND id NOT IN (SELECT DISTINCT Parent FROM Accounts WHERE Parent != 0)
ORDER BY Type DESC, Code
```
**Presentation:**
| **Income** | |
|---|---:|
| Service Revenue | $X.XX |
| **Total Income** | **$X.XX** |
| **Expenses** | |
|---|---:|
| Software & Subscriptions | $X.XX |
| Professional Services | $X.XX |
| **Total Expenses** | **$X.XX** |
| **Net Income (Loss)** | **$X.XX** |
### Trial Balance
Lists all accounts with non-zero balances. Debits should equal Credits.
```sql
SELECT
Code,
Name,
Type,
CASE WHEN Type IN ('Asset', 'Expense') THEN Balance ELSE 0 END as Debit,
CASE WHEN Type IN ('Liability', 'Equity', 'Income') THEN Balance ELSE 0 END as Credit
FROM Accounts
WHERE Balance != 0
AND id NOT IN (SELECT DISTINCT Parent FROM Accounts WHERE Parent != 0)
ORDER BY Code
```
### Accounts Payable Aging
```sql
SELECT
v.Name as Vendor,
b.BillNumber,
b.BillDate,
b.DueDate,
b.AmountDue,
CASE
WHEN b.DueDate >= strftime('%s', 'now') THEN 'Current'
WHEN b.DueDate >= strftime('%s', 'now') - 2592000 THEN '1-30 Days'
WHEN b.DueDate >= strftime('%s', 'now') - 5184000 THEN '31-60 Days'
ELSE '60+ Days'
END as Aging
FROM Bills b
JOIN Vendors v ON b.Vendor = v.id
WHERE b.Status IN ('Open', 'Partial')
ORDER BY b.DueDate
```

97
references/schema.md Normal file
View File

@@ -0,0 +1,97 @@
# Database Schema
Complete table schemas for the Grist accounting system.
## Accounts
| Column | Type | Notes |
|--------|------|-------|
| Code | Text | Account number (e.g., "5080") |
| Name | Text | Account name |
| Type | Choice | "Asset", "Liability", "Equity", "Income", "Expense" |
| Parent | Ref:Accounts | Parent account for hierarchy (0 = top-level) |
| Description | Text | |
| IsActive | Bool | |
| Balance | Formula | Calculated from transactions |
## Vendors
| Column | Type | Notes |
|--------|------|-------|
| Name | Text | Vendor name |
| DefaultExpenseAccount | Ref:Accounts | Auto-fills on bill lines |
| PaymentTerms | Choice | "Due on Receipt", "Net 15", "Net 30" |
| Notes | Text | |
| IsActive | Bool | |
| Balance | Formula | Sum of unpaid bills |
## Items
| Column | Type | Notes |
|--------|------|-------|
| Name | Text | Item name (e.g., "Software Subscription") |
| DefaultAccount | Ref:Accounts | Expense account for this item |
| DefaultDescription | Text | Auto-fills on bill lines |
| IsActive | Bool | |
## Bills
| Column | Type | Notes |
|--------|------|-------|
| Vendor | Ref:Vendors | Required |
| BillNumber | Text | Invoice number from vendor |
| BillDate | Date | Unix timestamp |
| DueDate | Date | Unix timestamp |
| Status | Choice | "Open", "Partial", "Paid" |
| Memo | Text | |
| EntryTransaction | Ref:Transactions | Link to journal entry |
| Invoice | Attachments | Vendor invoice document (use `["L", id]` format) |
| Receipt | Attachments | Payment receipt/confirmation (use `["L", id]` format) |
| Amount | Formula | Sum of BillLines.Amount |
| AmountPaid | Formula | Sum of BillPayments.Amount |
| AmountDue | Formula | Amount - AmountPaid |
## BillLines
| Column | Type | Notes |
|--------|------|-------|
| Bill | Ref:Bills | Required |
| Item | Ref:Items | Optional - auto-fills Account/Description |
| Account | Ref:Accounts | Expense account |
| Description | Text | |
| Amount | Numeric | Line item amount |
## Transactions
| Column | Type | Notes |
|--------|------|-------|
| Date | Date | Unix timestamp |
| Description | Text | Transaction description |
| Reference | Text | Check number, invoice reference, etc. |
| Status | Choice | "Draft", "Posted", "Cleared" |
| Memo | Text | |
| Total | Formula | Sum of debits |
| IsBalanced | Formula | True if debits = credits |
## TransactionLines
| Column | Type | Notes |
|--------|------|-------|
| Transaction | Ref:Transactions | Required |
| Account | Ref:Accounts | Required |
| Debit | Numeric | Debit amount (or 0) |
| Credit | Numeric | Credit amount (or 0) |
| Memo | Text | |
## BillPayments
| Column | Type | Notes |
|--------|------|-------|
| Bill | Ref:Bills | Required |
| Transaction | Ref:Transactions | Payment journal entry |
| Amount | Numeric | Payment amount |
| PaymentDate | Date | Unix timestamp |
## Formula Columns (Auto-Calculated)
| Table.Column | Description |
|--------------|-------------|
| Accounts.Balance | OwnBalance + ChildrenBalance |
| Transactions.IsBalanced | True if sum of debits = sum of credits |
| Transactions.Total | Sum of debit amounts |
| Bills.Amount | Sum of BillLines.Amount |
| Bills.AmountPaid | Sum of BillPayments.Amount |
| Bills.AmountDue | Amount - AmountPaid |
| Vendors.Balance | Sum of AmountDue for unpaid bills |

229
references/workflows.md Normal file
View File

@@ -0,0 +1,229 @@
# Workflow Examples
Detailed code examples for common accounting operations.
## Contents
- [Create a Vendor](#create-a-vendor)
- [Create Items](#create-items-for-common-purchases)
- [Complete Bill Entry](#complete-bill-entry-6-steps)
- [Pay Bill from Checking](#pay-bill-from-checking-account)
- [Pay Bill via Owner](#pay-bill-via-owner-reimbursement)
- [Reimburse Owner](#reimburse-owner)
- [Batch Operations](#batch-operations)
## Create a Vendor
```python
add_records("Vendors", [{
"Name": "Acme Corp",
"DefaultExpenseAccount": 36, # Software & Subscriptions
"PaymentTerms": "Due on Receipt",
"Notes": "Software vendor",
"IsActive": True
}])
# Returns: {"inserted_ids": [vendor_id]}
```
## Create Items for Common Purchases
```python
add_records("Items", [{
"Name": "Monthly Software",
"DefaultAccount": 36,
"DefaultDescription": "Monthly SaaS subscription",
"IsActive": True
}])
```
## Complete Bill Entry (6 Steps)
**Step 1: Create Bill Header**
```python
add_records("Bills", [{
"Vendor": 1, # vendor_id
"BillNumber": "INV-001",
"BillDate": 1759708800, # Unix timestamp
"DueDate": 1759708800,
"Status": "Open",
"Memo": "October services"
}])
# Returns: {"inserted_ids": [bill_id]}
```
**Step 2: Create Bill Line(s)**
```python
add_records("BillLines", [{
"Bill": 1, # bill_id from step 1
"Item": 1, # optional - auto-fills Account/Description
"Account": 36, # expense account
"Description": "Monthly subscription",
"Amount": 100.00
}])
```
**Step 3: Create Journal Entry**
```python
# Transaction header
add_records("Transactions", [{
"Date": 1759708800,
"Description": "Acme Corp - October services",
"Reference": "INV-001",
"Status": "Posted"
}])
# Returns: {"inserted_ids": [txn_id]}
# Transaction lines: Debit expense, Credit AP
add_records("TransactionLines", [
{"Transaction": 1, "Account": 36, "Debit": 100.00, "Credit": 0, "Memo": "Monthly subscription"},
{"Transaction": 1, "Account": 4, "Debit": 0, "Credit": 100.00, "Memo": "Monthly subscription"}
])
```
**Step 4: Link Bill to Transaction**
```python
update_records("Bills", [{"id": 1, "fields": {"EntryTransaction": 1}}])
```
**Step 5: Upload Invoice (if available)**
```bash
bash scripts/upload-attachment.sh invoice.pdf Bills 1 $TOKEN Invoice
```
**Step 6: Post-Entry Audit (REQUIRED)**
Run audit checks before concluding. See [Audit Reference](audit.md) for details.
```sql
-- Check 1: Transaction balanced
SELECT IsBalanced FROM Transactions WHERE id = {txn_id}
-- Expected: true
-- Check 2: Bill integrity
SELECT id, Vendor, EntryTransaction, Amount FROM Bills WHERE id = {bill_id}
-- Expected: All fields populated, Amount > 0
```
## Pay Bill from Checking Account
```python
# Step 1: Create payment transaction
add_records("Transactions", [{
"Date": 1760832000,
"Description": "Payment - Acme Corp INV-001",
"Reference": "Check #1001",
"Status": "Cleared"
}])
# Returns: {"inserted_ids": [txn_id]}
# Step 2: Debit AP, Credit Checking
add_records("TransactionLines", [
{"Transaction": 2, "Account": 4, "Debit": 100.00, "Credit": 0, "Memo": "Pay INV-001"},
{"Transaction": 2, "Account": 14, "Debit": 0, "Credit": 100.00, "Memo": "Pay INV-001"}
])
# Step 3: Create BillPayment record
add_records("BillPayments", [{
"Bill": 1,
"Transaction": 2,
"Amount": 100.00,
"PaymentDate": 1760832000
}])
# Step 4: Update bill status
update_records("Bills", [{"id": 1, "fields": {"Status": "Paid"}}])
# Step 5: Upload receipt (if available)
# bash scripts/upload-attachment.sh receipt.pdf Bills 1 $TOKEN Receipt
# Step 6: Post-Payment Audit (REQUIRED)
# Verify payment transaction balances and bill status updated correctly
```
## Pay Bill via Owner Reimbursement
When the owner pays a business expense personally:
```python
# Step 1: Create payment transaction
add_records("Transactions", [{
"Date": 1760832000,
"Description": "Owner payment - Acme Corp INV-001",
"Reference": "Owner Reimb",
"Status": "Posted"
}])
# Step 2: Debit AP, Credit Due to Owner (not Checking)
add_records("TransactionLines", [
{"Transaction": 2, "Account": 4, "Debit": 100.00, "Credit": 0, "Memo": "Pay INV-001"},
{"Transaction": 2, "Account": 22, "Debit": 0, "Credit": 100.00, "Memo": "Owner paid"}
])
# Step 3: Create BillPayment record
add_records("BillPayments", [{
"Bill": 1,
"Transaction": 2,
"Amount": 100.00,
"PaymentDate": 1760832000
}])
# Step 4: Update bill status
update_records("Bills", [{"id": 1, "fields": {"Status": "Paid"}}])
# Step 5: Upload receipt (if available)
# bash scripts/upload-attachment.sh receipt.pdf Bills 1 $TOKEN Receipt
# Step 6: Post-Payment Audit (REQUIRED)
```
## Reimburse Owner
When business pays back the owner:
```python
add_records("Transactions", [{
"Date": 1762041600,
"Description": "Owner reimbursement",
"Reference": "Transfer",
"Status": "Cleared"
}])
add_records("TransactionLines", [
{"Transaction": 3, "Account": 22, "Debit": 500.00, "Credit": 0, "Memo": "Reimburse owner"},
{"Transaction": 3, "Account": 14, "Debit": 0, "Credit": 500.00, "Memo": "Reimburse owner"}
])
```
## Batch Operations
When entering multiple bills efficiently:
1. **Create all Bills first** → collect inserted IDs
2. **Create all BillLines** referencing bill IDs
3. **Create all Transactions** → collect inserted IDs
4. **Create all TransactionLines** referencing transaction IDs
5. **Update all Bills** with EntryTransaction links in one call
6. (If paying) Create payment transactions, lines, and BillPayments
7. **Upload invoice attachments** if files are available
### Batch Attachment Uploads
```bash
# Example batch upload pattern
TOKEN=$(request_session_token with write permission)
for each (bill_id, invoice_path):
curl -X POST -H "Authorization: Bearer $TOKEN" \
-F "file=@$invoice_path" \
https://grist-mcp.bballou.com/api/v1/attachments
# Returns attachment_id
update_records("Bills", [{"id": bill_id, "fields": {"Invoice": ["L", attachment_id]}}])
```
### Batch Update Example
```python
update_records("Bills", [
{"id": 1, "fields": {"EntryTransaction": 1}},
{"id": 2, "fields": {"EntryTransaction": 2}},
{"id": 3, "fields": {"EntryTransaction": 3}}
])
```

View File

@@ -0,0 +1,56 @@
#!/bin/bash
# download-attachment.sh - Download attachment from Grist via MCP proxy
# Usage: ./download-attachment.sh <attachment_id> <output_file> [token]
#
# Examples:
# ./download-attachment.sh 11 invoice.pdf # prompts for token
# ./download-attachment.sh 11 invoice.pdf sess_abc123... # with token
set -e
ATTACHMENT_ID="$1"
OUTPUT_FILE="$2"
TOKEN="$3"
if [[ -z "$ATTACHMENT_ID" || -z "$OUTPUT_FILE" ]]; then
echo "Usage: $0 <attachment_id> <output_file> [token]"
echo ""
echo "Arguments:"
echo " attachment_id ID of the attachment to download"
echo " output_file Path to save the downloaded file"
echo " token Session token (optional, will prompt if not provided)"
echo ""
echo "Examples:"
echo " $0 11 invoice.pdf # Download attachment 11"
echo " $0 11 invoice.pdf \$TOKEN # With pre-obtained token"
echo ""
echo "To get attachment IDs, query the Bills table:"
echo " SELECT id, BillNumber, Invoice FROM Bills"
exit 1
fi
# Get token if not provided
if [[ -z "$TOKEN" ]]; then
echo "Paste session token (from request_session_token MCP call with read permission):"
read -r TOKEN
fi
# Base URL for the grist-mcp proxy
BASE_URL="${GRIST_MCP_URL:-https://grist-mcp.bballou.com}"
# Download attachment
echo "Downloading attachment $ATTACHMENT_ID to $OUTPUT_FILE..."
HTTP_CODE=$(curl -s -w "%{http_code}" -o "$OUTPUT_FILE" \
-H "Authorization: Bearer $TOKEN" \
"$BASE_URL/api/v1/attachments/$ATTACHMENT_ID")
if [[ "$HTTP_CODE" -eq 200 ]]; then
FILE_SIZE=$(stat -f%z "$OUTPUT_FILE" 2>/dev/null || stat -c%s "$OUTPUT_FILE" 2>/dev/null)
echo "Success! Downloaded $FILE_SIZE bytes to $OUTPUT_FILE"
else
echo "Download failed with HTTP $HTTP_CODE"
echo "Response:"
cat "$OUTPUT_FILE"
rm -f "$OUTPUT_FILE"
exit 1
fi

380
scripts/verify-pdf.py Normal file
View File

@@ -0,0 +1,380 @@
#!/usr/bin/env python3
"""
verify-pdf.py - Extract and verify invoice data from PDF files
Usage:
python verify-pdf.py <pdf_file> [--bill-id N] [--json]
Examples:
python verify-pdf.py invoice.pdf
python verify-pdf.py invoice.pdf --bill-id 1
python verify-pdf.py invoice.pdf --json
Dependencies:
pip install pdfplumber pytesseract pillow pdf2image python-dateutil
System packages (for OCR):
tesseract-ocr poppler-utils
"""
import argparse
import json
import re
import sys
from datetime import datetime
from decimal import Decimal, InvalidOperation
from pathlib import Path
# PDF extraction
try:
import pdfplumber
HAS_PDFPLUMBER = True
except ImportError:
HAS_PDFPLUMBER = False
# OCR fallback
try:
import pytesseract
from pdf2image import convert_from_path
HAS_OCR = True
except ImportError:
HAS_OCR = False
# Date parsing
try:
from dateutil import parser as dateparser
HAS_DATEUTIL = True
except ImportError:
HAS_DATEUTIL = False
def extract_text_pdfplumber(pdf_path: str) -> str:
"""Extract text from PDF using pdfplumber (fast, text-based PDFs)."""
if not HAS_PDFPLUMBER:
return ""
text_parts = []
try:
with pdfplumber.open(pdf_path) as pdf:
for page in pdf.pages:
page_text = page.extract_text()
if page_text:
text_parts.append(page_text)
except Exception as e:
print(f"pdfplumber error: {e}", file=sys.stderr)
return ""
return "\n".join(text_parts)
def extract_text_ocr(pdf_path: str) -> str:
"""Extract text from PDF using OCR (slower, handles scanned documents)."""
if not HAS_OCR:
return ""
text_parts = []
try:
images = convert_from_path(pdf_path, dpi=200)
for i, image in enumerate(images):
page_text = pytesseract.image_to_string(image)
if page_text:
text_parts.append(page_text)
except Exception as e:
print(f"OCR error: {e}", file=sys.stderr)
return ""
return "\n".join(text_parts)
def extract_text(pdf_path: str) -> tuple[str, str]:
"""
Extract text from PDF with OCR fallback.
Returns (text, method) where method is 'pdfplumber', 'ocr', or 'none'.
"""
# Try text extraction first (fast)
text = extract_text_pdfplumber(pdf_path)
if len(text.strip()) >= 50:
return text, "pdfplumber"
# Fall back to OCR for scanned documents
text = extract_text_ocr(pdf_path)
if text.strip():
return text, "ocr"
return "", "none"
def parse_invoice_number(text: str) -> str | None:
"""Extract invoice number from text."""
patterns = [
r'(?:Invoice|Inv|Invoice\s*#|Invoice\s*Number|Invoice\s*No\.?)[:\s]*([A-Z0-9][-A-Z0-9]{3,})',
r'(?:Order|Order\s*#|Order\s*Number)[:\s]*([A-Z0-9][-A-Z0-9]{3,})',
r'(?:Reference|Ref|Ref\s*#)[:\s]*([A-Z0-9][-A-Z0-9]{3,})',
r'#\s*([A-Z0-9][-A-Z0-9]{5,})', # Generic # followed by alphanumeric
]
for pattern in patterns:
match = re.search(pattern, text, re.IGNORECASE)
if match:
return match.group(1).strip()
return None
def parse_date(text: str) -> tuple[str | None, int | None]:
"""
Extract date from text.
Returns (date_string, unix_timestamp) or (None, None).
"""
if not HAS_DATEUTIL:
return None, None
# Look for labeled dates first
date_patterns = [
r'(?:Invoice\s*Date|Date|Issued)[:\s]*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})',
r'(?:Invoice\s*Date|Date|Issued)[:\s]*(\w+\s+\d{1,2},?\s+\d{4})',
r'(?:Invoice\s*Date|Date|Issued)[:\s]*(\d{4}[/-]\d{1,2}[/-]\d{1,2})',
]
for pattern in date_patterns:
match = re.search(pattern, text, re.IGNORECASE)
if match:
date_str = match.group(1)
try:
parsed = dateparser.parse(date_str)
if parsed:
return date_str, int(parsed.timestamp())
except:
pass
# Try to find any date-like pattern
generic_patterns = [
r'(\d{1,2}[/-]\d{1,2}[/-]\d{4})',
r'(\d{4}[/-]\d{1,2}[/-]\d{1,2})',
r'(\w+\s+\d{1,2},?\s+\d{4})',
]
for pattern in generic_patterns:
matches = re.findall(pattern, text)
for date_str in matches[:3]: # Check first 3 matches
try:
parsed = dateparser.parse(date_str)
if parsed and 2020 <= parsed.year <= 2030:
return date_str, int(parsed.timestamp())
except:
pass
return None, None
def parse_amount(text: str) -> tuple[str | None, Decimal | None]:
"""
Extract total amount from text.
Returns (amount_string, decimal_value) or (None, None).
"""
# Look for labeled totals (prioritize these)
total_patterns = [
r'(?:Total|Amount\s*Due|Grand\s*Total|Balance\s*Due|Total\s*Due)[:\s]*\$?([\d,]+\.?\d*)',
r'(?:Total|Amount\s*Due|Grand\s*Total|Balance\s*Due|Total\s*Due)[:\s]*USD?\s*([\d,]+\.?\d*)',
]
for pattern in total_patterns:
match = re.search(pattern, text, re.IGNORECASE)
if match:
amount_str = match.group(1).replace(',', '')
try:
return match.group(1), Decimal(amount_str)
except InvalidOperation:
pass
# Look for currency amounts (less reliable)
currency_pattern = r'\$\s*([\d,]+\.\d{2})'
matches = re.findall(currency_pattern, text)
if matches:
# Return the largest amount found (likely the total)
amounts = []
for m in matches:
try:
amounts.append((m, Decimal(m.replace(',', ''))))
except:
pass
if amounts:
amounts.sort(key=lambda x: x[1], reverse=True)
return amounts[0]
return None, None
def parse_vendor(text: str) -> str | None:
"""
Extract vendor name from text.
Usually appears in the header/letterhead area.
"""
lines = text.split('\n')[:10] # Check first 10 lines
# Filter out common non-vendor lines
skip_patterns = [
r'^invoice',
r'^date',
r'^bill\s*to',
r'^ship\s*to',
r'^\d',
r'^page',
r'^total',
]
for line in lines:
line = line.strip()
if not line or len(line) < 3 or len(line) > 100:
continue
# Skip lines matching patterns
skip = False
for pattern in skip_patterns:
if re.match(pattern, line, re.IGNORECASE):
skip = True
break
if skip:
continue
# Return first substantial line (likely company name)
if re.match(r'^[A-Z]', line) and len(line) >= 3:
return line
return None
def extract_invoice_data(pdf_path: str) -> dict:
"""Extract all invoice data from a PDF file."""
result = {
'file': pdf_path,
'extraction_method': None,
'invoice_number': None,
'date_string': None,
'date_timestamp': None,
'amount_string': None,
'amount_decimal': None,
'vendor': None,
'raw_text_preview': None,
'errors': [],
}
# Check file exists
if not Path(pdf_path).exists():
result['errors'].append(f"File not found: {pdf_path}")
return result
# Extract text
text, method = extract_text(pdf_path)
result['extraction_method'] = method
result['raw_text_preview'] = text[:500] if text else None
if not text:
result['errors'].append("Could not extract text from PDF")
return result
# Parse fields
result['invoice_number'] = parse_invoice_number(text)
result['date_string'], result['date_timestamp'] = parse_date(text)
result['amount_string'], amount = parse_amount(text)
result['amount_decimal'] = float(amount) if amount else None
result['vendor'] = parse_vendor(text)
return result
def compare_with_bill(extracted: dict, bill: dict) -> list[dict]:
"""
Compare extracted PDF data with bill record.
Returns list of discrepancies.
"""
issues = []
# Compare invoice number
if extracted.get('invoice_number') and bill.get('BillNumber'):
if extracted['invoice_number'].upper() != bill['BillNumber'].upper():
issues.append({
'field': 'invoice_number',
'severity': 'WARNING',
'pdf_value': extracted['invoice_number'],
'bill_value': bill['BillNumber'],
'message': f"Invoice number mismatch: PDF has '{extracted['invoice_number']}', bill has '{bill['BillNumber']}'"
})
# Compare amount
if extracted.get('amount_decimal') and bill.get('Amount'):
pdf_amount = Decimal(str(extracted['amount_decimal']))
bill_amount = Decimal(str(bill['Amount']))
if abs(pdf_amount - bill_amount) > Decimal('0.01'):
issues.append({
'field': 'amount',
'severity': 'ERROR',
'pdf_value': float(pdf_amount),
'bill_value': float(bill_amount),
'message': f"Amount mismatch: PDF has ${pdf_amount}, bill has ${bill_amount}"
})
# Compare date (allow 1 day tolerance)
if extracted.get('date_timestamp') and bill.get('BillDate'):
pdf_ts = extracted['date_timestamp']
bill_ts = bill['BillDate']
diff_days = abs(pdf_ts - bill_ts) / 86400
if diff_days > 1:
issues.append({
'field': 'date',
'severity': 'WARNING',
'pdf_value': extracted['date_string'],
'bill_value': datetime.fromtimestamp(bill_ts).strftime('%Y-%m-%d'),
'message': f"Date mismatch: PDF has '{extracted['date_string']}', bill has {datetime.fromtimestamp(bill_ts).strftime('%Y-%m-%d')}"
})
return issues
def main():
parser = argparse.ArgumentParser(description='Extract and verify invoice data from PDF')
parser.add_argument('pdf_file', help='Path to the PDF file')
parser.add_argument('--bill-id', type=int, help='Bill ID to compare against (for future use)')
parser.add_argument('--json', action='store_true', help='Output as JSON')
args = parser.parse_args()
# Check dependencies
missing = []
if not HAS_PDFPLUMBER:
missing.append('pdfplumber')
if not HAS_DATEUTIL:
missing.append('python-dateutil')
if missing:
print(f"Warning: Missing packages: {', '.join(missing)}", file=sys.stderr)
print("Install with: pip install " + ' '.join(missing), file=sys.stderr)
if not HAS_OCR:
print("Note: OCR support unavailable (install pytesseract, pdf2image)", file=sys.stderr)
# Extract data
result = extract_invoice_data(args.pdf_file)
if args.json:
print(json.dumps(result, indent=2, default=str))
else:
print(f"File: {result['file']}")
print(f"Extraction method: {result['extraction_method']}")
print(f"Invoice #: {result['invoice_number'] or 'NOT FOUND'}")
print(f"Date: {result['date_string'] or 'NOT FOUND'}")
print(f"Amount: ${result['amount_decimal']:.2f}" if result['amount_decimal'] else "Amount: NOT FOUND")
print(f"Vendor: {result['vendor'] or 'NOT FOUND'}")
if result['errors']:
print("\nErrors:")
for err in result['errors']:
print(f" - {err}")
if result['raw_text_preview']:
print(f"\nText preview:\n{'-' * 40}")
print(result['raw_text_preview'][:300])
if __name__ == '__main__':
main()