Files
windmill-git-sync/CLAUDE.md
Bill c838fa568c Initial commit: Windmill Git Sync service
Add containerized service for syncing Windmill workspaces to Git repositories.

Features:
- Flask webhook server for triggering syncs from Windmill
- wmill CLI integration for pulling workspace content
- Automated Git commits and push to remote repository
- Network-isolated (only accessible within Docker network)
- Designed to integrate with existing Windmill docker-compose files

Key components:
- Docker container with Python 3.11, wmill CLI, Git, and Flask
- Sync engine with error handling and logging
- External volume support for persistent workspace data
- Comprehensive documentation (README.md and CLAUDE.md)
2025-11-08 18:40:26 -05:00

7.0 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a containerized service for synchronizing Windmill workspaces to Git repositories. The service provides a Flask webhook server that Windmill can call to trigger automated backups of workspace content to a remote Git repository.

Architecture

The system consists of three main components:

  1. Flask Web Server (app/server.py): Lightweight HTTP server that exposes webhook endpoints for triggering syncs and health checks. Only accessible within the Docker network (not exposed to host).

  2. Sync Engine (app/sync.py): Core logic that orchestrates the sync process:

    • Pulls workspace content from Windmill using the wmill CLI
    • Manages Git repository state (init on first run, subsequent updates)
    • Commits changes and pushes to remote Git repository with PAT authentication
    • Handles error cases and provides detailed logging
  3. Docker Container: Bundles Python 3.11, wmill CLI, Git, and the Flask application. Uses volume mounts for persistent workspace storage.

Key Design Decisions

  • Integrated with Windmill docker-compose: This service is designed to be added as an additional service in your existing Windmill docker-compose file. It shares the same Docker network and can reference Windmill services directly (e.g., windmill_server).
  • Network isolation: Service uses expose instead of ports - accessible only within Docker network, not from host machine. No authentication needed since it's isolated.
  • Webhook-only triggering: Sync happens only when explicitly triggered via HTTP POST to /sync. This gives Windmill full control over backup timing via scheduled flows.
  • HTTPS + Personal Access Token: Git authentication uses PAT injected into HTTPS URL (format: https://TOKEN@github.com/user/repo.git). No SSH key management required.
  • Stateless operation: Each sync is independent. The container can be restarted without losing state (workspace data persists in Docker volume).
  • Single workspace focus: Designed to sync one Windmill workspace per container instance. For multiple workspaces, run multiple containers with different configurations.

Common Development Commands

Build and Run

# Build the Docker image
docker-compose build

# Start the service
docker-compose up -d

# View logs
docker-compose logs -f

# Stop the service
docker-compose down

Testing

# Test the sync manually (from inside container)
docker-compose exec windmill-git-sync python app/sync.py

# Test webhook endpoint (from another container in the network)
docker-compose exec windmill_server curl -X POST http://windmill-git-sync:8080/sync

# Health check (from another container in the network)
docker-compose exec windmill_server curl http://windmill-git-sync:8080/health

Development Workflow

# Edit code locally, rebuild and restart
docker-compose down
docker-compose up -d --build

# View live logs during testing
docker-compose logs -f windmill-git-sync

# Access container shell for debugging
docker-compose exec windmill-git-sync /bin/bash

# Inspect workspace directory
docker-compose exec windmill-git-sync ls -la /workspace

Environment Configuration

All configuration is done via .env file (copy from .env.example). Required variables:

  • WINDMILL_TOKEN: API token from Windmill for workspace access
  • WORKSPACE_VOLUME: External Docker volume name for persistent workspace storage (default: windmill-workspace-data)
  • GIT_REMOTE_URL: HTTPS URL of Git repository (e.g., https://github.com/user/repo.git)
  • GIT_TOKEN: Personal Access Token with repo write permissions

Docker Compose Integration

The docker-compose.yml file contains a service definition meant to be added to your existing Windmill docker-compose file, not run standalone. The service:

  • Does not declare its own network (uses the implicit network from the parent compose file)
  • Assumes a Windmill service named windmill_server exists in the same compose file
  • Uses depends_on: windmill_server to ensure proper startup order
  • Requires an external Docker volume specified in WORKSPACE_VOLUME env var (created via docker volume create windmill-workspace-data)

Code Structure

app/
├── server.py       # Flask application with /health and /sync endpoints
└── sync.py         # Core sync logic (wmill pull → git commit → push)

Important Functions

  • sync.sync_windmill_to_git(): Main entry point for sync operation. Returns dict with success bool and message string.
  • sync.validate_config(): Checks required env vars are set. Raises ValueError if missing.
  • sync.run_wmill_sync(): Executes wmill sync pull command with proper environment variables.
  • sync.commit_and_push_changes(): Stages all changes, commits with automated message, and pushes to remote.

Error Handling

The sync engine uses a try/except pattern that always returns a result dict, never raises to the web server. This ensures webhook requests always get a proper HTTP response with error details in JSON.

Git Workflow

When making changes to this codebase:

  1. Changes are tracked in the project's own Git repository (not the Windmill workspace backup repo)
  2. The service manages commits to the remote backup repository specified in GIT_REMOTE_URL
  3. Commits to the backup repo use the automated format: "Automated Windmill workspace backup - {workspace_name}"

Network Architecture

This service is designed to be added to your existing Windmill docker-compose file. When added, all services share the same Docker Compose network automatically.

Expected service topology within the same docker-compose file:

Services in docker-compose.yml:
├── windmill_server (Windmill API server on port 8000)
├── windmill_worker (Windmill workers)
├── postgres (Database)
└── windmill-git-sync (this service on port 8080)

The service references windmill_server via WINDMILL_BASE_URL=http://windmill_server:8000. If your Windmill server service has a different name, update WINDMILL_BASE_URL in .env.

Extending the Service

Adding Scheduled Syncs

To add cron-based scheduling in addition to webhooks:

  1. Install APScheduler in requirements.txt
  2. Add scheduler initialization in server.py
  3. Update configuration to support SYNC_SCHEDULE env var (e.g., 0 */6 * * * for every 6 hours)

Adding Slack/Discord Notifications

To notify on sync completion:

  1. Add slack-sdk or discord-webhook to requirements.txt
  2. Add notification function in sync.py
  3. Call notification function in sync_windmill_to_git() after successful push
  4. Add webhook URL as env var in .env and docker-compose.yml

Supporting SSH Authentication

To support SSH keys instead of PAT:

  1. Update docker-compose.yml to mount SSH key: ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro
  2. Add logic in sync.get_authenticated_url() to detect SSH vs HTTPS URLs
  3. Configure Git to use SSH: git config core.sshCommand "ssh -i /root/.ssh/id_rsa"