Add MIT License and Docker build validation script

- Add MIT License with William Ballou as copyright holder
- Create scripts/validate_docker_build.sh for testing Docker builds independently
- Update documentation to reflect API-based secret configuration model
- Refactor sync.py to accept config via function parameters instead of env vars
- Update server.py to parse JSON payloads and validate required fields
- Improve security by removing secrets from environment variables
This commit is contained in:
2025-11-09 20:29:52 -05:00
parent c838fa568c
commit 0509c44497
8 changed files with 419 additions and 122 deletions

View File

@@ -1,14 +1,16 @@
# Windmill Configuration # ===================================================
WINDMILL_BASE_URL=http://windmill_server:8000 # Add these lines to your existing Windmill .env file
WINDMILL_TOKEN=your-windmill-token-here # ===================================================
WINDMILL_WORKSPACE=home #
# This service is designed to integrate with your existing Windmill
# docker-compose setup. The configuration below should be added to
# the same .env file that contains your existing Windmill settings
# (like WINDMILL_DATA_PATH).
#
# Required: Your existing .env should already have:
# WINDMILL_DATA_PATH=/path/to/windmill/data
#
# ===================================================
# Workspace Volume (external Docker volume name) # windmill-git-sync Service Configuration
WORKSPACE_VOLUME=windmill-workspace-data WINDMILL_BASE_URL=http://windmill_server:8000
# Git Configuration
GIT_REMOTE_URL=https://github.com/username/repo.git
GIT_TOKEN=your-github-pat-here
GIT_BRANCH=main
GIT_USER_NAME=Windmill Git Sync
GIT_USER_EMAIL=windmill@example.com

View File

@@ -6,13 +6,16 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
This is a containerized service for synchronizing Windmill workspaces to Git repositories. The service provides a Flask webhook server that Windmill can call to trigger automated backups of workspace content to a remote Git repository. This is a containerized service for synchronizing Windmill workspaces to Git repositories. The service provides a Flask webhook server that Windmill can call to trigger automated backups of workspace content to a remote Git repository.
**Key Security Design**: Secrets (Windmill tokens, Git tokens) are NOT stored in environment variables or docker-compose files. Instead, they are passed dynamically via JSON payload in API requests from Windmill, which manages them in its own secret store.
### Architecture ### Architecture
The system consists of three main components: The system consists of three main components:
1. **Flask Web Server** (`app/server.py`): Lightweight HTTP server that exposes webhook endpoints for triggering syncs and health checks. Only accessible within the Docker network (not exposed to host). 1. **Flask Web Server** (`app/server.py`): Lightweight HTTP server that exposes webhook endpoints for triggering syncs and health checks. Parses JSON payloads containing secrets and configuration for each sync request. Only accessible within the Docker network (not exposed to host).
2. **Sync Engine** (`app/sync.py`): Core logic that orchestrates the sync process: 2. **Sync Engine** (`app/sync.py`): Core logic that orchestrates the sync process:
- Accepts configuration as function parameters (not from environment variables)
- Pulls workspace content from Windmill using the `wmill` CLI - Pulls workspace content from Windmill using the `wmill` CLI
- Manages Git repository state (init on first run, subsequent updates) - Manages Git repository state (init on first run, subsequent updates)
- Commits changes and pushes to remote Git repository with PAT authentication - Commits changes and pushes to remote Git repository with PAT authentication
@@ -22,12 +25,14 @@ The system consists of three main components:
### Key Design Decisions ### Key Design Decisions
- **API-based configuration**: Secrets and sync parameters are passed via JSON payload in each API request. Only infrastructure settings (WINDMILL_BASE_URL, volume names) are in environment variables.
- **Security-first**: No secrets in `.env` files or docker-compose.yml. All sensitive data managed by Windmill and passed per-request.
- **Flexible**: Same container can sync different workspaces to different repositories without reconfiguration or restart.
- **Integrated with Windmill docker-compose**: This service is designed to be added as an additional service in your existing Windmill docker-compose file. It shares the same Docker network and can reference Windmill services directly (e.g., `windmill_server`). - **Integrated with Windmill docker-compose**: This service is designed to be added as an additional service in your existing Windmill docker-compose file. It shares the same Docker network and can reference Windmill services directly (e.g., `windmill_server`).
- **Network isolation**: Service uses `expose` instead of `ports` - accessible only within Docker network, not from host machine. No authentication needed since it's isolated. - **Network isolation**: Service uses `expose` instead of `ports` - accessible only within Docker network, not from host machine. No authentication needed since it's isolated.
- **Webhook-only triggering**: Sync happens only when explicitly triggered via HTTP POST to `/sync`. This gives Windmill full control over backup timing via scheduled flows. - **Webhook-only triggering**: Sync happens only when explicitly triggered via HTTP POST to `/sync` with JSON payload. This gives Windmill full control over backup timing and configuration via scheduled flows.
- **HTTPS + Personal Access Token**: Git authentication uses PAT injected into HTTPS URL (format: `https://TOKEN@github.com/user/repo.git`). No SSH key management required. - **HTTPS + Personal Access Token**: Git authentication uses PAT injected into HTTPS URL (format: `https://TOKEN@github.com/user/repo.git`). No SSH key management required.
- **Stateless operation**: Each sync is independent. The container can be restarted without losing state (workspace data persists in Docker volume). - **Stateless operation**: Each sync is independent. The container can be restarted without losing state (workspace data persists in Docker volume).
- **Single workspace focus**: Designed to sync one Windmill workspace per container instance. For multiple workspaces, run multiple containers with different configurations.
## Common Development Commands ## Common Development Commands
@@ -50,11 +55,21 @@ docker-compose down
### Testing ### Testing
```bash ```bash
# Test the sync manually (from inside container) # Test the sync manually (from inside container) - requires env vars for testing
docker-compose exec windmill-git-sync python app/sync.py docker-compose exec windmill-git-sync python app/sync.py
# Test webhook endpoint (from another container in the network) # Test webhook endpoint with JSON payload (from another container in the network)
docker-compose exec windmill_server curl -X POST http://windmill-git-sync:8080/sync docker-compose exec windmill_server curl -X POST http://windmill-git-sync:8080/sync \
-H "Content-Type: application/json" \
-d '{
"windmill_token": "your-token",
"git_remote_url": "https://github.com/user/repo.git",
"git_token": "your-git-token",
"workspace": "admins",
"git_branch": "main",
"git_user_name": "Windmill Git Sync",
"git_user_email": "windmill@example.com"
}'
# Health check (from another container in the network) # Health check (from another container in the network)
docker-compose exec windmill_server curl http://windmill-git-sync:8080/health docker-compose exec windmill_server curl http://windmill-git-sync:8080/health
@@ -79,12 +94,29 @@ docker-compose exec windmill-git-sync ls -la /workspace
## Environment Configuration ## Environment Configuration
All configuration is done via `.env` file (copy from `.env.example`). Required variables: Configuration is split between infrastructure (`.env` file) and secrets (API payload):
- `WINDMILL_TOKEN`: API token from Windmill for workspace access ### Infrastructure Configuration (.env file)
- `WORKSPACE_VOLUME`: External Docker volume name for persistent workspace storage (default: `windmill-workspace-data`)
- `GIT_REMOTE_URL`: HTTPS URL of Git repository (e.g., `https://github.com/user/repo.git`) **Integration Approach:** This service's configuration should be **added to your existing Windmill `.env` file**, not maintained as a separate file. The `.env.example` file shows what to add.
- `GIT_TOKEN`: Personal Access Token with repo write permissions
Required in your Windmill `.env` file:
- `WINDMILL_DATA_PATH`: Path to Windmill data directory (should already exist in your Windmill setup)
- `WINDMILL_BASE_URL`: URL of Windmill instance (default: `http://windmill_server:8000`)
The docker-compose service uses `${WINDMILL_DATA_PATH}/workspace` as the volume mount path for workspace data.
### Secrets Configuration (API Payload)
Secrets are passed in the JSON body of POST requests to `/sync`:
- `windmill_token` (required): Windmill API token for workspace access
- `git_remote_url` (required): HTTPS URL of Git repository (e.g., `https://github.com/user/repo.git`)
- `git_token` (required): Personal Access Token with repo write permissions
- `workspace` (optional): Workspace name to sync (default: `admins`)
- `git_branch` (optional): Branch to push to (default: `main`)
- `git_user_name` (optional): Git commit author name (default: `Windmill Git Sync`)
- `git_user_email` (optional): Git commit author email (default: `windmill@example.com`)
### Docker Compose Integration ### Docker Compose Integration
@@ -92,7 +124,9 @@ The `docker-compose.yml` file contains a service definition meant to be **added
- Does not declare its own network (uses the implicit network from the parent compose file) - Does not declare its own network (uses the implicit network from the parent compose file)
- Assumes a Windmill service named `windmill_server` exists in the same compose file - Assumes a Windmill service named `windmill_server` exists in the same compose file
- Uses `depends_on: windmill_server` to ensure proper startup order - Uses `depends_on: windmill_server` to ensure proper startup order
- Requires an external Docker volume specified in `WORKSPACE_VOLUME` env var (created via `docker volume create windmill-workspace-data`) - Mounts workspace directory from existing Windmill data path: `${WINDMILL_DATA_PATH}/workspace:/workspace`
- Only exposes infrastructure config as environment variables (no secrets)
- Reads from the same `.env` file as your Windmill services
## Code Structure ## Code Structure
@@ -104,13 +138,18 @@ app/
### Important Functions ### Important Functions
- `sync.sync_windmill_to_git()`: Main entry point for sync operation. Returns dict with `success` bool and `message` string. - `sync.sync_windmill_to_git(config: Dict[str, Any])`: Main entry point for sync operation. Accepts config dictionary with secrets and parameters. Returns dict with `success` bool and `message` string.
- `sync.validate_config()`: Checks required env vars are set. Raises ValueError if missing. - `sync.validate_config(config: Dict[str, Any])`: Validates required fields are present in config dict. Raises ValueError if missing required fields (windmill_token, git_remote_url, git_token).
- `sync.run_wmill_sync()`: Executes `wmill sync pull` command with proper environment variables. - `sync.run_wmill_sync(config: Dict[str, Any])`: Executes `wmill sync pull` command using config parameters, not environment variables.
- `sync.commit_and_push_changes()`: Stages all changes, commits with automated message, and pushes to remote. - `sync.commit_and_push_changes(repo: Repo, config: Dict[str, Any])`: Stages all changes, commits with automated message, and pushes to remote using config parameters.
### Error Handling ### Error Handling
The server validates JSON payloads and returns appropriate HTTP status codes:
- **400 Bad Request**: Missing required fields or invalid JSON
- **200 OK**: Sync succeeded (returns success dict)
- **500 Internal Server Error**: Sync failed (returns error dict)
The sync engine uses a try/except pattern that always returns a result dict, never raises to the web server. This ensures webhook requests always get a proper HTTP response with error details in JSON. The sync engine uses a try/except pattern that always returns a result dict, never raises to the web server. This ensures webhook requests always get a proper HTTP response with error details in JSON.
## Git Workflow ## Git Workflow
@@ -118,7 +157,7 @@ The sync engine uses a try/except pattern that always returns a result dict, nev
When making changes to this codebase: When making changes to this codebase:
1. Changes are tracked in the project's own Git repository (not the Windmill workspace backup repo) 1. Changes are tracked in the project's own Git repository (not the Windmill workspace backup repo)
2. The service manages commits to the **remote backup repository** specified in `GIT_REMOTE_URL` 2. The service manages commits to the **remote backup repository** specified in the API payload's `git_remote_url`
3. Commits to the backup repo use the automated format: "Automated Windmill workspace backup - {workspace_name}" 3. Commits to the backup repo use the automated format: "Automated Windmill workspace backup - {workspace_name}"
## Network Architecture ## Network Architecture

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 William Ballou
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

167
README.md
View File

@@ -6,9 +6,13 @@ A containerized service for syncing Windmill workspaces to Git repositories via
This service provides automated backup of Windmill workspaces to Git. It runs a lightweight Flask web server that responds to webhook requests from Windmill, syncing the workspace content using the `wmill` CLI and pushing changes to a remote Git repository. This service provides automated backup of Windmill workspaces to Git. It runs a lightweight Flask web server that responds to webhook requests from Windmill, syncing the workspace content using the `wmill` CLI and pushing changes to a remote Git repository.
**Security Model**: Secrets are managed by Windmill and passed via API requests, not stored in environment variables or docker-compose files.
## Features ## Features
- **Webhook-triggered sync**: Windmill can trigger backups via HTTP POST requests - **Webhook-triggered sync**: Windmill can trigger backups via HTTP POST requests with dynamic configuration
- **Secure by default**: No secrets in environment variables - all sensitive data passed via API payload
- **Flexible**: Same container can sync different workspaces to different repositories per request
- **Dockerized**: Runs as a container in the same network as Windmill - **Dockerized**: Runs as a container in the same network as Windmill
- **Git integration**: Automatic commits and pushes to remote repository - **Git integration**: Automatic commits and pushes to remote repository
- **Authentication**: Supports Personal Access Token (PAT) authentication for Git - **Authentication**: Supports Personal Access Token (PAT) authentication for Git
@@ -16,72 +20,163 @@ This service provides automated backup of Windmill workspaces to Git. It runs a
## Quick Start ## Quick Start
This service is designed to be added to your existing Windmill docker-compose file. This service is designed to be added to your existing Windmill docker-compose setup.
1. Copy the example environment file: ### Prerequisites
- An existing Windmill docker-compose installation with a `.env` file that includes `WINDMILL_DATA_PATH`
### Installation Steps
1. **Add configuration to your Windmill `.env` file:**
Add the configuration from `.env.example` to your existing Windmill `.env` file:
```bash ```bash
cp .env.example .env # Add to your existing Windmill .env file
WINDMILL_BASE_URL=http://windmill_server:8000
``` ```
2. Edit `.env` with your configuration: Your `.env` should already have `WINDMILL_DATA_PATH` defined (e.g., `WINDMILL_DATA_PATH=/mnt/user/appdata/windmill`).
- Set `WINDMILL_TOKEN` to your Windmill API token
- Set `GIT_REMOTE_URL` to your Git repository URL
- Set `GIT_TOKEN` to your Git Personal Access Token
- Set `WORKSPACE_VOLUME` to an external Docker volume name
3. Create the external volume: 2. **Add the service to your docker-compose file:**
```bash
docker volume create windmill-workspace-data
```
4. Add the `windmill-git-sync` service block from `docker-compose.yml` to your existing Windmill docker-compose file. Add the `windmill-git-sync` service block from `docker-compose.yml` to your existing Windmill docker-compose file.
5. Build and start the service: 3. **Build and start the service:**
```bash ```bash
docker-compose up -d windmill-git-sync docker-compose up -d windmill-git-sync
``` ```
6. Trigger a sync from Windmill (see Integration section below) or test from another container: 4. **Configure secrets in Windmill:**
```bash
docker-compose exec windmill_server curl -X POST http://windmill-git-sync:8080/sync Store your tokens in Windmill's variable/resource system and trigger syncs from Windmill flows (see Integration section below).
```
## Configuration ## Configuration
All configuration is done via environment variables in `.env`: Configuration is split between infrastructure settings (in `.env`) and secrets (passed via API):
| Variable | Required | Description | ### Infrastructure Configuration (.env file)
|----------|----------|-------------|
| `WINDMILL_BASE_URL` | Yes | URL of Windmill instance (e.g., `http://windmill:8000`) | **Note:** These settings should be added to your existing Windmill `.env` file.
| `WINDMILL_TOKEN` | Yes | Windmill API token for authentication |
| `WINDMILL_WORKSPACE` | No | Workspace name (default: `default`) | | Variable | Required | Default | Description |
| `WORKSPACE_VOLUME` | Yes | External Docker volume name for workspace data | |----------|----------|---------|-------------|
| `GIT_REMOTE_URL` | Yes | HTTPS Git repository URL | | `WINDMILL_BASE_URL` | No | `http://windmill_server:8000` | URL of Windmill instance |
| `GIT_TOKEN` | Yes | Git Personal Access Token | | `WINDMILL_DATA_PATH` | Yes | - | Path to Windmill data directory (should already exist in your Windmill .env) |
| `GIT_BRANCH` | No | Branch to push to (default: `main`) |
| `GIT_USER_NAME` | No | Git commit author name | ### Secrets Configuration (API payload)
| `GIT_USER_EMAIL` | No | Git commit author email |
Secrets are **not stored in environment variables**. Instead, they are passed in the JSON payload of each `/sync` request:
| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `windmill_token` | Yes | - | Windmill API token for authentication |
| `git_remote_url` | Yes | - | HTTPS Git repository URL (e.g., `https://github.com/user/repo.git`) |
| `git_token` | Yes | - | Git Personal Access Token with write access |
| `workspace` | No | `admins` | Windmill workspace name to sync |
| `git_branch` | No | `main` | Git branch to push to |
| `git_user_name` | No | `Windmill Git Sync` | Git commit author name |
| `git_user_email` | No | `windmill@example.com` | Git commit author email |
## API Endpoints ## API Endpoints
This service is only accessible within the Docker network (not exposed to the host). This service is only accessible within the Docker network (not exposed to the host).
- `GET /health` - Health check endpoint ### `GET /health`
- `POST /sync` - Trigger a workspace sync to Git
Health check endpoint.
**Response:**
```json
{
"status": "healthy"
}
```
### `POST /sync`
Trigger a workspace sync to Git.
**Request Body (JSON):**
```json
{
"windmill_token": "your-windmill-token",
"git_remote_url": "https://github.com/username/repo.git",
"git_token": "ghp_your_github_token",
"workspace": "my-workspace",
"git_branch": "main",
"git_user_name": "Windmill Git Sync",
"git_user_email": "windmill@example.com"
}
```
**Success Response (200):**
```json
{
"success": true,
"message": "Successfully synced workspace 'my-workspace' to Git"
}
```
**Validation Error Response (400):**
```json
{
"success": false,
"message": "Missing required fields: windmill_token, git_remote_url"
}
```
**Sync Error Response (500):**
```json
{
"success": false,
"message": "Git push failed: authentication error"
}
```
## Integration with Windmill ## Integration with Windmill
Create a scheduled flow or script in Windmill to trigger backups: Create a scheduled flow or script in Windmill to trigger backups. Store secrets in Windmill's variable/resource system:
```typescript ```typescript
export async function main() { type Windmill = {
token: string;
}
type Github = {
token: string;
}
export async function main(
windmill: Windmill,
github: Github
) {
const response = await fetch('http://windmill-git-sync:8080/sync', { const response = await fetch('http://windmill-git-sync:8080/sync', {
method: 'POST' method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
windmill_token: windmill.token,
git_remote_url: 'https://github.com/username/repo.git',
git_token: github.token,
workspace: 'my-workspace', // optional, defaults to 'admins'
git_branch: 'main', // optional, defaults to 'main'
git_user_name: 'Windmill Git Sync', // optional
git_user_email: 'windmill@example.com' // optional
})
}); });
return await response.json(); return await response.json();
} }
``` ```
**Setting up Windmill Resources:**
1. In Windmill, create a Variable or Resource for your Windmill token
2. Create another Variable or Resource for your GitHub PAT
3. Schedule the above script to run on your desired backup schedule (e.g., hourly, daily)
## Development ## Development
See [CLAUDE.md](CLAUDE.md) for development instructions and architecture details. See [CLAUDE.md](CLAUDE.md) for development instructions and architecture details.

View File

@@ -4,7 +4,7 @@ Flask server for receiving webhook triggers from Windmill to sync workspace to G
Internal service - not exposed outside Docker network. Internal service - not exposed outside Docker network.
""" """
import logging import logging
from flask import Flask, jsonify from flask import Flask, jsonify, request
from sync import sync_windmill_to_git from sync import sync_windmill_to_git
# Configure logging # Configure logging
@@ -28,11 +28,54 @@ def trigger_sync():
""" """
Trigger a sync from Windmill workspace to Git repository. Trigger a sync from Windmill workspace to Git repository.
This endpoint is only accessible within the Docker network. This endpoint is only accessible within the Docker network.
Expected JSON payload:
{
"windmill_token": "string (required)",
"git_remote_url": "string (required)",
"git_token": "string (required)",
"workspace": "string (optional, default: 'admins')",
"git_branch": "string (optional, default: 'main')",
"git_user_name": "string (optional, default: 'Windmill Git Sync')",
"git_user_email": "string (optional, default: 'windmill@example.com')"
}
""" """
logger.info("Sync triggered via webhook") logger.info("Sync triggered via webhook")
# Parse JSON payload
try: try:
result = sync_windmill_to_git() config = request.get_json(force=True)
if not config:
return jsonify({
'success': False,
'message': 'Request body must be valid JSON'
}), 400
except Exception as e:
logger.error(f"Failed to parse JSON payload: {str(e)}")
return jsonify({
'success': False,
'message': f'Invalid JSON payload: {str(e)}'
}), 400
# Validate required fields
required_fields = ['windmill_token', 'git_remote_url', 'git_token']
missing_fields = [field for field in required_fields if not config.get(field)]
if missing_fields:
error_message = f"Missing required fields: {', '.join(missing_fields)}"
logger.error(error_message)
return jsonify({
'success': False,
'message': error_message
}), 400
# Log configuration (without exposing secrets)
workspace = config.get('workspace', 'admins')
git_branch = config.get('git_branch', 'main')
logger.info(f"Sync configuration - workspace: {workspace}, branch: {git_branch}")
try:
result = sync_windmill_to_git(config)
if result['success']: if result['success']:
logger.info(f"Sync completed successfully: {result['message']}") logger.info(f"Sync completed successfully: {result['message']}")

View File

@@ -6,35 +6,31 @@ import os
import subprocess import subprocess
import logging import logging
from pathlib import Path from pathlib import Path
from typing import Dict, Any
from git import Repo, GitCommandError from git import Repo, GitCommandError
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
# Configuration from environment variables # Configuration from environment variables (infrastructure only)
WORKSPACE_DIR = Path('/workspace') WORKSPACE_DIR = Path('/workspace')
WINDMILL_BASE_URL = os.getenv('WINDMILL_BASE_URL', 'http://windmill:8000') WINDMILL_BASE_URL = os.getenv('WINDMILL_BASE_URL', 'http://windmill_server:8000')
WINDMILL_TOKEN = os.getenv('WINDMILL_TOKEN', '')
WINDMILL_WORKSPACE = os.getenv('WINDMILL_WORKSPACE', 'default')
GIT_REMOTE_URL = os.getenv('GIT_REMOTE_URL', '')
GIT_TOKEN = os.getenv('GIT_TOKEN', '')
GIT_BRANCH = os.getenv('GIT_BRANCH', 'main')
GIT_USER_NAME = os.getenv('GIT_USER_NAME', 'Windmill Git Sync')
GIT_USER_EMAIL = os.getenv('GIT_USER_EMAIL', 'windmill@example.com')
def validate_config(): def validate_config(config: Dict[str, Any]) -> None:
"""Validate required configuration is present.""" """
missing = [] Validate required configuration is present in the provided config dict.
if not WINDMILL_TOKEN: Args:
missing.append('WINDMILL_TOKEN') config: Configuration dictionary with sync parameters
if not GIT_REMOTE_URL:
missing.append('GIT_REMOTE_URL') Raises:
if not GIT_TOKEN: ValueError: If required fields are missing
missing.append('GIT_TOKEN') """
required_fields = ['windmill_token', 'git_remote_url', 'git_token']
missing = [field for field in required_fields if not config.get(field)]
if missing: if missing:
raise ValueError(f"Missing required environment variables: {', '.join(missing)}") raise ValueError(f"Missing required fields: {', '.join(missing)}")
def get_authenticated_url(url: str, token: str) -> str: def get_authenticated_url(url: str, token: str) -> str:
@@ -45,14 +41,28 @@ def get_authenticated_url(url: str, token: str) -> str:
return url return url
def run_wmill_sync(): def run_wmill_sync(config: Dict[str, Any]) -> bool:
"""Run wmill sync to pull workspace from Windmill.""" """
logger.info(f"Syncing Windmill workspace '{WINDMILL_WORKSPACE}' from {WINDMILL_BASE_URL}") Run wmill sync to pull workspace from Windmill.
Args:
config: Configuration dictionary containing windmill_token and workspace
Returns:
bool: True if sync was successful
Raises:
RuntimeError: If wmill sync command fails
"""
workspace = config.get('workspace', 'admins')
windmill_token = config['windmill_token']
logger.info(f"Syncing Windmill workspace '{workspace}' from {WINDMILL_BASE_URL}")
env = os.environ.copy() env = os.environ.copy()
env['WM_BASE_URL'] = WINDMILL_BASE_URL env['WM_BASE_URL'] = WINDMILL_BASE_URL
env['WM_TOKEN'] = WINDMILL_TOKEN env['WM_TOKEN'] = windmill_token
env['WM_WORKSPACE'] = WINDMILL_WORKSPACE env['WM_WORKSPACE'] = workspace
try: try:
# Run wmill sync in the workspace directory # Run wmill sync in the workspace directory
@@ -75,8 +85,19 @@ def run_wmill_sync():
raise RuntimeError(f"Failed to sync from Windmill: {e.stderr}") raise RuntimeError(f"Failed to sync from Windmill: {e.stderr}")
def init_or_update_git_repo(): def init_or_update_git_repo(config: Dict[str, Any]) -> Repo:
"""Initialize Git repository or open existing one.""" """
Initialize Git repository or open existing one.
Args:
config: Configuration dictionary containing optional git_user_name and git_user_email
Returns:
Repo: GitPython repository object
"""
git_user_name = config.get('git_user_name', 'Windmill Git Sync')
git_user_email = config.get('git_user_email', 'windmill@example.com')
git_dir = WORKSPACE_DIR / '.git' git_dir = WORKSPACE_DIR / '.git'
if git_dir.exists(): if git_dir.exists():
@@ -87,14 +108,31 @@ def init_or_update_git_repo():
repo = Repo.init(WORKSPACE_DIR) repo = Repo.init(WORKSPACE_DIR)
# Configure user # Configure user
repo.config_writer().set_value("user", "name", GIT_USER_NAME).release() repo.config_writer().set_value("user", "name", git_user_name).release()
repo.config_writer().set_value("user", "email", GIT_USER_EMAIL).release() repo.config_writer().set_value("user", "email", git_user_email).release()
return repo return repo
def commit_and_push_changes(repo: Repo): def commit_and_push_changes(repo: Repo, config: Dict[str, Any]) -> bool:
"""Commit changes and push to remote Git repository.""" """
Commit changes and push to remote Git repository.
Args:
repo: GitPython Repo object
config: Configuration dictionary containing git_remote_url, git_token, git_branch, and workspace
Returns:
bool: True if changes were committed and pushed, False if no changes
Raises:
RuntimeError: If git push fails
"""
workspace = config.get('workspace', 'admins')
git_remote_url = config['git_remote_url']
git_token = config['git_token']
git_branch = config.get('git_branch', 'main')
# Check if there are any changes # Check if there are any changes
if not repo.is_dirty(untracked_files=True): if not repo.is_dirty(untracked_files=True):
logger.info("No changes to commit") logger.info("No changes to commit")
@@ -104,12 +142,12 @@ def commit_and_push_changes(repo: Repo):
repo.git.add(A=True) repo.git.add(A=True)
# Create commit # Create commit
commit_message = f"Automated Windmill workspace backup - {WINDMILL_WORKSPACE}" commit_message = f"Automated Windmill workspace backup - {workspace}"
repo.index.commit(commit_message) repo.index.commit(commit_message)
logger.info(f"Created commit: {commit_message}") logger.info(f"Created commit: {commit_message}")
# Configure remote with authentication # Configure remote with authentication
authenticated_url = get_authenticated_url(GIT_REMOTE_URL, GIT_TOKEN) authenticated_url = get_authenticated_url(git_remote_url, git_token)
try: try:
# Check if remote exists # Check if remote exists
@@ -120,8 +158,8 @@ def commit_and_push_changes(repo: Repo):
origin = repo.create_remote('origin', authenticated_url) origin = repo.create_remote('origin', authenticated_url)
# Push to remote # Push to remote
logger.info(f"Pushing to {GIT_REMOTE_URL} (branch: {GIT_BRANCH})") logger.info(f"Pushing to {git_remote_url} (branch: {git_branch})")
origin.push(refspec=f'HEAD:{GIT_BRANCH}', force=False) origin.push(refspec=f'HEAD:{git_branch}', force=False)
logger.info("Push completed successfully") logger.info("Push completed successfully")
return True return True
@@ -131,28 +169,40 @@ def commit_and_push_changes(repo: Repo):
raise RuntimeError(f"Failed to push to Git remote: {str(e)}") raise RuntimeError(f"Failed to push to Git remote: {str(e)}")
def sync_windmill_to_git(): def sync_windmill_to_git(config: Dict[str, Any]) -> Dict[str, Any]:
""" """
Main sync function: pulls from Windmill, commits, and pushes to Git. Main sync function: pulls from Windmill, commits, and pushes to Git.
Args:
config: Configuration dictionary with the following keys:
- windmill_token (required): Windmill API token
- git_remote_url (required): Git repository URL
- git_token (required): Git authentication token
- workspace (optional): Windmill workspace name (default: "admins")
- git_branch (optional): Git branch to push to (default: "main")
- git_user_name (optional): Git commit author name (default: "Windmill Git Sync")
- git_user_email (optional): Git commit author email (default: "windmill@example.com")
Returns: Returns:
dict: Result with 'success' boolean and 'message' string dict: Result with 'success' boolean and 'message' string
""" """
try: try:
# Validate configuration # Validate configuration
validate_config() validate_config(config)
workspace = config.get('workspace', 'admins')
# Pull from Windmill # Pull from Windmill
run_wmill_sync() run_wmill_sync(config)
# Initialize/update Git repo # Initialize/update Git repo
repo = init_or_update_git_repo() repo = init_or_update_git_repo(config)
# Commit and push changes # Commit and push changes
has_changes = commit_and_push_changes(repo) has_changes = commit_and_push_changes(repo, config)
if has_changes: if has_changes:
message = f"Successfully synced workspace '{WINDMILL_WORKSPACE}' to Git" message = f"Successfully synced workspace '{workspace}' to Git"
else: else:
message = "Sync completed - no changes to commit" message = "Sync completed - no changes to commit"
@@ -170,7 +220,17 @@ def sync_windmill_to_git():
if __name__ == '__main__': if __name__ == '__main__':
# Allow running sync directly for testing # Allow running sync directly for testing with environment variables
logging.basicConfig(level=logging.INFO) logging.basicConfig(level=logging.INFO)
result = sync_windmill_to_git()
# For testing: load config from environment variables
test_config = {
'windmill_token': os.getenv('WINDMILL_TOKEN', ''),
'git_remote_url': os.getenv('GIT_REMOTE_URL', ''),
'git_token': os.getenv('GIT_TOKEN', ''),
'workspace': os.getenv('WINDMILL_WORKSPACE', 'admins'),
'git_branch': os.getenv('GIT_BRANCH', 'main')
}
result = sync_windmill_to_git(test_config)
print(result) print(result)

View File

@@ -7,19 +7,9 @@ services:
expose: expose:
- "8080" - "8080"
volumes: volumes:
- ${WORKSPACE_VOLUME}:/workspace - ${WINDMILL_DATA_PATH}/workspace:/workspace
environment: environment:
# Windmill connection - WINDMILL_BASE_URL=${WINDMILL_BASE_URL:-http://windmill_server:8000}
- WINDMILL_BASE_URL=http://windmill_server:8000
- WINDMILL_TOKEN=${WINDMILL_TOKEN}
- WINDMILL_WORKSPACE=${WINDMILL_WORKSPACE:-default}
# Git configuration
- GIT_REMOTE_URL=${GIT_REMOTE_URL}
- GIT_TOKEN=${GIT_TOKEN}
- GIT_BRANCH=${GIT_BRANCH:-main}
- GIT_USER_NAME=${GIT_USER_NAME:-Windmill Git Sync}
- GIT_USER_EMAIL=${GIT_USER_EMAIL:-windmill@example.com}
restart: unless-stopped restart: unless-stopped
depends_on: depends_on:
- windmill_server - windmill_server

View File

@@ -0,0 +1,47 @@
#!/bin/bash
# Script to validate Docker build without requiring full Windmill docker-compose setup
set -e
echo "=== Validating Docker Build for Windmill Git Sync ==="
echo ""
# Build the Docker image
echo "Building Docker image..."
docker build -t windmill-git-sync:test .
echo ""
echo "=== Build Status ==="
if [ $? -eq 0 ]; then
echo "✓ Docker image built successfully"
# Show image details
echo ""
echo "=== Image Details ==="
docker images windmill-git-sync:test --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}\t{{.CreatedAt}}"
# Verify Python dependencies are installed
echo ""
echo "=== Verifying Python Dependencies ==="
docker run --rm windmill-git-sync:test pip list | grep -E "(Flask|GitPython|requests|python-dotenv)"
# Verify wmill CLI is installed
echo ""
echo "=== Verifying wmill CLI ==="
docker run --rm windmill-git-sync:test wmill --version
# Verify Git is installed
echo ""
echo "=== Verifying Git ==="
docker run --rm windmill-git-sync:test git --version
echo ""
echo "=== Validation Complete ==="
echo "✓ All checks passed"
echo ""
echo "To clean up the test image, run:"
echo " docker rmi windmill-git-sync:test"
else
echo "✗ Docker build failed"
exit 1
fi