ClickHouse Migrations with Alembic: Announcing clickhouse-migrate

2026-01-25 - 9 min read

Daniel Young

Founder, DRYCodeWorks

'Did we run that ALTER on staging?' 'Which version of the schema is in prod?' If you're managing ClickHouse schemas with raw DDL and prayer, we built something better. clickhouse-migrate brings Alembic-style migrations to ClickHouse—with the zero-downtime patterns we learned the hard way.

You've chosen ClickHouse for your analytics workload. The queries are fast, the storage is efficient, and the real-time ingestion keeps pace with your data volume. Then you need to add a column to a table with 500 million rows.

In PostgreSQL, you'd write an Alembic migration, run alembic upgrade head, and move on. In ClickHouse, you're writing raw DDL, copy-pasting it between environments, and hoping you remember which version of the schema is actually running in production.

We learned this the hard way while building IoT analytics infrastructure on ClickHouse Cloud. Our "system" was a folder of SQL files, a spreadsheet tracking what had been applied where, and a lot of Slack messages asking "did anyone run that ALTER on staging?" When a schema change broke our ingestion pipeline at 2am—because someone ran the wrong version of a migration—we decided to build proper tooling.

The result is clickhouse-migrate: an Alembic-based migration tool designed specifically for ClickHouse. It tracks state, manages environments, and handles the patterns that make ClickHouse schema changes tricky. Today we're releasing it as open source.

Why Not Just Use Raw DDL?

You could manage ClickHouse schemas with raw SQL files and bash scripts. We did that for a while. Here's what you're signing up for:

No state tracking. Which migrations have been applied to staging? Production? You'll need a spreadsheet—or a very good memory.
Manual ordering. Run migrations out of order and you'll get cryptic errors. Hope you remembered the dependency chain.
No rollback path. Something went wrong? Write a reversal script from scratch, under pressure, at 2am.
Environment drift. Dev has columns that staging doesn't. Production has an index nobody remembers adding. Good luck debugging.
No audit trail. "Who changed this table and when?" becomes an archaeological expedition through Slack history.

If you're running ClickHouse in production with more than one environment, you'll hit these problems eventually. clickhouse-migrate solves them the same way Alembic solves them for PostgreSQL—with version-controlled, tracked, reversible migrations.

What is clickhouse-migrate?

clickhouse-migrate - A CLI tool that brings Alembic-style migrations to ClickHouse, with multi-environment support, SSM secret management, and patterns for zero-downtime schema changes.

Bootstrap - The process of creating a database, roles, and users with appropriate permissions. clickhouse-migrate makes this idempotent and repeatable.

The tool handles the full lifecycle:

State tracking - Alembic tracks exactly which migrations have been applied to each environment
Rollback support - Every migration has a downgrade path for when things go wrong
Database bootstrapping - Creates databases, roles, and users with proper grants (idempotent)
Multi-environment config - Separate dev, staging, and production configurations in YAML
SSM integration - Store production credentials in AWS Parameter Store

Quick Start

Install the CLI globally:

bash

# Install with uv (recommended)
uv tool install git+https://github.com/DRYCodeWorks/clickhouse-migrate.git

# Verify installation
ch-migrate --version

# Install with uv (recommended)
uv tool install git+https://github.com/DRYCodeWorks/clickhouse-migrate.git

# Verify installation
ch-migrate --version

Initialize a new project:

bash

mkdir my-clickhouse-project && cd my-clickhouse-project
ch-migrate init --name my_project

mkdir my-clickhouse-project && cd my-clickhouse-project
ch-migrate init --name my_project

This creates a structured project:

text

my-clickhouse-project/
├── config.yaml              # ClickHouse hosts and settings
├── .env.local.example       # Template for local secrets
├── alembic.ini
└── migrations/
  ├── env.py
  ├── script.py.mako
  ├── versions/            # Migration files
  └── sql/
      └── history/         # Versioned SQL organized by object
          ├── tables/
          ├── views/
          └── dictionaries/

my-clickhouse-project/
├── config.yaml              # ClickHouse hosts and settings
├── .env.local.example       # Template for local secrets
├── alembic.ini
└── migrations/
  ├── env.py
  ├── script.py.mako
  ├── versions/            # Migration files
  └── sql/
      └── history/         # Versioned SQL organized by object
          ├── tables/
          ├── views/
          └── dictionaries/

Configuration

The config.yaml file defines your environments. Here's an example for a multi-region deployment on ClickHouse Cloud:

config.yamlyaml

project:
name: my_analytics

defaults:
port: 8443
secure: true
admin_user: default

environments:
# Local development with Docker
dev:
  host: localhost
  port: 8123
  secure: false
  database: default
  migration_user: default

# Staging (us-west-2)
staging:
  host: abc123xyz.us-west-2.aws.clickhouse.cloud
  database: default
  migration_user: migration_staging
  aws_region: us-west-2
  ssm:
    admin_password: /database/clickhouse_credentials#password
    migration_password: /database/clickhouse_credentials#password

# Production us-west-2
production_usw2:
  host: def456xyz.us-west-2.aws.clickhouse.cloud
  database: default
  migration_user: migration_production
  aws_region: us-west-2
  ssm:
    admin_password: /database/clickhouse_credentials#password
    migration_password: /database/clickhouse_credentials#password

# Production us-east-1 (multi-region)
production_use1:
  host: ghi789xyz.us-east-1.aws.clickhouse.cloud
  database: default
  migration_user: migration_production
  aws_region: us-east-1
  ssm:
    admin_password: /database/clickhouse_credentials#password
    migration_password: /database/clickhouse_credentials#password

project:
name: my_analytics

defaults:
port: 8443
secure: true
admin_user: default

environments:
# Local development with Docker
dev:
  host: localhost
  port: 8123
  secure: false
  database: default
  migration_user: default

# Staging (us-west-2)
staging:
  host: abc123xyz.us-west-2.aws.clickhouse.cloud
  database: default
  migration_user: migration_staging
  aws_region: us-west-2
  ssm:
    admin_password: /database/clickhouse_credentials#password
    migration_password: /database/clickhouse_credentials#password

# Production us-west-2
production_usw2:
  host: def456xyz.us-west-2.aws.clickhouse.cloud
  database: default
  migration_user: migration_production
  aws_region: us-west-2
  ssm:
    admin_password: /database/clickhouse_credentials#password
    migration_password: /database/clickhouse_credentials#password

# Production us-east-1 (multi-region)
production_use1:
  host: ghi789xyz.us-east-1.aws.clickhouse.cloud
  database: default
  migration_user: migration_production
  aws_region: us-east-1
  ssm:
    admin_password: /database/clickhouse_credentials#password
    migration_password: /database/clickhouse_credentials#password

For local development, copy .env.local.example to .env.local and set your passwords. For production, credentials come from SSM Parameter Store automatically.

Bootstrap Your Database

Before running migrations, bootstrap the database with proper roles and users:

bash

# Preview what will be created
ch-migrate bootstrap staging --dry-run

# Create database, roles, and migration user
ch-migrate bootstrap staging

# Preview what will be created
ch-migrate bootstrap staging --dry-run

# Create database, roles, and migration user
ch-migrate bootstrap staging

Bootstrap is idempotent—running it multiple times is safe and will only create missing objects.

Writing Migrations

Basic Table Creation

Create a migration:

bash

ch-migrate new dev create_events_table

ch-migrate new dev create_events_table

This generates a timestamped Python file in migrations/versions/. Write your schema in a SQL file for version tracking:

migrations/sql/history/tables/events/001_baseline.sqlsql

-- Baseline events table schema
-- Optimized codecs and indexes for time-series analytics

CREATE TABLE IF NOT EXISTS {db}.events (
  -- Timestamps
  event_time DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  event_date Date DEFAULT toDate(event_time),

  -- Event data
  event_type LowCardinality(String),
  event_data String CODEC(ZSTD(2)),

  -- Dimensions
  user_id String CODEC(ZSTD(1)),
  session_id String CODEC(ZSTD(1)),

  -- Indexes for common queries
  INDEX idx_event_type event_type TYPE set(100) GRANULARITY 2,
  INDEX idx_user_id user_id TYPE bloom_filter(0.001) GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(event_time)
ORDER BY (event_type, event_date, user_id)
SETTINGS index_granularity = 8192;

-- Baseline events table schema
-- Optimized codecs and indexes for time-series analytics

CREATE TABLE IF NOT EXISTS {db}.events (
  -- Timestamps
  event_time DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  event_date Date DEFAULT toDate(event_time),

  -- Event data
  event_type LowCardinality(String),
  event_data String CODEC(ZSTD(2)),

  -- Dimensions
  user_id String CODEC(ZSTD(1)),
  session_id String CODEC(ZSTD(1)),

  -- Indexes for common queries
  INDEX idx_event_type event_type TYPE set(100) GRANULARITY 2,
  INDEX idx_user_id user_id TYPE bloom_filter(0.001) GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(event_time)
ORDER BY (event_type, event_date, user_id)
SETTINGS index_granularity = 8192;

Reference it in your migration:

migrations/versions/2026_01_08_1438_baseline_events_table.pypython

"""baseline_events_table

Create baseline events table for analytics.

Revision ID: f3ca35d7eb74
"""
from alembic import op
from clickhouse_alembic import get_db, read_sql

revision = "f3ca35d7eb74"
down_revision = None

def upgrade() -> None:
  db = get_db()
  op.execute(read_sql("history/tables/events/001_baseline.sql", db=db))

def downgrade() -> None:
  db = get_db()
  op.execute(f"DROP TABLE IF EXISTS {db}.events")

"""baseline_events_table

Create baseline events table for analytics.

Revision ID: f3ca35d7eb74
"""
from alembic import op
from clickhouse_alembic import get_db, read_sql

revision = "f3ca35d7eb74"
down_revision = None

def upgrade() -> None:
  db = get_db()
  op.execute(read_sql("history/tables/events/001_baseline.sql", db=db))

def downgrade() -> None:
  db = get_db()
  op.execute(f"DROP TABLE IF EXISTS {db}.events")

The {db} placeholder is replaced with the environment's database name at runtime.

Zero-Downtime Schema Changes

ClickHouse doesn't support transactional DDL. Altering a table while data is being ingested can cause issues. The solution is a shadow table pattern using ClickHouse's EXCHANGE TABLES feature:

EXCHANGE TABLES - A ClickHouse command that atomically swaps two tables in a single operation. The swap is instant regardless of table size—no data is copied, only metadata pointers change. This enables schema changes without downtime when combined with a shadow table.

Here's the pattern in action:

migrations/versions/2026_01_08_1854_reorder_events_table.pypython

"""reorder_events_table

Change ORDER BY key to optimize user queries.
Uses EXCHANGE TABLES with gap-fill for minimal data loss.

Revision ID: 18b9ba3e02c5
"""
from alembic import op
from sqlalchemy import text
from clickhouse_alembic import get_db, read_sql

revision = "18b9ba3e02c5"
down_revision = "ef2e86ca3404"

def upgrade() -> None:
  db = get_db()

  # Step 0: Clean up leftover from previous rollback (idempotent)
  op.execute(f"DROP TABLE IF EXISTS {db}.events_shadow")

  # Step 1: Create shadow table with new ORDER BY
  op.execute(
      read_sql("history/tables/events_shadow/002_reordered.sql", db=db)
  )

  # Step 2: Capture timestamp before bulk copy
  result = op.get_bind().execute(
      text(f"SELECT toString(max(event_time)) FROM {db}.events")
  )
  t1 = result.scalar()

  # Step 3: Copy all existing data
  op.execute(f"INSERT INTO {db}.events_shadow SELECT * FROM {db}.events")

  # Step 4: Atomic swap
  op.execute(f"EXCHANGE TABLES {db}.events AND {db}.events_shadow")
  # Now: events = new schema, events_shadow = old data

  # Step 5: Backfill gap (events that arrived during copy)
  if t1 is not None:
      op.execute(f"""
          INSERT INTO {db}.events
          SELECT * FROM {db}.events_shadow
          WHERE event_time >= toDateTime64('{t1}', 3)
      """)

  # events_shadow preserved for rollback capability

"""reorder_events_table

Change ORDER BY key to optimize user queries.
Uses EXCHANGE TABLES with gap-fill for minimal data loss.

Revision ID: 18b9ba3e02c5
"""
from alembic import op
from sqlalchemy import text
from clickhouse_alembic import get_db, read_sql

revision = "18b9ba3e02c5"
down_revision = "ef2e86ca3404"

def upgrade() -> None:
  db = get_db()

  # Step 0: Clean up leftover from previous rollback (idempotent)
  op.execute(f"DROP TABLE IF EXISTS {db}.events_shadow")

  # Step 1: Create shadow table with new ORDER BY
  op.execute(
      read_sql("history/tables/events_shadow/002_reordered.sql", db=db)
  )

  # Step 2: Capture timestamp before bulk copy
  result = op.get_bind().execute(
      text(f"SELECT toString(max(event_time)) FROM {db}.events")
  )
  t1 = result.scalar()

  # Step 3: Copy all existing data
  op.execute(f"INSERT INTO {db}.events_shadow SELECT * FROM {db}.events")

  # Step 4: Atomic swap
  op.execute(f"EXCHANGE TABLES {db}.events AND {db}.events_shadow")
  # Now: events = new schema, events_shadow = old data

  # Step 5: Backfill gap (events that arrived during copy)
  if t1 is not None:
      op.execute(f"""
          INSERT INTO {db}.events
          SELECT * FROM {db}.events_shadow
          WHERE event_time >= toDateTime64('{t1}', 3)
      """)

  # events_shadow preserved for rollback capability

This pattern:

Creates a shadow table with the new schema
Records the current max timestamp
Bulk copies all data to the shadow table
Atomically swaps the tables (instant, no downtime)
Backfills any rows that arrived during the copy

The gap-fill step uses >= to avoid data loss at the boundary. This may insert some rows twice, but that's safe: if your table uses ReplacingMergeTree (common for event data), duplicates are automatically removed during background merges. For other table engines, ensure your downstream queries handle potential duplicates, or add a deduplication step after the migration.

Managing Users and Roles

Migrations aren't just for tables. Here's a migration that creates users and roles:

python

"""create_readonly_role

Create a read-only role for reporting users.
"""
import os
from alembic import op
from clickhouse_alembic import get_db

def upgrade() -> None:
  db = get_db()

  # Create read-only role
  op.execute("CREATE ROLE IF NOT EXISTS readonly_role")

  # Grant SELECT on all tables in the database
  op.execute(f"GRANT SELECT ON {db}.* TO readonly_role")

  # Create a reporting user (password from environment)
  password = os.environ.get("CLICKHOUSE_READONLY_PASSWORD", "changeme")
  op.execute(f"""
      CREATE USER IF NOT EXISTS reporting_user
      IDENTIFIED BY '{password}'
      DEFAULT ROLE readonly_role
  """)

  # Grant the role to the user
  op.execute("GRANT readonly_role TO reporting_user")

def downgrade() -> None:
  op.execute("DROP USER IF EXISTS reporting_user")
  op.execute("DROP ROLE IF EXISTS readonly_role")

"""create_readonly_role

Create a read-only role for reporting users.
"""
import os
from alembic import op
from clickhouse_alembic import get_db

def upgrade() -> None:
  db = get_db()

  # Create read-only role
  op.execute("CREATE ROLE IF NOT EXISTS readonly_role")

  # Grant SELECT on all tables in the database
  op.execute(f"GRANT SELECT ON {db}.* TO readonly_role")

  # Create a reporting user (password from environment)
  password = os.environ.get("CLICKHOUSE_READONLY_PASSWORD", "changeme")
  op.execute(f"""
      CREATE USER IF NOT EXISTS reporting_user
      IDENTIFIED BY '{password}'
      DEFAULT ROLE readonly_role
  """)

  # Grant the role to the user
  op.execute("GRANT readonly_role TO reporting_user")

def downgrade() -> None:
  op.execute("DROP USER IF EXISTS reporting_user")
  op.execute("DROP ROLE IF EXISTS readonly_role")

Running Migrations

bash

# Check current status
ch-migrate status staging

# Apply all pending migrations
ch-migrate up staging

# Apply specific revision
ch-migrate up staging --revision abc123

# Rollback last migration
ch-migrate down staging

# Show migration history
ch-migrate history staging

# Check current status
ch-migrate status staging

# Apply all pending migrations
ch-migrate up staging

# Apply specific revision
ch-migrate up staging --revision abc123

# Rollback last migration
ch-migrate down staging

# Show migration history
ch-migrate history staging

SQL History Pattern

One feature that's saved us debugging time: object-centric SQL history. Instead of burying SQL in Python files, we store each version of a table/view/dictionary as a separate SQL file:

text

migrations/sql/history/
├── tables/
│   ├── events/
│   │   ├── 001_f3ca35d7eb74_baseline.sql
│   │   ├── 002_ef2e86ca3404_add_session_id.sql
│   │   └── 003_18b9ba3e02c5_reorder.sql
│   └── events_shadow/
│       └── 003_18b9ba3e02c5_reorder.sql
├── views/
│   └── events_by_user/
│       └── 001_abc123.sql
└── roles/
  └── readonly_role/
      └── 001_baseline.sql

migrations/sql/history/
├── tables/
│   ├── events/
│   │   ├── 001_f3ca35d7eb74_baseline.sql
│   │   ├── 002_ef2e86ca3404_add_session_id.sql
│   │   └── 003_18b9ba3e02c5_reorder.sql
│   └── events_shadow/
│       └── 003_18b9ba3e02c5_reorder.sql
├── views/
│   └── events_by_user/
│       └── 001_abc123.sql
└── roles/
  └── readonly_role/
      └── 001_baseline.sql

This gives you a complete history of every object's schema evolution. When debugging "what changed?", you can diff the SQL files directly.

Local Development with Docker

For local development, run ClickHouse in Docker:

docker-compose.ymlyaml

services:
clickhouse:
  image: clickhouse/clickhouse-server:latest
  ports:
    - "8123:8123"  # HTTP
    - "9000:9000"  # Native
  volumes:
    - clickhouse-data:/var/lib/clickhouse
  healthcheck:
    test: ["CMD", "clickhouse-client", "--query", "SELECT 1"]
    interval: 5s
    timeout: 5s
    retries: 5

volumes:
clickhouse-data:

services:
clickhouse:
  image: clickhouse/clickhouse-server:latest
  ports:
    - "8123:8123"  # HTTP
    - "9000:9000"  # Native
  volumes:
    - clickhouse-data:/var/lib/clickhouse
  healthcheck:
    test: ["CMD", "clickhouse-client", "--query", "SELECT 1"]
    interval: 5s
    timeout: 5s
    retries: 5

volumes:
clickhouse-data:

bash

# Start ClickHouse
docker compose up -d

# Wait for healthy
docker compose exec clickhouse clickhouse-client --query "SELECT 1"

# Run migrations against local
ch-migrate up dev

# Start ClickHouse
docker compose up -d

# Wait for healthy
docker compose exec clickhouse clickhouse-client --query "SELECT 1"

# Run migrations against local
ch-migrate up dev

Common Gotchas

A few things we learned the hard way:

Bootstrap Is Idempotent—But Be Careful with Users

Running ch-migrate bootstrap multiple times is safe for databases and roles. But if you change a user's password in config, bootstrap will update it. Make sure your application configs are ready for the new credentials before running bootstrap in production.

EXCHANGE TABLES Requires Both Tables to Exist

If your shadow table creation fails mid-migration, EXCHANGE TABLES will error. The migrations clean up shadow tables at the start (DROP TABLE IF EXISTS ... _shadow) to handle this, but if you're debugging a failed migration, check for orphaned shadow tables.

SSM Permission Errors

If you see AccessDeniedException when running against staging/production, your AWS credentials need ssm:GetParameter permission for the paths in your config. The error message will tell you which parameter failed—check your IAM policy.

ClickHouse Cloud Connection Timeouts

ClickHouse Cloud services can take a few seconds to wake from idle. If you see connection timeouts on the first command, try again—subsequent commands will be fast. For CI/CD, add a retry or a brief health check before running migrations.

Materialized Views and Schema Changes

If you're changing a table that feeds a materialized view, you may need to drop and recreate the MV. ClickHouse MVs are triggers on INSERT—they don't automatically adapt to schema changes on the source table. Plan your migration order accordingly.

Get Started

The tool is available on GitHub:

bash

# Install
uv tool install git+https://github.com/DRYCodeWorks/clickhouse-migrate.git

# Or pin to a version
uv tool install git+https://github.com/DRYCodeWorks/clickhouse-migrate.git@v0.1.0

# Install
uv tool install git+https://github.com/DRYCodeWorks/clickhouse-migrate.git

# Or pin to a version
uv tool install git+https://github.com/DRYCodeWorks/clickhouse-migrate.git@v0.1.0

The README has full documentation, including:

Complete CLI reference
Dictionary creation with auto-grants
CI/CD integration patterns
Troubleshooting common issues

If you're running ClickHouse in production and managing schemas manually, give clickhouse-migrate a try. It's the tool we wished existed when we started, so we built it.

Have questions or feature requests? Open an issue on GitHub. We're actively developing the tool and welcome contributions.