Migrating 1.3 Billion IoT Rows to ClickHouse Cloud

2025-12-08 - 25 min read

Daniel Young

Founder, DRYCodeWorks

Dashboard timeouts during winter storms. 45-second queries when road crews need real-time data. We moved 1.3 billion IoT rows from SQL Server to ClickHouse Cloud—and cut query times from seconds to milliseconds while saving $2,200/month. Here's exactly how.

Picture thousands of road weather sensors transmitting temperature, humidity, and surface conditions every minute—24/7, 365 days a year. Now imagine querying three years of that data for pattern analysis while your dashboard times out during a winter storm, precisely when road crews need it most. That was our client's reality with SQL Server.

Fast forward to today: We're running 1.3+ billion transmission rows and nearly 4 billion forecast rows in ClickHouse Cloud. Dashboard queries that once took 45 seconds now return in under 2 seconds. Time-series queries for a single device? Sub-20 milliseconds. Bulk data exports? Under 500 milliseconds—at that point, network transfer is the bottleneck, not the database. And we're doing all this for $2,200 less per month than the old SQL Server setup.

These aren't benchmark numbers from a lab environment. This is production data from a live road weather information system (RWIS) serving municipal road crews and departments of transportation across North America. In this post, I'll walk through the architecture decisions, migration process, performance optimizations, and hard-learned lessons from moving billions of IoT rows to ClickHouse Cloud.

When SQL Server Becomes the Bottleneck

Our client operates a comprehensive road weather monitoring network. Think of it as a distributed sensor mesh that helps road maintenance crews make data-driven decisions about salting, plowing, and route prioritization during winter weather events.

The data characteristics are what you'd expect from high-frequency IoT telemetry:

Volume: 1.3+ billion transmission rows, growing by ~1.7 million daily
Schema: DeviceID, CaptureDateTimeUTC, SurfaceTemp, AirTemp, DewPoint, Humidity, HeaterTemp, AmbientLight
Additional data: Tens of millions of camera images with computer vision classification scores
Query patterns: Time-range aggregations for dashboards, sensor-by-sensor historical analysis, anomaly detection

The device fleet includes:

~2,800 road weather sensors measuring surface/air temperature and humidity
~500 cameras with computer vision for road condition assessment
~140 snow depth sensors
~2,400 unique devices actively transmitting around the clock

The SQL Server environment was struggling under the load—but the problems went deeper than just volume:

6 TB database storing years of historical sensor data
$4,000/month total RDS spend (compute + storage + automated snapshots)
RDS SQL Server r6i.xlarge instance maxed out on aggregation queries
Columnstore indexes helped, but not enough for three-year historical queries
Dashboard timeouts during winter storm events—exactly when road crews needed real-time visibility

When we arrived, the schema had a fundamental problem: the time-series tables used composite primary keys that didn't match the actual query patterns. The transmissions table's primary key started with a unique reading ID—which, in SQL Server, is also the clustered index. This meant data was physically laid out on disk in order of when readings arrived, not by device or time range.

Think of it like a library where books are shelved by acquisition date instead of by subject or author. When a dashboard asks "show me the last 7 days for device 1682," SQL Server had to scatter-read across the entire table, hunting for that device's readings interleaved with millions of rows from other devices. The query optimizer couldn't help—the physical layout was working against every query.

The breaking point came as the forecast data grew. Weather forecasts for each sensor location, updated hourly, had accumulated tens of millions of rows. SQL Server's Indexed Views offered pre-computation, but created 1-to-1 data replication—a storage explosion we couldn't sustain. We needed a database designed for this workload from the ground up.

Why ClickHouse for IoT Telemetry

When evaluating time-series databases, we considered TimescaleDB, InfluxDB, and ClickHouse. Each has strengths, but our decision criteria narrowed the field quickly:

Query performance: Sub-second aggregations over billions of rows
Operational complexity: Managed service preferred over self-hosting
Cost efficiency: Better price/performance than RDS SQL Server
SQL compatibility: Team already knows SQL, minimize learning curve
Ecosystem maturity: Production-ready tooling and documentation
Developer experience: Quality of console, observability tooling, and debugging workflows

ClickHouse Cloud won on all counts—and the developer experience deserves special mention. The console is genuinely well-designed: an intelligent query editor with autocomplete, a visual analyzer showing exactly which parts are slow and why, real-time metrics dashboards, and documentation that doesn't assume you're already an expert. When you're debugging a slow query at 2am, tooling quality matters. ClickHouse Cloud feels like a product built by engineers who use it themselves.

Here's why it was the right fit for our IoT telemetry use case:

Columnar storage is perfect for aggregating temperature readings across time. Traditional row-based databases store each record together: DeviceID, timestamp, temperature, humidity, all in one place. Columnar databases flip this—all temperatures stored contiguously, then all timestamps, then all humidity values. When your query is "average surface temperature for the last 30 days," ClickHouse reads just the temperature and timestamp columns, skipping humidity, device IDs, and everything else entirely. It's like having a filing cabinet where one drawer contains only temperatures from every sensor ever—you pull that drawer and you're done.

The architecture is fundamentally different from traditional databases. ClickHouse Cloud stores data in object storage (S3) rather than local disks, separating compute from storage. This means you're not paying for idle disk capacity, and compute can scale independently. Instead of B-tree indexes that maintain sorted pointers to every row, ClickHouse uses sparse primary indexes—storing one index entry per block of rows (typically 8,192). For our 1.3 billion transmission rows, that's ~160,000 index entries instead of 1.3 billion. The tradeoff: theoretically, point lookups are slower than PostgreSQL, but analytical scans over millions of rows are dramatically faster. In practice, we found point lookups (common for things like image metadata in our use case) were sub-10ms when the full ORDER BY key was provided. This enabled single-granule searches and optimal performance that outperformed SQL Server.

Compression is exceptional. Instead of storing "28.3, 28.4, 28.5, 28.6" for slowly-changing temperatures, Delta encoding stores "28.3, +0.1, +0.1, +0.1"—a massive space savings. Weather forecast data with slowly-changing values compresses dramatically. Even sensor readings with more environmental noise compress well. Total storage for 5.3+ billion rows: approximately 19 GB. (Full compression breakdown in the performance section below.)

Native time-series functions. Functions like toStartOfMonth(), toDate(), and time bucketing are first-class citizens. ASOF JOINs—crucial for correlating sensor readings with forecasts at the closest preceding time—are built into the query language.

ClickHouse Cloud's managed offering eliminates operational overhead. The SharedMergeTree engine handles replication and distributed storage automatically. No manual sharding, no replica management, no 3am pages about disk space.

Before diving deeper, here are some ClickHouse-specific terms you'll encounter throughout this article:

MergeTree - ClickHouse's primary table engine family, optimized for high-volume inserts and analytical queries. Think of it like a filing system where new documents go into an inbox, and a background worker periodically sorts and consolidates them into organized folders.

ReplacingMergeTree - A MergeTree variant that deduplicates rows with the same primary key, keeping only the latest version. Think of a voicemail system that keeps only your most recent message from each caller—duplicates pile up temporarily, but during periodic cleanup, older messages from the same person get purged. Essential for IoT data where sensors occasionally retransmit readings.

Materialized View - A view that physically stores its results and updates automatically when source data changes. Think of it as a live scoreboard that updates itself whenever the game state changes—you don't recalculate from play-by-play logs every time someone asks the score. Unlike SQL Server indexed views, ClickHouse MVs can transform data and write to separate target tables.

ASOF JOIN - A time-series-aware join that matches each row to the closest preceding row in another table. Like checking your bank balance—you get the most recent statement before today's date, not the one from next week. Perfect for correlating sensor readings with the most recent forecast.

Codec - A compression algorithm applied at the column level. ClickHouse supports Delta (for sequential values), ZSTD, LZ4, and others—often stacked for maximum compression.

Architecture Overview

Our data pipeline looks like this:

Sensor → Particle Cloud: Devices publish data via Particle's cellular network
Particle → Lambda: Webhook events trigger a Lambda function for each message batch
Lambda → ClickHouse Cloud: Batched inserts via HTTPS (us-east-2 region)

The ClickHouse architecture includes:

SharedReplacingMergeTree tables for handling duplicate/late-arriving sensor data
Materialized Views for real-time pre-computation: _mv_forecasts_deduplicator (deduplicates forecast data), _mv_transmissions_forecast_joiner (ASOF JOIN of measured + predicted values), get_latest_transmissions (current reading per device), and pair_transmissions_to_latest_images (aligns camera images with sensor readings)
Dictionary tables for device metadata (HTTP source, auto-refreshed every 10 minutes)

Why SharedReplacingMergeTree? ClickHouse Cloud uses the Shared engine family, which separates compute from storage and handles replication automatically. For IoT data where sensors occasionally retransmit the same reading, ReplacingMergeTree deduplicates based on a version column (our case: CaptureDateTimeUTC).

Migrating 1.3 Billion Rows Without Downtime

Database migrations at this scale require careful planning. A naive "stop writes, export, import, resume" approach would have meant hours of downtime—unacceptable for a real-time monitoring system. Here's how we did it.

Phase 1: Schema Design

The foundation of good performance in ClickHouse is schema design. We optimized for our query patterns: time-range filtering and aggregation by device.

Table structure decisions:

Engine: SharedReplacingMergeTree for ClickHouse Cloud (handles replication automatically)
Partitioning: Monthly partitions using PARTITION BY toStartOfMonth(CaptureDateTimeUTC). Think of it like organizing a warehouse by month—when you need November data, you go straight to the November section without walking past January through October. This allows efficient pruning for time-range queries.
Order key: ORDER BY (DeviceID, CaptureDateTimeUTC)—within each monthly partition, data is sorted like a phone book: first by device, then by time. This makes "show me data for device X over this time range" extremely fast.
Codecs: Delta encoding + ZSTD for timestamps, T64 for IDs, LowCardinality for strings

Here's the actual production DDL for the transmissions table:

transmissions_table.sqlsql

CREATE TABLE transmissions (
  ID UInt64 CODEC(Delta(8), ZSTD(1)),
  DeviceID UInt32 CODEC(T64),
  VendorDeviceID LowCardinality(String),
  CaptureTimestampUTC DateTime CODEC(Delta(4), ZSTD(1)),
  CaptureDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  SurfaceTemp Decimal(6, 2),
  AirTemp Decimal(5, 2),
  DewPoint Decimal(5, 2),
  Humidity Decimal(4, 2),
  CreatedDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  HeaterTemp Decimal(6, 2),
  AmbientLight Int32
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
PARTITION BY toStartOfMonth(CaptureDateTimeUTC)
ORDER BY (DeviceID, CaptureDateTimeUTC)

CREATE TABLE transmissions (
  ID UInt64 CODEC(Delta(8), ZSTD(1)),
  DeviceID UInt32 CODEC(T64),
  VendorDeviceID LowCardinality(String),
  CaptureTimestampUTC DateTime CODEC(Delta(4), ZSTD(1)),
  CaptureDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  SurfaceTemp Decimal(6, 2),
  AirTemp Decimal(5, 2),
  DewPoint Decimal(5, 2),
  Humidity Decimal(4, 2),
  CreatedDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  HeaterTemp Decimal(6, 2),
  AmbientLight Int32
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
PARTITION BY toStartOfMonth(CaptureDateTimeUTC)
ORDER BY (DeviceID, CaptureDateTimeUTC)

Why these codec choices?

Timestamps: CODEC(Delta(8), ZSTD(1)) - Sequential timestamps compress extremely well with delta encoding
Device IDs: CODEC(T64) - Efficient for sorted unsigned integers
Strings: LowCardinality(String) - Dictionary encoding for columns with few unique values (like device serial numbers)
Temperature decimals: No special codec—ZSTD default handles these efficiently

The forecasts table follows a similar pattern but includes additional weather fields:

forecasts_table.sqlsql

CREATE TABLE forecasts (
  DeviceID UInt32,
  CaptureDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  ForecastDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  SurfaceTemp Decimal(5, 2) CODEC(ZSTD(3)),
  AirTemp Decimal(5, 2) CODEC(ZSTD(3)),
  DewPoint Decimal(4, 2) CODEC(ZSTD(3)),
  Humidity Decimal(4, 2) CODEC(ZSTD(3)),
  SurfaceGrip Decimal(2, 2) CODEC(ZSTD(3)),
  RoadCondition Int8,
  PrecipType Int8,
  PrecipRate Decimal(5, 2) CODEC(ZSTD(3)),
  WindDirection Int16 CODEC(Delta(2), ZSTD(1)),
  WindSpeed Decimal(5, 2) CODEC(ZSTD(3))
) ENGINE = SharedMergeTree
PARTITION BY toStartOfMonth(CaptureDateTimeUTC)
ORDER BY (DeviceID, CaptureDateTimeUTC, ForecastDateTimeUTC)

CREATE TABLE forecasts (
  DeviceID UInt32,
  CaptureDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  ForecastDateTimeUTC DateTime64(3) CODEC(Delta(8), ZSTD(1)),
  SurfaceTemp Decimal(5, 2) CODEC(ZSTD(3)),
  AirTemp Decimal(5, 2) CODEC(ZSTD(3)),
  DewPoint Decimal(4, 2) CODEC(ZSTD(3)),
  Humidity Decimal(4, 2) CODEC(ZSTD(3)),
  SurfaceGrip Decimal(2, 2) CODEC(ZSTD(3)),
  RoadCondition Int8,
  PrecipType Int8,
  PrecipRate Decimal(5, 2) CODEC(ZSTD(3)),
  WindDirection Int16 CODEC(Delta(2), ZSTD(1)),
  WindSpeed Decimal(5, 2) CODEC(ZSTD(3))
) ENGINE = SharedMergeTree
PARTITION BY toStartOfMonth(CaptureDateTimeUTC)
ORDER BY (DeviceID, CaptureDateTimeUTC, ForecastDateTimeUTC)

Notice the order key includes ForecastDateTimeUTC because queries often need "forecasts made at this time for this future time."

Phase 2: Dual-Write and Feature-Flagged Reads

Before touching historical data, we set up dual writes. New sensor readings went to both SQL Server and ClickHouse simultaneously—this let us validate ClickHouse on live production data before committing to the full migration.

The clever part was the read path. We implemented a feature flag that routed queries based on timestamp:

Data after the cutover date → read from ClickHouse
Data before the cutover date → fall back to SQL Server

This gave us a live A/B test. We could verify ClickHouse was returning correct results for recent data while SQL Server continued serving historical queries. No big-bang cutover risk.

Phase 3: Historical Data Migration

With dual writes proven stable, we backfilled historical data into ClickHouse. As each month's data landed, the feature flag's cutover date automatically expanded backward—gradually shifting more query traffic to ClickHouse without any manual intervention.

Export strategy:

Custom export jobs that partitioned data by month for parallel processing
Exported to Parquet format—columnar on both ends meant no schema translation headaches
Parquet's built-in compression saved bandwidth during transfer

Load strategy:

ClickHouse Cloud's native Parquet import handled the heavy lifting
Parallel loads by partition kept throughput high

Validation:

Row count comparisons between source and destination
Spot-check queries on known aggregations to verify data integrity

Phase 4: Cutover

After validating data consistency, we flipped the switch:

Dashboard migration: Updated API endpoints to query ClickHouse instead of SQL Server
Canary deployment: 10% of traffic to ClickHouse, monitored for 48 hours
Full cutover: 100% of traffic to ClickHouse
SQL Server retention: Kept SQL Server read-only for 60 days as a safety net

Rollback plan: If ClickHouse had failed, we could revert the API endpoints in under 5 minutes. We never needed it.

Final validation:

Spot-checked aggregation queries for numerical accuracy (match SQL Server results within rounding tolerance)
Tested dashboard load times under concurrent user load
Verified materialized views were populating correctly from live data

Enabling Features That Were Impossible in SQL Server

In SQL Server, our only option for pre-computed views was Indexed Views, which created 1-to-1 data replication—a storage explosion waiting to happen. ASOF JOINs to align different time series? Impossible. Aligning a photo with a transmission at read time? Unrealistic at scale.

ClickHouse Materialized Views changed everything. We now have 8 MVs running in production that enable features we couldn't build before:

Pattern 1: Time-Series Fusion (Measured + Predicted)

The _mv_transmissions_forecast_joiner creates a "fusion" table combining actual sensor readings with forecasted values:

fusion_materialized_view.sqlsql

-- Simplified from production
CREATE MATERIALIZED VIEW _mv_transmissions_forecast_joiner
TO transmissions_with_forecasts AS
SELECT
  t.DeviceID,
  t.CaptureDateTimeUTC,
  t.SurfaceTemp AS SurfaceTemp_F,        -- Measured
  t.AirTemp AS AirTemp_F,                 -- Measured
  f.ForecastSurfaceTemp_F,                -- Predicted
  f.ForecastAirTemp_F,                    -- Predicted
  f.PrecipType,
  f.RoadCondition,
  -- How old was the forecast when we measured?
  toUInt32(abs(t.CaptureDateTimeUTC - f.ForecastDateTimeUTC)) AS ForecastAge_Seconds
FROM transmissions AS t
ASOF LEFT JOIN best_known_forecasts AS f
  ON t.DeviceID = f.DeviceID
  AND t.CaptureDateTimeUTC >= f.ForecastDateTimeUTC

-- Simplified from production
CREATE MATERIALIZED VIEW _mv_transmissions_forecast_joiner
TO transmissions_with_forecasts AS
SELECT
  t.DeviceID,
  t.CaptureDateTimeUTC,
  t.SurfaceTemp AS SurfaceTemp_F,        -- Measured
  t.AirTemp AS AirTemp_F,                 -- Measured
  f.ForecastSurfaceTemp_F,                -- Predicted
  f.ForecastAirTemp_F,                    -- Predicted
  f.PrecipType,
  f.RoadCondition,
  -- How old was the forecast when we measured?
  toUInt32(abs(t.CaptureDateTimeUTC - f.ForecastDateTimeUTC)) AS ForecastAge_Seconds
FROM transmissions AS t
ASOF LEFT JOIN best_known_forecasts AS f
  ON t.DeviceID = f.DeviceID
  AND t.CaptureDateTimeUTC >= f.ForecastDateTimeUTC

Why this matters: Customers can now see "What did we predict vs. what actually happened?" for every sensor reading. This powers forecast accuracy dashboards and model improvement workflows. In SQL Server, this query would timeout on 1.3 billion rows—and Indexed Views don't support ASOF JOINs or complex subqueries.

The ASOF JOIN is doing the heavy lifting here. It finds the most recent forecast before each transmission timestamp. If a sensor reading came in at 14:37, and we had forecasts at 14:00 and 15:00, the ASOF JOIN picks the 14:00 forecast. This is the correct semantics for "what were we predicting when this measurement was taken?"

Pattern 2: Photo + Sensor Pairing

The pair_transmissions_to_latest_images MV aligns camera images with their corresponding transmission:

image_pairing_mv.sqlsql

-- Pairs each camera image with sensor readings at that moment
CREATE MATERIALIZED VIEW pair_transmissions_to_latest_images
TO transmissions_with_images AS
SELECT
  i.DeviceID,
  i.ImageUrl,
  i.CaptureDateTimeUTC AS ImageCaptureDateTimeUTC,
  t.SurfaceTemp, t.AirTemp, t.Humidity,
  i.SnowOnRoad, i.WetPavement, i.ClearPavement  -- CV scores
FROM latest_images AS i
INNER JOIN transmissions AS t
  ON t.DeviceID = i.DeviceID
  AND t.CaptureDateTimeUTC = i.TransmissionCaptureDateTimeUTC

-- Pairs each camera image with sensor readings at that moment
CREATE MATERIALIZED VIEW pair_transmissions_to_latest_images
TO transmissions_with_images AS
SELECT
  i.DeviceID,
  i.ImageUrl,
  i.CaptureDateTimeUTC AS ImageCaptureDateTimeUTC,
  t.SurfaceTemp, t.AirTemp, t.Humidity,
  i.SnowOnRoad, i.WetPavement, i.ClearPavement  -- CV scores
FROM latest_images AS i
INNER JOIN transmissions AS t
  ON t.DeviceID = i.DeviceID
  AND t.CaptureDateTimeUTC = i.TransmissionCaptureDateTimeUTC

Why this matters: Road crews see a photo with "Surface: 28°F, Snow probability: 87%" overlaid directly in the dashboard. Previously, aligning images with readings at read time was too slow—we'd have to query tens of millions of image records and over a billion transmission records, then join in application code. Now it's pre-computed.

Pattern 3: Latest Value Caching

Multiple get_latest_* MVs maintain current state per device:

get_latest_transmissions: Current readings for all ~2,400 devices
get_latest_images: Most recent photo per camera
get_latest_cv_images: Latest CV-analyzed image (Version = 3 only)
get_latest_sds_readings: Current snow depth (Error = 0 only for quality filtering)

These feed the real-time dashboard—200ms to show all ~2,400 devices, including their current temperature, humidity, last photo, and road condition.

latest_transmissions_mv.sqlsql

CREATE MATERIALIZED VIEW get_latest_transmissions
ENGINE = ReplacingMergeTree(CaptureDateTimeUTC)
ORDER BY DeviceID AS
SELECT
  DeviceID,
  CaptureDateTimeUTC,
  SurfaceTemp,
  AirTemp,
  DewPoint,
  Humidity,
  HeaterTemp,
  AmbientLight
FROM transmissions

CREATE MATERIALIZED VIEW get_latest_transmissions
ENGINE = ReplacingMergeTree(CaptureDateTimeUTC)
ORDER BY DeviceID AS
SELECT
  DeviceID,
  CaptureDateTimeUTC,
  SurfaceTemp,
  AirTemp,
  DewPoint,
  Humidity,
  HeaterTemp,
  AmbientLight
FROM transmissions

The ReplacingMergeTree engine automatically keeps only the row with the maximum CaptureDateTimeUTC for each DeviceID. During background merges, older rows are discarded. For dashboard queries, we use FINAL to force immediate deduplication:

sql

SELECT * FROM get_latest_transmissions FINAL
WHERE DeviceID IN (1234, 5678, 9012)

SELECT * FROM get_latest_transmissions FINAL
WHERE DeviceID IN (1234, 5678, 9012)

Pattern 4: Smart Filtering

_mv_last_snow_forecasted captures "when did we last forecast snow?"—filtering for PrecipType = 3 and PrecipRate > 0. This enables alerting workflows ("notify me when snow is predicted in the next 6 hours") without scanning billions of forecast rows.

snow_forecast_mv.sqlsql

CREATE MATERIALIZED VIEW _mv_last_snow_forecasted
ENGINE = ReplacingMergeTree(ForecastDateTimeUTC)
ORDER BY DeviceID AS
SELECT
  DeviceID,
  ForecastDateTimeUTC,
  CaptureDateTimeUTC,
  PrecipType,
  PrecipRate
FROM forecasts
WHERE PrecipType = 3 AND PrecipRate > 0

CREATE MATERIALIZED VIEW _mv_last_snow_forecasted
ENGINE = ReplacingMergeTree(ForecastDateTimeUTC)
ORDER BY DeviceID AS
SELECT
  DeviceID,
  ForecastDateTimeUTC,
  CaptureDateTimeUTC,
  PrecipType,
  PrecipRate
FROM forecasts
WHERE PrecipType = 3 AND PrecipRate > 0

The SQL Server Alternative (and why it failed)

Feature	SQL Server Indexed Views	ClickHouse MVs
Storage overhead	1:1 replication (doubles storage)	Incremental only
ASOF JOIN support	❌ Not supported	✅ Native
Complex WHERE clauses	❌ Very limited	✅ Full SQL
Update on INSERT	⚠️ Performance hit on writes	✅ Designed for high-throughput writes
Aggregate in view	❌ Limited (no GROUP BY with joins)	✅ Full support

SQL Server's Indexed Views did double our storage footprint—that's how we got to 6 TB. They were our stopgap for the inefficient clustered indexes: we couldn't fix the underlying index structure without downtime on our SQL Server license tier, so we threw indexed views at the problem. It helped query performance but exploded storage. ClickHouse MVs are fundamentally different—they only store the result of the transformation, not a duplicate of the source data.

Pattern 5: Efficient Bulk Exports to S3

One pattern that's worked exceptionally well: daily exports directly to our S3 bucket using ClickHouse's S3 table engine. No intermediate steps, no export jobs to maintain—just INSERT INTO a table that happens to be an S3 path.

daily_export.sqlsql

-- Export yesterday's transmissions to S3 as Parquet
INSERT INTO FUNCTION s3(
  'https://our-bucket.s3.us-east-2.amazonaws.com/exports/transmissions/{_partition_id}.parquet',
  'AWS_ACCESS_KEY', 'AWS_SECRET_KEY',
  'Parquet'
)
SELECT *
FROM transmissions
WHERE CaptureDateTimeUTC >= today() - 1
AND CaptureDateTimeUTC < today()

-- Export yesterday's transmissions to S3 as Parquet
INSERT INTO FUNCTION s3(
  'https://our-bucket.s3.us-east-2.amazonaws.com/exports/transmissions/{_partition_id}.parquet',
  'AWS_ACCESS_KEY', 'AWS_SECRET_KEY',
  'Parquet'
)
SELECT *
FROM transmissions
WHERE CaptureDateTimeUTC >= today() - 1
AND CaptureDateTimeUTC < today()

This runs as a scheduled query and completes in seconds—exporting millions of rows directly to compressed Parquet files. The same data export from SQL Server required a multi-step ETL pipeline with staging tables. ClickHouse treats S3 as just another storage destination.

We use this for:

Daily backups to a separate AWS account for disaster recovery
Data lake integration for downstream analytics in Spark/Athena
Customer data exports when clients need raw data extracts

Performance Optimization Deep-Dive

Good schema design got us 80% of the way there. The remaining 20% came from codec strategy and understanding how ClickHouse executes queries.

Codec Strategy and Compression

ClickHouse's compression results on our production data:

Table	Compressed	Uncompressed	Ratio
transmissions	11.09 GB	68.73 GB	6.2:1
forecasts	5.17 GB	266.6 GB	51.5:1
images	2.35 GB	9.8 GB	4.2:1
best_known_forecasts	414 MB	2.78 GB	6.7:1
Total	~19 GB	~348 GB	~18.3:1 average

Why 315x better than SQL Server? The 6 TB SQL Server database included indexes, transaction logs, automated snapshots, and row-based storage overhead. ClickHouse's columnar storage eliminates row headers, and codecs like Delta+ZSTD achieve 50:1 on repetitive forecast data. Even the "worst" compression (6.2:1 on transmissions) beats SQL Server's typical 3:1 rowstore compression.

Codec deep-dive:

The forecasts table achieves 51.5:1 compression because weather data is highly repetitive:

Surface temperatures change slowly (28.3°F, 28.4°F, 28.5°F, ...) → Delta encoding is extremely efficient
Many devices share similar forecasts (cold front affects 500 devices simultaneously)
Wind direction values cluster (NW, NW, NNW, NW) → ZSTD dictionary compression excels

Transmissions compress "only" 6.2:1 because sensor readings are more varied and include noise from environmental factors.

Query Performance Patterns

ClickHouse excels at certain query patterns. Understanding them is key to fast queries.

Pattern 1: Time-range filtering with monthly partitions

sql

SELECT DeviceID, avg(SurfaceTemp), count()
FROM transmissions
WHERE CaptureDateTimeUTC >= '2025-11-01' AND CaptureDateTimeUTC < '2025-12-01'
GROUP BY DeviceID

SELECT DeviceID, avg(SurfaceTemp), count()
FROM transmissions
WHERE CaptureDateTimeUTC >= '2025-11-01' AND CaptureDateTimeUTC < '2025-12-01'
GROUP BY DeviceID

ClickHouse only reads the November 2025 partition—data pruning at the partition level. Query time: ~50ms for 52 million rows (one month of data).

Pattern 2: Per-device time-series queries

sql

SELECT CaptureDateTimeUTC, SurfaceTemp, AirTemp
FROM transmissions
WHERE DeviceID = 1682
AND CaptureDateTimeUTC >= now() - INTERVAL 7 DAY
ORDER BY CaptureDateTimeUTC

SELECT CaptureDateTimeUTC, SurfaceTemp, AirTemp
FROM transmissions
WHERE DeviceID = 1682
AND CaptureDateTimeUTC >= now() - INTERVAL 7 DAY
ORDER BY CaptureDateTimeUTC

Because the ORDER BY key is (DeviceID, CaptureDateTimeUTC), this query reads a contiguous range of data on disk. Query time: <20ms for ~10,000 rows.

Pattern 3: Aggregation with materialized views

sql

SELECT DeviceID, SurfaceTemp_F, ForecastSurfaceTemp_F
FROM transmissions_with_forecasts
WHERE CaptureDateTimeUTC >= now() - INTERVAL 1 DAY
AND DeviceID IN (SELECT ID FROM devices WHERE GroupID = 42)

SELECT DeviceID, SurfaceTemp_F, ForecastSurfaceTemp_F
FROM transmissions_with_forecasts
WHERE CaptureDateTimeUTC >= now() - INTERVAL 1 DAY
AND DeviceID IN (SELECT ID FROM devices WHERE GroupID = 42)

This query uses the pre-computed transmissions_with_forecasts table (populated by _mv_transmissions_forecast_joiner). The ASOF JOIN has already been executed incrementally as data arrived. Query time: <1 second for ~1.7 million rows (one day × ~2,400 devices).

Connection Management from Lambda

Lambda functions are ephemeral, but ClickHouse connections have overhead. Our optimization:

lambda_clickhouse.pypython

import clickhouse_connect

# Global variable for connection reuse in warm containers
client = None

def lambda_handler(event, context):
  global client

  if client is None:
      client = clickhouse_connect.get_client(
          host='abc123.us-east-2.aws.clickhouse.cloud',
          port=8443,
          username='default',
          password=os.environ['CLICKHOUSE_PASSWORD'],
          secure=True
      )

  # Synchronous insert - no batching on Lambda side
  rows = parse_iot_messages(event)
  client.insert('transmissions', rows)

  return {'statusCode': 200}

import clickhouse_connect

# Global variable for connection reuse in warm containers
client = None

def lambda_handler(event, context):
  global client

  if client is None:
      client = clickhouse_connect.get_client(
          host='abc123.us-east-2.aws.clickhouse.cloud',
          port=8443,
          username='default',
          password=os.environ['CLICKHOUSE_PASSWORD'],
          secure=True
      )

  # Synchronous insert - no batching on Lambda side
  rows = parse_iot_messages(event)
  client.insert('transmissions', rows)

  return {'statusCode': 200}

Key points:

Connection reuse in warm containers (global variable persists across invocations)
Synchronous writes, not async. Each Lambda invocation creates a new ClickHouse session. Async inserts create a buffer per session—at our scale, that's memory overhead on ClickHouse we don't need.
ClickHouse Cloud handles HTTPS connections well—no need for connection pooling
If we need client-side batching in the future, we'd use Kinesis with a ClickPipe to ingest—keeping the batching logic on our infrastructure, not ClickHouse's.

The Numbers: Before and After

Here's the side-by-side comparison:

Metric	SQL Server	ClickHouse Cloud	Improvement
Time-series query (single device, 7 days)	Several seconds	<20ms	100x+
Bulk data export (100s of MBs)	Timeout or very slow	<500ms (network-bound)	Network is the bottleneck, not DB
"Latest reading" for all devices (2,391)	12s	200ms	60x
Storage size	6 TB	~19 GB	315x smaller
Monthly cost	$4,000 (RDS compute + storage + snapshots)	$1,800 ($1,500 ClickHouse Cloud + $300 remaining SQL Server for OLTP)	$2,200/month savings

Query performance numbers are from production. The most common query pattern—sampling time-series data for a device or device group over a date range—returns in milliseconds, not seconds. Bulk exports returning hundreds of megabytes complete in under 500ms, where network transfer is the bottleneck, not ClickHouse.

The Real Win: Cost Trajectory

The $2,200/month savings tells only part of the story. The bigger win is what happens as data grows.

With SQL Server, costs scaled linearly with data volume. Every new sensor, every additional year of history, every new data type meant more storage, more compute, more money. The 6 TB database was heading toward 10 TB within a year, and RDS pricing would have followed.

ClickHouse's architecture changes this equation entirely. Under the hood, ClickHouse uses LSM Trees (Log-Structured Merge Trees)—a data structure designed for write-heavy workloads. Instead of updating data in place like traditional B-trees, LSM Trees batch writes into sorted runs and merge them in the background. This is why ClickHouse handles 35 million daily inserts without breaking a sweat, and why compression ratios stay excellent as data grows.

The practical result: our costs stay flat as data scales. We're ingesting 35 million rows daily into a system that has massive headroom at its current tier. The forecast data that was exploding SQL Server storage? It compresses 51:1 in ClickHouse. Adding another year of sensor history doesn't meaningfully change our bill.

For a system that's been running for years and will run for years more, this trajectory change matters far more than the monthly savings snapshot. We're not just paying less—we've stepped off the linear cost curve entirely.

Sample Dashboard Query (Real Production)

dashboard_query.sqlsql

SELECT
  DeviceID,
  toDate(CaptureDateTimeUTC) as date,
  avg(SurfaceTemp) as avg_surface_temp,
  avg(AirTemp) as avg_air_temp,
  min(SurfaceTemp) as min_temp,
  max(SurfaceTemp) as max_temp,
  count() as readings
FROM transmissions
WHERE CaptureDateTimeUTC >= now() - INTERVAL 30 DAY
GROUP BY DeviceID, date
ORDER BY DeviceID, date
-- Returns ~72,000 rows (~2,400 devices × 30 days) in ~1.5s

SELECT
  DeviceID,
  toDate(CaptureDateTimeUTC) as date,
  avg(SurfaceTemp) as avg_surface_temp,
  avg(AirTemp) as avg_air_temp,
  min(SurfaceTemp) as min_temp,
  max(SurfaceTemp) as max_temp,
  count() as readings
FROM transmissions
WHERE CaptureDateTimeUTC >= now() - INTERVAL 30 DAY
GROUP BY DeviceID, date
ORDER BY DeviceID, date
-- Returns ~72,000 rows (~2,400 devices × 30 days) in ~1.5s

Real Daily Sensor Data Example

Here's actual data from the system (sanitized):

DeviceID	Date	Avg Surface (°F)	Avg Air (°F)	Readings
1682	2025-11-09	41.4	38.3	1,433
1682	2025-11-10	46.0	45.8	1,441
4186	2025-12-07	28.7	25.3	3,156

Notice the reading counts vary (1,433 vs 3,156 per day)—this reflects different transmission intervals for different device types.

Query Concurrency

ClickHouse Cloud handles multiple concurrent dashboard users without degradation. We ran the migration in summer, so we simulated peak winter load—50 concurrent users hammering dashboard queries. P95 latency stayed under 2 seconds. SQL Server would have ground to a halt under that load.

ClickHouse Gotchas to Watch For

ClickHouse is powerful, but it has quirks that can trip you up. Here's what to watch for:

ReplacingMergeTree Quirks

Data isn't deduplicated until background merges run. If you insert a duplicate row and immediately query, you might see both. For "latest value" queries, you need the FINAL keyword or explicit deduplication logic.

sql

-- Without FINAL: may return multiple rows per DeviceID
SELECT * FROM get_latest_transmissions WHERE DeviceID = 1234

-- With FINAL: guarantees one row per DeviceID
SELECT * FROM get_latest_transmissions FINAL WHERE DeviceID = 1234

-- Without FINAL: may return multiple rows per DeviceID
SELECT * FROM get_latest_transmissions WHERE DeviceID = 1234

-- With FINAL: guarantees one row per DeviceID
SELECT * FROM get_latest_transmissions FINAL WHERE DeviceID = 1234

FINAL forces deduplication at query time. For tables with millions of rows, this can be slow. We mitigated by keeping the "latest values" tables small (one row per device).

ClickHouse Isn't PostgreSQL

ClickHouse is OLAP, not OLTP. Adjustments we had to make:

No UPDATE or DELETE in the traditional sense. Use ALTER TABLE ... DELETE (async mutation) or insert new rows with higher version numbers for ReplacingMergeTree.
NULL handling is explicit. Columns are NOT NULL by default. Use Nullable(Type) explicitly if you need nulls.
Case-sensitive identifiers. DeviceID is different from deviceid. Stick to a convention.
DateTime64(3) for millisecond precision, not TIMESTAMP. We initially used DateTime and lost sub-second precision.

If you're coming from PostgreSQL or SQL Server, the ClickHouse learning modules are well worth your time. They cover these differences systematically and saved us from several early mistakes.

Lambda Connection Management

Each Lambda cold start opens a new HTTPS connection. ClickHouse Cloud handles this well, but we saw occasional spikes in connection establishment latency during traffic bursts. Solution: reuse connections in warm containers (global variable in Python) and implement exponential backoff on connection failures.

Materialized View Gotchas

Materialized Views only process new inserts, not historical data. When we created _mv_forecasts_deduplicator, the best_known_forecasts table was empty. We had to backfill with:

sql

INSERT INTO best_known_forecasts
SELECT DeviceID, ForecastDateTimeUTC, CaptureDateTimeUTC, ...
FROM forecasts

INSERT INTO best_known_forecasts
SELECT DeviceID, ForecastDateTimeUTC, CaptureDateTimeUTC, ...
FROM forecasts

MV errors are silent by default. If the MV's SELECT query fails (e.g., type mismatch), the insert to the source table succeeds but the MV doesn't populate. We added monitoring on system.query_log to catch MV failures.

Be extra careful tuning subqueries in MVs. Memory usage can grow in opaque ways that don't show up immediately. Our multi-sensor fusion MVs had a suboptimal join that worked fine early in each month, but as the partition grew, memory usage climbed until we hit OOM errors in the back half of the month. The fix was straightforward once diagnosed, but the failure mode was subtle.

Dictionary Tables for Device Metadata

Initially, we tried querying device metadata from SQL Server at read time via linked server connections—network latency killed performance (200ms+ per query).

Solution: ClickHouse Dictionary with HTTP source that fetches device metadata from an API endpoint, auto-refreshed every 10 minutes:

devices_dictionary.sqlsql

CREATE DICTIONARY devices (
  ID UInt32,
  GroupID UInt64,
  Name String,
  Latitude Float32,
  Longitude Float32,
  DeviceType String,
  DeviceState String
)
PRIMARY KEY ID
SOURCE(HTTP(URL 'https://api.example.com/v1/devices' FORMAT JSONEachRow))
LIFETIME(MIN 0 MAX 600)
LAYOUT(FLAT())

CREATE DICTIONARY devices (
  ID UInt32,
  GroupID UInt64,
  Name String,
  Latitude Float32,
  Longitude Float32,
  DeviceType String,
  DeviceState String
)
PRIMARY KEY ID
SOURCE(HTTP(URL 'https://api.example.com/v1/devices' FORMAT JSONEachRow))
LIFETIME(MIN 0 MAX 600)
LAYOUT(FLAT())

Gotcha: Dictionary reload can briefly return stale data. For the 10-second reload window, queries might see old device names. Acceptable for our use case, but worth knowing.

ClickHouse Cloud Pricing

ClickHouse Cloud pricing is compute-based, not storage-based. Long-running analytical queries (full table scans) can spike costs. We addressed this by:

Query timeouts: SET max_execution_time = 60 in application layer
Monitoring system.query_log for expensive queries
Dashboard query optimization: pre-computed MVs instead of on-the-fly joins

Our ClickHouse Cloud bill has been consistent at ~$1,500/month despite growing data volume, thanks to these optimizations.

ASOF JOIN Edge Cases

ASOF JOIN requires sorted data on the right side. If the best_known_forecasts table isn't sorted by (DeviceID, ForecastDateTimeUTC), ASOF JOIN produces unexpected NULLs.

Also, if forecast data has gaps (e.g., device offline for 12 hours), ASOF JOIN will match transmissions to stale forecasts. We added fallback logic in the MV to populate ForecastAge_Seconds and filter out forecasts older than 6 hours in dashboard queries.

When ClickHouse Is (and Isn't) the Right Choice

ClickHouse transformed our IoT analytics platform. But it's not a universal solution. Here's our decision framework:

ClickHouse is excellent for:

High-volume time-series data. We're at 5.3 billion rows and growing by 35 million daily—ClickHouse handles this effortlessly.
Read-heavy analytical workloads. Dashboards, reports, aggregations—this is ClickHouse's sweet spot.
Aggregation-heavy queries. GROUP BY device, date, hour—columnar storage shines here.
Mostly immutable data. Sensor readings don't change after they're recorded.
Real-time materialized views and ASOF JOINs. These features enable analytics that were impossible in SQL Server.

Consider alternatives if:

You need ACID transactions across multiple rows. ClickHouse isn't built for transactional workloads. Use PostgreSQL or MySQL.
Frequent point updates to existing rows. If you're constantly updating individual rows, PostgreSQL is a better fit.
Complex many-to-many JOINs are your primary pattern. ClickHouse can do JOINs, but it's optimized for denormalized data and columnar scans.
You need sub-millisecond ingestion latency. ClickHouse batches writes for efficiency. For real-time streaming, consider Kafka + ClickHouse.

The Broader Lesson: Right Tool for the Scale You're At

SQL Server served our client well for the first million rows. When the dataset grew to billions and query patterns shifted toward time-series analytics, ClickHouse was the right evolution.

This isn't about ClickHouse being "better"—it's about matching database architecture to workload characteristics. SQL Server excels at transactional workloads with complex relationships. ClickHouse excels at analytical queries over append-only time-series data. Use both, where appropriate.

The results speak for themselves:

1.3+ billion transmission rows, ~4 billion forecast rows in production
Sub-20ms time-series queries, under 500ms bulk exports (network-bound)
315x storage reduction: 6 TB SQL Server to approximately 19 GB ClickHouse
$2,200/month savings: $4K RDS to $1.5K ClickHouse Cloud (approximately $26K/year)

What's next for us? We're building anomaly detection queries using ClickHouse's array functions, expanding the computer vision image analysis pipeline, and evaluating projections for common query patterns.

If your IoT platform, sensor network, or time-series application is outgrowing traditional relational databases, now might be the time to consider ClickHouse. At DRYCodeWorks, we've designed and migrated these systems for clients across industries—from road weather monitoring to industrial sensors to financial tick data.

Get in touch to discuss your architecture. We offer technical audits to evaluate whether ClickHouse (or another time-series database) is the right fit for your scale and query patterns.

Resources

ClickHouse Documentation — Getting started guide and core concepts for ClickHouse.
ClickHouse Cloud Architecture — Deep dive into SharedMergeTree and the separation of compute and storage.
ReplacingMergeTree Engine — How deduplication works with version columns and FINAL queries.
Materialized Views in ClickHouse — Cascading MVs, target tables, and real-time aggregation patterns.
ASOF JOIN Documentation — Syntax and usage for time-series correlation joins.
ClickHouse Codecs — Column-level compression options including Delta, ZSTD, and LZ4.
AWS Lambda Best Practices — Connection pooling, error handling, and performance optimization for Lambda.
ClickHouse Performance Optimization — Query tuning, indexing strategies, and hardware considerations.