Skip links

Config-Driven Data Platform for a
US based theme park

34 theme parks
created by theme park merger

~50 million guests
visiting the theme parks annually

2,720 source tables
with 10M+ rows migrated

Business Case

When two leading park operators merged, each of the 34 venues still ran its own SQL Server. Data teams faced these problems

Source Fragmentation

2,700 tables lived on separate park servers, so teams pulled numbers from different snapshots

Schema Drift

New or dropped columns crashed hard-coded jobs, triggering late-night fixes

Zero Audit Trail

No run-level log showed row counts or errors, so failures stayed hidden until reports looked wrong

Impact Delivered

90% faster
onboarding

New park feeds move from request
to production in under two days

0 manual fixes

Schema drift handled automatically;
engineers focus on analysis

Full transparency

Every run logs row counts and status,
so executives trust the numbers

Our Solution

We built a config-driven pipeline that gathers data from all 34 park servers into one Snowflake warehouse.

Central Config Table

All mappings, S3 paths, and load flags now sit in a Snowflake table that analysts can edit

Dynamic COPY Procedure

A stored procedure builds COPY INTO commands for each table and writes outcomes to an audit log

Object-Level Refresh Tracking

Start time, end time, status, and duration are captured for every table, every run

Airflow-Orchestrated Loads

Python tasks move files from SQL Server to S3, call the dynamic COPY, then archive each batch

Row-Count Verification

Source and target totals are compared on the fly. Any mismatch triggers an alert

Tech Stack

S3

Python

Snowflake

Apache Airflow

dbt

SQL Stored Procedures