Config-Driven Data Platform for a
US based theme park
34 theme parks
created by theme park merger
~50 million guests
visiting the theme parks annually
2,720 source tables
with 10M+ rows migrated
Business Case
When two leading park operators merged, each of the 34 venues still ran its own SQL Server. Data teams faced these problems
Source Fragmentation
2,700 tables lived on separate park servers, so teams pulled numbers from different snapshots
Schema Drift
New or dropped columns crashed hard-coded jobs, triggering late-night fixes
Zero Audit Trail
No run-level log showed row counts or errors, so failures stayed hidden until reports looked wrong
90% faster
onboarding
New park feeds move from request
to production in under two days
0 manual fixes
Schema drift handled automatically;
engineers focus on analysis
Full transparency
Every run logs row counts and status,
so executives trust the numbers
Our Solution
We built a config-driven pipeline that gathers data from all 34 park servers into one Snowflake warehouse.
Central Config Table
All mappings, S3 paths, and load flags now sit in a Snowflake table that analysts can edit
Dynamic COPY Procedure
A stored procedure builds COPY INTO commands for each table and writes outcomes to an audit log
Object-Level Refresh Tracking
Start time, end time, status, and duration are captured for every table, every run
Airflow-Orchestrated Loads
Python tasks move files from SQL Server to S3, call the dynamic COPY, then archive each batch
Row-Count Verification
Source and target totals are compared on the fly. Any mismatch triggers an alert