34+
Parks merged
~50M
Annual visitors
$3.1B+
Revenue in 2025
Following the merger of two operating entities into a single group of multiple parks, the organization’s data infrastructure was fragmented across separate SQL Servers, ticketing platforms, payroll systems, and APIs with no shared data model and no centralized warehouse. Pipeline cycles ran for two to three hours with no observability, sales and promotion data was unavailable in consolidated form, and Snowpipe had no orchestration or failure detection. Inferenz rebuilt the data platform from the ground up on Snowflake, replaced Snowpipe with a fully orchestrated Apache Airflow framework, and delivered a config-driven engine that onboards new source systems in four to six weeks with no code changes required.
Following the merger of two operating entities, the group's data infrastructure was fragmented, unobservable, and unable to support a unified business.
Two independent source systems, including SQL Servers, REST APIs, payroll platforms, and separate ticketing infrastructure, with no shared data model or centralized warehouse. Twelve source systems ranging from 15 GB to 550 GB had no unified foundation.
Real-time ticket volumes, daily sales, and promotion redemptions were unavailable in consolidated form across US markets. The most data-intensive systems ran for two to three hours per pipeline cycle with no observability into whether they completed successfully.
Snowpipe had no orchestration, no deterministic completion tracking, and no table-level metadata. Failures were silent and root cause analysis was slow and entirely manual, with no way to trace which tables had loaded or why runs had stalled.
Legacy SQL Server infrastructure buckled under post-merger analytical load. Reports were stale and distrusted. With no lineage in place, teams could not trace numbers back to source or diagnose anomalies when figures conflicted across systems.
Inferenz rebuilt the group's data platform from the ground up, replacing Snowpipe with a fully orchestrated Apache Airflow framework and centralizing all data into a governed Snowflake warehouse.
A config-driven framework defines each source system's tables, load type, and schedule in configuration rather than code: no DAG changes required to onboard a new source, cutting integration time to four to six weeks regardless of system size, from a 4-table 15 GB system to 100+ table systems carrying 550 GB. Unmanaged Snowpipe was replaced with a fully orchestrated Apache Airflow pipeline on MWAA, making every run deterministic, tracked, and observable, with structured run status and table-level alerting replacing silent failures.

Dynamic dbt DAG generation standardized the transformation layer across all 12 systems with a consistent, auto-generated execution pattern, removing manual maintenance of individual transformation logic. Per-system concurrency isolation eliminated pipeline collisions, enabling safe parallel processing, daily dbt pipeline run time dropped from approximately one hour to approximately 10 minutes as a direct result.

A comprehensive audit layer captures every run across 50 to 60 tracked tables: status, row counts, load type, and watermark per table, giving teams full visibility into what loaded, when, and whether it completed within SLA, replacing complete operational blindness.

Ticketing, sales, and promotion data were unified in a Snowflake Consumption Layer purpose-built for Power BI, with SLA-compliant dashboards surfacing daily pipeline health, sales performance, and promotion redemption data across all US markets in consolidated form for the first time.





Faster daily pipeline
Daily pipeline run time reduced from 2 to 3 hours to 30 to 50 minutes via parallel dbt execution with per-system isolation.
Sales and promotion visibility
Ticketing, daily sales, and promotion redemption data consolidated into a single consumption layer across all US markets for the first time.
Managed via config only
12 source systems and 600+ tables onboarded and governed through configuration updates alone, with no code changes required for new sources.
Audit trail per run
Table-level status, row counts, load type, and watermark captured across 50 to 60 tracked tables on every pipeline run.
Whether you’re starting with data modernization or exploring AI copilots, we’re here to help.
Contact Us