Why was the Fivetran-dbt merger all but inevitable?
Fivetran & dbt Labs announced their merger yesterday. The all-stock deal combines two companies into an entity approaching $600 million in ARR.
The beauty of the modern data stack was the explosion in choice. As the cloud exploded onto the scene, the legacy data warehouse was replaced by a collection of fast-moving platforms. In that era, specialization won. The pendulum is now swinging back towards consolidation.
Why? The answer lies in compute economics & revenue scale asymmetry. The table below shows why.
| Category | Snowflake | Databricks | Fivetran + dbt | Est. Category Revenue1 |
|---|---|---|---|---|
| Ingestion (ETL) | Openflow (Apache NiFi via Datavolo) | LakeFlow Connect (Arcion CDC) | Fivetran | ~$2.5B |
| Transformation | dbt Projects on Snowflake (native dbt) | Delta Live Tables + dbt via Workflows | dbt Core/Cloud | ~$500M |
| Compute | Virtual Warehouses (owns margin pool) | Clusters (owns margin pool) | Runs on platform compute (no margin capture) | ~$7.6B |
There are three different categories of software within this subset of the ecosystem:
-
Ingestion takes data from software & moves it into a cloud data warehouse. Snowflake acquired Datavolo, which commercializes the open source product Apache NiFi, calling it Openflow. Databricks acquired Arcion for ingestion through change data capture, calling it LakeFlow Connect. Fivetran focuses exclusively on this layer.
-
Transformation means reformatting the data within the cloud data warehouse. Snowflake launched native dbt Projects on Snowflake. Databricks offers Delta Live Tables, native SQL, & Python, plus supports hosted dbt through Databricks Workflows. dbt Core/Cloud is the leading independent transformation tool.
-
Compute revenue is generated when we ask questions of our data. Snowflake remains one of the leaders in structured data analysis with their cloud data warehouse. Databricks’ compute is their own as well.
Here’s the asymmetry in one number. Compute represents 72% of the overall market ($7.6B of $10.6B). As a result of their massive operations, Snowflake & Databricks exert significant gravity within the ecosystem. They have expanded beyond the compute market to impose their presence & capture marginal revenue within customers, pressuring the competitive ecosystem.
That’s not to say these components are independent. George Fraser analyzed Snowflake workloads in September 2024, finding transformation represents 40-45% of total Snowflake compute, which means even at smaller scales, startups can have significant impact on these behemoth businesses.
The Fivetran-dbt merger is an inevitable evolution of a maturing market. Two unicorns must partner to compete against two decacorns. They solve two of the three customer problems. But not yet compute.
One could surmise this consolidation signals the end of the modern data stack. I view it differently. The MDS has succeeded beyond our expectations. The stakes are higher now. Broad platforms, fast growth, & AI-native architectures define the next phase. Expect more consolidation.
-
Category revenue estimates based on public disclosures & company filings. Ingestion: Informatica ($1.64B FY2024), Fivetran ($300M est.), Talend ($350M), others ($200M est.). Transformation: dbt Labs ($300M est.), others ($200M est.). Compute: Snowflake ($3.6B FY2025), Databricks ($4.0B ARR Aug 2025). ↩︎