In early and mid-stage data programs, ADF often worked well. Teams could orchestrate batch ingestion, trigger Databricks notebooks, and coordinate dependencies across systems with minimal upfront engineering effort. For organizations standardizing on Azure, ADF felt like the natural control plane for data movement and orchestration.
However, as data platforms have evolved, particularly for organizations that have made Databricks the center of their analytics and AI stack, many teams are encountering real limitations with ADF:
- Pipeline logic fragmented across ADF, Databricks jobs, and external scripts
- Brittle dependencies and complex retry logic as pipelines scale
- Limited observability into end-to-end data lineage and state
- CI/CD friction when managing JSON-based pipelines at scale
- Rising orchestration and compute costs driven by pipeline sprawl
These challenges are not theoretical. They surface most acutely in Databricks-centric architectures that rely on Delta Lake, Unity Catalog, streaming ingestion, and advanced transformation patterns.
This is where Databricks Lakeflow enters the picture. Lakeflow represents a shift from external orchestration toward a native, data-aware pipeline layer built directly into the Databricks platform. By unifying ingestion, transformation, orchestration, lineage, and governance, Lakeflow enables teams to simplify architectures while improving reliability, performance, and operational clarity.Entrada has worked closely with customers navigating this transition. The remainder of this post draws on those experiences to share practical lessons from ADF-to-Lakeflow migrations across industries.
Entrada’s Experience with ADF-to-Lakeflow Migrations
A Databricks-First Perspective
Entrada is a Databricks-focused systems integrator and strategic partner, with extensive experience delivering large-scale data platform transformations across financial services, healthcare, life sciences, retail, and digital-native organizations. Many of these customers came to Entrada with existing Azure Data Factory investments, often deeply embedded in their operating model.
Rather than treating ADF as “wrong,” Entrada’s approach has been to assess where ADF still adds value and where it introduces unnecessary complexity in a modern lakehouse architecture.
Across dozens of engagements, a clear pattern has emerged:
When Databricks becomes the primary execution and analytics platform, external orchestration increasingly becomes a liability rather than an asset.
Common ADF Challenges Observed in the Field
In ADF-heavy environments that rely on Databricks, Entrada repeatedly encountered similar issues:
- Brittle pipeline dependencies
Notebook chains managed via ADF activities were tightly coupled and difficult to evolve without regressions. - Operational overhead at scale
Customers running hundreds of pipelines struggled with monitoring, troubleshooting, and replaying partial failures. - CI/CD limitations
Managing and versioning large numbers of ADF JSON definitions introduced friction, especially when teams wanted to apply software engineering best practices. - Disconnected lineage and governance
Even with tools like Azure Purview, lineage across ADF → Databricks → Delta tables was often incomplete or delayed. - Cost opacity
Orchestration costs, job cluster spin-up, and inefficient retries made it difficult to attribute and optimize spend.
These challenges were most pronounced in environments with:
- Event-driven ingestion or CDC
- Regulatory or audit requirements
- Growing AI and ML workloads
How Entrada Approached the Migration
Entrada does not advocate for a “big bang” rewrite. Successful migrations followed a structured, incremental approach:
- Pipeline Mapping and Classification
Existing ADF pipelines were analyzed and categorized:- Pure orchestration (ADF calling Databricks only)
- Ingestion-heavy (Copy Activities, snapshots)
- Cross-platform coordination (ADF + Functions + Logic Apps)
- DAG Re-architecture Inside Databricks
For Databricks-centric pipelines, orchestration logic was moved into:- Databricks Workflows for scheduling and dependencies
- Delta Live Tables (DLT) for declarative transformations
- Lakeflow Connect / Auto Loader for ingestion
- Incremental Cutover
Pipelines were migrated domain by domain, allowing old and new patterns to coexist safely during transition. - Governance by Design
Unity Catalog was introduced early to ensure lineage, access control, and auditability were native, not bolted on later.
This approach reduced risk while allowing teams to realize value quickly.
Comparative Value: Lakeflow vs. ADF
The most compelling insights from Entrada’s work come from post-migration outcomes. While every environment is different, several consistent benefits emerged.
1. Simplified Orchestration and Native Integration
With Lakeflow, orchestration is no longer external to execution.
Instead of:
- ADF triggering Databricks notebooks
- Notebooks triggering other notebooks
- External scripts managing retries and state
Teams define pipelines directly in Databricks, where the platform:
- Builds dependency graphs automatically
- Manages execution state and checkpoints
- Resumes from the correct point on failure
For example, one enterprise reduced over 150 ADF pipelines into fewer than 40 Lakeflow pipelines, dramatically simplifying operational complexity.
2. Improved Observability, Lineage, and Governance
ADF pipelines provide activity-level visibility, but they lack native understanding of data state.
Lakeflow, combined with Unity Catalog, provides:
- Table- and column-level lineage
- Built-in audit logs
- Unified access control across data and pipelines
- Clear impact analysis for downstream consumers
In regulated industries, this shift reduced audit preparation effort and improved confidence in data correctness.
3. Streamlined DevOps and CI/CD Practices
ADF pipelines are often managed as large JSON artifacts, which do not lend themselves well to code review, testing, or reuse.
By contrast, Lakeflow assets are:
- Defined as code (SQL, Python)
- Versioned alongside application logic
- Tested using standard software engineering workflows
Entrada helped several customers integrate Lakeflow pipelines into Git-based CI/CD pipelines, enabling:
- Automated testing
- Safer deployments
- Faster iteration cycles
Teams reported fewer deployment-related failures and greater developer autonomy.
4. Cost and Performance Efficiency
From a cost perspective, a key distinction emerged:
- ADF charges for orchestrating work
- Lakeflow charges primarily for doing work
By consolidating orchestration and execution:
- Clusters were reused more effectively
- Redundant job spin-up was eliminated
- Retry logic became more targeted
In multiple cases, customers saw:
- 50–80% reductions in pipeline runtime on large tables
- 60–70% reductions in run costs for high-volume workloads
- Fewer failed jobs and manual restarts
These gains were often achieved without increasing overall data volume, simply by removing inefficiencies.
Strategic Takeaways
When Does Migration Make Sense?
Based on Entrada’s experience, organizations should strongly consider migrating from ADF to Lakeflow when:
- Databricks is the primary data and analytics platform
- ADF is used mainly to orchestrate Databricks jobs
- Pipeline complexity is increasing faster than data volume
- Governance, lineage, and auditability are business-critical
- Teams want stronger CI/CD and engineering discipline
Conversely, ADF may still make sense for:
- Broad cross-platform orchestration
- Lightweight, Azure-native integrations where Databricks plays a minor role
The Future of Data Orchestration
Entrada’s perspective is that data orchestration is evolving away from task-based control planes toward data-aware execution engines.
Lakeflow reflects this shift:
- Declarative pipelines instead of imperative steps
- Platform-managed state instead of custom logic
- Governance embedded in execution, not layered on top
As organizations push further into real-time analytics, AI, and GenAI use cases, this model becomes increasingly important.
Closing Thoughts
Migrating from Azure Data Factory to Databricks Lakeflow is not about replacing one tool with another. It is about simplifying architectures, improving reliability, and aligning data engineering practices with the realities of modern analytics platforms.
Entrada’s customer successes show that when migrations are approached thoughtfully, grounded in real engineering trade-offs and business outcomes, the results can be transformative.
If your organization is evaluating Azure Data Factory migration, Databricks Lakeflow, or broader data pipeline modernization, Entrada can help assess your current state, identify high-impact migration candidates, and design a pragmatic roadmap forward.
The right orchestration strategy is no longer just about scheduling jobs, it is about enabling trustworthy, scalable, and future-ready data products.
Race to the Lakehouse
AI + Data Maturity Assessment
Unity Catalog
Rapid GenAI
Modern Data Connectivity
Gatehouse Security
Health Check
Sample Use Case Library