For most of the last decade, the goal of a data platform was simple: make the data available. Land it, govern it, and let the humans take it from there. That goal is no longer enough.

In 2026, the consumer of your enterprise data is increasingly likely to be something other than a human. It is an agent. Agent Bricks, Genie Spaces, and Genie Code are reading your tables, interpreting your column names, following your foreign keys, and making decisions on the answers they produce. And here is the uncomfortable truth I keep running into with clients: if your data is not modeled correctly, your agents will hallucinate. Confidently, fluently, and at scale.

The Lakehouse must now be structured for machine understanding, not just human consumption. That is what I mean by agent-ready.

Why Data Modeling Matters More in the AI Era

In my first article, The Lost Art of Data Modeling in the Age of AI and the Lakehouse, I argued that the discipline of modeling has been quietly discarded by teams that confused the cheapness of storage with design freedom. Agentic AI removes the last excuse to keep skipping the design phase.

A human analyst can compensate for a badly modeled table. They can read between the lines, ask a colleague what cust_v2_id actually means, and notice when a number looks off. An agent cannot. It will take your column name at face value, join on a key that nobody validated, calculate a “revenue” number that excludes refunds because no one told it otherwise, and return a polished answer with full confidence. The Databricks team put a number on this recently: agents grounded in proper Unity Catalog semantics deliver 70% higher accuracy than standard RAG, and 30% better performance on multi-step workflows.

Modeling is not dead. Modeling is now the prerequisite for trustworthy AI.

What Makes a Lakehouse “Agent-Ready” on Databricks

Databricks medallion architecture diagram showing data flowing from Bronze (raw ingestion from cloud storage, Kafka, and Salesforce) through Silver (cleaned and validated customer and transaction data) into Gold (enriched, business-ready tables organized within Unity Catalog schemas).
The medallion architecture organizes data into Bronze (raw), Silver (validated), and Gold (enriched) layers – each progressively closer to business consumption. For an Agent-Ready Lakehouse, the Gold layer is where semantic clarity matters most: it’s the layer agents actually read.
Source: Databricks

An agent-ready lakehouse is not just governed and scalable. It is semantically clear. Three pillars hold it up.

1. Well-Modeled Business Entities

This is the foundation I covered in the Lost Art piece. Clear conceptual, logical, and physical layers. A consistent grain in your fact tables. Meaningful relationships expressed as primary and foreign keys with the RELY keyword so Photon can trust them for join elimination. Liquid Clustering on the columns the business actually filters and joins on.

If an agent cannot tell which table holds the authoritative customer, or whether orders.amount is gross or net, no amount of prompt engineering will save you. The agent inherits whatever ambiguity you left in the model.

2. Trusted Semantic Context

This is the layer that did not really exist three years ago. Unity Catalog is now the place where business meaning lives alongside the physical data.

Governed Tags (GA in March 2026) give you an account-level vocabulary for describing what your data actually is. They are enforced key-value pairs like sensitivity = confidential or pii = ssn that drive Attribute-Based Access Control (ABAC) policies and feed agents the metadata they need to reason responsibly.

SQL:
-- Apply a governed tag to a column
ALTER TABLE prod.gold.dim_customers
ALTER COLUMN ssn SET TAGS ('pii' = 'ssn');

-- Tag the table with a business domain
ALTER TABLE prod.gold.fact_sales
SET TAGS ('domain' = 'revenue', 'certified' = 'true');==
Databricks Unity Catalog hierarchical architecture diagram showing the Metastore at the top, containing Catalogs, which contain Schemas, which contain Tables, Views, Volumes, Functions, and ML Models - illustrating the three-level namespace that grounds Mosaic AI agents in governed semantic context.
Unity Catalog’s three-level namespace (catalog.schema.table) is the foundation of semantic clarity on Databricks. It’s where business meaning – governed tags, certifications, ownership, and metric definitions – lives alongside the physical data.
Source: Databricks

Unity Catalog Metric Views are the semantic layer the lakehouse always needed. You define a business metric once – Total Revenue, Active Customers, Average Order Value – and every consumer (Genie, dashboards, agents, external BI through JDBC) gets the same answer from the same definition.

version: 1.1
source: prod.gold.fact_sales
joins:
  - name: customer
    source: prod.gold.dim_customers
    on: source.customer_sk = customer.customer_sk
dimensions:
  - name: Order Date
    expr: order_date
  - name: Customer Segment
    expr: customer.segment
measures:
  - name: Total Revenue
    expr: SUM(amount)
  - name: Average Order Value
    expr: SUM(amount) / COUNT(DISTINCT order_id)==

This is what stops an agent from inventing its own definition of “revenue.” When a user asks Genie, “What was Q1 revenue by segment?” The agent calls MEASURE(Total Revenue) against the metric view, and the answer matches what finance reports.

3. Governed AI Access

The right data, exposed the right way, to the right agent, under the right identity. Agent Bricks (now GA) enforces this through on-behalf-of token passing: agents inherit the user identity, so they can only access what the requesting user is authorized to see. The same row filters, column masks, and ABAC policies that protect a SQL analyst extend to every agent interaction.

The agent is not a privileged service account. It is a delegated extension of the user. Cut that connection, and your governance posture collapses the moment an agent goes to production.

The Architect’s Role: Where We Make the Difference

Platform capability is not the same as platform readiness. Databricks gives you the tools. Architects turn those tools into a context that an agent can actually use.

In practice, on every engagement I am doing the same things:

  • Modernizing legacy estates with modeling discipline at the front, not the end. No lift-and-shifting tables nobody understands.
  • Designing the medallion architecture so the gold layer is genuinely consumable. Gold belongs to the business, and now to the agents that act on its behalf.
  • Standing up the semantic layer. Metric views, governed tags, and Genie Space instructions are first-class deliverables alongside the pipelines, not a “phase two.”
  • Aligning architecture with business outcomes. Modeling decisions get made with finance, analytics, and compliance in the same room, because those are the people whose questions the agents will eventually answer.

The architect is the one who translates raw platform capability into the contextual map an AI can navigate.

Common Failure Points

I have seen the same patterns derail agent projects across financial services, healthcare, and retail engagements:

  • Legacy schemas moved “as-is.” Shaky foundations in a faster engine.
  • Cryptic naming and undocumented tables. Bad for analysts. Catastrophic for agents.
  • Inconsistent definitions across domains. Without a metric view to arbitrate, the agent picks one – or invents a fourth.
  • Weak metadata and missing semantic tags. The agent has nothing to ground itself on.
  • Governed access without a business context. Permissions are correct, but the agent still does not know which of the seven customer tables to use.

AI problems on Databricks almost always start as architecture problems. The agent is the symptom. The model is the cause.

Final Words: The Lakehouse as Context Layer for AI

The future of AI on Databricks will not be defined only by better models. It will be defined by whether architects build lakehouses that AI can actually understand.

Mosaic AI, Agent Bricks, Genie, and Genie Code are powerful capabilities. They are also unforgiving. They expose every modeling shortcut, every undocumented column, every inconsistent definition you ever let slide. The teams that win the next wave of AI projects are not the ones with the cleverest prompts. They are the ones who treated Unity Catalog as the semantic foundation it was designed to be.

Is your Lakehouse ready for agents, or just ready for analysts?

Contact us for an Agent Readiness Assessment of your current Databricks environment. Let us help you build the foundation that turns Mosaic AI from a science experiment into a production capability.

Other blog posts
Digital data house representing the Mortgage Intelligence Platform by Entrada, with Cotality, Genie, and Lakebase

Mortgage Intelligence Platform: Building a Databricks-Native Lead Engine with Cotality, Genie, and Lakebase

Mortgage lenders sit on rich data across CRM, LOS, and servicing systems, yet still struggle to identify which borrowers are about to transact. Entrada’s Mortgage Intelligence Platform addresses that gap with a Databricks-native architecture: Cotality property intelligence delivered through Delta Sharing and Unity Catalog, deterministic scoring as governed SQL primitives, Genie grounded in a curated semantic layer, and Lakebase Postgres recording every approval and audit event. The result is a governed lead generation layer that tells growth teams who to contact, why now, and with what offer – and proves it afterward.

Read more
Feature store-driven ML architecture concept visualized as a connected smart city at night with data flow lines

Feature Store-Driven ML: Lessons from Real Deployments

After years of architecting ML platforms on Databricks, one pattern keeps repeating: the difference between a model that survives in production and one that quietly fails usually comes down to how features are managed. Here’s what we’ve learned the hard way.

Read more
Conceptual hero image for Entrada Governance Atlas representing Databricks-native data governance with Unity Catalog, Genie, and Lakebase - a glowing shield and lock over a circuit board symbolizing protected, governed metadata.

Governance Atlas: Databricks-Native Data Governance with Unity Catalog, Genie, and Lakebase

Every serious governance project eventually reaches the same uncomfortable moment: the platform has the metadata, but the organization still does not have a product. There is a catalog. There are tags. There are comments, owners, lineage events, audit rows, dashboards, policies, and a dozen local rituals around who is allowed to change what. Yet when a steward asks, “Can I safely change this field?”, the answer still arrives as a meeting, a spreadsheet, and a prayer.

Read more
Show all posts
GET IN TOUCH

Millions of users worldwide trust Entrada

For all inquiries including new business or to hear more about our services, please get in touch. We’d love to help you maximize your Databricks experience.