Governance Atlas: Unity Catalog, Genie & Lakebase On Databricks

May 5, 2026

Governance Atlas: Databricks-Native Data Governance with Unity Catalog, Genie, and Lakebase

Every serious governance project eventually reaches the same uncomfortable moment: the platform has the metadata, but the organization still does not have a product. There is a catalog. There are tags. There are comments, owners, lineage events, audit rows, dashboards, policies, and a dozen local rituals around who is allowed to change what. Yet when a steward asks, “Can I safely change this field?”, the answer still arrives as a meeting, a spreadsheet, and a prayer.

Conceptual hero image for Entrada Governance Atlas representing Databricks-native data governance with Unity Catalog, Genie, and Lakebase - a glowing shield and lock over a circuit board symbolizing protected, governed metadata.

Author

Skyler Myers

That gap is where Governance Atlas started to become interesting.

In the original write-up on the Databricks governance experience, we introduced the thesis: Databricks can host more than metadata screens. With Unity Catalog, system tables, Databricks SQL, Delta, Databricks Apps, and bundles, it can host an internal metadata product in the same control plane where the data already lives. That first version proved the architecture was plausible.

The newer version had a harder job. It had to survive contact with real workspace behavior: cold warehouses, partial lineage, permissions that change the answer, empty-but-not-really-empty inventories, governance state that must be mutable but auditable, and AI that cannot be allowed to sound confident when the evidence is thin.

That is the technical lesson behind Governance Atlas: a metadata product is not a prettier catalog. It is a control plane for decisions.

What Changed In A Short Time

Area	Change
Presentation layer	The early accelerator shell has moved into a compiled React experience served by the Databricks App backend. The app now behaves like a product surface rather than a notebook-era utility.
Runtime contract	The supported path is explicit: `app.yaml` to `run_app.py` to `runtime_app.py` to a built frontend bundle. That matters because installability and promotion depend on boring, repeatable packaging.
Truth model	Non-authoritative payloads, mocked evidence, empty capability claims, and unscoped fallback data are rejected or rendered as unavailable rather than turned into product looking numbers.
Lineage workflow	Lineage is no longer only a graph. The selected asset and selected column feed an impact inspector that frames downstream consumers, column paths, jobs, dashboards, quality state, owners, approvals, and evidence boundaries.
Genie integration	Atlas AI is backed by a curated Databricks Genie space over governed metadata views, rather than a detached chat box over arbitrary application text.
Lakebase integration	Operational current state tables can be mirrored into Lakebase while historical and analytical evidence stays in Delta and Unity Catalog.

The Product Shift

The most important change since the original post is not a single feature. It is a change in posture.

The first version asked: can we put a governance experience inside Databricks? The current version asks: can that experience make the next governance decision easier, safer, and faster than the native surfaces alone?

That moves the center of gravity away from browsing metadata and toward operating on it. Discovery is not just search. It is permission aware search with governed trust signals. Stewardship is not a static task list. It is the place where control plane work becomes visible. Lineage is not a topology poster. It is the beginning of an impact brief.

A Databricks-Native Metadata Product Has Different Physics

A metadata app built outside the platform can be architecturally elegant and still lose the trust of engineers. The moment it has a different owner model, a different lineage interpretation, a different tag source, or a different permission story, users start treating it as another opinion.

Governance Atlas takes the opposite bet. The physical metadata plane remains Unity Catalog. Runtime lineage comes from Databricks lineage capture and the lineage system tables. Queries run through a SQL warehouse. The application surface is a Databricks App. Mutable governance state lives in the Databricks boundary. Deployment is expressed in bundle configuration.

That does not mean every signal is magically available. In fact, the product has to be more careful because it is native. Unity Catalog lineage is permissioned. System lineage tables represent captured events, not a universal graph of every possible data relationship. Column lineage is powerful, but it has known limitations when transformations are opaque, path based, or outside supported capture patterns. Quality and policy signals must be backed by real runs or real control records. If the source is not there, the UI has to say so.

Discovery Is Where Trust Starts

The discovery surface looks simple, but it encodes one of the hardest rules in governance UX: do not make the user guess whether an empty result means “no data exists,” “you cannot see it,” “the warehouse is still warming,” or “your filters excluded everything.” Those are different states, and collapsing them into the same blank table is how metadata products lose credibility.

Governance Atlas treats search as an actor visible, governed workflow. Results are ranked with available trust signals, filtered through live visibility, and paired with adjacent governance views and Atlas AI recommendations when the evidence source is available. The UI is not trying to entertain the user. It is trying to keep them from making a bad assumption.

Entrada Governance Atlas discovery interface on Databricks: governed search results with trust scores, certification filters, Unity Catalog asset metadata, and Genie-powered Atlas AI recommendations for stewardship workflows. — *Discovery is search first, permission aware, and paired with evidence backed Atlas AI recommendations when Genie returns usable rows.*

Stewardship Is The Test Of Whether Governance Is Real

A catalog without work is a library. Useful, but passive. Governance becomes operational when someone can look at an asset, see what is wrong, understand who owns the next move, and leave an auditable trail of the decision.

The workbench in Atlas is deliberately grounded in control plane rows. It can show governance requests, affected assets, trigger/source fields, comments, resolution paths, and the exact places where SLA evidence is still unavailable. That last part matters. A mature product does not fill missing SLA timers with invented urgency just to make the screen feel alive.

`Entrada Governance Atlas stewardship workbench on Databricks showing 20 open governance work items with Unity Catalog affected assets, trigger sources, audit evidence, and resolution paths backed by Lakebase operational state.` — *The stewardship workbench turns governance state into visible work: requests, affected assets, routing context, comments, and resolution paths.*

The Glossary Is Not A Dictionary

Glossary work is often treated as documentation. In practice, it is a reconciliation problem. Business terms need owners, review state, relationships, linked assets, and lineage context. They need to survive the difference between what the business says and what the platform can prove.

The current Glossary and CDE surfaces are moving toward that model: term hierarchy, review status, source records, asset associations, and CDE-aware browsing. The important design choice is that glossary terms do not float above the lakehouse. They attach back to source-of-record assets and line up with the same governance evidence used elsewhere in the app.

Entrada Governance Atlas glossary and Critical Data Element registry on Databricks showing business term hierarchy, review status, ownership, and associations linked to Unity Catalog source-of-record assets with governance evidence. — *Glossary and CDE surfaces keep business meaning attached to source-of-record assets instead of floating in a separate documentation layer.*

Genie Is Not The Button. Genie Is The Contract.

The easiest way to misuse AI in a governance product is to attach a chat widget to a complex metadata estate and hope the answers sound useful. That is not what technical users need, and it is not what technical reviewers respect.

The stronger pattern is to curate the semantic room first. Governance Atlas builds a Databricks Genie space around views that are already shaped for governance questions: current assets, governance work, glossary state, quality evidence, audit events, and lineage edges. The prompt is not “ask the app anything.” The prompt is “reason over the governed views we are willing to stand behind.”

That distinction changes the architecture. Genie is not a decorative assistant. It becomes the conversational layer over a governed metadata model. When the evidence is missing, Atlas AI should degrade. When it answers, the answer should be tied to rows, assets, and returned evidence.

Lakebase Makes The Control Plane More Interesting

Delta remains the right place for historical evidence, lineage adjacent audit trails, quality runs, and analytical projections. But not every piece of application state wants to behave like an append-heavy analytical table. Governance products also have current operational state: notifications, receipts, identity entries, threads, tasks, custom property assignments, and review queues.

Lakebase gives Databricks Apps a native Postgres-backed resource for that kind of state. In Atlas, the current pattern is conservative: Delta remains authoritative for the governance store while selected operational tables can be shadow written into Lakebase. That lets us explore lower latency app workflows without moving the entire governance evidence model into an external service.

The result is a useful split: Lakebase for mutable operational current state, Delta and Unity Catalog for durable governance evidence, history, and analytics. That is a much better fit than pretending one storage model is perfect for every part of the product.

resources:
  apps:
    atlas:
      resources:
        - name: sql-warehouse
          sql_warehouse:
            id: ${var.warehouse_id}
            permission: CAN_USE
        - name: atlas-genie-space
          genie_space:
            name: Governance Atlas Metadata Room
            space_id: ${var.genie_space_id}
            permission: CAN_RUN
        - name: atlas-lakebase
          postgres:
            branch: ${var.lakebase_branch}
            database: ${var.lakebase_database_resource}
            permission: CAN_CONNECT_AND_CREATE

Lineage Became A Decision Packet

Lineage is the page where product ambition is easiest to fake and hardest to earn. Anyone can draw nodes and edges. The useful question is what happens after a user clicks a column.

The current Lineage Atlas moves toward the workflow that engineers, stewards, and executives actually need: if this table or column changes, what breaks, who owns it, what controls are affected, which approvals matter, and what evidence can I export? The graph is still there, but it is not the whole product. The selected asset feeds an impact inspector. Selected columns trace direct upstream and downstream paths when Unity Catalog returns them. Missing transformation SQL is called out rather than invented.

That is the right kind of boring. In governance, boring often means trustworthy.

Entrada Governance Atlas lineage view on Databricks showing permission-aware Unity Catalog impact analysis with downstream consumers, column paths, jobs, owners, approvals, and governance evidence boundaries in the Impact Inspector decision packet. — *Lineage is most useful when it becomes an impact brief: downstream consumers, column paths, jobs, quality, owners, approvals, and evidence boundaries.*

Entrada Governance Atlas column lineage view on Databricks showing first-class Unity Catalog evidence with upstream and downstream column paths, and explicit unavailable states for missing transformation SQL in the governance Impact Inspector. — *Column lineage is treated as first class evidence. Missing transformation SQL stays explicitly unavailable rather than fabricated.*

A Small But Important SQL Pattern

One of the useful implementation patterns is to make Genie and product surfaces consume views that already encode the governance contract. For example, lineage edges are projected into a curated view instead of giving the AI layer a raw table and asking it to infer product semantics every time.

This kind of view does not expose proprietary business logic, but it shows the discipline: normalize names, filter to actor visible assets where possible, preserve availability, and keep the source table obvious.

CREATE OR REPLACE VIEW atlas_ai.atlas_ai_lineage_edges AS
SELECT
  source_table_full_name AS source_asset_fqn,
  target_table_full_name AS target_asset_fqn,
  COALESCE(source_type, entity_type, 'uc_lineage') AS lineage_source,
  event_time,
  'available' AS availability_state
FROM system.access.table_lineage
WHERE source_table_full_name IS NOT NULL
   OR target_table_full_name IS NOT NULL;

What Databricks Gives You, What OpenMetadata Teaches You, And Where Atlas Fits

Databricks gives you the most important thing: the platform truth. Unity Catalog, system tables, SQL warehouses, Databricks Apps, Genie, Lakebase, and bundles are not sidecar infrastructure. They are the operating environment.

OpenMetadata teaches a different lesson: workflow depth matters. Engineers and governance teams need discovery, entity pages, lineage, glossary, conversations, tasks, ownership, quality, and stewardship to feel like one product. A native platform screen can be accurate and still not be the place where governance work happens.

Governance Atlas is the attempt to combine those strengths: Databricks native truth with metadata product workflow depth. It should not try to out-Catalog-Explorer Catalog Explorer, and it should not clone OpenMetadata chrome. The valuable middle ground is a Databricks App that turns platform metadata into operating decisions.

The Hard Part Is Being Honest

The surprising engineering lesson is that the product gets more compelling when it refuses to lie. If quality evidence is unavailable, say unavailable. If lineage is truncated, say truncated. If a recommendation came from Genie, say Genie. If a row is backed by Unity Catalog, say Unity Catalog. If a button cannot safely mutate state, disable it or explain the unavailable state.

Experienced Databricks engineers notice this immediately. They know when a governance screen is pretending. They also know that a product willing to show its evidence boundaries is more useful than a polished dashboard full of unverifiable counters.

Where This Goes Next

The near future is obvious and difficult in the best way: precomputed metadata snapshots to reduce cold query latency, richer quality run execution, better background work, governed notification fanout, stronger approval workflows, more complete policy evidence, and deeper integrations with BI and service management systems.

The Genie path also gets more interesting as the governed corpus gets better. A curated room over Unity Catalog assets, governance work, glossary state, quality evidence, audit history, and lineage edges can become more than a recommendation panel. It can become a steward facing analyst that drafts impact briefs, explains evidence gaps, proposes owner notifications, and helps route approvals without leaving Databricks.

Lakebase opens a parallel path: faster operational state, branchable development workflows, and application native tables that still live inside the Databricks platform story. The product does not need a new universe. It needs the right state in the right native layer.

Entrada Governance Atlas insights dashboard on Databricks combining Unity Catalog signals, governance maturity scores, metadata coverage, and Genie-powered Atlas AI strategic recommendations with explicit unavailable states for unbacked metrics. — *Governance Insights combines measurable platform signals with strategic recommendations and honest unavailable states where signals are not backed yet.*

Closing

The point of Governance Atlas is not to prove that a Databricks App can render a nicer metadata table.

The point is to show that Databricks can be the place where governance work happens: search, understand, trace, decide, request, approve, and explain, all while staying close to the metadata and permissions that engineers already trust.

That is the difference between a catalog screen and a control plane. It is also the reason this project has become more interesting with every hard edge we have found.