Nº21 · Governance
OpenMetadata
The open catalog to discover and trace the lineage of your data.
What is it?
OpenMetadata is an open-source data catalog platform that centralizes discovery and lineage of all data assets in an organization. Think of it as the "map" of your data ecosystem: where each table lives, how data flows through the system, and who owns each asset.
What is it for?
- Data discovery: indexes tables, dashboards, pipelines, and ML models from dozens of connectors (Snowflake, BigQuery, dbt, Airflow, Superset…) and surfaces them through semantic search with filters by owner, tag, or domain.
- End-to-end lineage: automatically traces the chain
source → transformation → consumptionacross pipelines, tables, and dashboards, making it straightforward to assess the impact of upstream changes. - Data quality: define and run quality tests directly on tables (uniqueness, nulls, value ranges) and assign owners per asset, so you know who to ask when something changes.
When to use it / when not
Use it when your organization runs multiple tools (several warehouses, orchestrators, BI platforms) and the data team loses time asking "where is that table?" or "who maintains it?". It is the natural fit if you already use dbt, Airflow, or Trino and want automatic lineage without manual instrumentation.
Think twice if your stack is small — a single database and a two-person team — because the deployment and maintenance overhead of OpenMetadata can outweigh the benefit. In that scenario, a well-maintained schema comment convention or DataHub Lite may be enough. If you only need dbt lineage, the built-in dbt docs site is considerably lighter.
Get started in 1 minute
The fastest path is spinning up the full stack with Docker:
git clone https://github.com/open-metadata/OpenMetadata
cd OpenMetadata/docker/development
docker compose up -d
Within a few minutes, the UI will be available at http://localhost:8585. Default credentials: admin / admin.
From the interface, navigate to Settings → Services → Add Service to connect your first data source. The connectors guide covers each integration in detail.
# Minimal example: read metadata via SDK
pip install openmetadata-ingestion
from metadata.ingestion.ometa.ometa_api import OpenMetadata
from metadata.generated.schema.entity.services.connections.metadata.openMetadataConnection import (
OpenMetadataConnection,
AuthProvider,
)
server_config = OpenMetadataConnection(
hostPort="http://localhost:8585/api",
authProvider=AuthProvider.openmetadata,
securityConfig={"jwtToken": "<your-jwt>"},
)
metadata = OpenMetadata(server_config)
# List all indexed tables
tables = metadata.list_entities(entity=Table)
for table in tables.entities:
print(table.fullyQualifiedName.__root__)
The full API reference and Python SDK docs are at docs.open-metadata.org.
Quick trivia — test what you just read.
How much do you know about OpenMetadata?
Official documentation
The source of truth lives there. Here we orient you; the depth is up to you.
Open official docs ↗What to learn next
See alsoNº21 · Updated 2026-06-08