Nº10 · Orchestration
Dagster
Pipeline orchestration centered on the data (assets), not just on tasks.
What is it?
Dagster is a modern, asset-centric data pipeline orchestrator. Instead of thinking only in terms of tasks that run in order, you define in Python the data assets you produce — a table, a model, a dataset — and their dependencies. The execution graph is inferred from those relationships, and with it come types, tests, and observability out of the box.
That shift (from "tasks" to "assets") is what sets it apart from traditional orchestrators: the system knows what data each step produces, not just that something ran.
What is it for?
- Modeling pipelines as a graph of assets. You declare each asset and its dependencies; Dagster builds the DAG and keeps the lineage.
- Local development + UI. You run and debug everything on your machine, and the web interface shows the graph, the runs, and the state of each asset.
- Stack integration. It connects naturally with dbt, Spark, warehouses, and data sources under one observable graph.
- Schedules and sensors. Schedule materializations by time or trigger them on events.
When to use it / when not
Use it when you want a data-aware orchestrator with good developer experience: pipelines with several sources and dependencies between tables, local testing, lineage, and observability without wiring them by hand.
Think twice for:
- A single simple cron (one script once a day): standing up an orchestrator is overkill — a cron is enough.
- Teams already invested in Airflow: versus Airflow — more traditional, task-centric, with a huge ecosystem of operators — the choice depends on the team and the case, not on which is "better" in the abstract.
Get started in 1 minute
Define an asset that produces data and another that consumes it — the heart of Dagster's model.
pip install dagster
# pipeline.py
from dagster import asset
@asset
def sales():
# Source asset: produces the data
return [{"country": "PE", "amount": 150}, {"country": "CL", "amount": 80}]
@asset
def total_by_country(sales):
# Downstream asset: depends on `sales` via the parameter name
totals = {}
for row in sales:
totals[row["country"]] = totals.get(row["country"], 0) + row["amount"]
return totals
Launch the UI with
dagster dev -f pipeline.pyand openhttp://localhost:3000: you'll see thesales → total_by_countrygraph and can materialize it with one click.
Quick trivia — test what you just read.
How much do you know about Dagster?
Official documentation
The source of truth lives there. Here we orient you; the depth is up to you.
Open official docs ↗What to learn next
See alsoNº10 · Updated 2026-06-26