Open-source curation · Python-first · in Spanish & English

The catalogue

Nº13 · Infrastructure

Docker

Package any stack tool into a reproducible container.

InfrastructureIntroData Engineer

What is it?

Docker is a container platform: it packages an application together with all its dependencies (libraries, binaries, configuration) into a portable unit that runs identically on any machine. Unlike a virtual machine, a container shares the host operating system's kernel, making it far lighter and faster to start.

In data work, this solves the classic "works on my machine" problem: if the container runs locally, it runs the same way in production.

What is it used for?

  • Spin up stack tools instantly. PostgreSQL, Airflow, Superset, Trino, and virtually any tool in the ecosystem have official images on Docker Hub — you have them running with a single command, without installing anything directly on your system.
  • Reproducible environments. A Dockerfile pins the exact version of every dependency your pipeline needs; any team member rebuilds the same environment in minutes.
  • Consistent deployment. The same image you tested locally is what gets deployed to the server, eliminating discrepancies between development and production.

When to use it / when not to?

Use it when:

  • You need to try or integrate a data stack tool (databases, orchestrators, data catalogs) without cluttering your system.
  • You want the development environment to be reproducible across the entire team.
  • You are deploying a data service and need predictable behavior regardless of the host.

Think twice if:

  • You only have a trivial Python script: a virtual environment (uv, venv) is simpler and enough.
  • You need to orchestrate workloads at scale across multiple nodes; that is what Kubernetes is for — Docker is the packaging unit, not a large-scale production orchestrator.

Up and running in 1 minute

Start a PostgreSQL instance ready for local testing:

docker run --rm \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  postgres:16

Connect with any client at localhost:5432, user postgres, password secret. Stop the container and it cleans itself up (--rm).

To browse data stack images: hub.docker.com Official docs: docs.docker.com

Quick trivia — test what you just read.

How much do you know about Docker?

Official documentation

The source of truth lives there. Here we orient you; the depth is up to you.

Open official docs

What to learn next

See also

Nº13 · Updated 2026-06-08