Open-source curation · Python-first · in Spanish & English

The catalogue

Nº09 · Storage

Ceph

Distributed storage at production scale: objects, blocks and files.

StorageIntermediateData Engineer

What is it?

Ceph is an open-source distributed storage platform built for production scale. Over a single cluster it offers three interfaces: object (S3-compatible), block (disks for machines/containers), and filesystem. It replicates data across nodes, self-heals on disk failures, and scales to petabytes.

It is the choice when storage itself is the serious problem: a private cloud, large on-premise infrastructure, persistent storage for Kubernetes. That power comes at a cost: operating it is complex and usually requires a dedicated team.

What is it for?

  • Private storage cloud. A single platform for object, block, and files in your own datacenter.
  • On-premise lake at scale. Via its RADOS Gateway it exposes the S3 API, so your lake (Parquet, Iceberg) can live on Ceph just like on S3.
  • Persistent storage for Kubernetes. With Rook, Ceph provides volumes for containerized workloads.

When to use it / when not

Use it when the challenge is scale and operations: petabytes, fault tolerance, multi-protocol, your own infrastructure with people to run it. It is the "serious" on-premise storage option.

Think twice —and it almost always will be, for a beginner— when:

  • You're learning the object storage concept: MinIO gives it to you in a minute, S3-compatible, with no operational complexity.
  • Your scale is modest or you work locally/single-node: Ceph's overhead isn't justified.
  • You're in the cloud with managed S3/GCS/Azure: you already have storage at scale without operating it yourself.

Get started in 1 minute

Let's be honest: Ceph does not stand up in a minute — it is a cluster system you deploy and operate carefully. Two paths depending on your goal:

  • To understand the concept of object storage (buckets, S3 API), start with MinIO: one container and you're set.
  • For a real Ceph trial, the official path is cephadm on a dedicated host (not your work laptop):
# On a dedicated test host (Linux), not your main machine:
curl -sLO https://download.ceph.com/rpm-18.2.0/el9/noarch/cephadm
sudo python3 cephadm bootstrap --mon-ip <HOST-IP>
# Brings up a minimal cluster + the web dashboard; from there you add disks (OSDs).

The full deployment and operations guide is in the official documentation.

Quick trivia — test what you just read.

How much do you know about Ceph?

Official documentation

The source of truth lives there. Here we orient you; the depth is up to you.

Open official docs

What to learn next

See also

Nº09 · Updated 2026-06-25