Nº17 · Analysis

Jupyter

The interactive notebook where data analysis takes shape.

Environment—Intro—Data Scientist·Base / cross-cutting—python

What is it?

Jupyter is an environment for interactive notebooks: documents where you mix code cells (usually Python), their output (tables, charts, numbers), and explanatory text in Markdown. You run cell by cell and see the result instantly, which makes it the natural place to explore data iteratively.

The name comes from Julia, Python, and R, the first languages it supported. Today the ecosystem includes JupyterLab (the modern interface) and JupyterHub (multi-user notebooks on a server).

What is it for?

Data exploration. Load a dataset with pandas, inspect it, plot, and tweak — all in the same document, without restarting anything.
Prototyping and experimentation. Try a model idea, a transformation, or a hypothesis before turning it into production code.
Communicating analysis. A well-crafted notebook tells a story: context in Markdown, the code behind it, and the charts that prove it, in order.
Teaching and learning. It is the standard format for data courses and tutorials thanks to its blend of explanation and runnable code.

When to use it / when not

Use it for exploratory analysis, prototypes, ad-hoc visualization, and documenting a line of reasoning step by step. It is the Data Scientist's workbench.

Think twice for:

Production code. Reusable logic belongs in versioned, tested .py modules, not a notebook. Refactor what works into scripts.
Scheduled pipelines. To run something on a schedule with retries, use an orchestrator (Airflow) — not a notebook by hand.
Strict reproducibility. Hidden state (cells run out of order) is a classic trap; restart the kernel and run top to bottom before trusting the result.

Get started in 1 minute

Install JupyterLab and open it in the browser:

pip install jupyterlab
jupyter lab        # opens the interface at http://localhost:8888

Create a new notebook and, in a cell, try the typical exploration flow:

import pandas as pd

df = pd.DataFrame({"country": ["PE", "PE", "CL"], "amount": [100, 50, 80]})
df.groupby("country")["amount"].sum()   # the output appears under the cell

Quick trivia — test what you just read.

How much do you know about Jupyter?

Official documentation

The source of truth lives there. Here we orient you; the depth is up to you.

Open official docs ↗

What to learn next

Python

The lingua franca of the data stack: from scripts to pipelines and ML.

Intropython

Nº22Analysis

pandas

The Swiss Army knife for manipulating and analyzing tabular data in Python.

Intropython

Nº20Analysis

NumPy

Python's numeric foundation: fast, vectorized arrays.

Intropython

Nº17 · Updated 2026-06-25