Jupyter Notebook 101¶

What are Jupyter Notebooks?¶

A Jupyter Notebook is a document that helps with creating and sharing computational documents. Just like Google Docs are an online version of Microsoft Word, Jupyter Notebooks are documents that can execute code online. Notebooks have the file extension .ipynb (Interactive PYthon NOtebook), and each cell of code is executed in a REPL (Read-Eval-Print-Loop) fashion.

What is REPL? Read-Eval-Print-Loop is an interactive programming environment.

Read in: The user creates some code and sends it to be processed.

Eval: The code is evaluated.

Print: The results are printed out for the user.

Loop. The ability to repeat steps 1-3 as an iterative method. \

Jupyter is an open source project. Many organizations and universities use it for data science, data discovery, and visualization workflows, and it is the most popular data science interface for code execution.

Accessed via a browser, it is has a kernel (in the video below it is Python) and has a UI to access the file system:

{% embed url="https://www.youtube.com/watch?index=1&list=PLG7TPzTSJYkdIy0aSBcPR6CYD8l2z7utq&pp=gAQBiAQB&t=10s&v=WFHMJ726aP0" %}

How does a Jupyter Notebook work?¶

Jupyter Notebooks are built on the IPython kernel, famous for its REPL (Read-Eval-Print-Loop) capabilities. REPL interfaces take user inputs, execute the code, and presents the result to the user. This feedback loop can be repeated multiple times in each Notebook. IPython is a command line terminal through which we can interactively execute python commands.

Jupyter Notebooks contain cells that contain atomic commands. Each cell can be executed to get the result from a programming environment. For example, a cell might make a request to an API from a third-party system, and then return the result.

screenshot of a Jupyter task — A cell with one line of code. The evaluation is printed below the cell.

{% embed url="https://www.youtube.com/watch?index=2&list=PLG7TPzTSJYkdIy0aSBcPR6CYD8l2z7utq&pp=gAQBiAQB&v=-v50jsIpyLU" %}

\

In addition to code cells, you can add text content (in Markdown format!)

{% embed url="https://www.youtube.com/watch?index=3&list=PLG7TPzTSJYkdIy0aSBcPR6CYD8l2z7utq&pp=gAQBiAQB&v=JNrepLxDRPQ" %}

When the Markdown cell is Run - it will display the text inside the Notebook:

rendering the markdown inside the notebook — Rendering the markdown inside the notebook

Building on previous cells¶

Variables created in a cell are stored in the notebook kernel, and are available for use in subsequent cells:

Variable in the first cell can be referred to in the second cell

Cells in the notebook build on the results of previously run cells. One way to think of a cell is as a microservice. The microservices are called in order, and complete a full application when the Notebook is completed.

In the video below, we access a JSON file for an API key, and then make an API call for weather data:

{% embed url="https://www.youtube.com/watch?index=4&list=PLG7TPzTSJYkdIy0aSBcPR6CYD8l2z7utq&pp=gAQBiAQB&v=Dt-ANGHETfM" %}

Once we have collected the API data, we can load it into a dataframe and visualize the data.

{% embed url="https://www.youtube.com/watch?index=5&list=PLG7TPzTSJYkdIy0aSBcPR6CYD8l2z7utq&pp=gAQBiAQB&v=d3QQhNCSO60" %}

Why use Jupyter Notebooks for runbook automation?¶

DevOps/SRE groups create Runbooks to automate their workflows. Runbooks are comprised of steps that must be completed in order to complete a task. Many of these steps are accompanied by scripts that help in completeing the step. By placing the Runbook inside a Notebook container - the code can actually be evaluated while the Runbook is being used!

By applying the Jupyter Notebook concept to automate the infrastructure workflows simplifies the task of a DevOps/SRE. It also aids in decoupling and debugging various systems quickly.

PS: CloudOps uses Jupyter Notebook under the hood. You could build or consume some knowledge shared by many engineers at the cloudops repo and run it via the Docker container.

Crash Course with Jupyter Notebook¶

Installation¶

There are many ways to install and run Jupyter Notebooks. Over the years, cloud platforms and several new-age startups have implemented Jupyter Notebooks - Google Collab, Deepnote, and Naas.ai.\

I’m using the Anaconda Distribution of Jupyter Notebook. Search for “anaconda download”; the first link you find is probably from Anaconda.com, which distributes the Jupyter Notebooks and several other products like Anaconda Server.

Download and Install the Anaconda Opensource Distribution; it fits our use case to build and run a basic Jupyter Notebook.
Open the Anaconda Navigator

\

Launch Jupyter Notebook! (make you are not launching JupyterLab)
A browser tab would be launched at http://localhost:8889/tree

Building a notebook¶

\

Next up, hit New -> Python 3 (ipykernel)

¶

Each Jupyter notebook contains multiple cells, which contain Python code. Python code in each cell gets executed when you run the cell. Cells can also be configured for text so that you can input instructions and guidance needed for the following code snippet. (markdown!).

\

\

Please note each cell carries the programming context from the above cells.

Running it! ¶

A notebook (containing multiple cells) can be run in one go by using Cell -> Run Cells (as shown in the image above).\

Alternatively, if you are a command line fan, you can use the below command to run a specific notebook.

> jupyter notebook notebook.ipynb

CloudOps and Jupyter Notebooks¶

As discussed above, CloudOps uses opensource Jupyter Notebooks under the hood and provides a seamless way of debugging/triggering complex infrastructure scripts.

\

CloudOps has many open-source runbooks (aka notebooks) at the cloudops GitHub repository. So give us a star and raise an issue if you feel we are missing something.

Resources¶

Few more resources on getting started to understand Jupyter Notebooks:\

Running via Papermill¶

Papermill executes Jupyter notebooks as parameterised batch jobs — no browser required. This is the recommended method for CI pipelines and scheduled runs.

Install¶

uv pip install papermill
# Already included in nnthanh101/runbooks devcontainer

Basic Execution¶

# Run a notebook with parameter injection
uv run papermill \
  notebooks/inventory/ec2-inventory-analysis.ipynb \
  /tmp/ec2-output.ipynb \
  -p AWS_PROFILE "b2b-energy-nz-elec-inbound-sec-ReadOnlyAccess" \
  -p AWS_REGION "ap-southeast-2"

The output notebook at /tmp/ec2-output.ipynb contains all cell outputs embedded. Convert to HTML for sharing:

jupyter nbconvert --to html /tmp/ec2-output.ipynb

Parameter Cells¶

Papermill injects parameters by finding the cell tagged parameters in the notebook metadata. In JupyterLab:

Select the first code cell
Open Property Inspector (right panel)
Add tag: parameters

# Cell tagged: parameters
AWS_PROFILE = "default"       # overridden by papermill -p
AWS_REGION = "ap-southeast-2" # overridden by papermill -p
TOP_N = 20                    # overridden by papermill -p

Error Handling¶

Papermill exits non-zero on cell failure. Use --no-progress-bar for clean CI output:

uv run papermill \
  notebooks/inventory/ec2-inventory-analysis.ipynb \
  /tmp/output.ipynb \
  -p AWS_PROFILE "$AWS_OPERATIONS_PROFILE" \
  --no-progress-bar 2>&1
echo "Exit: $?"

Taskfile Integration¶

The cloudops project Taskfile provides notebook tasks that wrap papermill and JupyterLab, eliminating the need to remember long CLI flags.

Available Tasks¶

# Show all notebook-related tasks
task --list | grep notebook

# Tier 1: Structural validation — no AWS required (2–5 seconds)
task notebook:test

# Tier 2: Live execution with real READONLY profiles
task notebook:test:live

# Start JupyterLab — auto-detects port conflict with devcontainer
task jupyter

Task: `notebook:test` (Tier 1)¶

Validates all .ipynb files for:

Valid JSON structure (HARD FAIL if corrupt)
nbformat metadata present (HARD FAIL if missing)
No DRY_RUN=True, mock_ec2, or sample data patterns (WARN — SAMPLE_DATA_IN_NOTEBOOKS anti-pattern)

cd /path/to/cloudops
task notebook:test
# Output: "Tier 1 Results: 12/12 structurally valid, 0 DRY_RUN warnings, 0 hard failures"

Task: `jupyter` (Port-Aware)¶

Starts JupyterLab and automatically selects port 8889 if 8888 is occupied by a running devcontainer:

task jupyter
# Port 8888 in use (devcontainer?) — using 8889
# Starting JupyterLab — http://localhost:8889

Adding a New Notebook to the Pipeline¶

Create notebooks/<domain>/your-notebook.ipynb
Add a parameters tagged cell with AWS_PROFILE, AWS_REGION, and any domain-specific params
Run task notebook:test to verify structural integrity
Add a notebook:run:<name> task to the Taskfile for one-command execution