Two Killer Jupyter Hacks That Are Guaranteed To Save You Hours Of Work Time | by Avi Chawla | Nov, 2022
The Moment You Start Using Them
Jupyter Notebooks, because of their simple, streamlined, beginner-friendly, and sleek design, are almost indispensable to any Python-oriented task today.
Thinking retrospectively, I cannot even imagine my life without an Interactive Python (IPython) tool like Jupyter.
Essentially, the most significant advantage of IPython is that they reduce the friction of re-running scripts by keeping objects in memory as long as the kernel is active.
Additionally, Jupyter is also preferred for typical prototyping purposes.
This makes tasks like data cleaning, transformation and visualization, numerical simulation, statistical modeling, machine learning, and many more relatively easier.
However, with their simplicity, developers often tend to commit some common mistakes (unintentionally) that costs them both time and computation — two fundamental pillars in a project.
Thus, in this blog, I will share two mistakes that almost every Jupyter user has made. With that, I will also present an elegant solution to them that will save you tons of time.
Let’s begin 🚀!
Have you ever been in a situation where you wrote some code in Jupyter but realized after computation that you forgot to assign it to a variable?
In such situations, one has to unwillingly execute the cell again and generate the results to assign them to a variable.
I can relate to that feeling as I have been there myself.
Solution
What if I told you there is a clever solution to this?
When you execute a cell in Jupyter, you get to see something like In [2]:
besides the cell, don’t you?
Likewise, you also get to see something besides the output panel of the cell. More specifically, it is denoted as Out[3]:
.
In IPython, Out
is a standard python dictionary that stores the mapping of output-id
to cell-output
. In
is a Python list that stores the code executed in order.
Their type can be verified as follows:
Thus, if you forgot to assign the output to some variable, you can use the Out
dictionary and pass the output-id
that appears beside the output panel.
For instance, in the groupby
output above, you can use Out[3]
to retrieve the results.
Isn’t that cool?
While working in a Jupyter Notebook, one may want to restart the kernel due to several reasons. But before restarting, one often tends to dump data objects to disk to avoid recomputing them in the subsequent run.
This is a time-consuming process. Also, storing each important data object individually is quite a hassle.
Solution
The “store” magic command serves as an ideal solution to this. Here, you can obtain a previously computed value even after restarting your kernel.
What’s more, you never need to go through the hassle of dumping the object to disk.
This is demonstrated in the video below:
As shown above, the store magic command allows you to retrieve a previously computed value even after restarting your kernel.
To summarize, these are the steps:
Step 1: Store the variable using %store
.
Step 2: After restarting the kernel, use %store
with -r
option.
Note that you can also store multiple values using a single %store
command.
The Moment You Start Using Them
Jupyter Notebooks, because of their simple, streamlined, beginner-friendly, and sleek design, are almost indispensable to any Python-oriented task today.
Thinking retrospectively, I cannot even imagine my life without an Interactive Python (IPython) tool like Jupyter.
Essentially, the most significant advantage of IPython is that they reduce the friction of re-running scripts by keeping objects in memory as long as the kernel is active.
Additionally, Jupyter is also preferred for typical prototyping purposes.
This makes tasks like data cleaning, transformation and visualization, numerical simulation, statistical modeling, machine learning, and many more relatively easier.
However, with their simplicity, developers often tend to commit some common mistakes (unintentionally) that costs them both time and computation — two fundamental pillars in a project.
Thus, in this blog, I will share two mistakes that almost every Jupyter user has made. With that, I will also present an elegant solution to them that will save you tons of time.
Let’s begin 🚀!
Have you ever been in a situation where you wrote some code in Jupyter but realized after computation that you forgot to assign it to a variable?
In such situations, one has to unwillingly execute the cell again and generate the results to assign them to a variable.
I can relate to that feeling as I have been there myself.
Solution
What if I told you there is a clever solution to this?
When you execute a cell in Jupyter, you get to see something like In [2]:
besides the cell, don’t you?
Likewise, you also get to see something besides the output panel of the cell. More specifically, it is denoted as Out[3]:
.
In IPython, Out
is a standard python dictionary that stores the mapping of output-id
to cell-output
. In
is a Python list that stores the code executed in order.
Their type can be verified as follows:
Thus, if you forgot to assign the output to some variable, you can use the Out
dictionary and pass the output-id
that appears beside the output panel.
For instance, in the groupby
output above, you can use Out[3]
to retrieve the results.
Isn’t that cool?
While working in a Jupyter Notebook, one may want to restart the kernel due to several reasons. But before restarting, one often tends to dump data objects to disk to avoid recomputing them in the subsequent run.
This is a time-consuming process. Also, storing each important data object individually is quite a hassle.
Solution
The “store” magic command serves as an ideal solution to this. Here, you can obtain a previously computed value even after restarting your kernel.
What’s more, you never need to go through the hassle of dumping the object to disk.
This is demonstrated in the video below:
As shown above, the store magic command allows you to retrieve a previously computed value even after restarting your kernel.
To summarize, these are the steps:
Step 1: Store the variable using %store
.
Step 2: After restarting the kernel, use %store
with -r
option.
Note that you can also store multiple values using a single %store
command.