Member-only story

The Pros and Cons of Using Jupyter Notebooks as Your Editor for Data Science Work

TL;DR: PyCharm’s probably better

Steffen Sjursen

Published in

Better Programming

5 min readMar 1, 2020

Jupyter notebooks have three particularly strong benefits:

They’re great for showcasing your work. You can see both the code and the results. The notebooks at Kaggle is a particularly great example of this.
It’s easy to use other people’s work as a starting point. You can run cell by cell to better get an understanding of what the code does.
Very easy to host server side, which is useful for security purposes. A lot of data is sensitive and should be protected, and one of the steps toward that is no data is stored on local machines. A server-side Jupyter Notebook setup gives you that for free.

When prototyping, the cell-based approach of Jupyter notebooks is great. But you quickly end up programming several steps — instead of looking at object-oriented programming.

Downsides of Jupyter notebooks

When we’re writing code in cells instead of functions/classes/objects, you quickly end up with duplicate code that does the same thing, which is very hard to maintain.

Don’t get the support from a powerful IDE.

Consequences of duplicate code:

It’s hard to actually collaborate on code with Jupyter — as we’re copying snippets from each other it’s very easy to get out of sync
Hard to maintain one version of the truth. Which one of these notebooks has the one true solution to the number of xyz?

There’s also a tricky problem related to plotting. How are you sharing plots outside of the data science team? At first, Jupyter Notebook is a great way of sharing plots — just share the notebook! But how do you ensure the data there’s fresh? Easy, just have them run the notebook.

But in large organizations, you might run into a lot of issues as you don’t want too many users having direct access to the underlying data (for GDPR issues or otherwise). In practice, in a workplace, we’ve noticed plots from Jupyter typically get shared by copy/pasting into PowerPoint. It’s highly ineffective to have your data scientists do copy/paste…

Better Programming

The Pros and Cons of Using Jupyter Notebooks as Your Editor for Data Science Work

TL;DR: PyCharm’s probably better

Downsides of Jupyter notebooks

Create an account to read the full story.

Published in Better Programming

Written by Steffen Sjursen

Responses (7)