Want to Be a Better Data Scientist? Write Coding Tutorials! | by Conor O’Sullivan | Apr, 2023

By Jessie Hobb On Apr 15, 2023

Improve your technical and communication skills, learn to write readable code and start writing about anything

When I first started writing, I could hardly code. My little Python experience came from university projects. At the same time, I thought I was a great writer! Now looking back, I cringe at some of my earlier articles.

This shows me that writing tutorials has improved both my technical and communication skills. Ultimately, it has made me a better data scientist. I want to share my experience and discuss some specific benefits. This includes writing readable code and providing the first step to writing about difficult topics.

Technical skills are obviously important to developers and data scientists. At the same time, we work in fields that are constantly changing. This means we need to keep up with the latest tools. This can be difficult and time-consuming so it takes a real commitment to learn continuously.

Writing has motivated me to consistently learn new technical skills and to develope a deep knowledge on those topics.

Writing has been a great motivator. Not only can I learn new things but I can share those lessons with the world. It is addictive — having people read and interact with your work. The desire to consistently post new articles has pushed me to consistently learn new things.

Not only that but I find writing is the best way to learn. Often it’s only after I’ve tried to explain something that I’ve realised I don’t actually understand it. On top of this, I always aim to add value to a topic. Introducing novelty requires a good understanding. In this way, writing has been a check on my knowledge. It has pushed me to truly understand the technical concepts I write about.

Writing readable code

A related skill is writing readable code. Good code should explain itself. Similarly, a tutorial based on readable code is easier to write. I spend less time explaining what the code does and often you don’t have to explain it at all. I’ve found this benefit extends to collaborating in industry.

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

— Martin Fowler

An example comes from a recent article of mine — Using SHAP to Debug a PyTorch Image Regression Model. In the tutorial, I worked with SHAP values given in a 5-dimensional array. I wanted to visualise this array using one of the package’s functions. To do this, the array first needed to be reordered. Originally, the code looked like this:

# Reshape shap values for plotting
shap_numpy = [np.swapaxes(np.swapaxes(s, 1, -1), 1, 2) for s in shap_values]

The nested swapaxes functions were not fun to explain. The TLDR is that the code transposed the array’s 3rd dimension with the 5th dimension. I found myself writing a lengthy paragraph about how the code was doing this. So, instead, I rewrote the code:

# Reshape shap values for plotting
shap_numpy = list(np.array(shap_values).transpose(0,1,3,4,2))

The transpose function is more intuitive and the code was easier to explain. When writing code for tutorials, I am always looking out for examples like these. Readable code makes the writing process smoother and the tutorial more digestible.

I’ve found this skill particularly useful when collaborating with other professionals. You can never get away from handovers and code reviews. Explaining code during these processes is no different to explaining it in an online tutorial. Ultimately, writing readable code means I’ve spent less time explaining that code to colleagues.

When I first started working, I quickly realised that non-technical skills were also critical. You not only need to get results but also explain them and your process. I had to produce emails, presentations or technical documents on a daily basis. There was no getting away from writing!

…traditionally, different people follow different paths in their careers―some are more technical, others are more creative and communicative. A data scientist has to have both.

―Monica Rogati

Writing coding tutorials is great practice for this. The more I wrote the better I was at communicating at work. Complex analysis became easier to explain and I had to clarify my work less. I quickly saw the tangible benefits. Yet, this was just the start.

Using tutorials as a starting point for your writing

Compared to other articles, coding tutorials are easy to write. There are no deep thoughts or nuanced arguments. You just explain the code. This means they can be a way to get into more difficult writing.

Writing is something you need to practise to get good at. Tutorials helped me take my first steps. I started to experiment by adding catchy stories to the instructions and conclusions. Stumbling through these helped me to move into different types of articles.

Taking more confident strides, I moved onto pieces based on my experience as a data scientist. These were articles based on more abstract concepts, my personal experience and feelings. I found these far harder to write than any coding tutorial. Yet, I would not be at this point if not for those first tutorials.

Looking back, there have been many benefits from writing coding tutorials. I’ve improved my technical knowledge and learned to write readable code. Writing tutorials has also brought me to the point where I feel confident to write about nearly anything. I haven’t even mentioned the benefits for my reputation and the financial compensation.

So, if you are interested in writing, coding tutorials are a great way to start. If you are struggling with ideas, you may find this article useful: