What is data storytelling? Why data science needs storytellers

Jun 13, 2022 - 10 min read
Erica Vartanian
editor-page-cover

Each day, business decisions big and small are driven by data. And today, we’re equipped with more data than ever before. From Fitbits to satellites, we have access to a vast mass of complex data sets, broadly referred to as big data.

To infer and extract actionable insights from all this data, business leaders and decision makers look to data science professionals, who leverage advanced technologies and methods to analyze and interpret that data. Then, the data science professionals must convey their findings back to leadership to inform data-driven decision-making.

Problems can often arise at this stage of translation. The reason why might surprise you: poor storytelling.

Here is Forbes on the importance of storytelling in data science: “Many of the heavily-recruited individuals with advanced degrees in economics, mathematics, or statistics struggle with communicating their insights to others effectively—essentially, telling the story of their numbers.”

Storytelling has been identified as one of the most important non-technical skills for data scientists, and companies have been upskilling their data professionals to teach them the skill of data storytelling.

If you’re a storyteller, you already possess one of the most important non-technical skills in data science. If you’re a data scientist, data storytelling skills will be crucial to help you drive change and communicate business intelligence to your audience.

Today, we’ll discuss why data science needs storytellers to help data make a strong impact:


Data scientist, data storyteller

Not every role in data science requires storytelling skills, but those that convey key insights from data analysis to decision makers do. In most cases, it’s the role of the data scientist to communicate these key insights to stakeholders.

To extract insights from complex data sets, data professionals leverage a wealth of technical tools, such as code, statistical analysis, and machine learning algorithms. However, to effectively convey their findings to their audience, they have to communicate them in a way that their audience can understand.

Data experts must assume that their audience won’t share their technical background in data science. In fact, the technical summary of their findings may be completely lost in translation. Instead, data experts can effectively communicate their findings by presenting it in the form of a data story – a story told to communicate the findings and actionable insights found in the data.

Stories are essential to driving data-driven action for many reasons, two of which being:

  • Narratives are more memorable than numbers
  • Stories can connect to the emotions of the reader, while numbers can’t

The last point is important, as neuroscience proves to us that most decisions are driven by emotion rather than logic.

A failure to present data as a story threatens one’s ability to persuade their audience to take action. And the cost of failed data storytelling can be tragic.

One unfortunate example of failed data storytelling can be found in the story of Ignaz Semmelweis. Semmelweis was an obstetrician who set out to investigate why one of the maternity clinics in which he worked witnessed a high rate of deaths in admitted mothers. At the time, these deaths were understood as a condition known as childbed fever (today, we know the cause was bad hygiene).

Semmelweis had observed his problem closely, and hypothesized a potential solution would be to have all doctors wash their hands before working in the maternity clinic. He launched a handwashing policy, which, sure enough, managed to result in a dramatic drop in deaths by childbed fever (including a few months with absolutely none).

However, Semmelweis’ solution didn’t last long. Many student doctors neglected the policy. Eventually, the doctors took the hand washing policy personally, and the resulting drama not only cost Ignaz his position, but more importantly, the lives of many more mothers as childbed fever returned to the clinic in full force.

Brent Dykes took a close look at this historical case to extract important takeaways for the importance of data storytelling.

According to Dykes, some of the effective data storytelling practices that Semmelweis missed were:

  • Using effective visualizations to illustrate data (he used data tables, but he could have used graphs to demonstrate trends in growth and decline)
  • Being attentive to the knowledge of the audience (had he met them where they were, he could have helped clarify misunderstandings and gotten his message through)

And, importantly, Dykes reflects: “One of the biggest mistakes Semmelweis made was he failed to tell a story with his data. Interesting statistics alone won’t persuade skeptical minds… Imagine if he was able to have his fellow obstetricians think of their own mothers and the critical role they played in their lives.”

Data doesn’t speak for itself, and this is why we must present it through stories to truly make an impact.

Get hands-on with data science today.

Try one of our 300+ courses and learning paths: Data Science for Non-Programmers.


What is data storytelling?

Data storytelling is a data science skill that leverages data, visuals, and narrative to persuade a group of decision makers to take a certain action.

widget

Data analysis

This is the most technical part of the process, and involves coding and statistical analysis to analyze data sets. Many parts of this process might be performed by a data analyst, but data scientists also handle these tasks. After analyzing the data, relations and patterns are found, which are the data insights that inform business intelligence.

When analyzing a data set, it’s important to obtain accurate results, as they’ll be the foundation for your narrative. There are many important practices that help ensure that analysis is accurate, including:

  • Using complete data sets, including outliers
  • Data cleaning to maintain accuracy

Data visualization

Thinking back to Semmelweis’ attempts to enforce handwashing, we know that one of his shortcomings was presenting data tables to his audience rather than a visual. A table is not an effective data visualization. It requires the audience to do the work of reading the entries in each cell, and to identify for themselves the relationship between the data points. That’s a lot of work to leave to the audience, and frankly, we can’t count on them to do it!

Rather, it’s up to the data storyteller to support their insights with effective data visualizations.

Data visualizations can take many forms, including:

  • Graphs
  • Pie charts
  • Scatter plots
  • Images
  • Videos

There are many different types of relationships you can present with data visualizations, and no single visualization is fit for every type of job. Therefore, it’s important to select the most effective visualization for their given situation.

For instance, a pie chart has its own dos and donts:

  • Do use them to show parts in relation to a whole
  • Do use them to illustrate that one part of the whole is comparatively large or small
  • Don’t use them if the parts don’t add up to 100%
  • Don’t use them to compare the size of each part of the whole

Data narrative

The narrative you tell will connect the key insights of your data, provide context, and come to an actionable conclusion. As with any good fiction story, a good data story should:

  • Be memorable
  • Connect to the audience
  • Have a beginning, middle, and end
  • Have characters, a problem/conflict, and resolution

But unlike the many roles a fiction story can play, a data story’s main aim is to persuade decision-makers to take a certain course of action.

To that end, a good data story should:

  • Persuade the audience toward a clear goal
  • Stay simple and focused

5 steps to tell a data story

There are various different frameworks for data storytelling. No single way is the right way, but here’s one approach you can take.

1. Define your problem

An effective data story should focus on one clear problem that will eventually be met with a resolution.

2. Look to the data for patterns and insights

Analysis will help you see the correlations in your data. Then, you’ll need to consider causal relationships behind the data, and hypothesize a change that might bring about a solution to your problem.

3. Get to know your audience

What does your audience know, and what do they care about?

Understanding your audience will be crucial to help you connect to them emotionally, and effectively persuade them into action.

4. Create a data narrative

This is where you take your numerical understanding and put it into words that are easily understood by your audience.

Your narrative will need to have a clear story arc, including:

  • Beginning: Providing context and introducing your characters
  • Middle: Introducing the problem, and the tension it creates for the characters and audience
  • End: A clear call-to-action for the audience that helps resolve the problem

You’ll also need to define:

  • Setting: This helps your audience immerse in your story, as you can’t assume they have all the context
  • Characters: Your audience will connect to the people in your story more than the data alone

5. Create effective visuals

The data holds the patterns and insights, but the visuals will be what engage and connect to your audience.

To make a data story memorable and engaging, try to offer a visual for the beginning, middle, and end of your narrative.

And of course, be sure that the visual is best-suited for the relationship you’re trying to show.

Each type of visualization has its speciality. For instance, when showing a trend of growth, graphs are effective – pie charts are not.

Get hands-on with data science today.

Try one of our 300+ courses and learning paths: Data Science for Non-Programmers.


Storytelling with care

To a beginner in the topic, data may seem like it should be objective information. However, there’s a high level of interpretation involved between the correlation that calculations identify, and the causation and narrative interpreted by a scientist and/or storyteller.

Because their interpretations drive leadership decisions, data storytellers should be mindful to do their work with an appropriate sense of responsibility.

A careful storyteller should be able to create narratives from data while remaining:

  • Empathetic
  • Unbiased (as much as possible)
  • Ethical

This being said, the data analysis should be accurate, and the visualizations shouldn’t be misleading.

An example of a misleading graphic can be seen in the following figure from Venngage, where only certain data points were included to portray a misleading trend of growth:

widget

Storytelling is similar to making an argument, and it’s possible to construct two different arguments based on one set of premises or “truths.” To that end, it’s crucial that we have thoughtful storytellers who also consider the complexity and context of the real world when looking at data – and have a sense of responsibility when interpreting the numbers in front of them.


Breaking into data science as a storyteller

If you’re an avid storyteller, you already have one of the most important non-technical skills required in data science. And if you’d enjoy the opportunity to problem-solve with numbers and code as well, data science is an opportunistic career path in which you can combine all these elements.

To get started in data science, the foundations you’ll need to master include math, statistics, programming languages, and other technologies. Some of the most popular languages used for data science are Python (a multipurpose language) and R (used especially for statistical computation).

If you’re feeling up to the task, you can start honing the technical skills you’ll need to become a data storyteller and scientist today with our course, Data Science for Non-Programmers. This course provides a gentle introduction to data science foundations, and will get you hands-on experience applying these concepts in Python, an industry standard for data science.

Happy learning!


Continue learning about data science


WRITTEN BYErica Vartanian

Join a community of more than 1.3 million readers. A free, bi-monthly email with a roundup of Educative's top articles and coding tips.

Learn in-demand tech skills in half the time

Copyright ©2022 Educative, Inc. All rights reserved.

soc2