Each day, business decisions big and small are driven by data. And today, we’re equipped with more data than ever before. From Fitbits to satellites, we have access to a vast mass of complex data sets, broadly referred to as big data.
To infer and extract actionable insights from all this data, business leaders and decision makers look to data science professionals, who leverage advanced technologies and methods to analyze and interpret that data. Then, the data science professionals must convey their findings back to leadership to inform data-driven decision-making.
Problems can often arise at this stage of translation. The reason why might surprise you: poor storytelling.
Here is Forbes on the importance of storytelling in data science: “Many of the heavily-recruited individuals with advanced degrees in economics, mathematics, or statistics struggle with communicating their insights to others effectively—essentially, telling the story of their numbers.”
Storytelling has been identified as one of the most important non-technical skills for data scientists, and companies have been upskilling their data professionals to teach them the skill of data storytelling.
If you’re a storyteller, you already possess one of the most important non-technical skills in data science. If you’re a data scientist, data storytelling skills will be crucial to help you drive change and communicate business intelligence to your audience.
Today, we’ll discuss why data science needs storytellers to help data make a strong impact:
Not every role in data science requires storytelling skills, but those that convey key insights from data analysis to decision makers do. In most cases, it’s the role of the data scientist to communicate these key insights to stakeholders.
To extract insights from complex data sets, data professionals leverage a wealth of technical tools, such as code, statistical analysis, and machine learning algorithms. However, to effectively convey their findings to their audience, they have to communicate them in a way that their audience can understand.
Data experts must assume that their audience won’t share their technical background in data science. In fact, the technical summary of their findings may be completely lost in translation. Instead, data experts can effectively communicate their findings by presenting it in the form of a data story – a story told to communicate the findings and actionable insights found in the data.
Stories are essential to driving data-driven action for many reasons, two of which being:
The last point is important, as neuroscience proves to us that most decisions are driven by emotion rather than logic.
A failure to present data as a story threatens one’s ability to persuade their audience to take action. And the cost of failed data storytelling can be tragic.
One unfortunate example of failed data storytelling can be found in the story of Ignaz Semmelweis. Semmelweis was an obstetrician who set out to investigate why one of the maternity clinics in which he worked witnessed a high rate of deaths in admitted mothers. At the time, these deaths were understood as a condition known as childbed fever (today, we know the cause was bad hygiene).
Semmelweis had observed his problem closely, and hypothesized a potential solution would be to have all doctors wash their hands before working in the maternity clinic. He launched a handwashing policy, which, sure enough, managed to result in a dramatic drop in deaths by childbed fever (including a few months with absolutely none).
However, Semmelweis’ solution didn’t last long. Many student doctors neglected the policy. Eventually, the doctors took the hand washing policy personally, and the resulting drama not only cost Ignaz his position, but more importantly, the lives of many more mothers as childbed fever returned to the clinic in full force.
Brent Dykes took a close look at this historical case to extract important takeaways for the importance of data storytelling.
According to Dykes, some of the effective data storytelling practices that Semmelweis missed were:
And, importantly, Dykes reflects: “One of the biggest mistakes Semmelweis made was he failed to tell a story with his data. Interesting statistics alone won’t persuade skeptical minds… Imagine if he was able to have his fellow obstetricians think of their own mothers and the critical role they played in their lives.”
Data doesn’t speak for itself, and this is why we must present it through stories to truly make an impact.
Try one of our 300+ courses and learning paths: Data Science for Non-Programmers.
This is the most technical part of the process, and involves coding and statistical analysis to analyze data sets. Many parts of this process might be performed by a data analyst, but data scientists also handle these tasks. After analyzing the data, relations and patterns are found, which are the data insights that inform business intelligence.
When analyzing a data set, it’s important to obtain accurate results, as they’ll be the foundation for your narrative. There are many important practices that help ensure that analysis is accurate, including:
Thinking back to Semmelweis’ attempts to enforce handwashing, we know that one of his shortcomings was presenting data tables to his audience rather than a visual. A table is not an effective data visualization. It requires the audience to do the work of reading the entries in each cell, and to identify for themselves the relationship between the data points. That’s a lot of work to leave to the audience, and frankly, we can’t count on them to do it!
Rather, it’s up to the data storyteller to support their insights with effective data visualizations.
Data visualizations can take many forms, including:
There are many different types of relationships you can present with data visualizations, and no single visualization is fit for every type of job. Therefore, it’s important to select the most effective visualization for their given situation.
For instance, a pie chart has its own dos and donts:
The narrative you tell will connect the key insights of your data, provide context, and come to an actionable conclusion. As with any good fiction story, a good data story should:
But unlike the many roles a fiction story can play, a data story’s main aim is to persuade decision-makers to take a certain course of action.
To that end, a good data story should:
There are various different frameworks for data storytelling. No single way is the right way, but here’s one approach you can take.
An effective data story should focus on one clear problem that will eventually be met with a resolution.
Analysis will help you see the correlations in your data. Then, you’ll need to consider causal relationships behind the data, and hypothesize a change that might bring about a solution to your problem.
What does your audience know, and what do they care about?
Understanding your audience will be crucial to help you connect to them emotionally, and effectively persuade them into action.
This is where you take your numerical understanding and put it into words that are easily understood by your audience.
Your narrative will need to have a clear story arc, including:
You’ll also need to define:
The data holds the patterns and insights, but the visuals will be what engage and connect to your audience.
To make a data story memorable and engaging, try to offer a visual for the beginning, middle, and end of your narrative.
And of course, be sure that the visual is best-suited for the relationship you’re trying to show.
Each type of visualization has its speciality. For instance, when showing a trend of growth, graphs are effective – pie charts are not.
Try one of our 300+ courses and learning paths: Data Science for Non-Programmers.
To a beginner in the topic, data may seem like it should be objective information. However, there’s a high level of interpretation involved between the correlation that calculations identify, and the causation and narrative interpreted by a scientist and/or storyteller.
Because their interpretations drive leadership decisions, data storytellers should be mindful to do their work with an appropriate sense of responsibility.
A careful storyteller should be able to create narratives from data while remaining:
This being said, the data analysis should be accurate, and the visualizations shouldn’t be misleading.
An example of a misleading graphic can be seen in the following figure from Venngage, where only certain data points were included to portray a misleading trend of growth:
Storytelling is similar to making an argument, and it’s possible to construct two different arguments based on one set of premises or “truths.” To that end, it’s crucial that we have thoughtful storytellers who also consider the complexity and context of the real world when looking at data – and have a sense of responsibility when interpreting the numbers in front of them.
If you’re an avid storyteller, you already have one of the most important non-technical skills required in data science. And if you’d enjoy the opportunity to problem-solve with numbers and code as well, data science is an opportunistic career path in which you can combine all these elements.
To get started in data science, the foundations you’ll need to master include math, statistics, programming languages, and other technologies. Some of the most popular languages used for data science are Python (a multipurpose language) and R (used especially for statistical computation).
If you’re feeling up to the task, you can start honing the technical skills you’ll need to become a data storyteller and scientist today with our course, Data Science for Non-Programmers. This course provides a gentle introduction to data science foundations, and will get you hands-on experience applying these concepts in Python, an industry standard for data science.
Join a community of more than 1.6 million readers. A free, bi-monthly email with a roundup of Educative's top articles and coding tips.