...

/

What Is Data Science?

What Is Data Science?

Understand data science with real-world examples, core pillars, key roles, and how it’s used today.

Data is constantly generated: app usage logs, website clicks, and sensor readings. But raw data alone doesn’t tell us much. It’s often messy, unstructured, and hard to interpret without context. The value of data comes from how we process and analyze it. We can extract insights that inform real decisions by asking the right questions, identifying patterns, and applying statistical or machine learning techniques.

This course is about learning how to turn raw data into something useful, step by step, using the tools and techniques of data science.

Introduction to data science

Data science is a multidisciplinary field focused on making sense of large and complex data. At its core, it’s about uncovering meaningful insights that help people and organizations make smarter decisions. Whether it’s analyzing text, numbers, images, or clicks, the goal is to turn raw data into knowledge we can act on.

What makes data science especially powerful is how it blends different areas of expertise. It draws from mathematics to build models, statistics to interpret uncertainty, computer science to process massive datasets, and domain knowledge to ensure that the insights matter in a specific context, like health care, finance, sports, or entertainment.

Press + to interact
How data science works
How data science works

Say we’re working with raw text data from customer reviews or social media posts. As this kind of text isn’t structured like a spreadsheet or database, it’s unstructured and messy. But with data science, we can use simple models to find patterns in the words people use. For instance, we might notice that the words “fast” and “delivery” often appear together in positive reviews. This kind of pattern can help businesses understand what their customers care about. That’s the power of data science—it helps us find useful meaning in what first looks like a jumble of words.

Data science makes sense of such information by applying statistical and machine learning tools to real-world data. Whether it’s predicting trends, identifying key themes, or generating recommendations based on past behavior, data science turns raw data into practical solutions.

Where did you bump into data science today?

Think about your morning today. Did you...

  • Check Google Maps for the fastest route.

  • See a product you might like while browsing?

  • Scroll past a video that felt weirdly relevant?

If you answered yes to any of the above, you’ve already encountered data science today.

Data science in big tech

Data science is the quiet powerhouse behind the world’s top tech companies. While startups spotlight it, industry giants have been using it all along—refining searches, personalizing feeds, and predicting what you’ll need next. It’s not just helpful; it’s how they lead. Let’s explore how they do it.

  • Google: Google collects a lot of data about how people search—what they type, what they click, and how long they stay on pages. It uses natural language processing (NLP) to understand these searches and suggest what we might be looking for next.

Press + to interact
Typeahead suggestions in the search bar
Typeahead suggestions in the search bar
  • Amazon: Every time we browse, click, or buy something, Amazon learns a little more about our preferences. It uses this information to recommend products we might like. It also uses data to figure out which items should be stored near us so deliveries are faster, and it adjusts prices based on supply, demand, and what others are doing.

  • Netflix: Netflix keeps track of what we watch, when we pause, or what we skip. It uses this data to find patterns in our behavior. Then, it applies machine learning—a way for computers to learn from past examples—to recommend shows we will enjoy next.

Press + to interact
Netflix’s personalized recommendation system
Netflix’s personalized recommendation system
  • OpenAI: OpenAI trains its AI tools, like ChatGPT and DALL•E, on huge collections of text and images from books, websites, and more. These tools learn to recognize patterns and generate new content based on what they’ve seen, like writing a helpful answer or drawing a picture from a description. They also improve over time by learning how people interact with them.

These tech giants don’t just use data science—they run on it. It’s the invisible engine behind the scenes, optimizing, personalizing, and innovating at scale. By studying these examples, we see how data science isn’t just theory—it’s shaping our daily experiences.

Pillars of data science

As we dive deeper into data science, we must understand the key building blocks that support everything we do. These pillars show up in nearly every project—from exploring datasets to building predictive models—and grasping each helps us make smarter, more effective decisions with data. Let’s take a closer look at the foundations that make data science work:

Press + to interact
The data science pyramid
The data science pyramid
  • Statistics and probability: We use statistics and probability to understand what our data is telling us. They help us spot patterns, measure variability, and test whether what we see in our data is meaningful or just random noise. These tools guide our thinking and help us make informed decisions under uncertainty.

  • Linear algebra and basic math: Much of our data is organized in tables or matrices—rows and columns of numbers. That’s where linear algebra comes in. It gives us the structure to work with and transform our data, especially when we move into machine learning. Even simple math concepts help us effectively clean, scale, and reshape data.

  • Machine learning and AI: This is where things get exciting. We use machine learning to train computers to recognize patterns and make predictions based on data. From recommending products to detecting spam, these models learn from past information and help us automate tasks that would be hard to code by hand. As we go deeper, we also explore artificial intelligence (AI) and deep learning, which take machine learning to the next level. These methods use more advanced techniques—like neural networks with many layers—to understand complex patterns in text, speech, images, and video.

  • Programming and data tools: We rely on programming languages like Python and tools like pandas and SQL to do the work. These tools let us pull data from databases, clean it up, explore it, and build models. Learning how to use them turns our ideas into real-world results.

  • Domain knowledge: Understanding our problem is as important as the data itself. Whether we’re working in healthcare, marketing, or climate science, knowing the context helps us ask the right questions, interpret results accurately, and deliver solutions that make a difference.

Roles in data science

Data science is a team sport. We don’t work alone. It helps to know the different roles to see where a data scientist fits in and how everyone contributes to turning data into insights.

  • Data analysts dig into data, clean it up, and create reports that make the story behind the numbers clear and easy to understand.

  • Data engineers build and manage the systems that collect and prepare the needed data. They make sure data flows smoothly and is ready for us to use.

  • Machine learning engineers take cleaned data and build models that learn from it. These models help automate tasks and make predictions we can trust.

  • Data scientists—the role we’ll focus on—bring all these pieces together. We explore the data, build models, find meaningful insights, and communicate those insights clearly to help others make decisions. We need technical skills and a good sense of the business’s needs.

What’s next?

In this course, we’ll walk you through the essential skills and tools you need to become a data scientist—from understanding where data comes from, to cleaning and organizing it, exploring it visually, finding patterns with statistics and machine learning, and finally communicating your insights through clear, impactful stories.

Before we begin, ensure you’re comfortable with basic Python and understand math and logical thinking. A curiosity for problem-solving and real-world questions will take you far.