Hello, Data! Meet SQL
Explore the basics of SQL to fetch, filter, and organize structured data stored in relational databases. Learn essential commands such as SELECT, FROM, WHERE, ORDER BY, and LIMIT to retrieve meaningful insights, handle complex filtering with logical conditions, and shape your data for analysis or reporting.
Spreadsheets are great, CSV files are common, and APIs stream data from countless sources—but when companies need speed, structure, and scale, they often rely on relational databases. Think of them as powerful digital filing cabinets where data lives in neatly organized tables: rows and columns designed for fast retrieval and analysis.
But tables alone aren’t enough. To explore them, ask questions, and extract insights, we need a universal language: SQL (Structured Query Language).
SQL is not a programming language—it’s a declarative language used to request data from relational databases. It helps you filter, organize, and shape data for analysis, reporting, or machine learning pipelines.
How SQL communicates with databases
SQL is how we talk to relational databases. It’s how we ask for the data we want, filter the noise, and shape results into meaningful answers. Whether you're analyzing employee salaries, tracking customer orders, or prepping data for machine learning models—SQL is a tool you'll reach for again and again.
Fun fact: The first version of SQL was developed at IBM in the 1970s. It originally stood for "Structured English Query Language" (SEQUEL)—which is why some people still pronounce it “sequel” today!
Here’s a simple diagram to show how SQL communicates with a database behind the scenes:
This lesson focuses on core SQL building blocks: SELECT, FROM, WHERE, ORDER BY, and LIMIT.
Sample data: The Employees table
To understand how SQL works, let’s start with a simple example table called employees. It stores information about employees in a company, including their ID, name, department, hire date, and salary.
EmployeeID | Name | Department | HireDate | Salary |
1 | Alice Johnson | Engineering | 2020-03-15 | 85,000.00 |
2 | Bob Smith | Marketing | 2019-07-22 | 65,000.00 |
3 | Carol Martinez | Sales | 2021-01-08 | 72,000.00 |
4 | David Liu | Engineering | 2018-11-02 | 95,000.00 |
5 | Eva Gómez | HR | 2022-05-16 | 58,000.00 |
6 | Frank O’Connor | Finance | 2017-09-30 | 78,000.00 |
7 | Grace Patel | Engineering | 2023-02-12 | 80,000.00 |
8 | Hiro Tanaka | Support | 2020-12-01 | 60,000.00 |
9 | Isabella Rossi | Marketing | 2021-06-25 | 67,000.00 |
10 | Ayra Stark | Sales | 2019-04-10 | 71,000.00 |
This table will help us explore the core building blocks of SQL.
Pulling what you need: SELECT and FROM
When we start querying a database, two SQL clauses are essential: SELECT and FROM. They work hand-in-hand to tell the database what data we want and where to find it.
Fun fact: SQL is case-insensitive, but people often write keywords in UPPERCASE to make queries easier to read.
SELECT is how we specify the exact columns—or pieces of information—we want to see in our results. For example, if we only care about employee names and salaries, we tell SQL to select just those columns.
But data doesn’t float in space—it lives inside tables. That’s where FROM comes in. It tells SQL which table to pull the data from. Since databases often contain many tables, we have to be specific about the source.
Together, SELECT and FROM answer two key questions:
What data do we want? (
SELECT)Where do we get it from? (
FROM)
Basic syntax
The following is the syntax to use SELECT and FROM clauses:
SELECT column1, column2, ...FROM table_name;
column1, column2, ...—the columns we want to retrievetable_name—the table where the data resides
Lines 1—3: Give us the
NameandSalarycolumns.Line 4: Get that data from the
employeestable.
This asks the database to fetch just the names and salaries from the employees table. Clean, clear, and efficient.
If we want to select all columns from a table, we can use * instead of listing each column name individually. For example, SELECT * FROM employees; retrieves every column in the employees table.
Filtering rows: The WHERE clause
When we're exploring data, we rarely want to look at every row in the table. Instead, we usually have a specific question in mind: Which employees work in Engineering? Who earns more than $80,000? Who was hired before 2020? That’s where the WHERE clause becomes essential.
The WHERE clause lets us filter rows so we only get the ones that match certain conditions—giving us control to focus on the most relevant slice of data.
Syntax
The following is the syntax to use WHERE clause:
SELECT column1, column2, ...FROM table_nameWHERE condition;
This is the basic structure. Here’s what each part does:
SELECT: Choose the columns we want to view.FROM: Point to the table we're querying.WHERE: Add a condition that filters the rows.
Example 1: Filter by department
Assume we’re interested in analyzing the Engineering team. Here’s how we’d narrow it down:
Strings in SQL go in single quotes, like 'Engineering', not double quotes.
This query returns only employees from the Engineering department. It’s a great way to isolate a team before doing deeper analysis—like comparing average salaries or evaluating hiring patterns.
Example 2: Filter by salary
Now let’s say we want to look at employees earning over $80,000—maybe for a compensation analysis.
This helps us find high earners and begin understanding salary distributions across the organization—essential for tasks like pay equity reviews or anomaly detection.
Example 3: Filter by hire date
If we’re studying employee retention or tenure, we might want to find those who were hired before a certain year:
Fun fact: Even if a date looks like a string, relational databases often store it in a special DATE or DATETIME format for sorting and comparisons.
This focuses our query on longer-serving employees, useful for cohort analysis or calculating average tenure.
Next, we’ll see how to combine multiple conditions to ask more nuanced questions.
Get specific: Combining conditions with AND, OR, and ()
As data engineer, we often want to ask layered questions. Maybe we want to find engineers hired after a certain date and earning above a certain salary. To do this, we can combine multiple conditions using logical operators like AND, OR, and parentheses ().
This approach helps us narrow down rows even further and add nuance to our data exploration.
Syntax
The following is the syntax to use AND, OR, and parentheses ().
SELECT column1, column2, ...FROM table_nameWHERE condition1 AND condition2;-- ORSELECT column1, column2, ...FROM table_nameWHERE condition1 OR condition2;-- Use parentheses for claritySELECT ...FROM ...WHERE (condition1 OR condition2) AND condition3;
Use parentheses to control logic when combining multiple ANDs and ORs—just like math!
Example 1: Use AND for precise filters
Let’s say we want employees in the Engineering department that also earns more than $80,000:
This filters the dataset down to employees who satisfy both conditions. This is helpful when we’re segmenting specific roles for performance or budget analysis.
Example 2: Use OR for flexible filters
Now imagine we’re interested in employees from either Marketing or Sales—perhaps to compare how customer-facing teams are compensated.
The OR keyword expands our filter to include multiple groups—ideal for combining related departments, regions, or project teams.
Logical conditions give us a powerful way to sharpen our filters and ask more complex, data-driven questions. Whether we're profiling talent, analyzing churn risk, or preparing data for modeling, understanding how to layer conditions is a must-have skill.
Fun fact: SQL follows Boolean logic, just like computer science fundamentals. You’re basically writing mini math problems for your data.
Make it digestible: ORDER BY and LIMIT
Once we’ve filtered our data, the next step is to present it in a way that’s clear and useful. As data engineer, we often need to answer questions like:
Who are the top earners in the company?
Which departments hired most recently?
What’s the latest record in our dataset?
SQL gives us two simple tools to shape our result set: ORDER BY for sorting, and LIMIT for restricting how many rows we see.
Sorting with ORDER BY
The ORDER BY clause lets us sort rows by one or more columns. By default, it sorts in ascending order, but we can use DESC for descending.
This query gives us the highest-paid employees first. Sorting is especially helpful when identifying trends, ranking metrics, or preparing data for visualization.
ORDER BY always comes after WHERE, and LIMIT comes last in the query.
We can also sort by multiple columns:
Here, we group by department and show the most recently hired employees within each one.
Limiting rows with LIMIT
Sometimes we don’t need every record—just the top few. That’s where LIMIT comes in handy:
This returns only the top three earners. It’s perfect for dashboards, summaries, or performance snapshots.
Fun fact: LIMIT is super useful when exploring large datasets—it keeps your result set manageable while you test different queries.
Combining ORDER BY and LIMIT helps us make results more digestible and relevant—whether we’re scanning for outliers, surfacing highlights, or building fast queries for large datasets.
Summary
As data engineer, understanding SQL is essential for retrieving the right information from relational databases. In this lesson, we explored how to use SELECT, FROM, WHERE, ORDER BY, and LIMIT—the core building blocks for querying structured data. These tools help us pull, filter, and organize data so we can turn it into actionable insight. Mastering these basics is the first step toward writing more advanced and powerful queries.
To reinforce these concepts, test your understanding with a quick quiz.
Quiz!
Which SQL keyword is used to specify the table to retrieve data from?
GET
SELECT
FROM
TABLE