Make Some Changes
Explore how to use SQL data manipulation commands INSERT, UPDATE, and DELETE to maintain accurate and fresh data in pipelines. Learn when and how to apply these commands to keep data reliable and ready for downstream analytics, ensuring your data engineering processes run smoothly and efficiently.
As a data engineer, your job isn’t just to move data around—you’re responsible for keeping it accurate, fresh, and in top shape. Whether you’re prepping a dataset for a data pipeline, scrubbing outdated records from a staging table, or loading a fresh batch into a warehouse, DML commands (INSERT, UPDATE, DELETE) are your go-to SQL tools.
These aren’t just boring commands—they’re like the surgical tools in a data engineer’s kit. Use them right, and your data systems hum along like a well-oiled machine. From inserting new records to cleaning up stale ones, here’s how DML commands keep data flowing through the pipeline:
Sample table
Before we dive in, meet the table we’ll be working with—think of it as a temporary parking spot for incoming product data before it’s transformed and pushed to production.
product_id | name | price | stock | discontinued |
1 | T-shirt | 19.99 | 50 | false |
2 | Jeans | 49.99 | 20 | false |
3 | Baseball cap | 15.00 | 0 | true |
1. INSERT
Imagine a new product batch just landed, and your pipeline needs to load that data into the products table. Enter the INSERT command.
Syntax
The syntax of INSERT command is:
INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);