Introduction

Learn about SQL and its use in data analysis.

History of SQL

The Structured Query Language (SQL) was invented in the early 70s at IBM based on work by Edgar F. Codd. It is the most popular language for querying data stored in a relational database management system (RDBMS). SQL became a standard in 1986 (ANSI-86). In 1992, there was a revision (ANSI-92) to which modern-day SQL is very similar.

Most commercial and popular open-source database engines, such as PostgreSQL, MySQL, Oracle, and SQL Server, follow the ANSI-92 standard. However, there are still some subtle differences and features that are unique to each engine.

In this course, we are going to focus on PostgreSQL, a popular open-source database. However, the concepts can be adapted and applied to other ANSI-compliant database engines as well.

PostgreSQL

PostgreSQL is based on the Berkeley POSTGRES project from 1986. It was originally named Postgres95, but the team quickly renamed it to PostgreSQL.

PostgreSQL is free and open-source. Contributors from all over the world, across many different industries and disciplines, work tirelessly on developing PostgreSQL and its many extensions. Over the past few years, PostgreSQL has become one of the fastest-growing database engines.

Trend of top 5 RDBMS databases in the past 10 years
Trend of top 5 RDBMS databases in the past 10 years

SQL For data analysis

With an influx of data coming from different sources, it has become a real challenge to process all this data. This sparked new technologies that focused on processing large amounts of data such as NoSQL databases and frameworks like pandas.

The new technologies have valid use cases, but more often than not, a lot can be done quickly and more efficiently directly on the database. Many developers and data scientists have lost track of what they can do with the database they already have, so in this course, we are going to focus on analyzing data directly in the database using SQL.

What we’ll learn

In this course, we're going to focus on analyzing data using SQL. We are going to tackle real-life problems we encounter on a daily basis.

  • Explore data: How to get familiar with a fresh dataset using various SQL functions.
  • Clean data: How to clean messy data and get it ready for further analysis.
  • Transform data: How to transform data in different ways and gain actionable insight from it.
  • Visualize data: How to visualize to quickly identify trends and other features in the data.