SQL databases have been around since the 1970s. Some of the smartest people in the world worked on making it easy to slice, dice, fetch and manipulate data quickly and efficiently. SQL databases have come such a long way that many developers and data scientists have lost track of what they can do with plain SQL.

In this course, you will learn advanced concepts and techniques for analyzing data using SQL. You will obtain hands-on experience in producing descriptive statistics, breaking a large query into multiple steps and samples, and cleaning and preparing data for analysis. You will learn how to produce aggregate results and subtotals, analyze a time series using cumulative and window frames, how to handle and fill missing data, and how to produce buckets and histograms.

After completing this course, you will be able to gain actionable insight from your data using nothing but SQL!

Practical Data Analysis with SQL

Extracting a small subset of a table is often called **sampling**. There are various reasons to use sampling, for example:

1. Performing estimations on large datasets: When working on large tables, we are sometimes willing to compromise accuracy in favor of speed. By sampling a portion of the table we can produce less accurate results more quickly.

2. Producing a training set: When doing data analysis using machine learning models, it 
is often necessary to train the model on a portion of the data. This portion is known as a training set. The training set can be produced by sampling the table.

## Sampling with `LIMIT`

A simple way to fetch a random portion of a table is combining `random` with `LIMIT`:


Extracting a small subset of a table is often called **sampling**. There are various reasons to use sampling, for example:

1. Performing estimations on large datasets: When working on large tables, we are sometimes willing to compromise accuracy in favor of speed. By sampling a portion of the table we can produce less accurate results more quickly.

2. Producing a training set: When doing data analysis using machine learning models, it 
is often necessary to train the model on a portion of the data. This portion is known as a training set. The training set can be produced by sampling the table.

# Sampling with `LIMIT`

A simple way to fetch a random portion of a table is combining `random` with `LIMIT`:


Learn to sample a subset of a table using SQL.

Introduction

Basic SQL for Data Analysis

Descriptive Statistics

Grouping and Subtotals

Running and Cumulative Aggregation

Interpolation

Binning

Conclusion

Sampling

Sampling with `LIMIT`