A window function in SQL is a very basic and essential utility. Window functions are a way to perform a variety of mathematical and analytical operations on a single row or multiple rows. The driving idea behind the purpose of the window function is to enable the user to work concurrently with multiple values of the result set. It is also the accessibility of window functions in SQL that makes it a very productive language to perform statistical operations like ranking and aggregates over a specific window (which we call frame).
Window functions work by accessing a specific window of the data in the input rows. We call this particular window frame.
SQL provides two major types of functions that can be performed on the selected frame. These functions are:
Aggregate window functions
Ranking window functions
We’ll discuss aggregate window functions in detail.
The functions that perform computations on multiple values/rows and return a single value are called aggregate window functions. Some of these functions are MIN()
, MAX()
, SUM()
, COUNT()
and AVERAGE()
.
We can use array_agg()
as our aggregate function. It takes aggregates of the selected frame and outputs the content in the desired format. It has the following definition:
select x, array_agg(x) over (order by x)
The over(order by x)
actually means over (order by x rows between unbounded preceding and current row)
. We can also give the function definition as follows:
select x, array_agg(x) over (order by x rows between unbounded preceding and current row)
We can also opt not to use a specific frame. In this case, the aggregate window function would have the following definition:
select x, array_agg(x) over() as frame
Let’s analyze an example that calculates the average and sum of the whole set of rows:
Performing Window Aggregate Functions with no frame specification*/select x,array_agg(x) over () as frame,sum(x) over () as sum,avg(x) over () as avg,max(x) over () as max,min(x) over () as min,x::float/sum(x) over() as percentagefrom generate_series(1, 5) as t(x);
The output of this sample would be as follows:
We can see that the aggregate window function can very easily perform aggregate functions on a whole set of rows. We can easily do calculations like percentage and standard deviation using a whole set of rows.
We can also use frames to perform these functions on selected values in rows. The following example performs the same.
/* Performing Window Aggregate Functions with frame specification*/select x,array_agg(x) over (order by x),sum(x) over (order by x) as sum,avg(x) over (order by x) as avg,max(x) over (order by x) as max,min(x) over (order by x) as min,x::float/sum(x) over(order by x) as percentagefrom generate_series(1, 3) as t(x);
The output would be as follows:
We can also try to implement different window aggregate functions in the playground provided below:
select x,array_agg(x) over () as frame,sum(x) over () as sum,avg(x) over () as avg,max(x) over () as max,min(x) over () as min,x::float/sum(x) over() as percentagefrom generate_series(1, 5) as t(x);
In this exercise, we see that the sum, average, maximum, minimum, and percentage operations are performed on the whole set of rows. This method is useful if we want to calculate the sum, average, maximum, and minimum.
select x,array_agg(x) over (order by x),sum(x) over (order by x) as sum,avg(x) over (order by x) as avg,max(x) over (order by x) as max,min(x) over (order by x) as min,x::float/sum(x) over(order by x) as percentagefrom generate_series(1, 5) as t(x);
This method is also useful if we want to calculate the percentage fractions.