Difference between FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE()
A window function is used to calculate values over a set of rows. It lets you perform calculations across any number of table rows that are linked to the current row. A window in SQL refers to the set of rows on which the window function operates. Window frames specify the range of rows within a partition and they allow us to do more precise calculations. Partitions are used to group rows together within the same partition before applying the window function. A partition can be defined using the PARTITION BY clause. The window functions can be predefined or user-defined. Window functions in SQL can be broadly categorized into the following three categories:
Ranking window functions
Analytic window functions
In SQL, analytic window functions assign values to the rows in the result set based on the values in the other rows. Analytic window functions are also called value window functions. These functions include LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE().
The window functions FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE() are available in various relational database management systems (RDBMS), including PostgreSQL, that support advanced SQL functionalities.
FIRST_VALUE(): Returns the first value in the data set.LAST_VALUE(): Returns the last value in the data set.NTH_VALUE(): Returns the Nth value in the data set.
Syntax of a window function
Following is the syntax for a window function:
window_function ()OVER ( [PARTITION BY clause][ORDER BY clause][window_frame clause])
The
window_function()represents the window function. We replace it with the actual window function to perform the operation.We create partitions by defining the column/condition in
PARTITION BY clause. This clause becomes optional if there’s no need to create partitions. The window function will treat the dataset as a single window in such cases.The
ORDER BY clausespecifies the order within each partition.The
window_frame clauseis used to define the range of rows considered for the calculations.
Coding example
Let’s understand each of the window functions through coding examples:
Example table
Let’s suppose we have the following data on the student’s exam results. We’re using the SELECT query to view the records in the Exam_result table.
SELECT * FROM Exam_result;
The FIRST_VALUE() window function in PostgreSQL
The window function FIRST_VALUE() will return the value from the first row in the window frame based on the specified order. Following is the syntax for FIRST_VALUE():
FIRST_VALUE ( [expression] )OVER ( [PARTITION BY clause][ORDER BY clause] )
The
FIRST_VALUE([expression])represents the window function with the expression value.We create partitions by defining the column/condition in
PARTITION BY clause. This clause becomes optional if there’s no need to create partitions. The window function will treat the dataset as a single window in such cases.The
ORDER BY clausespecifies the order within each partition.
We can use this function to retrieve the lowest marks in a subject. The coding example is:
SELECTstd_id,std_name,marks,subject_name,FIRST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks ASC)FROMExam_result;
Lines 1–5: We use the
SELECTstatement to read specific columns from theExam_resulttable, includingstd_id,std_name,marks, andsubject_name.Lines 6–10: The
FIRST_VALUE()window function is used to calculate the first value of themarkscolumn in each partition defined by thesubject_namecolumn. The function is applied to each row within the partition, and the result is stored in theFIRST_VALUE(marks)column in ascending order as written inORDER BYclause.Lines 11–12: The
FROM clausespecifies the source tableExam_resultfrom which the data is being queried.
The LAST_VALUE() window function in PostgreSQL
The window function LAST_VALUE() will return the value in the last row in the window frame based on the specified order. Following is the syntax for LAST_VALUE():
LAST_VALUE ( [expression] )OVER ( [PARTITION BY clause][ORDER BY clause][RANGE clause] )
The
LAST_VALUE([expression])represents the window function with the expression value.We create partitions by defining the column/condition in
PARTITION BY clause. This is the same argument as given forFIRST_VALUE().The
ORDER BY clausespecifies the order within each partition.The
RANGE clausewith the syntaxRANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGmeans that the window frame for each row in the partition includes all rows from the first row to the last row of the partition, without regard to the actual values of the rows but based on their order.
The coding example is:
SELECTstd_id,std_name,marks,subject_name,LAST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks ASC)FROMExam_result;
Lines 1–5: We use the
SELECTstatement to read specific columns from theExam_resulttable, includingstd_id,std_name,marks, andsubject_name.Lines 6–10: The
LAST_VALUE()window function is used to calculate the last value of themarkscolumn in each partition defined by thesubject_namecolumn. The function is applied to each row within the partition, and the result is stored in theLAST_VALUE(marks)column in ascending order as written inORDER BYclause.Lines 11–12: The
FROM clausespecifies the source tableExam_resultfrom which the data is being queried.
We can see that the marks and last_value columns output the same values. We can use this function to retrieve the highest marks in a subject, as demonstrated in the example below:
SELECTstd_id,std_name,marks,subject_name,LAST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks ASCROWS BETWEENUNBOUNDED PRECEDING ANDUNBOUNDED FOLLOWING)FROMExam_result;
Lines 1–5: We use the
SELECTstatement to read specific columns from theExam_resulttable, includingstd_id,std_name,marks, andsubject_name.Lines 6–13: The
LAST_VALUE()window function is used to calculate the last value of themarkscolumn in each partition defined by thesubject_namecolumn. The function is applied to each row within the partition, and the result is stored in theLAST_VALUE(marks)column. The output is sorted in ascending order ofmarkscolumn. TheROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGclause specifies that the window frame for each row in the partition includes all rows from the first row to the last row of the partition, without regard to the actual values of the rows but based on their order. This means that for each row, the function considers all rows in the partition for calculating the result.Lines 14–15: The
FROM clausespecifies the source tableExam_resultfrom which the data is being queried.
The NTH_VALUE() window function in PostgreSQL
The window function NTH_VALUE() returns the value in the nth row in the window frame based on the specified order. Following is the syntax for NTH_VALUE():
NTH_VALUE ( [expression, n] )OVER ( [PARTITION BY clause][ORDER BY clause][RANGE clause] )
The
FIRST_VALUE([expression])represents the window function with the expression value.We create partitions by defining the column/condition in
PARTITION BY clause. This is the same argument as given forFIRST_VALUE()andLAST_VALUE().The
ORDER BY clausespecifies the order within each partition.The
RANGE clausehas already been explained in theLAST_VALUE()section.
This function requires an additional argument specifying the position n within the window frame. Let’s retrieve the second-highest marks in a subject as follows:
SELECTstd_id,std_name,marks,subject_name,NTH_VALUE(marks, 2)OVER(PARTITION BY subject_nameORDER BY marks DESCRANGE BETWEENUNBOUNDED PRECEDING ANDUNBOUNDED FOLLOWING)FROMExam_result;
Lines 1–5: We use the
SELECTstatement to read specific columns from theExam_resulttable, includingstd_id,std_name,marks, andsubject_name.Lines 6–13: The
NTH_VALUE()window function is used to calculate the value of themarkscolumn in each partition defined by thesubject_namecolumn. The function is applied to each row within the partition, and the result is stored in theNTH_VALUE(marks)column. The output is sorted in descending order ofmarkscolumn. TheROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING clausespecifies that the window frame for each row in the partition includes all rows from the first row to the last row of the partition. The output changes the value ofNULLrecord. You may try removing this clause to see how the output looks without this clause.Lines 14–15: The
FROM clausespecifies the source tableExam_resultfrom which the data is being queried.
Comparing FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE()
The window functions FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE() are used to retrieve specific values based on the order. Let’s use all three functions in a single query. We’re using FIRST_VALUE() function to retrieve the highest marks, LAST_VALUE() to retrieve the lowest marks, and NTH_VALUE() function to retrieve the second-highest value in a subject:
SELECTstd_id,std_name,marks,subject_name,FIRST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks DESC),LAST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks DESC),NTH_VALUE(marks, 2)OVER(PARTITION BY subject_nameORDER BY marks DESCRANGE BETWEENUNBOUNDED PRECEDING ANDUNBOUNDED FOLLOWING)FROMExam_result;
Lines 1–5: We use the
SELECTstatement to retrieve specific columnsstd_id,std_name,marks, andsubject_namefrom theExam_resulttable.Lines 6–23: The
FIRST_VALUE(),LAST_VALUE(), andNTH_VALUE()window functions are used to calculate certain values within each partition defined by thesubject_namecolumn.FIRST_VALUE(marks)calculates the first value of themarkscolumn in each partition, ordered bymarksin descending order.LAST_VALUE(marks)calculates the last value of themarkscolumn in each partition, ordered bymarksin descending order.NTH_VALUE(marks, 2)calculates the value of themarkscolumn that corresponds to the second row in each partition, ordered bymarksin descending order. TheRANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING clausespecifies that the window frame for each row in the partition includes all rows from the first row to the last row of the partition.
Lines 24–25: The
FROM clausespecifies the source tableExam_resultfrom which the data is being queried.
Conclusion
The window functions FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE() are used for data analysis. These functions enable users to retrieve precise values from specified positions within a window frame based on ordered partitions.
Free Resources