The SQL window function is a well known feature, especially in the data warehousing space. They offer a way to perform calculations over a set of rows that are related to the current row, allowing for more advanced and flexible analysis. Instead of applying an aggregate function to the entire result set, window functions let you define a window, or a subset of rows, to which the function is applied.
Syntax
Window functions are often used with an OVER clause in SQL queries, and can be applied to compute aggregations such as SUM, ROW_NUMBER, AVG, RANK, etc.
<window fxn> OVER (PARTITION BY <partition expression> ORDER BY <ordering expression> <frame spec>)
Key components and structure
In order to effectively utilize window functions in SQL queries—and perform advanced calculations on specific datasets, users need to understand the key components of a window function.
- Partitioning
The PARTITION BY clause will partition the rowset into groups based on a specified column’s values. Each partition is treated as a separate dataset for the window function calculation. For instance, to calculate the average headcount per county, partitioning must be done on the county column.
- Ordering
The ORDER BY clause specifies the order in which the rows within each partition will be processed by the window function. It determines the sequence of calculations or aggregations performed. For instance, for ranking the average headcount per county, ordering must be done on the headcount column.
- Framing
The frame specifications, which is an optional parameter and the last enhancement to the window functions, further defines the subset of rows within each partition to be included in the window function calculation, mainly determining the window boundaries. There are 2 frame types: ROWS and RANGE.
- The ROWS frame specifies a finite number of preceding or following rows.
- The RANGE frame defines a range of values based on the order.
The choice between ROWS or RANGE depends on the analysis done for that particular case.
Common classifications of SQL window functions
Aggregate functions
Aggregating window functions will perform calculations on a window of the rowset and return a single value for each row in the result set.
Function | Functionality |
SUM | Calculates the SUM (column) within a window |
AVG | Calculates the AVG (column) within a window |
MIN | Returns the MIN (column) within a window |
MAX | Returns the MAX (column) within a window |
Ranking functions
Ranking window functions will assign a unique rank to each row within a window based on the order specified.
Function | Functionality |
ROW_NUMBER | Returns a unique sequence of number to each row within the window |
RANK | Returns a unique rank to each row, with same rank for tied rows |
DENSE_RANK | Returns a unique rank to each row, with no gaps in ranks for tied rows |
Analytic/OLAP functions
Analytic/OLAP window functions will compute and returns a value for each row based on a group of rows within a window.
Function | Functionality |
LAG | Returns the previous row within the window, NULL as the first value |
LEAD | Returns the subsequent row within the window, NULL as the last value |
FIRST_VALUE | Returns the first non-null value within the window |
LAST_VALUE | Returns the last non-null value within the window |
Cumulative functions
Cumulative window functions will calculate the cumulative distribution or ranking of value within a window.
Function | Functionality |
CUME_DIST | Calculates the cumulative distribution of a value within a window |
PERCENT_RANK | Calculates the relative rank of a value within a window |
NTILES | Divides the row within a window evenly and returns the value the row falls into |
Examples of the above mentioned window functions are detailed in the Imply docs.
Data sketches in window functions
Before going down the rabbit hole of data sketches in window functions, let us get a primer on what data sketches are.
Data sketches
Data sketches are probabilistic data structures/algorithms that are designed to provide approximate answers to several queries with reduced memory cost. Sketches are particularly useful when querying massive datasets such as in Druid where cost vs performance plays a key role to customers.
Some common types of sketches used in Druid are:
- Theta
- HLL (HyperLogLog)
- Quantiles
- Tuple
- T-Digest
Details about each sketch type and functions can be found in the official Druid documentation.
Data sketches can be used in window functions in a manner similar to aggregate functions.
Customer example
A typical real world usage for data sketches would be to compute APPROX COUNT DISTINCT, QUANTILE estimates, etc. With each sketch algorithm the computations would differ such as in Druid, APPROX_COUNT_DISTINCT_DS_HLL differs from APPROX_COUNT_DISTINCT_DS_THETA although both compute APPROX COUNT DISTINCT.
In both cases, the result is always an approximation. However, the accuracy of the approximation can be calibrated in certain cases.
For instance, let us consider a Druid table of approximately 25TB data volume. This table is partitioned appropriately, compacted, segments right sized and the data spans over 3 years. The queries to this table are usually for a time interval of 1 day, 28 days, 3 months and 12 months. This query can have several filters added or can be overall with only a time filter.
Even with all the right optimizations to the table, a vanilla COUNT DISTINCT query can take well over 30 seconds.
But there is a solution to maintain subsecond query response times. A user can simply add a sketch object during ingestion by converting an existing column to a sketch object with a complex data type. Then at query time, they can run a sketch algorithm on that sketch object—and their queries should complete in milliseconds.
Query 1: A sample sketch query
In this query, the original column, labeled listeners, was converted as a sketch object during ingestion and named listeners_sketch. An APPROX_COUNT_DISTINCT_DS_HLL sketch function is used to compute the approximate count for querying a year’s data. Due to the usage of sketches this query responds in 430ms.
SELECT
APPROX_COUNT_DISTINCT_DS_HLL(listeners_sketch, 18, 'HLL_4') AS "listeners",
SUM(stream_count) AS "streams"
FROM blog_stream_listeners
WHERE __time >= '2022-06-12' AND __time <= '2023-06-18'
and id = 'a5e6a1bf774242d49ce8f'
Query 2: A sample sketch query using window function
In this query, a LAG function is used to find the previous value of the sketch object listeners_sketch within the time window TIME_FLOOR(__time, ‘P1D’) across several country_code
SELECT
TIME_FLOOR(__time, 'P1D') as dayLvl
, country_code
, APPROX_COUNT_DISTINCT_DS_HLL("listeners_sketch", 18, 'HLL_4') as "value"
, LAG( ROUND(HLL_SKETCH_ESTIMATE(DS_HLL("listeners_sketch")), 0) ,1,0) over (PARTITION BY TIME_FLOOR(__time, 'P1D') order by TIME_FLOOR(__time, 'P1D')) as prevVal
FROM "blog_listeners"
where __time >= '2023-07-28' AND __time <= '2023-08-10'
AND "id" = '43e2fa4422a646a79f4e586a4203f510' AND "country_code" in ( 'IN', 'US', 'UK', 'AU', 'SW', 'FI')
GROUP BY 1,2
Result Set
“dayLvl”,”country_code”,”value”,”prevVal”
“2023-07-28T00:00:00.000Z”,”AU”,”12974″,”null”
“2023-07-28T00:00:00.000Z”,”FI”,”1534″,”12960″
“2023-07-28T00:00:00.000Z”,”IN”,”10426″,”1568″
“2023-07-28T00:00:00.000Z”,”US”,”212970″,”10471″
“2023-07-29T00:00:00.000Z”,”AU”,”6983″,”null”
“2023-07-29T00:00:00.000Z”,”FI”,”759″,”7006″
“2023-07-29T00:00:00.000Z”,”IN”,”5244″,”760″
“2023-07-29T00:00:00.000Z”,”US”,”127995″,”5341″
Query 3: A sample sketch query using window function…a bit more complex
In this query, multiple things are computed using window functions and sketches. Value using HLL_SKETCH_ESTIMATE for the blog id for a time interval of 1 week. More interestingly, cumulative value, cumeVal and unique cumulative value, cumeUniqueVal is computed as well, partitioned by country_code and ordered by day level, dayLvl.
SELECT
TIME_FLOOR(__time, 'P1D') as dayLvl
, country_code
, ROUND(HLL_SKETCH_ESTIMATE(DS_HLL("listeners_sketch", 18, 'HLL_4')), 0) as "value"
, Round (HLL_SKETCH_ESTIMATE(DS_HLL(DS_HLL("listeners_sketch", 18, 'HLL_4'), 18, 'HLL_4') OVER (PARTITION BY country_code ORDER BY TIME_FLOOR(__time, 'P1D'))),0) as cumeUniqueVal
, Sum(APPROX_COUNT_DISTINCT_DS_HLL("listeners_sketch", 18, 'HLL_4')) OVER (PARTITION BY country_code order by TIME_FLOOR(__time, 'P1D')) as cumeVal
FROM "blog_listeners"
WHERE __time >= '2023-08-02' AND __time <= '2023-08-11'
AND "id" = '0003cb24683a448a8f6fe0a0573857cd' AND "country_code" IN ('IN')
GROUP BY 1,2
ResultSet
“dayLvl”,”country_code”,”value”,”cumeUniqueVal”,”cumeVal”
“2023-08-02T00:00:00.000Z”,”IN”,”492″,”492″,”492″
“2023-08-03T00:00:00.000Z”,”IN”,”452″,”895″,”944″
“2023-08-04T00:00:00.000Z”,”IN”,”488″,”1306″,”1432″
“2023-08-05T00:00:00.000Z”,”IN”,”413″,”1630″,”1845″
“2023-08-06T00:00:00.000Z”,”IN”,”407″,”1931″,”2252″
“2023-08-07T00:00:00.000Z”,”IN”,”369″,”2198″,”2621″
“2023-08-08T00:00:00.000Z”,”IN”,”359″,”2438″,”2980″
“2023-08-09T00:00:00.000Z”,”IN”,”379″,”2680″,”3359″
“2023-08-10T00:00:00.000Z”,”IN”,”313″,”2873″,”3672″
“2023-08-11T00:00:00.000Z”,”IN”,”260″,”3032″,”3932″
Conclusion
When used alongside data sketches, window functions are powerful tools that require careful guard rails in order to compute complex real world use cases in the real-time data analytics space. Because it provides subsecond results leveraging approximations, the use of window functions justifies the price for performance quotient—with only a tiny compromise in accuracy.