What Are Window Functions?
Window functions perform calculations across a set of rows that are related to the current row — without collapsing them into a single aggregate like GROUP BY does. They are one of the most powerful tools in a data analyst's SQL toolkit, enabling ranking, running totals, moving averages, and period-over-period comparisons without self-joins or subqueries.
The syntax uses an OVER() clause that defines the "window" of rows to consider for each output row.
-- Basic structure
SELECT
column1,
FUNCTION() OVER (
PARTITION BY partition_column -- optional: reset per group
ORDER BY order_column -- optional: defines row order
ROWS BETWEEN ... AND ... -- optional: frame definition
) AS result
FROM table;
Window Function Categories
| Category | Functions | What They Return |
|---|---|---|
| Ranking | ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(n) | A position number for each row within its partition |
| Offset | LAG(col, n), LEAD(col, n), FIRST_VALUE(col), LAST_VALUE(col), NTH_VALUE(col, n) | A value from another row relative to the current row |
| Aggregate | SUM(), AVG(), MIN(), MAX(), COUNT() used with OVER() | Running or windowed aggregate without collapsing rows |
| Statistical | PERCENT_RANK(), CUME_DIST(), PERCENTILE_CONT(), PERCENTILE_DISC() | Relative rank or percentile within the partition |
Ranking Functions
Ranking functions assign a sequential number to each row within a partition. The three main variants differ in how they handle ties.
-- Compare ROW_NUMBER, RANK, and DENSE_RANK
SELECT
salesperson,
region,
revenue,
ROW_NUMBER() OVER (PARTITION BY region ORDER BY revenue DESC) AS row_num,
RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS rank,
DENSE_RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS dense_rank
FROM sales;
-- Result for two tied rows with revenue = 10000:
-- row_num=1, row_num=2 (always unique)
-- rank=1, rank=1 (then skips to 3)
-- dense_rank=1, dense_rank=1 (then goes to 2, no skip)
-- Top 3 products per category (practical use of ROW_NUMBER)
SELECT * FROM (
SELECT
product_name, category, revenue,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS rn
FROM products
) t
WHERE rn <= 3;
Offset Functions: LAG and LEAD
LAG and LEAD let you access values from previous or following rows without a self-join, making period-over-period comparisons clean and efficient.
-- Month-over-month revenue growth
SELECT
month,
revenue,
LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY month) AS mom_change,
ROUND(
100.0 * (revenue - LAG(revenue, 1) OVER (ORDER BY month))
/ NULLIF(LAG(revenue, 1) OVER (ORDER BY month), 0), 1
) AS mom_pct_change
FROM monthly_revenue;
-- Year-over-year: LAG with offset=12 for monthly data
SELECT
month,
revenue,
LAG(revenue, 12) OVER (ORDER BY month) AS same_month_last_year,
ROUND(100.0 * (revenue - LAG(revenue, 12) OVER (ORDER BY month))
/ NULLIF(LAG(revenue, 12) OVER (ORDER BY month), 0), 1) AS yoy_pct
FROM monthly_revenue;
-- Time to next purchase per customer
SELECT
customer_id,
order_date,
LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS next_order_date,
DATEDIFF(LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date), order_date) AS days_to_next
FROM orders;
Running Totals and Moving Averages
-- Running (cumulative) total
SELECT
order_date,
daily_revenue,
SUM(daily_revenue) OVER (ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM daily_sales;
-- 7-day moving average (current row + 6 preceding rows)
SELECT
order_date,
daily_revenue,
AVG(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS ma_7d
FROM daily_sales;
-- Running total reset per month (PARTITION BY)
SELECT
order_date,
DATE_TRUNC('month', order_date) AS month,
daily_revenue,
SUM(daily_revenue) OVER (
PARTITION BY DATE_TRUNC('month', order_date)
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total_mtd
FROM daily_sales;
Frame Definitions: ROWS vs RANGE
| Frame Clause | Meaning | Use Case |
|---|---|---|
| ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | All rows from the start of the partition to the current row | Cumulative sum, running count |
| ROWS BETWEEN 6 PRECEDING AND CURRENT ROW | Current row and the 6 rows before it | 7-day moving average |
| ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING | Centred window of 7 rows | Smoothing with equal past/future context |
| ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING | Current row to the last row in partition | Reverse cumulative sum |
| RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | All rows with ORDER BY value <= current row (includes ties) | When tied rows must share the same cumulative value |
Percentiles and Distributions
-- Percentile rank of each customer by revenue
SELECT
customer_id,
total_revenue,
PERCENT_RANK() OVER (ORDER BY total_revenue) AS pct_rank, -- 0 to 1
NTILE(10) OVER (ORDER BY total_revenue DESC) AS decile -- 1=top 10%
FROM customer_summary;
-- Median revenue per product category (PERCENTILE_CONT)
SELECT
category,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY revenue) AS median_revenue,
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY revenue) AS p25,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY revenue) AS p75
FROM orders
GROUP BY category;
-- Running percentage of total
SELECT
product_name,
revenue,
ROUND(100.0 * revenue / SUM(revenue) OVER (), 2) AS pct_of_total,
ROUND(100.0 * SUM(revenue) OVER (ORDER BY revenue DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
/ SUM(revenue) OVER (), 2) AS cumulative_pct
FROM product_sales
ORDER BY revenue DESC;
Practical Pattern: Sessionisation
Sessionisation groups user events into sessions based on inactivity gaps. This is a classic window function pattern used in web and product analytics.
-- Define a new session whenever the gap since the last event > 30 minutes
WITH events_with_gap AS (
SELECT
user_id,
event_time,
LAG(event_time) OVER (PARTITION BY user_id ORDER BY event_time) AS prev_event_time,
DATEDIFF(MINUTE,
LAG(event_time) OVER (PARTITION BY user_id ORDER BY event_time),
event_time
) AS gap_minutes
FROM user_events
),
session_starts AS (
SELECT *,
CASE WHEN gap_minutes > 30 OR gap_minutes IS NULL THEN 1 ELSE 0 END AS is_new_session
FROM events_with_gap
)
SELECT
user_id,
event_time,
SUM(is_new_session) OVER (PARTITION BY user_id ORDER BY event_time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS session_number
FROM session_starts;
Common Mistakes to Avoid
| Mistake | What Goes Wrong | Fix |
|---|---|---|
| Using WHERE to filter on window function results | Window functions are evaluated after WHERE; the filter won't see them | Wrap in a subquery or CTE and filter in the outer query |
| Omitting ORDER BY in ranking functions | ROW_NUMBER() without ORDER BY returns arbitrary order — results are non-deterministic | Always include ORDER BY inside OVER() for ranking and offset functions |
| Confusing ROWS and RANGE frames | RANGE includes all tied values; ROWS is strictly positional | Use ROWS for moving averages; RANGE when ties must share the same cumulative value |
| Applying window functions inside aggregations | You cannot nest a window function inside SUM() or AVG() | Compute window functions first in a CTE, then aggregate in the outer query |
Summary
Window functions unlock a class of analytical queries that would otherwise require multiple self-joins, correlated subqueries, or application-side processing. Mastering PARTITION BY and ORDER BY inside OVER() is the foundation; understanding frame clauses (ROWS vs RANGE, PRECEDING, FOLLOWING) lets you build any moving window calculation. The most frequently used patterns in analytics are: ranking with ROW_NUMBER for top-N per group, LAG/LEAD for period-over-period comparison, running SUM for cumulative totals, and moving AVG for smoothed trend lines. Practice these on your own datasets and they will quickly become the first tool you reach for in analytical SQL.
Create a free reader account to keep reading.