SQL Window Functions for Data Analysts

What Are Window Functions?

Window functions perform calculations across a set of rows related to the current row — without collapsing those rows into a single output row the way GROUP BY aggregations do. The "window" is the set of rows defined by an OVER() clause. This makes window functions essential for rankings, running totals, moving averages, and period-over-period comparisons.

Window Function Syntax

The general syntax is:

function_name(expression) OVER ( [PARTITION BY partition_expression] [ORDER BY sort_expression] [frame_clause] )

Clause	Purpose	Example
PARTITION BY	Divides rows into groups; resets the window for each group	PARTITION BY department_id
ORDER BY	Defines row order within each partition	ORDER BY sale_date ASC
ROWS/RANGE frame	Narrows the window to a subset of rows around the current row	ROWS BETWEEN 6 PRECEDING AND CURRENT ROW

Ranking Functions

Ranking functions assign a position to each row within its partition.

Function	Ties Handled By	Gaps After Ties	Use Case
ROW_NUMBER()	Arbitrary ordering	N/A	Unique sequential ID per partition
RANK()	Same rank for ties	Yes (1,1,3)	Olympic-style ranking
DENSE_RANK()	Same rank for ties	No (1,1,2)	Ranking without gaps
NTILE(n)	Distributes rows into n buckets	N/A	Quartiles, deciles, percentiles
PERCENT_RANK()	Relative rank 0–1	N/A	Percentile position of each row
CUME_DIST()	Cumulative distribution 0–1	N/A	What fraction of rows are ≤ this value

Example — rank customers by total spend within each region:

SELECT customer_id, region, total_spend, RANK() OVER (PARTITION BY region ORDER BY total_spend DESC) AS spend_rank FROM customer_summary;

Aggregate Window Functions

Standard aggregate functions (SUM, AVG, COUNT, MIN, MAX) become window functions when paired with OVER(). Unlike GROUP BY, they return a value for every row while still computing across the window.

Pattern	SQL	Result Per Row
Running total	SUM(amount) OVER (ORDER BY date)	Cumulative sum up to current row
Partition total	SUM(amount) OVER (PARTITION BY dept)	Total for the department (same on every row)
% of partition	amount / SUM(amount) OVER (PARTITION BY dept)	Row's share of department total
7-day moving avg	AVG(amount) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)	Average of last 7 rows
Cumulative count	COUNT(*) OVER (PARTITION BY user_id ORDER BY event_time)	Number of events so far per user

Value / Offset Functions

These functions access a value from a different row in the window without a self-join.

Function	Description	Example
LAG(col, n)	Value from n rows before the current row	LAG(revenue, 1) — previous period revenue
LEAD(col, n)	Value from n rows after the current row	LEAD(revenue, 1) — next period revenue
FIRST_VALUE(col)	First value in the window frame	First sale date in the cohort
LAST_VALUE(col)	Last value in the window frame	Most recent status per user
NTH_VALUE(col, n)	nth value in the window frame	Second-highest score in a group

Example — month-over-month revenue change:

SELECT month, revenue, LAG(revenue, 1) OVER (ORDER BY month) AS prev_revenue, revenue - LAG(revenue, 1) OVER (ORDER BY month) AS mom_change, ROUND(100.0 * (revenue - LAG(revenue, 1) OVER (ORDER BY month)) / NULLIF(LAG(revenue, 1) OVER (ORDER BY month), 0), 2) AS mom_pct FROM monthly_revenue;

Frame Clauses

The frame clause restricts which rows within the partition are included in the calculation. There are two modes:

Mode	Behavior
ROWS	Physical row positions — fast and unambiguous
RANGE	Logical value ranges — includes all rows with the same ORDER BY value

Frame Boundary	Meaning
UNBOUNDED PRECEDING	From the first row of the partition
n PRECEDING	n rows before current row
CURRENT ROW	The current row itself
n FOLLOWING	n rows after current row
UNBOUNDED FOLLOWING	To the last row of the partition

Default frame when ORDER BY is present: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. When ORDER BY is absent: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

Common Analytical Patterns

Use Case	Window Function Approach
Top N per group	ROW_NUMBER() OVER (PARTITION BY group ORDER BY metric DESC), filter WHERE rn <= N
Deduplication	ROW_NUMBER() OVER (PARTITION BY key ORDER BY updated_at DESC), keep rn = 1
Running total	SUM(amount) OVER (ORDER BY date ROWS UNBOUNDED PRECEDING)
Cohort comparison	AVG(metric) OVER (PARTITION BY cohort) vs individual row value
Session detection	SUM(is_new_session) OVER (PARTITION BY user ORDER BY event_time) gives session ID
Gap and island	ROW_NUMBER() minus ROW_NUMBER() OVER different partitions to find consecutive sequences

Performance Considerations

Window functions are computed after WHERE, GROUP BY, and HAVING but before ORDER BY and LIMIT. This means you cannot filter on a window function result in the same query — you must wrap it in a subquery or CTE.

-- Correct: filter on window function using CTE WITH ranked AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn FROM employees ) SELECT * FROM ranked WHERE rn = 1;

Each distinct OVER() clause defines a separate window computation. Queries with many different OVER() clauses can be slow on large tables — consolidate partitions where possible and ensure the ORDER BY columns are indexed.

Dialect Differences

Feature	PostgreSQL	BigQuery	Snowflake	MySQL 8+
ROWS/RANGE frames	Yes	Yes	Yes	Yes
GROUPS frame	Yes (v11+)	No	No	No
NTH_VALUE	Yes	Yes	Yes	Yes
IGNORE NULLS in LAG/LEAD	No	Yes	Yes	No
WINDOW keyword (named window)	Yes	Yes	Yes	Yes

Named Windows

If the same OVER() clause appears multiple times, define a named window once using the WINDOW keyword to keep the query readable:

SELECT sale_date, amount, SUM(amount) OVER w AS running_total, AVG(amount) OVER w AS running_avg, COUNT(*) OVER w AS running_count FROM sales WINDOW w AS (ORDER BY sale_date ROWS UNBOUNDED PRECEDING);

Summary

SQL window functions give data analysts row-level detail and aggregate insight simultaneously. Mastering PARTITION BY, ORDER BY, and frame clauses unlocks a wide range of analytical patterns — rankings, running totals, moving averages, period comparisons, deduplication — that would otherwise require complex self-joins or application-layer logic. Because window functions execute after filtering and grouping, wrapping them in CTEs or subqueries is the standard pattern for filtering on their results.