Window Functions
Apply rank, lag, lead, rolling and cumulative aggregations as new columns. Partition by any categorical column without writing code.
Window Functions add new columns computed via row-level calculations across the dataset or within partitioned groups. Unlike Group By, window functions preserve every original row - the new column appears alongside the existing data. You can configure multiple window definitions in one operation, each targeting any column.
Window functions add new columns. Original rows and columns are unchanged. Row count stays the same.
When to Use Window Functions
- Rank within group. Rank
revenuewithin eachregionto identify top performers per territory. - Lag for period-over-period comparison. Shift
revenueby 1 row to compare each order against the previous one. - 7-day rolling average. Apply Rolling Mean with window size 7 on
revenueto smooth daily volatility. - Running total. Cumulative Sum on
revenueordered byorder_dateto track year-to-date revenue. - Lead for next-row preview. Shift
delivery_dateforward by 1 to show the next scheduled delivery alongside each order.
Sample Dataset
The examples in this doc use the Edilitics sample orders dataset. Download it to follow along in your own workspace.
edilitics_sample_orders.csv
Sample orders dataset for hands-on practice · 500 rows
Relevant columns for Window Functions examples:
Prop
Type
Supported Functions
11 window functions across 4 groups:
| Group | Function | Label | Source column type | Extra config |
|---|---|---|---|---|
| Ranking | rank | Rank | All | Rank method (optional) |
| Offset | lag | Lag | Numerical | Offset (rows to shift back) |
| Offset | lead | Lead | Numerical | Offset (rows to shift forward) |
| Rolling | rolling_mean | Rolling Mean | Numerical | Window size |
| Rolling | rolling_sum | Rolling Sum | Numerical | Window size |
| Rolling | rolling_min | Rolling Min | Numerical | Window size |
| Rolling | rolling_max | Rolling Max | Numerical | Window size |
| Rolling | rolling_std | Rolling Std Dev | Numerical | Window size |
| Cumulative | cumsum | Cumulative Sum | Numerical | None |
| Cumulative | cummax | Cumulative Max | Numerical | None |
| Cumulative | cummin | Cumulative Min | Numerical | None |
Lag, Lead, Rolling, and Cumulative functions require a numerical source column. Rank accepts any column type.
Rank methods
| Method | Behaviour | Example (tied values at position 2) |
|---|---|---|
| Average | Tied rows share the mean rank | 1, 2.5, 2.5, 4 |
| Min | Tied rows all get the lowest rank | 1, 2, 2, 4 |
| Max | Tied rows all get the highest rank | 1, 3, 3, 4 |
| Dense | No gaps between ranks | 1, 2, 2, 3 |
| Ordinal | Every row gets a unique rank | 1, 2, 3, 4 |
How to Apply Window Functions
Open the Window Functions operation
In your Transform pipeline, click Add Operation and select Window Functions from the operation list.
One blank definition appears (Definition 1).
Select the Function
In the Function dropdown, choose one of the 11 functions. The Source Column dropdown updates to show only compatible column types: numerical-only for Lag, Lead, Rolling, and Cumulative functions; all columns for Rank.
Select the Source Column
Choose the column to compute the window over. Only columns compatible with the chosen function appear.
Configure function-specific parameters
- Rolling functions (Rolling Mean, Rolling Sum, Rolling Min, Rolling Max, Rolling Std Dev): Enter a Window Size. Minimum 1. Example: 7 for a 7-row rolling window.
- Lag / Lead: Enter an Offset. Minimum 1. Lag shifts values back (previous rows). Lead shifts values forward (next rows). Offset 1 means one row back or forward.
- Rank: Optionally select a Rank Method. Default is Average.
Set Partition By (optional)
In the Partition By multi-select, choose one or more categorical columns. The window calculation resets independently for each unique combination of partition values.
Example: partition by region to rank revenue separately within each territory.
Leave empty to apply the function across the full dataset.
Name the output column
Type a name in the Output Column Name field. No auto-generated name - you must enter one.
The name must:
- Contain only letters, numbers, and underscores
- Not start with
__or a number - Not already exist in the dataset
Add more definitions (optional)
Click Add another window definition to add a second definition block. Each definition is independent - different function, source column, partition, and output name. All definitions are applied together when you save.
Click Save & Preview
Click Save & Preview. Edilitics applies all window definitions and adds the new columns. The success toast confirms: "Window functions applied. New columns added."
Verify in the preview
Check the new columns. Rolling functions produce null for the first N-1 rows (insufficient preceding rows for the window). Lag produces null for the first N rows. Lead produces null for the last N rows.
Before and After
Rolling Mean on revenue with window size 3, no partition. Result column: revenue_rolling_3.
Input:
| order_id | region | revenue |
|---|---|---|
| ORD-2024-0001 | North America | 12450.00 |
| ORD-2024-0047 | Europe | 8320.50 |
| ORD-2024-0099 | North America | 83841.00 |
| ORD-2024-0312 | Latin America | 3750.75 |
| ORD-2024-0401 | Europe | 15200.00 |
Output:
| order_id | region | revenue | revenue_rolling_3 |
|---|---|---|---|
| ORD-2024-0001 | North America | 12450.00 | null |
| ORD-2024-0047 | Europe | 8320.50 | null |
| ORD-2024-0099 | North America | 83841.00 | 34870.50 |
| ORD-2024-0312 | Latin America | 3750.75 | 31970.75 |
| ORD-2024-0401 | Europe | 15200.00 | 34264.00 |
Original columns unchanged. One new column added. First 2 rows null (window size 3 requires 3 rows).
Code Equivalent
-- Rank revenue within region (PostgreSQL / BigQuery / Snowflake / DuckDB)
SELECT
*,
RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS revenue_rank
FROM orders;
-- 7-row rolling mean on revenue
SELECT
*,
AVG(revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS revenue_rolling_7
FROM orders;
-- Lag by 1 row
SELECT
*,
LAG(revenue, 1) OVER (ORDER BY order_date) AS revenue_lag_1
FROM orders;import polars as pl
# Rank within partition
df = df.with_columns(
pl.col("revenue")
.rank(method="average", descending=True)
.over("region")
.alias("revenue_rank")
)
# 7-row rolling mean
df = df.with_columns(
pl.col("revenue")
.rolling_mean(window_size=7)
.alias("revenue_rolling_7")
)
# Lag by 1
df = df.with_columns(
pl.col("revenue")
.shift(1)
.alias("revenue_lag_1")
)
# Cumulative sum
df = df.with_columns(
pl.col("revenue")
.cum_sum()
.alias("revenue_cumsum")
)After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.
After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.
Operation Reference
Prop
Type
Frequently Asked Questions
Next Steps
Group By
Collapse rows into group summaries. Use after window functions to aggregate ranked or rolling columns.
Filter
Filter to top-N rows by rank after applying a window rank column.
Conditional Column
Flag rows where a rolling value crosses a threshold or where lag exceeds a limit.
Sort / Order By
Sort the dataset by order_date before applying rolling or cumulative functions for correct ordering.
Need help? Email support@edilitics.com with your workspace, job ID, and context. We reply within one business day.
Last updated on
Pivot / Unpivot
Pivot reshapes long to wide: category values become headers. Unpivot reshapes wide to long: columns stack into key-value rows. Configure aggregations visually.
Flatten
Expand nested JSON, struct, and list columns into flat tabular columns. Configure max depth and whether to keep the original column. No code needed.