Flatten
Expand nested JSON, struct, and list columns into flat tabular columns. Configure max depth and whether to keep the original column. No code needed.
Flatten expands nested columns into separate flat columns. Object and Struct columns are unnested into individual fields. List columns are exploded into one row per element. New columns are named using the pattern originalcolumn_fieldname. The source column can be kept or dropped. You can flatten all eligible columns at once or select specific columns.
Only columns with Object, List, or Struct data types appear in the Flatten operation. String, numerical, and datetime columns are not shown. If no eligible columns exist, an info toast appears and the operation closes automatically.
When to Use Flatten
- Unpack JSON metadata. Expand a
metadataJSON column into separatemetadata_source,metadata_channelcolumns for filtering. - Explode list fields. Convert an array-valued column into one row per value for counting or joining.
- Prepare for Group By. Flatten nested struct fields before applying Group By on extracted sub-fields.
- Normalize API response data. Expand deeply nested API payload columns into queryable flat fields.
- Fix Group By warning. Group By warns when Object or List columns exist. Flatten them first to avoid data loss in aggregations.
Sample Dataset
The examples in this doc use the Edilitics sample orders dataset. Download it to follow along in your own workspace.
edilitics_sample_orders.csv
Sample orders dataset for hands-on practice · 500 rows
Relevant column for Flatten examples:
Prop
Type
How to Apply Flatten
Open the Flatten operation
In your Transform pipeline, click Add Operation and select Flatten from the operation list.
If no Object, List, or Struct columns exist, an info toast appears: "No columns with objects and lists found!" and the operation closes automatically.
Choose column selection mode
By default, Select All Columns is checked. All eligible nested columns are pre-loaded into the list with default settings.
Uncheck Select All Columns to switch to custom mode. In custom mode, start with one blank row and add more using Add Column.
Review or configure columns
Each row has three settings:
| Field | Default | Description |
|---|---|---|
| Flatten Column | (all eligible columns in all-columns mode) | The nested column to flatten. In custom mode, select from Object/List/Struct columns. |
| Max Depth | 3 | How many levels of nesting to recurse into. Options: 1, 2, 3, 4, 5. |
| Drop Original Column | On (toggled) | When on, the source nested column is removed after flattening. When off, it remains alongside the new flat columns. |
Exclude columns (all-columns mode only)
In all-columns mode, click the remove icon next to any column to exclude it from flattening. Click the undo icon to re-include it. At least one column must remain included.
Add more columns (custom mode only)
Click Add Column to add another row. The button is disabled until the current row has a column selected and a valid Max Depth.
Click Save & Preview
Click Save & Preview. Edilitics flattens the selected columns and adds the new expanded columns to the dataset. The success toast confirms: "Flattening applied. Selected nested fields were expanded successfully."
Verify in the preview
Check the new columns. Struct fields appear as originalcolumn_fieldname. List columns are exploded - each element becomes its own row, increasing the row count.
Before and After
Flatten metadata (Object column) with Max Depth 2, Drop Original Column on.
Input:
| order_id | revenue | metadata |
|---|---|---|
| ORD-2024-0001 | 12450.00 | {"source": "web", "channel": "organic"} |
| ORD-2024-0047 | 8320.50 | {"source": "mobile", "channel": "paid"} |
| ORD-2024-0099 | 83841.00 | {"source": "web", "channel": "direct"} |
Output:
| order_id | revenue | metadata_source | metadata_channel |
|---|---|---|---|
| ORD-2024-0001 | 12450.00 | web | organic |
| ORD-2024-0047 | 8320.50 | mobile | paid |
| ORD-2024-0099 | 83841.00 | web | direct |
Original metadata column dropped. Two new flat columns added. Row count unchanged.
How Flatten Works by Column Type
| Column type | What happens |
|---|---|
| Struct | Unnested using Polars unnest. Each field becomes a new column named originalcolumn_fieldname. |
| Object / JSON string | Parsed as JSON, converted to Struct, then unnested the same way. |
| List | Exploded using Polars explode. Each list element becomes its own row. Row count increases. |
Flattening a List column multiplies rows - one row per list element. A dataset with 500 rows where the list column averages 3 elements per row will produce approximately 1,500 rows after exploding. Check row count in the preview.
Max Depth
Max Depth controls how many levels of nesting the operation recurses into. Default is 3.
| Max Depth | Effect |
|---|---|
| 1 | Only the top-level fields of the struct are extracted. Nested structs within those fields remain nested. |
| 3 (default) | Recurses into 3 levels of nesting. Covers most real-world JSON payloads. |
| 5 | Maximum. Use for deeply nested API responses or complex Parquet schemas. |
New column names at each level append the field name: metadata_user_address_city at depth 3.
Code Equivalent
-- PostgreSQL: expand JSON object column
SELECT
order_id,
revenue,
metadata->>'source' AS metadata_source,
metadata->>'channel' AS metadata_channel
FROM orders;
-- BigQuery: flatten STRUCT column
SELECT
order_id,
revenue,
metadata.source AS metadata_source,
metadata.channel AS metadata_channel
FROM orders;import polars as pl
# Flatten struct column
df = df.with_columns(
pl.col("metadata").str.json_decode()
).unnest("metadata").rename({
"source": "metadata_source",
"channel": "metadata_channel"
})
# Explode list column
df = df.explode("tags")After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.
After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.
Operation Reference
Prop
Type
Frequently Asked Questions
Next Steps
Group By
Group by the new flat columns after expanding nested fields.
Filter
Filter rows by values in the newly expanded flat columns.
Drop / Rename Columns
Remove or rename flattened columns to clean up the schema after expanding.
Join
Join the flattened dataset with another table using the expanded key fields.
Need help? Email support@edilitics.com with your workspace, job ID, and context. We reply within one business day.
Last updated on
Window Functions
Apply rank, lag, lead, rolling and cumulative aggregations as new columns. Partition by any categorical column without writing code.
Code Editor
Write Python scripts directly inside your Transform pipeline. Polars, NumPy, SciPy, math, and re preloaded. Test before saving. No imports required.