Edilitics | Data to Decisions

Flatten

Expand nested JSON, struct, and list columns into flat tabular columns. Configure max depth and whether to keep the original column. No code needed.

Flatten expands nested columns into separate flat columns. Object and Struct columns are unnested into individual fields. List columns are exploded into one row per element. New columns are named using the pattern originalcolumn_fieldname. The source column can be kept or dropped. You can flatten all eligible columns at once or select specific columns.

Only columns with Object, List, or Struct data types appear in the Flatten operation. String, numerical, and datetime columns are not shown. If no eligible columns exist, an info toast appears and the operation closes automatically.


When to Use Flatten

  • Unpack JSON metadata. Expand a metadata JSON column into separate metadata_source, metadata_channel columns for filtering.
  • Explode list fields. Convert an array-valued column into one row per value for counting or joining.
  • Prepare for Group By. Flatten nested struct fields before applying Group By on extracted sub-fields.
  • Normalize API response data. Expand deeply nested API payload columns into queryable flat fields.
  • Fix Group By warning. Group By warns when Object or List columns exist. Flatten them first to avoid data loss in aggregations.

Sample Dataset

The examples in this doc use the Edilitics sample orders dataset. Download it to follow along in your own workspace.

edilitics_sample_orders.csv

Sample orders dataset for hands-on practice · 500 rows

Download

Relevant column for Flatten examples:

Prop

Type


How to Apply Flatten

Open the Flatten operation

In your Transform pipeline, click Add Operation and select Flatten from the operation list.

If no Object, List, or Struct columns exist, an info toast appears: "No columns with objects and lists found!" and the operation closes automatically.

Choose column selection mode

By default, Select All Columns is checked. All eligible nested columns are pre-loaded into the list with default settings.

Uncheck Select All Columns to switch to custom mode. In custom mode, start with one blank row and add more using Add Column.

Review or configure columns

Each row has three settings:

FieldDefaultDescription
Flatten Column(all eligible columns in all-columns mode)The nested column to flatten. In custom mode, select from Object/List/Struct columns.
Max Depth3How many levels of nesting to recurse into. Options: 1, 2, 3, 4, 5.
Drop Original ColumnOn (toggled)When on, the source nested column is removed after flattening. When off, it remains alongside the new flat columns.

Exclude columns (all-columns mode only)

In all-columns mode, click the remove icon next to any column to exclude it from flattening. Click the undo icon to re-include it. At least one column must remain included.

Add more columns (custom mode only)

Click Add Column to add another row. The button is disabled until the current row has a column selected and a valid Max Depth.

Click Save & Preview

Click Save & Preview. Edilitics flattens the selected columns and adds the new expanded columns to the dataset. The success toast confirms: "Flattening applied. Selected nested fields were expanded successfully."

Verify in the preview

Check the new columns. Struct fields appear as originalcolumn_fieldname. List columns are exploded - each element becomes its own row, increasing the row count.


Before and After

Flatten metadata (Object column) with Max Depth 2, Drop Original Column on.

Input:

order_idrevenuemetadata
ORD-2024-000112450.00{"source": "web", "channel": "organic"}
ORD-2024-00478320.50{"source": "mobile", "channel": "paid"}
ORD-2024-009983841.00{"source": "web", "channel": "direct"}

Output:

order_idrevenuemetadata_sourcemetadata_channel
ORD-2024-000112450.00weborganic
ORD-2024-00478320.50mobilepaid
ORD-2024-009983841.00webdirect

Original metadata column dropped. Two new flat columns added. Row count unchanged.


How Flatten Works by Column Type

Column typeWhat happens
StructUnnested using Polars unnest. Each field becomes a new column named originalcolumn_fieldname.
Object / JSON stringParsed as JSON, converted to Struct, then unnested the same way.
ListExploded using Polars explode. Each list element becomes its own row. Row count increases.

Flattening a List column multiplies rows - one row per list element. A dataset with 500 rows where the list column averages 3 elements per row will produce approximately 1,500 rows after exploding. Check row count in the preview.


Max Depth

Max Depth controls how many levels of nesting the operation recurses into. Default is 3.

Max DepthEffect
1Only the top-level fields of the struct are extracted. Nested structs within those fields remain nested.
3 (default)Recurses into 3 levels of nesting. Covers most real-world JSON payloads.
5Maximum. Use for deeply nested API responses or complex Parquet schemas.

New column names at each level append the field name: metadata_user_address_city at depth 3.


Code Equivalent

-- PostgreSQL: expand JSON object column
SELECT
  order_id,
  revenue,
  metadata->>'source' AS metadata_source,
  metadata->>'channel' AS metadata_channel
FROM orders;

-- BigQuery: flatten STRUCT column
SELECT
  order_id,
  revenue,
  metadata.source AS metadata_source,
  metadata.channel AS metadata_channel
FROM orders;
import polars as pl

# Flatten struct column
df = df.with_columns(
    pl.col("metadata").str.json_decode()
).unnest("metadata").rename({
    "source": "metadata_source",
    "channel": "metadata_channel"
})

# Explode list column
df = df.explode("tags")

After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.


After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.


Operation Reference

Prop

Type


Frequently Asked Questions


Next Steps

Need help? Email support@edilitics.com with your workspace, job ID, and context. We reply within one business day.

Last updated on

On this page