Do I need to import Polars or NumPy?

No. All libraries are preloaded in the read-only preamble block. Write your code directly below it.

Why is Save & Preview disabled?

Save & Preview is enabled only after a successful Test. Click Test first, fix any errors, then save.

Can I define helper functions?

No. def is blocked. Write all logic inline using Polars expression chaining and map_elements() for row-level operations.

Does the script run on the full dataset?

Test runs on a 16,600-row preview sample. Full pipeline execution runs on the complete dataset.

Can I use Code Editor between no-code operations?

Yes. Code Editor operations can be placed at any position in the pipeline. The output of the preceding step is passed as df.

Code Editor | Edilitics Docs

Write Python scripts directly inside your Transform pipeline. Polars, NumPy, SciPy, math, and re preloaded. Test before saving. No imports required.

Code Editor (Advanced Mode) lets you write Python scripts directly inside your Transform pipeline. The dataframe is available as df. Polars, NumPy, SciPy, math, and re are preloaded - no imports required. Write your transformation below the read-only preamble, test it, then save.

Code Editor operations can be placed anywhere in a pipeline alongside no-code operations. The output of each step becomes the input for the next, regardless of whether the step is a code or no-code operation.

When to Use Code Editor

Multi-condition logic. Classify rows using pl.when().then().otherwise() chains that no-code Filter cannot express.
Custom column derivation. Create columns from arithmetic, string manipulation, or pattern matching across multiple existing columns.
Regex extraction. Extract substrings, domains, or codes from text columns using re directly on string series.
Numerical operations. Apply NumPy or SciPy functions to columns for statistical scoring, normalization, or scientific computation.
Edge cases beyond no-code. Any transformation that requires logic branching, cross-column calculation, or library functions not available in the no-code operations.

Sample Dataset

The examples in this doc use the Edilitics sample orders dataset. Download it to follow along in your own workspace.

edilitics_sample_orders.csv

Sample orders dataset for hands-on practice · 500 rows

Download

Editor Setup

When the Code Editor opens, the top of the editor contains a read-only preamble block. This block cannot be edited:

import re
import math
import scipy
import numpy as np
import polars as pl

# Transform your dataframe using Polars syntax.
# The dataframe object is available as: df
# Example: df = df.with_columns([df["column_name"].cast(pl.Int64)])

# Notes:
# - No imports are required or supported - Polars is preloaded.
# - Only whitelisted packages/functions are allowed.
# - Code runs on a 16,600-row preview sample only.

# Begin your script below:
#dataframe object = df

Write your transformation code below this block. The preamble handles all library loading. Reassign df to apply changes.

Restrictions

The editor blocks a set of Python keywords and patterns on every keystroke. Blocked terms trigger a "Not Allowed" toast and the change is rejected.

Blocked	Reason
`def`, `class`, `return`	Function and class definitions not permitted
`import`	All imports handled by preamble
`os`, `exec`, `eval`, `open`, `subprocess`	System access not permitted
`print`, `assert`, `raise`, `yield`	Output and control flow keywords blocked
`to_csv`, `to_json`, `to_parquet`, and other write methods	Data export not permitted from editor
`to_pandas`	Pandas conversion not permitted

Write expression-level code using df = assignments. Do not define functions or use control flow keywords.

The editor runs on a 16,600-row preview sample only. Production pipeline execution uses the full dataset.

How to Use Code Editor

Open the Code Editor operation

In your Transform pipeline, click Add Operation and select Code Editor (Advanced Mode) from the operation list.

Select a language

The language dropdown at the top left shows the available runtime. Select Python - polars (the default and only active option).

Write your script

Click below the read-only preamble block and write your transformation. Use df = to reassign the dataframe with your changes. All preloaded libraries (pl, np, scipy, math, re) are available without importing.

Example - add a revenue band column:

df = df.with_columns(
    pl.when(pl.col("revenue") > 1000)
      .then(pl.lit("High"))
      .when(pl.col("revenue") > 500)
      .then(pl.lit("Medium"))
      .otherwise(pl.lit("Low"))
      .alias("revenue_band")
)

Click Test

Click the Test button (top right of the editor) to run the script against the 16,600-row preview sample.

Success: Toast confirms "Script executed successfully. Preview has been refreshed." The output pane closes. Save & Preview becomes enabled.
Error: Toast shows "Script error detected. Check the right pane for details." An output pane opens on the right showing the error message. Fix the script and Test again.

Click Save & Preview

Save & Preview is enabled only after a successful Test. Click it to apply the script to the pipeline. The success toast confirms: "Script saved. Preview has been updated with the latest transformation output."

Before and After

Rep performance score with percentile rank

Score each sales rep by combining their revenue share within region and discount efficiency - a multi-step calculation requiring window functions, cross-column arithmetic, and percentile ranking in a single pass. No combination of no-code operations can produce this in one step.

Input (first 3 rows of edilitics_sample_orders):

order_id	sales_rep	region	revenue	discount_pct	units_sold
ORD-2024-0001	Marcus Williams	North America	780.62	0.25	3
ORD-2024-0002	Yuki Tanaka	North America	4703.29	0.25	10
ORD-2024-0003	Tomás Herrera	Europe	7347.74	0.15	27

Script:

df = df.with_columns([
    (pl.col("revenue") / pl.col("revenue").sum().over("region")).alias("revenue_share_in_region"),
    (pl.col("revenue") / pl.col("units_sold")).alias("revenue_per_unit"),
])

df = df.with_columns(
    (
        pl.col("revenue_share_in_region") * 0.6
        + (1 - pl.col("discount_pct")) * 0.4
    ).alias("rep_score_raw")
)

df = df.with_columns(
    (pl.col("rep_score_raw").rank("average") / pl.col("rep_score_raw").len())
    .alias("rep_score_pct_rank")
)

Output (first 3 rows):

order_id	sales_rep	region	revenue	rep_score_raw	rep_score_pct_rank
ORD-2024-0001	Marcus Williams	North America	780.62	0.487	0.31
ORD-2024-0002	Yuki Tanaka	North America	4703.29	0.612	0.58
ORD-2024-0003	Tomás Herrera	Europe	7347.74	0.754	0.82

Output: 500 rows, 17 columns (13 original + 4 derived).

Example Scripts

Z-score normalisation per region

Normalise revenue within each region to identify outliers. Requires per-partition mean and standard deviation - not available in any no-code operation.

df = df.with_columns([
    pl.col("revenue").mean().over("region").alias("region_revenue_mean"),
    pl.col("revenue").std().over("region").alias("region_revenue_std"),
])

df = df.with_columns(
    ((pl.col("revenue") - pl.col("region_revenue_mean")) / pl.col("region_revenue_std"))
    .alias("revenue_zscore")
)

df = df.filter(pl.col("revenue_zscore").abs() < 3)

Extract structured fields from a JSON metadata column

The metadata column contains JSON strings like {"source": "web", "campaign": "q1_inbound"}. Extract both fields into separate columns. Requires row-level string parsing - not possible with String Extract or any no-code operation.

df = df.with_columns([
    pl.col("metadata").map_elements(
        lambda x: re.search(r'"source":\s*"([^"]+)"', x).group(1) if x and re.search(r'"source":\s*"([^"]+)"', x) else None,
        return_dtype=pl.String
    ).alias("lead_source"),
    pl.col("metadata").map_elements(
        lambda x: re.search(r'"campaign":\s*"([^"]+)"', x).group(1) if x and re.search(r'"campaign":\s*"([^"]+)"', x) else None,
        return_dtype=pl.String
    ).alias("campaign"),
])

Days to delivery and SLA breach flag

Compute the number of days between order_date and delivery_date, then flag rows where delivery took longer than 30 days. Requires date casting, cross-column arithmetic, and a derived boolean - not achievable in a single no-code step.

df = df.with_columns([
    pl.col("order_date").cast(pl.Date).alias("order_date_dt"),
    pl.col("delivery_date").cast(pl.Date).alias("delivery_date_dt"),
])

df = df.with_columns(
    (pl.col("delivery_date_dt") - pl.col("order_date_dt"))
    .dt.total_days()
    .alias("days_to_delivery")
)

df = df.with_columns(
    (pl.col("days_to_delivery") > 30).alias("sla_breach")
)

Cumulative revenue milestone flag per sales rep

Compute running total of revenue per sales rep ordered by date, then flag the row where cumulative revenue crosses 50,000 for the first time. Requires ordered cumulative sum partitioned by rep - not available in Window Functions no-code operation for milestone detection.

df = df.sort(["sales_rep", "order_date"])

df = df.with_columns(
    pl.col("revenue").cum_sum().over("sales_rep").alias("cumulative_revenue_by_rep")
)

df = df.with_columns(
    (
        (pl.col("cumulative_revenue_by_rep") >= 50000)
        & (pl.col("cumulative_revenue_by_rep").shift(1).over("sales_rep") < 50000)
    ).alias("crossed_50k_milestone")
)

After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.

Code Editor

When to Use Code Editor

Sample Dataset

Editor Setup

Restrictions

How to Use Code Editor

Before and After

Rep performance score with percentile rank

Example Scripts

Z-score normalisation per region

Extract structured fields from a JSON metadata column

Days to delivery and SLA breach flag

Cumulative revenue milestone flag per sales rep

Operation Reference

Frequently Asked Questions

Next Steps

Filter

Group By

Drop / Rename Columns

Cast Data Types

On this page