Code Editor
Write Python scripts directly inside your Transform pipeline. Polars, NumPy, SciPy, math, and re preloaded. Test before saving. No imports required.
Code Editor (Advanced Mode) lets you write Python scripts directly inside your Transform pipeline. The dataframe is available as df. Polars, NumPy, SciPy, math, and re are preloaded - no imports required. Write your transformation below the read-only preamble, test it, then save.
Code Editor operations can be placed anywhere in a pipeline alongside no-code operations. The output of each step becomes the input for the next, regardless of whether the step is a code or no-code operation.
When to Use Code Editor
- Multi-condition logic. Classify rows using
pl.when().then().otherwise()chains that no-code Filter cannot express. - Custom column derivation. Create columns from arithmetic, string manipulation, or pattern matching across multiple existing columns.
- Regex extraction. Extract substrings, domains, or codes from text columns using
redirectly on string series. - Numerical operations. Apply NumPy or SciPy functions to columns for statistical scoring, normalization, or scientific computation.
- Edge cases beyond no-code. Any transformation that requires logic branching, cross-column calculation, or library functions not available in the no-code operations.
Sample Dataset
The examples in this doc use the Edilitics sample orders dataset. Download it to follow along in your own workspace.
edilitics_sample_orders.csv
Sample orders dataset for hands-on practice · 500 rows
Editor Setup
When the Code Editor opens, the top of the editor contains a read-only preamble block. This block cannot be edited:
import re
import math
import scipy
import numpy as np
import polars as pl
# Transform your dataframe using Polars syntax.
# The dataframe object is available as: df
# Example: df = df.with_columns([df["column_name"].cast(pl.Int64)])
# Notes:
# - No imports are required or supported - Polars is preloaded.
# - Only whitelisted packages/functions are allowed.
# - Code runs on a 16,600-row preview sample only.
# Begin your script below:
#dataframe object = dfWrite your transformation code below this block. The preamble handles all library loading. Reassign df to apply changes.
Restrictions
The editor blocks a set of Python keywords and patterns on every keystroke. Blocked terms trigger a "Not Allowed" toast and the change is rejected.
| Blocked | Reason |
|---|---|
def, class, return | Function and class definitions not permitted |
import | All imports handled by preamble |
os, exec, eval, open, subprocess | System access not permitted |
print, assert, raise, yield | Output and control flow keywords blocked |
to_csv, to_json, to_parquet, and other write methods | Data export not permitted from editor |
to_pandas | Pandas conversion not permitted |
Write expression-level code using df = assignments. Do not define functions or use control flow keywords.
The editor runs on a 16,600-row preview sample only. Production pipeline execution uses the full dataset.
How to Use Code Editor
Open the Code Editor operation
In your Transform pipeline, click Add Operation and select Code Editor (Advanced Mode) from the operation list.
Select a language
The language dropdown at the top left shows the available runtime. Select Python - polars (the default and only active option).
Write your script
Click below the read-only preamble block and write your transformation. Use df = to reassign the dataframe with your changes. All preloaded libraries (pl, np, scipy, math, re) are available without importing.
Example - add a revenue band column:
df = df.with_columns(
pl.when(pl.col("revenue") > 1000)
.then(pl.lit("High"))
.when(pl.col("revenue") > 500)
.then(pl.lit("Medium"))
.otherwise(pl.lit("Low"))
.alias("revenue_band")
)Click Test
Click the Test button (top right of the editor) to run the script against the 16,600-row preview sample.
- Success: Toast confirms "Script executed successfully. Preview has been refreshed." The output pane closes. Save & Preview becomes enabled.
- Error: Toast shows "Script error detected. Check the right pane for details." An output pane opens on the right showing the error message. Fix the script and Test again.
Click Save & Preview
Save & Preview is enabled only after a successful Test. Click it to apply the script to the pipeline. The success toast confirms: "Script saved. Preview has been updated with the latest transformation output."
Before and After
Rep performance score with percentile rank
Score each sales rep by combining their revenue share within region and discount efficiency - a multi-step calculation requiring window functions, cross-column arithmetic, and percentile ranking in a single pass. No combination of no-code operations can produce this in one step.
Input (first 3 rows of edilitics_sample_orders):
| order_id | sales_rep | region | revenue | discount_pct | units_sold |
|---|---|---|---|---|---|
| ORD-2024-0001 | Marcus Williams | North America | 780.62 | 0.25 | 3 |
| ORD-2024-0002 | Yuki Tanaka | North America | 4703.29 | 0.25 | 10 |
| ORD-2024-0003 | Tomás Herrera | Europe | 7347.74 | 0.15 | 27 |
Script:
df = df.with_columns([
(pl.col("revenue") / pl.col("revenue").sum().over("region")).alias("revenue_share_in_region"),
(pl.col("revenue") / pl.col("units_sold")).alias("revenue_per_unit"),
])
df = df.with_columns(
(
pl.col("revenue_share_in_region") * 0.6
+ (1 - pl.col("discount_pct")) * 0.4
).alias("rep_score_raw")
)
df = df.with_columns(
(pl.col("rep_score_raw").rank("average") / pl.col("rep_score_raw").len())
.alias("rep_score_pct_rank")
)Output (first 3 rows):
| order_id | sales_rep | region | revenue | rep_score_raw | rep_score_pct_rank |
|---|---|---|---|---|---|
| ORD-2024-0001 | Marcus Williams | North America | 780.62 | 0.487 | 0.31 |
| ORD-2024-0002 | Yuki Tanaka | North America | 4703.29 | 0.612 | 0.58 |
| ORD-2024-0003 | Tomás Herrera | Europe | 7347.74 | 0.754 | 0.82 |
Output: 500 rows, 17 columns (13 original + 4 derived).
Example Scripts
Z-score normalisation per region
Normalise revenue within each region to identify outliers. Requires per-partition mean and standard deviation - not available in any no-code operation.
df = df.with_columns([
pl.col("revenue").mean().over("region").alias("region_revenue_mean"),
pl.col("revenue").std().over("region").alias("region_revenue_std"),
])
df = df.with_columns(
((pl.col("revenue") - pl.col("region_revenue_mean")) / pl.col("region_revenue_std"))
.alias("revenue_zscore")
)
df = df.filter(pl.col("revenue_zscore").abs() < 3)Extract structured fields from a JSON metadata column
The metadata column contains JSON strings like {"source": "web", "campaign": "q1_inbound"}. Extract both fields into separate columns. Requires row-level string parsing - not possible with String Extract or any no-code operation.
df = df.with_columns([
pl.col("metadata").map_elements(
lambda x: re.search(r'"source":\s*"([^"]+)"', x).group(1) if x and re.search(r'"source":\s*"([^"]+)"', x) else None,
return_dtype=pl.String
).alias("lead_source"),
pl.col("metadata").map_elements(
lambda x: re.search(r'"campaign":\s*"([^"]+)"', x).group(1) if x and re.search(r'"campaign":\s*"([^"]+)"', x) else None,
return_dtype=pl.String
).alias("campaign"),
])Days to delivery and SLA breach flag
Compute the number of days between order_date and delivery_date, then flag rows where delivery took longer than 30 days. Requires date casting, cross-column arithmetic, and a derived boolean - not achievable in a single no-code step.
df = df.with_columns([
pl.col("order_date").cast(pl.Date).alias("order_date_dt"),
pl.col("delivery_date").cast(pl.Date).alias("delivery_date_dt"),
])
df = df.with_columns(
(pl.col("delivery_date_dt") - pl.col("order_date_dt"))
.dt.total_days()
.alias("days_to_delivery")
)
df = df.with_columns(
(pl.col("days_to_delivery") > 30).alias("sla_breach")
)Cumulative revenue milestone flag per sales rep
Compute running total of revenue per sales rep ordered by date, then flag the row where cumulative revenue crosses 50,000 for the first time. Requires ordered cumulative sum partitioned by rep - not available in Window Functions no-code operation for milestone detection.
df = df.sort(["sales_rep", "order_date"])
df = df.with_columns(
pl.col("revenue").cum_sum().over("sales_rep").alias("cumulative_revenue_by_rep")
)
df = df.with_columns(
(
(pl.col("cumulative_revenue_by_rep") >= 50000)
& (pl.col("cumulative_revenue_by_rep").shift(1).over("sales_rep") < 50000)
).alias("crossed_50k_milestone")
)After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.
Operation Reference
Prop
Type
Frequently Asked Questions
Next Steps
Filter
Use no-code Filter before or after a code step to narrow the dataset.
Group By
Aggregate the output of a code step using no-code Group By.
Drop / Rename Columns
Remove or rename columns added by a code step.
Cast Data Types
Cast column types produced by a code step to match downstream requirements.
Need help? Email support@edilitics.com with your workspace, job ID, and context. We reply within one business day.
Last updated on
Flatten
Expand nested JSON, struct, and list columns into flat tabular columns. Configure max depth and whether to keep the original column. No code needed.
Hybrid Pipelines
Combine no-code operations and Python Polars scripts in a single pipeline. Each step passes its output to the next regardless of type.