Edilitics | Data to Decisions

Filter

Keep rows that match conditions and remove the rest. Filter numeric columns by range, strings by value, and datetimes by time. Combine with AND logic.

Filter keeps rows that match your conditions and removes all others. Each filter targets one column, applies a condition (equal, not equal, greater than, etc.), and specifies the value to compare against. Add multiple filters to narrow further. Every filter must be satisfied for a row to survive (AND logic).

All column types are supported: numeric, categorical (string), and datetime.


When to Use Filter

  • Isolating a segment. Keep only orders from a specific region, product category, or customer tier for segment-level analysis.
  • Removing test or invalid data. Exclude rows where status = "test" or revenue <= 0 before running aggregations.
  • Date range scoping. Keep only rows within a specific date range before exporting or joining to another dataset.
  • Excluding known bad values. Remove rows where a category column contains a deprecated label you are not yet ready to replace with Find & Replace.
  • Narrowing before Group By. Filter to the relevant subset before aggregating to avoid computing over the full dataset.

Sample Dataset

The examples in this doc use the Edilitics sample orders dataset. Download it to follow along in your own workspace.

edilitics_sample_orders.csv

Sample orders dataset for hands-on practice · 500 rows

Download

Relevant columns for Filter examples:

Prop

Type


Conditions by Column Type

The available conditions change depending on the type of column selected.

String columns support two conditions:

ConditionKeeps rows where
Equal tothe column value matches any of the selected values
Not Equal tothe column value does not match any of the selected values

The Target Value(s) field is a multi-select dropdown pre-populated with the actual values in the column. Select one or more. You can also type a value not in the list to create it.

Example: region Equal to East, West keeps all rows where region is East or West.

Numeric columns support six conditions:

ConditionKeeps rows where
Equal tovalue matches any of the entered values
Not Equal tovalue does not match any of the entered values
Greater Thanvalue is strictly greater than the entered number
Less Thanvalue is strictly less than the entered number
Greater Than Equal tovalue is greater than or equal to the entered number
Less Than Equal tovalue is less than or equal to the entered number

For Equal to and Not Equal to, you can enter multiple values. For all comparison conditions (Greater Than, Less Than, etc.), enter a single number.

Datetime columns show a date picker and/or time picker instead of a text input. The pickers available depend on the column's exact type:

  • Date column: date picker only
  • Time column: time picker only (HH:mm:ss format)
  • Datetime column: both date picker and time picker

All six conditions apply: Equal to, Not Equal to, Greater Than, Less Than, Greater Than Equal to, Less Than Equal to.

Use Greater Than Equal to + Less Than Equal to in two separate filter rows to create a date range (see Multiple Filters section below).


Multiple Filters and AND Logic

When you add more than one filter row, all filters are applied sequentially. A row must pass every filter to appear in the output. This is AND logic.

Example: two filters applied together:

  • revenue Greater Than Equal to 1000
  • region Equal to East

Only rows where revenue is 1000 or above AND region is East are kept. Rows matching only one condition are removed.

To implement OR logic (keep rows matching either condition), use the Code Editor operation with a Polars expression: df.filter((pl.col("region") == "East") | (pl.col("region") == "West")).


How to Apply Filter

Open the Filter operation

In your Transform pipeline, click Add Operation and select Filter from the operation list.

Select a column

Choose the column to filter on from the Filter Column dropdown. Numeric, categorical, and datetime columns are all available.

Choose a condition

Select a condition from the Condition dropdown. The options shown depend on the column type. The Condition dropdown is disabled until a column is selected.

  • Categorical columns: Equal to, Not Equal to
  • Numeric columns: Equal to, Not Equal to, Greater Than, Less Than, Greater Than Equal to, Less Than Equal to
  • Datetime columns: same six as numeric, with date/time pickers for value entry

Enter target value(s)

For categorical and numeric Equal to / Not Equal to: the Target Value(s) field shows a multi-select dropdown pre-populated with actual column values. Select one or more, or type to add a value not in the list.

For comparison conditions (Greater Than, Less Than, etc.): enter a single number.

For datetime columns: use the date picker and/or time picker that appear.

The Target Value(s) field is disabled until a condition is selected.

Add more filters (optional)

Click Add Column to add another filter row for a different column. All filters combine as AND logic. The Add Column button is disabled until the current row has all three fields filled.

Remove a filter row (optional)

Click the remove icon on any row. The icon only appears when more than one row exists.

Save the operation

Click Save & Preview. Edilitics applies all filters in sequence. A success toast confirms: "Filter applied. Preview updated to reflect filtered rows."

Verify in the preview

Check the row count and spot-check values to confirm the expected rows were kept. If too many or too few rows remain, remove the operation and reconfigure.


Before and After

Two filters applied in one operation:

  • revenue Greater Than Equal to 1000
  • status Not Equal to cancelled

Input:

regionrevenuestatus
East500completed
East1200completed
West800cancelled
West3400completed
East150completed

Output (rows passing both conditions):

regionrevenuestatus
East1200completed
West3400completed

3 rows removed: two for revenue below 1000, one for cancelled status.


Code Equivalent

-- AND logic: both conditions must be true
SELECT *
FROM orders
WHERE revenue >= 1000
  AND status != 'cancelled';

-- Date range filter
SELECT *
FROM orders
WHERE order_date >= '2024-01-01'
  AND order_date <= '2024-06-30';
import polars as pl

# AND logic: chain .filter() calls
df = (
    df
    .filter(pl.col("revenue").ge(1000))
    .filter(~pl.col("status").is_in(["cancelled"]))
)

# Equal to multiple values
df = df.filter(pl.col("region").is_in(["East", "West"]))

# Date range
df = df.filter(
    pl.col("order_date").ge(pl.lit("2024-01-01").cast(pl.Date))
).filter(
    pl.col("order_date").le(pl.lit("2024-06-30").cast(pl.Date))
)

After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.


After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.


Operation Reference

Prop

Type


Frequently Asked Questions


Next Steps

Need help? Email support@edilitics.com with your workspace, job ID, and context. We reply within one business day.

Last updated on

On this page