Can I join tables from different databases?

Yes. The Database dropdown shows all connected integrations. Both tables do not need to be in the same database or warehouse.

What happens when both tables have a column with the same name?

The column from the second table is automatically renamed with a _right suffix. The join key column from the second table is dropped from the output.

Can I join on columns with different names?

Yes. Select the Left Key and Right Key independently. They do not need to share the same name, only compatible values.

Can I join more than two tables?

Not in a single operation. Add a second Join operation after the first to join a third table.

Why is Save & Preview disabled?

Save & Preview requires all five fields: Left Key, join type, Database, Table, and Right Key. Check that all dropdowns have a selection.

Joins | Edilitics Docs

Merge two tables using a matching key column. Left, Right, Inner, and Outer joins supported. Join across any connected database. No code needed.

Joins merges the current dataset with a table from any connected integration using a shared key column. Four join types: Left, Right, Inner, and Outer. Select the Left Key from the current dataset, choose a join type, then select the database, table, and Right Key from the second table. Conflicting column names in the joined table get a _right suffix automatically.

The second table can come from a different database or integration than the current pipeline source. No need to land both datasets in a shared warehouse first.

Only one table can be joined per operation. Chain multiple Join operations to join more than two tables.

Cross-Database Joins

Most ETL and transformation tools require both tables to exist in the same warehouse before a join is possible. Edilitics does not.

Join any two tables across any two connected integrations in a single operation - PostgreSQL to BigQuery, MySQL to Snowflake, MongoDB to DuckDB. No staging required, no intermediate landing zone, no SQL to write.

What it enables	Details
Join across integrations	Combine tables from any two connected databases in one step
No shared warehouse required	Data does not need to be co-located before joining
Same workflow, any source	Cross-database joins use the same UI as single-source joins

This is a meaningful capability gap versus warehouse-first tools like dbt, Fivetran, Airbyte, and Matillion - all of which require data to be landed in a common warehouse before any join can be performed.

When to Use Joins

Enrich orders with customer data. Left join orders to customers on customer_id to add name, email, and region to every order row.
Find matching records only. Inner join transactions to accounts to return only transactions with a valid matching account.
Retain all records from a reference table. Right join orders to a product_catalog to keep every product, even those with no orders.
Build a full combined view. Outer join two regional sales tables to see all rows from both, with nulls where no match exists.
Join across databases. Left join a PostgreSQL orders table to a BigQuery metadata table without moving either dataset.

Sample Dataset

The examples in this doc use two sample datasets. Download both to follow along in your own workspace.

edilitics_sample_orders.csv

Primary orders dataset - use as the current pipeline source · 500 rows

Download

edilitics_sample_products.csv

Product catalog - use as the second table to join on product_id · 4 rows

Download

Join Types

Type	Rows returned	Use when
Left Join	All rows from the current dataset. Matched columns from the second table added; nulls where no match.	Enrich the current dataset while keeping all existing rows.
Right Join	All rows from the second table. Matched columns from the current dataset added; nulls where no match.	The second table is the authoritative list; current dataset is supplementary.
Inner Join	Only rows with a matching key in both tables.	Return only confirmed matches between both datasets.
Outer Join	All rows from both tables. Nulls where no match on either side.	Build a complete view combining all rows from both sources.

How to Apply a Join

Open the Join operation

In your Transform pipeline, click Add Operation and select Joins from the operation list.

Select the Left Key

In the left panel, select the key column from the current dataset to join on. Columns with object or list datatypes are excluded from the dropdown.

Select a join type

In the centre panel, click one of the four join type buttons:

Left Join - Keep all rows from the current dataset (default).
Right Join - Keep all rows from the second table.
Inner Join - Keep only rows with a match in both tables.
Outer Join - Keep all rows from both tables.

Left Join is selected by default.

Select the Database

In the right panel, select any connected integration from the Database dropdown. This can be the same integration as the current source or a different one. Tables already used by previous Join or Concat operations in this pipeline are excluded.

Select the Table

In the right panel, select the table to join from the Table dropdown.

Select the Right Key

Select the key column from the second table to join on. This column must match in value with the Left Key selected in the left panel.

Click Save & Preview

Save & Preview is enabled when all four fields are set: Left Key, join type, Database, Table, and Right Key. Click it to run the join. The success toast confirms: "Join applied. Sample preview reflects merged dataset."

Column Conflict Handling

When the second table contains a column with the same name as a column in the current dataset (other than the join key), the column from the second table is renamed automatically with a _right suffix.

The join key column from the second table (Right Key) is dropped from the output. Only the Left Key column is retained.

Scenario	Output
Column `revenue` exists in both tables	Current dataset: `revenue`. Second table: `revenue_right`.
Join key `order_id` selected as Right Key	`order_id` dropped from second table output. Left table `order_id` retained.
No overlapping column names	All columns from both tables included as-is.

Before and After

Left Join

Left join edilitics_sample_orders (500 rows, 13 columns) to edilitics_sample_products (4 rows, 7 columns) on product_id. All 500 order rows retained. Matching product columns added. Orders with a product_id not in the products table return null for product columns.

Left Key: product_id (orders). Right Key: product_id (products).

Input (current dataset, first 3 rows):

order_id	customer_name	product_id	revenue	product_tier
ORD-2024-0001	Topaz Analytics	PROD-001	780.62	Growth
ORD-2024-0002	Bravo Analytics	PROD-002	4703.29	null
ORD-2024-0003	Silverstone Tech	PROD-001	7347.74	Starter

Second table (edilitics_sample_products):

product_id	product_name	product_tier	base_price_usd	support_level	max_users	storage_gb
PROD-001	Edilitics Starter	Starter	299.00	Community	5	10
PROD-002	Edilitics Growth	Growth	799.00	Standard	25	100
PROD-003	Edilitics Enterprise	Enterprise	2499.00	Priority	100	500
PROD-004	Edilitics Enterprise Plus	Enterprise	4999.00	Dedicated	9999	2000

Output (first 3 rows):

order_id	customer_name	product_id	revenue	product_tier	product_name	base_price_usd	support_level	max_users	storage_gb
ORD-2024-0001	Topaz Analytics	PROD-001	780.62	Growth	Edilitics Starter	299.00	Community	5	10
ORD-2024-0002	Bravo Analytics	PROD-002	4703.29	null	Edilitics Growth	799.00	Standard	25	100
ORD-2024-0003	Silverstone Tech	PROD-001	7347.74	Starter	Edilitics Starter	299.00	Community	5	10

Output: 500 rows, 19 columns (13 original + 6 from products; product_id from right table dropped, product_tier from products renamed product_tier_right to avoid conflict).

Inner Join

Same tables, Inner Join on product_id. Only order rows with a matching product_id in edilitics_sample_products are returned. All 500 orders reference PROD-001 through PROD-004 - all match - so output remains 500 rows. Orders referencing a discontinued or unknown product_id would be excluded.

Code Equivalent

-- Left Join
SELECT o.*, p.product_name, p.category, p.list_price
FROM orders o
LEFT JOIN product_catalog p ON o.product_id = p.product_id;

-- Inner Join
SELECT o.*, p.product_name, p.category, p.list_price
FROM orders o
INNER JOIN product_catalog p ON o.product_id = p.product_id;

-- Right Join
SELECT o.order_id, o.revenue, p.*
FROM orders o
RIGHT JOIN product_catalog p ON o.product_id = p.product_id;

-- Outer Join
SELECT o.*, p.product_name, p.category, p.list_price
FROM orders o
FULL OUTER JOIN product_catalog p ON o.product_id = p.product_id;

import polars as pl

# Left Join
df_result = df_orders.join(
    df_catalog,
    left_on="product_id",
    right_on="product_id",
    how="left"
)

# Inner Join
df_result = df_orders.join(
    df_catalog,
    left_on="product_id",
    right_on="product_id",
    how="inner"
)

# Right Join (flip left join for full right-table retention)
df_result = df_catalog.join(
    df_orders,
    left_on="product_id",
    right_on="product_id",
    how="left"
)

# Outer Join
df_result = df_orders.join(
    df_catalog,
    left_on="product_id",
    right_on="product_id",
    how="outer"
)

After Save & Preview, the pipeline shows a DQ delta badge on this step - green if the table score improved, red if it dropped. See Data Quality Scoring for how scores are calculated.

Joins

Cross-Database Joins

When to Use Joins

Sample Dataset

Join Types

How to Apply a Join

Column Conflict Handling

Before and After

Left Join

Inner Join

Code Equivalent

Operation Reference

Frequently Asked Questions

Next Steps

Drop / Rename Columns

Filter

Manage Nulls

Concat

On this page