Where does my data actually go when I connect a database to Edilitics?

It stays in your infrastructure. Integrate reads only metadata at connection time - column names, data types, null rates, cardinality - to build the semantic layer. When AskEdi or Transform runs a query, it executes live against your database and returns aggregated results. Raw rows are never stored on Edilitics servers and are never transmitted to any AI provider. All database credentials are encrypted using PBKDF2-derived Fernet encryption before storage - inaccessible to Edilitics staff and never sent to any LLM.

What is a DQ score and why does it matter before I run AI queries?

DQ (Data Quality) score is a 0-100 grade assigned to every table and column automatically after connection. It measures completeness (inverse null rate), uniqueness (cardinality relative to row count), and compliance (values matching their declared data type). Tables with grades D or F surface a warning before any AskEdi session is created - so poor data quality is never silently passed to the AI. You can still proceed, but you know what you are working with.

What is an AIR score and how is it different from DQ?

AIR (AI Readiness) score combines your data health (DQ) with how well your schema is documented. A column with no description scores 0.0 on semantic health. An AI-generated description scores 0.2. A human-validated description scores 1.0. You can have pristine data (DQ grade A) but a low AIR score if no one has reviewed the column definitions. AskEdi reasons on the AIR-scored layer - which is why answers are traceable rather than guessed.

How long does it take to connect and be ready for AI queries?

Under 5 minutes for a small-to-medium schema. Connect your database, DQ profiling starts immediately in the background, the AI drafts plain-English column definitions, your team reviews and approves them, and AskEdi is ready. Larger schemas with hundreds of tables take longer to review - but you can approve a subset and start querying immediately without waiting for the full schema to be validated.

Can my team reuse a connection without seeing the credentials?

Yes. Integration sharing gives teammates full reuse access - they can browse schema, run Transform pipelines, and start AskEdi sessions - without ever seeing the underlying credentials. Credentials stay encrypted and visible only to the owner. Shared users cannot edit credentials, regenerate AI insights, or delete the integration. Sharing is restricted to your verified organizational domain - personal email domains are blocked.

What data sources does Integrate support today?

24 live connectors: PostgreSQL, MySQL, SQL Server, SQLite, Cloud SQL variants, Snowflake, BigQuery, Amazon Redshift, Databricks, SAP HANA, MongoDB, MongoDB Atlas, Redis, Google Sheets, CSV, Excel, JSON, PDF, Parquet, Avro, Feather, and Pickle. 15 more are in development including Stripe, Shopify, Google Analytics 4, Jira, and Slack. Full connector list at edilitics.com/integrations.

Connect. Score. Trust.

Connect your data sources.
Ask them anything.

Integrate connects 24 live data sources in-place and builds four things every other module depends on: live connections, per-column quality scores, AI readiness grades, and a governed semantic layer your team validates and locks. Transform pipelines run against it. Visualize dashboards read from it. AskEdi uses the column stats and definitions to write precise queries - not blind ones. One foundation. The entire platform follows.

Connect Your Data View Connectors

60%

of AI projects abandoned due to bad data foundations

60% of AI projects are abandoned due to lack of AI-ready data. The root cause is almost always the same: fragmented sources, unvalidated metrics, and no single governed definition of what a number means. Integrate connects 24 live data sources in-place and fixes that - permanently.

The cost is concrete: delayed decisions, eroded trust between your CFO and CMO, and a data team that exists to reconcile instead of analyze. Integrate exists to end that - permanently. One governed foundation the entire platform runs on.

Enterprise Credential Security

Zero Friction.
Zero Data Storage.

Designed for regulated industries constrained by strict InfoSec policies. Edilitics secures your credentials while executing computation directly against your infrastructure.

Zero Source Storage

Edilitics never stores customer data rows. Only structural configuration and aggregated semantic statistics are persisted. Your physical records remain entirely within your own native infrastructure for maximum data security.

Multi-Salt AES Encryption

Connection credentials are encrypted at-rest using complex domain-level and user-level salt key isolation. Even in the event of a breach, a single compromised instance cannot decrypt any payloads across independent workspaces.

Non-Destructive Target Validation

When linking a new source, the backend fires asynchronous validation requests verifying connection health, proxy configurations, and SSL enforcement without ever executing any unpredictable or destructive target database queries.

Isolated File Sandboxing

Static flat files uploaded into the module are vaulted immediately within entirely isolated Cloud Storage buckets, triggering aggressive background virus-scan validations prior to any backend metric processing or data profiling.

Every security constraint here enforces one guarantee: zero raw rows ever reach the AI. Architecture, not policy. The same guarantee holds when AskEdi queries your data and when Transform moves it through pipelines.

Adaptive AI Compliance Controls

AI Context Without the
Privacy Compromise.

Three strict privacy postures determine exactly what context is transmitted to the AI during metadata generation. In all three modes, zero raw customer records are ever transmitted.

Max Compliance

Private Mode

Column names are fully anonymized before reaching the AI. The LLM receives only anonymous aliases, actual table names, SQL data types, and DQ statistics.

Column names anonymized (col_1, col_2...).

Backend reverse-maps aliases post-inference.

Ensures zero schema leakage to external AI.

Recommended

Balanced Mode

The enterprise standard. AI receives actual table and column names, data types, DQ statistics, and your Focus Sector for maximum context without row data.

Real schema names shared for accuracy now.

Focus Sector primes domain vocabulary use.

Ensures zero live row data is transmitted.

Max Intelligence

Full Context Mode

Maximum reasoning power. Adds top-frequency sample values per column on top of Balanced Mode, enabling the AI to understand your unique internal taxonomy.

Balanced context + top categorical values.

Enables taxonomy-aware AI descriptions now.

Ideal for heavily categorical data analysis.

MULTI-SALT AES ENCRYPTION

BYOK: OPENAI, ANTHROPIC, DEEPMIND

ZERO RAW RECORD TRANSMISSION

Your database credentials are encrypted. Your raw rows never leave your infrastructure. The governed foundation that makes every AI answer trustworthy is ready to build today.

Data Quality Engine

Your data, graded before
the AI ever sees it.

The moment you connect a source, Integrate profiles every column in every table - automatically. No setup, no configuration. A statistically rigorous DQ score is computed before a single dashboard is built or question is asked.

DQ Scoring Formula

Every column receives a 0–100 score built from three dimensions. ID and timestamp columns are weighted 3× heavier - they're structurally critical.

Completeness

50%inverse null rate

Uniqueness

25%cardinality ratio

Compliance

25%inverse noncompliant rate

Formula

DQ = (0.50 × completeness) + (0.25 × uniqueness) + (0.25 × compliance)

Sampled at 16,600 rows · 99% confidence · <1% margin of error

A–F Grade Scale

Table scores roll up from column scores. Grades D and F surface an advisory warning before an AskEdi session - so poor data quality is never silently passed to the AI.

APristine

90–100%

BStrong

75–89%

CAverage

60–74%

DPoor

45–59% Advisory

FFailing

0–44% Advisory

Grades D and F both show an advisory warning before an AskEdi session - cautioning about data reliability. You can still proceed. No grade hard-blocks session creation.

What Gets Profiled Per Column

Seven statistical metrics are computed for every column across every table - automatically, the moment connection succeeds.

Null CountEmpty / null values

Distinct CountUnique value cardinality

NoncompliantDatatype violations

Min ValueNumeric & date columns

Max ValueNumeric & date columns

CategoryDateTime · Scalar · String · List

Top ValuesFrequency distribution

ZERO RAW ROWS STORED - SAMPLE ONLY

16,600-ROW CAP · REPRODUCIBLE SAMPLING

AUTOMATED PROFILING ON CONNECTION

AI Readiness Score

Clean data isn't enough.
The AI needs context too.

DQ tells you how healthy your data is. AIR tells you how ready it is for AI. A table can be statistically pristine and still produce unreliable answers if the AI has no idea what its columns mean.

A semantic layer is only as good as the data inside it. AIR scores tell you exactly where you stand - before AskEdi answers a single question. The foundation is ready when AIR says it is.

Structural Health

DQ Score

× 0.50

null rate · cardinality · compliance

Semantic Health

Validation

× 0.50

human-validated descriptions

AIR Score

A–F

0–100

AskEdi weights reasoning accordingly

0.0 pts

Undocumented

Column has no description - the AI has no business context to reason against.

AIR contribution0%

0.2 pts

AI-Generated

LLM produced a description from schema + DQ stats. Useful baseline, not yet trusted.

AIR contribution10%

1.0 pts

Human-Validated

Your team reviewed, edited, and locked the description. Permanently protected from AI overrides.

AIR contribution50%

Sector-Aware Metadata Generation

When the AI generates column descriptions, it knows your industry. A column named amt becomes Transaction Amount in Finance - and Dosage Amount in Healthcare. Business vocabulary is baked in from day one.

Finance

Healthcare

Retail

SaaS

Logistics

+ more

Focus Sector

Not sent in
Private Mode

Semantic Layer

What AskEdi reasons on -
not your raw data.

Integrate builds a natural-language documentation layer over your schema. Your team validates it. AskEdi consumes the locked result - so every answer is traceable to business logic your team approved, not an LLM's best guess.

At every step - AI generation, human review, and permanent locking - zero raw rows from your database ever reach the LLM. The AI reasons on schema structure and validated definitions only. That is the architectural guarantee that makes AskEdi's answers trustworthy. Once locked, the semantic layer also powers every chart in Visualize and every Transform pipeline that writes back to your data.

AI-GeneratedAutomatic

On connection, an async job calls your selected LLM with schema metadata and DQ statistics - never raw rows. Every column gets a natural-language description in seconds.

Human ReviewYour Team

Open the Metadata Viewer to read AI-generated descriptions alongside column statistics. Edit any description inline. Mark it Human Validated when it's right.

Locked ForeverProtected

Human-validated descriptions are permanently marked ai_generated=False. Every future DQ refresh skips them - your institutional knowledge survives schema drift.

Metadata Viewer - public.transactionsAI Insights Active

tx_idUUID

DQ:AAIR:A

Unique identifier for each transaction record.

Validated

tx_amountnumeric(10,2)

DQ:AAIR:A

Settled transaction amount in USD.

Validated

statusvarchar

DQ:BAIR:B

Current transaction lifecycle state (SETTLED, PENDING, FAILED).

created_attimestamptz

DQ:AAIR:D

No description - AIR score penalised

HUMAN-VALIDATED = PERMANENTLY LOCKED, NEVER OVERWRITTEN

AI-GENERATED = RE-EVALUATED ON EACH DQ REFRESH

ZERO RAW ROWS EVER REACH THE LLM

Collaboration & Governance

Share access.
Never share credentials.

Integrations can be shared with any verified workspace member - they get full analytical access without ever seeing the underlying database credentials. You stay in control of who can modify, share, or delete.

CapabilityOwnerShared

View integration & DQ / AIR scores

Browse schema & table explorer

View AI Insights & metadata

Use as pipeline or AskEdi source

Generate / Refresh AI Insights

Refresh DQ & AIR Scores

Edit connection credentials

Share with additional users

Delete integration

Domain Guard

Sharing is restricted to verified organizational domains. 22 personal and generic email providers are blocked - including Gmail, Yahoo, Outlook, Hotmail, and more. No shared integration can ever be exposed to a non-org account.

Credentials Stay Hidden

Shared users get reuse access at the API level. Credential fields are encrypted and never returned in any shared-user response - not hostname, not password, not connection URL. Access grants capability, not visibility.

Workspace Superadmins

Superadmins can manage all integrations across the organization regardless of creator - view, edit, share, and delete. Every operation is logged with user, timestamp, and action for full audit traceability.

Full audit log + CSV export

Org-wide visibility

Observability & Lineage

Always in sync.
Always traceable.

Integrate watches your data sources continuously. Schema drift is detected automatically. Every downstream dependency is visible before you make a change - so nothing breaks silently in production.

Automated Schema Drift Detection

A daily background cron targets integrations stale for 7+ days whose underlying data has grown by ≥5%. Schema is re-hashed, new or deleted columns are detected, and all DQ and AIR scores are recalculated - without overwriting any human-validated descriptions.

[cron 02:00 UTC] Scanning stale integrations...

INFO integration prod_postgres - stale 8d · row growth +12.3% → refresh triggered

INFO schema re-hash: 2 new columns detected in public.orders

INFO resampling 16,600 rows (reproducible) · recalculating DQ + AIR...

Human-validated descriptions preserved (4 columns skipped)

Manual refresh rate-limited: once per 24h

Auto-triggered on any credential edit

Live Status Pulse

Real-time badges show exactly what the backend is doing. Editing is disabled while any badge is active.

Profiling

Schema scan in progress

Generating AI Insights

LLM generating descriptions

Refreshing

DQ & AIR recalculating

Active

Live pipeline running

End-to-End Dependency Lineage

Every integration card shows a Powers row - icons and counts for every downstream asset consuming that source. Before you edit or delete anything, you know exactly what breaks. If an active Transform flow run is in progress, deletion is hard-blocked. Otherwise, you see the full impact and choose what to do.

Force-delete soft-removes the integration and all dependent assets simultaneously - nothing is silently orphaned.

prod_postgresActive

Transform Pipelines

Visualize Dashboards

AskEdi Sessions

COMMON QUESTIONS

Everything you need to know before you decide.

No sales call needed. If you have a question we haven't answered here, reach out directly.

Connect your data sources.Ask them anything.