Edilitics | Data to Decisions

AI Column Insights

How Edilitics generates and validates column descriptions so AskEdi and Auto Generate Charts understand what your data means.

AskEdi and Auto Generate Charts rely on knowing what your columns mean before they can give you a reliable answer. A column named revenue might be gross or net. A column named status might have six possible values with specific business meanings. Without that context, the AI makes its best guess. AI Column Insights replaces guessing with verified, business-specific descriptions for every column in your integration.


How It Works

AI Column Insights works in two stages. Both matter. Only the second one is optional to skip.

Stage 1: Generation. When you enable AI Column Insights, Edilitics sends structural information about your schema to a large language model: column names, data types, DQ statistics, and optionally your organisation's industry sector. The model writes one description per column, grounded in everything it can observe about the data. This runs automatically in the background and requires no input from you beyond the initial setup.

Stage 2: Validation. The generated descriptions are a starting point. The AI does not know that your revenue column is net of refunds. It does not know that status has a value called on_hold that only appears for B2B orders. Your team does. Validation is the step where a person reads each description, corrects anything wrong, and marks it as confirmed. A validated description is locked permanently against future automated changes and contributes fully to the AIR Score.

Generation is automatic. Validation is the work that matters. An integration with only AI-generated descriptions gives AskEdi a reasonable starting point. An integration where your team has validated the key columns gives AskEdi verified, business-specific context for every question it answers.


Privacy Modes

Before generation runs, you choose a privacy mode. This controls what information about your schema is shared with the LLM. In all three modes, zero raw data rows are ever transmitted.

Sends table names, column names, data types, and DQ statistics to the LLM. Does not send your organisation's Focus Sector or any value-level data.

Use Private when your column names themselves contain sensitive identifiers and you want to minimise what leaves your environment. Descriptions will be structurally accurate but will not reflect your industry context.

Sends everything in Private, plus your organisation's configured Focus Sector. Does not send value-level data.

Balanced is the right choice for most integrations. A column named amt in a Finance workspace will be described as a transaction or revenue measure. The same column in a Healthcare workspace will be described as a dosage or billing amount. The descriptions are industry-aware without any value-level data leaving your environment.

Sends everything in Balanced, plus the most frequently occurring values per column drawn from the DQ profiling sample.

Use Full Context for heavily categorical datasets where the actual values matter for accurate descriptions. If a column named status has frequent values like pending_review, escalated, and resolved, Full Context lets the model use that vocabulary in the description. Frequent values are statistical summaries from the profiling sample, not individual records from your source.


Enabling AI Column Insights

At Connection Time

The connection form includes an AI Driven Column Insights toggle. Enabling it opens a two-step flow:

A Disclaimer appears confirming that descriptions are generated using schema metadata only, not raw data. Check "Do not show again" to skip this screen on future connections, then confirm to continue.

The Privacy and Context screen appears. Select your LLM provider and privacy mode, then click Verify and Proceed.

Once the connection is saved, generation runs in the background. An amber Generating AI Insights pulse badge appears on the integration card while the job runs. All card actions are disabled during this time. When generation completes, the badge clears, the card shows the initial AIR Score, and you receive a notification.

Enabling AI Column Insights is a one-way decision. Once generation has run for an integration, the feature cannot be disabled. You can update individual descriptions at any time, but you cannot remove AI Column Insights from a connected integration.

After Connection

If you skipped AI Column Insights when you first connected, you can enable it at any point:

Hover over the integration card to reveal the action menu.
Click Generate AI Insights.

Select your LLM provider and privacy mode, then click Verify and Proceed.

Generation runs in the background. The amber pulse badge appears on the card while it is running. You will receive a notification when it is complete.


What the AI Generates

The AI writes one description per column, between 200 and 300 characters, grounded in the column name, data type, DQ statistics, and Focus Sector if your selected mode includes it. It does not paraphrase the column name. It uses the statistics to infer cardinality, completeness, and value patterns, and writes a description that explains the column's business role.

ColumnAI-Generated Description
order_idUnique record identifier, always populated, with high cardinality indicating one entry per order. Functions as the primary reference key across order-related reporting.
revenueGross order revenue in the transaction currency, spanning a wide numeric range. Used as the primary financial measure in sales and performance reporting.
statusOrder fulfilment status with a small number of distinct categorical values observed. Drives filtering and segmentation in operational and delivery reporting.
created_atUTC timestamp recording when the order was placed, always populated. Used as the primary date dimension for trend and cohort analysis.

These are strong starting points, not finished definitions. The AI can see that revenue is a wide numeric measure. It cannot see that your revenue is net of refunds and excludes VAT. It can see that status is categorical with a small distinct count. It cannot see that on_hold is a value used exclusively for B2B orders. That knowledge belongs to your team. Validation is where you put it into the system.


Viewing and Validating Descriptions

Once insights are generated, open the Metadata Viewer via View AI Insights in the integration hover menu. The viewer shows every column across your tables, with its AI-generated description, validation status, and DQ score.

Two actions are available on every description:

  • Approve: you have read the description and it is accurate as written. The column is immediately marked as validated. Approval is permanent and per-column.
  • Edit: opens the Column Metadata modal where you can read the column's full DQ breakdown and rewrite the description. Your edits across multiple columns in a table accumulate locally and are submitted together when you click Save. Saving any edit also marks the column as validated.

For the full detail on the viewer layout, the Edit modal, character constraints, and how validation affects the AIR Score, see AI Readiness (AIR) Score.


Where to Focus First

Not every column has the same impact on AI output quality. Start with the columns AskEdi uses most.

Date and timestamp columns (created_at, order_date, anything ending in _at or _date). AskEdi uses these for every time-based question. A description that confirms the timezone, the event being recorded, and how the column is used in reporting directly improves trend and cohort analysis.

Your primary measures (revenue, quantity, cost). AskEdi aggregates these in almost every answer. If a measure's description is wrong about its unit or scope, every answer involving that measure is wrong.

Categorical columns with specific values (status, region, type). The AI's generated description will note that the column is categorical but may not list all the values or explain what they mean. A validated description that maps the values to their business meaning prevents misinterpretation in every filter and grouping.

Validating ten columns that appear in your most common AskEdi queries delivers more improvement than validating fifty rarely-used ones. Start with what you actually ask questions about.


Frequently Asked Questions


Next Steps

Need help? Email support@edilitics.com with your workspace, job ID, and context. We reply within one business day.

Last updated on

On this page