Get startedGet started for free

The interrogation phase: data quality

1. The interrogation phase: data quality

You've defined your task and understood your data. Now comes the part that often eats hours of an analyst's life: checking the data for issues before you trust any insight that comes out of it. This is the interrogation phase — and AI changes both how fast you can do it, and what's actually possible.

2. The usual suspects

You know the standard data quality issues: missing values, duplicate records, data type mismatches. AI can handle many of these as quickly as manual workflows— but the more interesting story is what it lets you tackle that you couldn't easily do before.

3. The harder problems AI unlocks

There are three classes of issues that have always been painful to handle — and they are issues that AI is genuinely transformative at. Each one is doable without AI, but slowly, with brittle rules.

4. Fuzzy duplicates

"Apple Inc" and "Apple Incorporated" are the same company — but a text-match query treats them as separate. Traditionally, you'd hand-craft fuzzy-match rules and keep adding to them as new variants appear. AI takes both pieces off your plate: it generates the matching logic for you, and it brings something string-similarity alone can't — a semantic understanding of what these entities actually are. The output is a query or a proposed grouping you can read and adjust.

5. Time-order violations

Some data issues are about logic across columns: a product shipped before it was ordered, or an invoice paid before it was issued, for example. Traditionally, you'd write SQL that compares timestamps with ordering constraints — once you've worked out which fields to compare. With AI, you describe what shouldn't happen and it writes the query and identifies issues. The pain isn't in the SQL; it's in identifying the right rule. AI shortcuts both.

6. Logical inconsistencies

Then there's the kind of error AI is uniquely positioned to spot: logical inconsistencies. If the city is "London" and country is "Japan". Or if the product category "Dairy" and the product is "Bread". A traditional check requires reference tables of valid combinations — and someone to maintain them. AI brings its own world knowledge and flags combinations that don't make sense with no reference data required.

7. Interrogate the AI's work

Here's the important part: AI isn't doing magic. In most cases it's generating a query, a piece of Python, or a set of rules and running them against your data. That work can be wrong — a bad regex, an incorrect timestamp comparison, a hallucinated business rule. Before you trust any finding, look under the hood. Read the query. Check the logic. Verify on a sample.

8. Fixing the issues

Once you've validated what AI has found, fixing things depends on your platform. In this course, you'll have AI generate cleaned data that you could re-upload to your BI stack. In your own workflow, integrations and account tiers determine whether AI can write changes back directly. The findings transfer; how you fix them depends on where you work.

9. Let's practice!

Time to start interrogating your data!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.