Database design and optimization with AI
1. Database design and optimization with AI
Welcome back! This will be an exciting video. Now that we've tuned Atlas into a performant, well-maintained toolkit, it's time to plug in our real production data.2. Plugging in real data
First, we create a dedicated folder under Atlas's /usecase directory to host our tourism dataset. If we simply point Atlas to this new data, we hit an error: sqlite3 OperationalError: no such table: activity events. At first glance, it looks like the table is missing. But if we take a closer look, we realize the table exists, but it is stored under a different name. This happens because our database schema does not match the schema Atlas expects.3. AI for schema mapping
We could manually rename tables and columns until everything lines up, but that's tedious, error-prone work. This "inspect and match" task is perfect for an AI model, which can map and align schemas much faster than a human could. To do this, we could use a prompt like the one shown.4. AI maps the schema
In seconds, the model proposes a clear mapping and generates the migration steps, so we can plug our data into the Atlas pipeline.5. Real-world data updates
Now let's talk about real-world data updates. Wayfarer ingests booking events from partners like GetYourGuide and similar marketplaces. Those partners often send duplicate events, cancellation corrections, and route adjustments. If new data arrives or older data is corrected, the Atlas KPIs become inaccurate unless we first reconcile the database. That's why Wayfarer runs controlled reconciliation through CRUD operations.6. AI-assisted duplicate detection
For example, imagine a partner sends a new batch of booking events. Because multiple providers may scrape from the same upstream API, duplicates are common. Before we insert anything into production, we need to detect whether entries in the incoming batch already exist.7. Designing the detection query
An AI model can help us design the query to enforce this type of data quality. Again, instead of asking the model to "find duplicates" directly, we ask it to generate a reusable SQL query, so that we can run that very same query for every new batch.8. Query performance optimization
Next, AI can help us analyze and optimize query performance. Let's say we run this query multiple times a day to compute a key metric. For example, "conversion rate by city over the last 30 days." The query works, but it's starting to get slow as the dataset grows. We can ask the model to profile the query and propose indexing strategies.9. AI profiles the query
We can see the model identifying expensive joins, recommending targeted indexes, and rewriting the query to reduce scans and unnecessary computations.10. Database design at scale
Finally, we can use AI for higher-level design recommendations: normalization, partitioning, and scaling. As Wayfarer's data grows, we need to decide what belongs in a single table, what should be normalized into separate entities, and when we should partition or split storage to keep performance predictable.11. AI-driven design decisions
We can see how the model generates an action plan based on best impact-to-effort. With that, we’re not just writing queries. We are leveraging AI to design a database that stays fast, reliable, and maintainable as Atlas moves into real production workloads.12. Let's practice!
Now it's your turn to get hands-on and put all of this into practice!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.