How is a raven like a writing desk? Actually, I have no idea. But I do know how an LLM is like a genie.
It’s all about wishes. Remember all those tales about wishes gone awry? The stories of the Fisherman and His Wife and The Monkey’s Paw are but two examples. In both stories, wishes that are not framed very carefully lead to outcomes that range from comedically awkward to horrific. Asking a genie to grant a wish is fraught with risk.
And now LLMs can reproduce that magic. All you have to do is ask.
Specifically, all you have to do is ask the LLM to generate SQL. According to arXiv, 37% of SQL generated by today’s LLMs is wrong. But that’s only the beginning. The real problem is when the LLM helpfully invents data to go along with its SQL. After all, your wish is its command.
If We’re Careful, It Can’t Hallucinate. Right?
For a Fortune-100 finance team, their logic was simple: if the model generates valid SQL and shows the raw results, there’s no room for hallucinations. Right? Right? Wrong.
LLMs plan. They are very good at planning. They figure out what they expect to see. After all, what is an LLM but a highly evolved text prediction machine? So they predict. They expect. And when reality doesn’t match their expectations? They create. They fill in the blanks to satisfy their own plans.
Updated your schema? Changed “customer_revenue” to “net_revenue”? Ask for customer_revenue and the LLM will create numbers for customer_revenue. While gratifying to the ego, perhaps not the best information to base projections on. That’s what the finance team discovered. Awkward.
This isn’t an isolated case. arXiv estimates it happens with almost 40% of text-to-SQL queries.
Give an LLM and empty cell and before long you’ll have a very convincing looking world. Unreal. Literally unreal. But convincing.
At this point, you might wish for a better solution.
Beware of Great Expectations
The problem is that data hallucinations aren’t a software bug in the traditional sense. Rather, they’re an expectation bug. What the LLM expects is what you get. Control the expectations and you control the result.
SWIRL does just that. Did just that for the aforementioned Fortune-100 finance team.
Crafting a Better Wish
Sitting between your LLMs and your data, SWIRL applies the security roles and schema rules you already know and trust. Every query passes through SWIRL’s orchestration layer. Before the query executes, SWIRL checks requested columns against the live schema and user permissions. If the query is asking for something that doesn’t exist, SWIRL refines your wish. Clarifies it. Makes sure it is asking for something real.
After running the query, SWIRL further checks the results. SWIRL makes the LLM explain why each returned column is present. SWIRL will even make the LLM apologize if it makes a mistake. It’s not clear if this causes LLMs to become more polite, but it does get them to retry until results and reality align (try that with a genie!).
When the dataset is sparse, SWIRL switches the LLM from “answer” to “analysis.” That way it’ll tell you what’s uncertain. No made-up data. No false expectations. No awkward conversations with Compliance.
For our Fortune-100 finance team, using SWIRL gave them sub-second responses with zero hallucinated metrics. No surprise columns. No phantom rows. Just truthful answers. A wish that worked.
Aladdin could have benefitted from SWIRL.
To find out how SWIRL can help you get real answers from your data, request a demo, download our white papers, or contact us for more information.