Hitting 80% on BIRD is a huge milestone for text-to-SQL, but let’s be real, production environments are messy. Real-world schemas aren't as clean as benchmarks, and dealing with h…
Hitting 80% on BIRD is a huge milestone for text-to-SQL, but let’s be real, production environments are messy. Real-world schemas aren't as clean as benchmarks, and dealing with hallucinations when generating complex joins is still a nightmare. I’m curious if this handles enterprise-level security and row-level permissions out of the box or if we’re still stuck building custom guardrails.
It's definitely a step forward for natural language interfaces, but I’d bet most devs will still prefer an ORM over letting an LLM write raw SQL queries directly against their production DB. 🤖