It started with what I thought was a casual 1:1 with my new manager.
I had just joined a new company as a Python backend dev. Two weeks in, still figuring out our codebase, I was expecting the usual onboarding chatter. But then he says:
"We've been thinking about building an AI assistant for internal data. Something that can turn plain English into SQL."
I smiled, nodded… and inside I was like: "Oh no. I don't know the first thing about building that."
But I also really wanted to build it.
So I did what any decent dev does: I googled my way through it. And this is the story of how I, a non-AI-specialist, built a working internal AI SQL assistant from scratch using Python, open-source tools, and a lot of trial and error.
Step 1: Don't Build an Assistant. Just Answer One Question.
The phrase "AI Assistant" sounds intimidating. So I didn't build one.
Instead, I broke it down. What the team really wanted was to ask questions like:
- "List all employees who joined last year"
- "Show me average salary by department"
…and get a usable SQL query in return. That's it.
So I set myself one goal: Take a sentence in plain English → return a working SQL query.
No chatbot. No user accounts. Just a Flask app with a box to type in and a box to show output.
Step 2: Pick a Model That Actually Knows SQL
There are hundreds of large language models out there, but most are trained on general text. I needed one that understood SQL deeply.
I tested:
- OpenAI's GPT-3.5: surprisingly good, but $$$ and privacy concerns.
- T5 fine-tuned on WikiSQL: decent for basics, broke easily on custom schemas.
- Defog's
sqlcoder-7b-2(💡 my winner): open-source, SQL-native, and surprisingly good with schema-aware prompting.
And here's the best part — you can run it locally using 4-bit quantization with BitsAndBytes, so you don't need a $10k GPU setup.
Step 3: Build a Tiny Flask App to Run Local Inference
I didn't need anything fancy. I just wanted a simple playground. Here's a mini version of the app's core logic:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
from flask import Flask, request, jsonify
app = Flask(__name__)
model_name = "defog/sqlcoder-7b-2"
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
load_in_4bit=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
@app.route("/generate", methods=["POST"])
def generate_sql():
data = request.json
question = data["question"]
prompt = f"""
You are a SQL expert. Use the following schema:
Table: employees
Columns: id, name, department, salary, date_joined
### Convert the following question into a SQL query:
{question}
SQL:
"""
output = generator(prompt, max_new_tokens=150, do_sample=False)[0]["generated_text"]
return jsonify({"sql": output.split("SQL:")[-1].strip()})Send a POST request with { "question": "List employees in marketing" }
...and out comes a SQL query.
Step 4: Prompting Is the Hidden Superpower
At first, I let users type questions and passed them directly to the model. The results were… creative. The model hallucinated tables and columns like revenue_data_2022.
Turns out, prompting matters a lot.
By including a schema definition in the prompt every time, I gave the model enough grounding to stay accurate. Here's my working template:
You are a SQL expert.
Use the following schema:
Table: employees
Columns: id, name, department, salary, date_joined
Convert the following question into a SQL query:
[question goes here]This one change improved results by about 70%.
Step 5: Make It Better (But Only When It Breaks)
The coolest part? Once I shared the tool internally, other devs started using it — and breaking it. That's when the real learning happened.
I added:
- A feedback button ("Was this SQL correct?")
- A log of all questions + generated queries
- A filter to remove unsafe queries (
DROP,DELETE, etc.)
I didn't try to turn it into a polished product. I just made small improvements when the current version failed.
What I Learned (That Might Help You)
- Start small. Build a playground, not a platform.
- Use schema-aware prompts. LLMs aren't mind readers.
- Test on real questions. You'll find edge cases fast.
- Let usage guide improvements. Feedback beats guessing.
Final Thoughts
I still don't consider myself an AI engineer.
But this project showed me that you don't need to be one to build something useful with AI — especially when you know how to break a problem down, ask good questions, and iterate fast.
If you're building something similar, or thinking about it, feel free to reach out. Or if you want the base Flask app as a starting point, just drop a comment — I'll happily share it.
Thank you for being a part of the community
Before you go:
- Be sure to clap and follow the writer ️👏️️
- Follow us: X | LinkedIn | YouTube | Newsletter | Podcast | Differ | Twitch
- Check out CoFeed, the smart way to stay up-to-date with the latest in tech 🧪
- Start your own free AI-powered blog on Differ 🚀
- Join our content creators community on Discord 🧑🏻💻
- For more content, visit plainenglish.io + stackademic.com