Ever stared blankly at a search bar, trying to remember the right keyword to find that perfect travel backpack with USB charging, waterproof lining, and secret compartments?

Yeah, same.

That's why at Nobs, we built an AI that doesn't just search โ€” it understands. Here's how we trained our bot to go from "just keywords" to "shopping buddy with brains."

๐Ÿง  Step 1: Give the Bot Some Context

Product listings are messy โ€” you've got names, prices, specs, and descriptions. We mashed all the good stuff together into one rich, contextual string

def combine_attributes(row):
    attributes = [row['Name'], row['Category'], row.get('SellingPrice', ''),
                  row['Description'], row['Specification']]
    attributes = [str(attr) for attr in attributes if pd.notna(attr) and attr != ""]
    return " ".join(attributes)

Now, instead of just searching by product name, the AI learns from the entire product story.

๐Ÿ“ฆ Step 2: Teach It to Remember

We used OpenAI's text-embedding-ada-002 model to turn each product into a 1536-dimension vector. Think of it like compressing each product into a "meaningful fingerprint" of its essence

response = openai.Embedding.create(
    model="text-embedding-ada-002",
    input=chunk
)
embeddings.extend([result["embedding"] for result in response["data"]]

We cached this to avoid burning through credits every time someone searched for "Bluetooth earphones under 2k with mic."

๐Ÿ” Step 3: Understand Queries Like a Human

When a user types "Need something rugged and waterproof for mountain hiking," most search engines throw up their hands. Ours doesn't.

We built a little GPT-based assistant to refine and extract meaningful keywords from such natural queries

response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "Extract key search terms for e-commerce."},
        {"role": "user", "content": f"{query}"}
    ]
)
enhanced_query = response.choices[0].message.content.strip()

The AI essentially rephrases vague questions into sharp, searchable terms.

๐Ÿ”— Step 4: Match Like a Pro

We match the user query embedding against all product embeddings using good ol' cosine similarity

cosine_similarity([query_embedding], chunk_embeddings).flatten()

Then, we sort the top results โ€” and boom, relevant recommendations appear like magic โœจ

๐Ÿงช Bonus: Caching, Chunking & Smart Retries

We wrapped the heavy-lifting in a retry loop to survive API hiccups, used chunking to dodge rate limits, and pickle-dumped embeddings to disk for performance:pythonCopyEdi

try:
    with open(embeddings_file, 'rb') as f:
        product_embeddings = pickle.load(f)
except:
    product_embeddings = generate_embeddings_in_chunks(product_descriptions)

No drama, no downtime โ€” just AI doing what it does best.

๐Ÿš€ Final Thoughts

We didn't reinvent search. We just made it smarter. More human. More helpful. That's the Nobs way โ€” no BS, just results.

Got a messy product catalog or confusing customer queries? Let's talk.

Have questions? Reach out directly at allvaluenobs@gmail.com โ€” we're here to help you get the most from your data.

NoBS is a data science and machine learning company leveraging smart, affordable talent to deliver real impact without inflated costs. We focus on outcomes, not jargon โ€” if we don't deliver value, you don't pay. Specializing in practical, scalable solutions for small and mid-sized companies, we use open-source tools and a no-nonsense approach to keep things simple and cost-effective.

With over 100 solutions built across industries, we're honest, affordable, and focused on results that truly matter.

๐ŸŒ Website: www.no-bs.in