Agentic AI Systems Are Not Built — They Are Grown

Sharing the experience of working on Databricks Agent Bricks

THE BRICK LEARNING

Towards Data Engineering

· ~6 min read · December 2, 2025 (Updated: December 2, 2025) · Free: Yes

A Retail Product Mapping Story About Why Agentic AI Requires Iteration, Not Assembly

For almost three decades, enterprise systems followed a comforting pattern: gather requirements, design components, assemble modules, test workflows, deploy, and then… the system stays predictable until someone changes the code.

You can listen here:

But agentic AI refuses to play by those rules.

The moment you ask an AI agent to reason, interpret, and choose — especially in a chaotic real-world domain like global retail product mapping — you're no longer dealing with deterministic software. You're dealing with an evolving system. And evolving systems aren't built once. They are grown through cycles of use, failure, feedback, and refinement.

Let's ground this idea in a real scenario: a global shoe retailer operating in 100+ countries, sourcing data from dozens of vendors, each with their own naming conventions, spreadsheets, marketplaces, and internal codes. Every day, a team manually maps thousands of vendor SKUs to a global product master. Some arrive with clean names; many arrive with mystery text, missing fields, or the kind of noise that makes you wonder if the vendor is trolling you.

This is exactly the world where an agentic AI system can shine — but only if we accept that its development is not a one-time engineering exercise. It is an iterative intelligence-building process.

Below is a complete, narrative walkthrough structured around five steps of building an agentic AI system, expressed through a retail product-mapping use case.

A. Narrow the Use Case + Set Up the Data (Step 1: Frame a Manageable Decision Task)

Agentic AI does not succeed when you throw the entire enterprise at it. It succeeds when you start with a small, repetitive, painful decision task that humans currently solve repeatedly.

In our retailer's case, we scope the POC to just two types of vendor product-name problems:

1. The "Unknown or Placeholder" Name Problem

These are rows where the vendor gives you almost no usable signal:

NULL
"N/A"
"Shoes"
"TBD"
"Style 123"
"Product 1"
"Sample"

You can't trust these. You must rely on attributes (brand, gender, size, colour, GTIN/EAN).

2. The "Noisy or Decorated" Name Problem

These rows contain a product name — but buried under layers of promotions, tags, channel indicators, or vendor codes:

BEST SELLER NIKE_AIR_ZOOM_XYZ123_RED_42_EU
"Air Zoom Pegasus 40 (Pack of 2) EU 42 Red"
[AMAZON FR SALE] Nike Pegasus 40 NEW

Marketing fluff, emojis, platform tags, and embedded technical details all distort the actual identity of the product.

To make the agent's job visible and measurable, we create a tight data model in Unity Catalog:

Golden Truth Tables

product_master: A clean, canonical representation of every global shoe product.

Fields like: product_id, brand, model_name, category, gender, colour, size_scale, size_value, gtin_ean, description_canonical.

product_alias: Every known vendor spelling, nickname, noisy title, or historic name.

Raw Input Table

raw_product_ingest: Exactly what vendors send — untouched, unfiltered.

Fields like: vendor_id, vendor_sku, product_name_raw, brand_raw, size_raw, colour_raw, gtin_ean_raw, description_raw, country_code, price, channel.

The POC Question

Once the data is cleanly defined, the POC becomes a crisp scientific question:

"Given this raw vendor SKU row, can an agent clean the name or compensate for missing names, and map it to a global product_id — or determine that it's a new product — using only a small set of tools?"

This is the first step of agentic system development: start small, start narrow, and start somewhere you can observe change.

B. Build a Tiny Toolset (Step 2: Create a Minimal Functional Loop)

A classic mistake teams make is building twenty tiny helper functions and hoping the agent figures them out. Instead, we build four 'fat' Unity Catalog functions, each capturing a major stage of human reasoning.

These functions don't "solve the problem"; they structure the problem so the agent has reliable primitives.

1. `fn_clean_product_name` — Turning Vendor Noise Into Meaning

This function removes clutter from product_name_raw: emojis, promotions, channel tags, pack indicators, vendor codes.

It normalizes spacing, case, brand placement, and returns:

a clean_name
extracted decorations_found (SALE, AMAZON, MULTIPACK)
a name_quality signal (STRONG, WEAK, UNKNOWN)

This is the agent's first lens.

2. `fn_detect_placeholder_name` — Knowing When a Name Is Useless

This function identifies when the raw product name is essentially a placeholder:

NULL, UNKNOWN, SHOES, FOOTWEAR, PRODUCT 1

It returns:

a placeholder_flag
a placeholder_type

Now the agent knows when not to trust the name at all.

3. `fn_candidate_product_search` — Searching for the Right Product

This is a multi-strategy retrieval function combining:

GTIN/EAN matching
Brand/category/gender filtering
Alias lookups
Fuzzy matching
Semantic similarity between cleaned name and canonical model_name / description

It returns a ranked list of plausible products, each with title/description/attribute scores.

4. `fn_confidence_and_decision` — The Judge

Given the raw row and the candidates, this function evaluates:

Structural alignment
attribute alignment
similarity signals
placeholder indicators

It returns:

chosen_product_id
decision (AUTO_ACCEPT / NEEDS_REVIEW / NO_MATCH)
confidence_score
human-readable reasoning

This is where the agent forms an output that humans can trust — or override. Together, these four functions create a minimal, end-to-end reasoning loop.

This is Step 2 of agentic development: build only the tools needed for reasoning, not a large library of helpers.

C. Introduce Agentic Flow (Step 3: Add Orchestration + Reasoning)

Agentic AI is not about one function calling another. It's about deciding which functions to call, and why — like a junior merchandiser reasoning through each SKU.

We replicate that through three roles:

Knowledge Assistant — understanding the case

It reads the raw row and classifies it:

NAME_UNKNOWN → rely on GTIN + attributes
NOISY_NAME → clean name → search
ATTRIBUTE_ONLY → strong GTIN/size/colour, weak title

It picks the right playbook.

Supervisor — executing the playbook

Depending on the case, it sequences the UC functions.

For NOISY_NAME, for example:

clean name
detect placeholder
candidate search
decision

For NAME_UNKNOWN, it may skip cleaning entirely and rely purely on attributes.

Genie — the execution engine

Genie executes the SQL functions and returns a rich structured result:

{
  "vendor_sku": "XYZ123",
  "proposed_product_id": "P18902",
  "confidence_score": 0.982,
  "machine_reason": "GTIN match + size/colour aligned; title weak but consistent"
}

This is Step 3 of agentic development: the agent doesn't just compute — it chooses, reasons, and adapts its approach.

D. Human-in-the-Loop Review (Step 4: Put Humans in the Middle)

Agentic AI isn't meant to eliminate humans; it's meant to amplify them.

A Product Mapping Review App becomes the workspace where merchandisers validate, correct, or override agent decisions. The app shows:

raw vendor data
cleaned signals
agent proposal
reasoning + confidence
detected decorations
placeholder flags

Users can:

approve the mapping
reject with reason
choose a different product
create a new product in the master

Every action becomes a learning opportunity.

This is Step 4: The iteration loop requires humans not as "approvers" but as "teachers".

E. Feedback + Learning Loop (Step 5: Let the System Evolve)

The fifth step is the heart of agentic system development: closing the loop.

Every decision — automatic or manual — is logged in product_mapping_feedback. This includes:

cleaned name
candidate list
human corrections
confidence patterns
vendor-level quirks
attribute mismatches
decoration types

This feedback becomes the source of truth for improvement:

add new aliases
expand cleaning rules
update placeholder patterns
adjust auto-accept thresholds
improve semantic search
vendor-wise calibration

Week after week, the system stabilizes. Not by adding more code, but by refining behavior.

That's the true nature of agentic AI: It becomes accurate because it lives long enough to learn.

Final Thoughts — You Don't Build an Agent. You Grow One.

When you zoom out across A to E, a clear story emerges:

You framed a narrow, observable decision task.
You built only the essential tools.
You added reasoning and orchestration.
You wrapped it with human expertise.
You fed every correction back into the loop.

This is not "project delivery". It's not "prompt engineering". It's not "LLM integration".

It is the practice of growing an intelligent system.

And for a global retailer buried in inconsistent vendor data, this shift is transformative. You're no longer maintaining a brittle rule engine. You're cultivating a self-improving product classification ecosystem — one that becomes faster, smarter, and more aligned with reality every single day.

#databricks #ai-agent #agent-bricks