The Chunking Dilemma
Remember when we talked about chunking? We had a problem:
Small chunks:
"Einstein published his theory in 1905."- ✅ Precise matching (easy to find with "Einstein 1905")
- ❌ Missing context (which theory? what was the impact?)
Large chunks:
[Entire 2000-word biography section]- ✅ Full context
- ❌ Poor matching (query about "1905" buried in text)
We want both! Precise search + complete context.
The Parent-Child Solution
The idea: Index small chunks, but return large parent sections.
Parent Document: "Einstein Biography - Chapter 3"
├─ Child Chunk 1: "Einstein was born in 1879..."
├─ Child Chunk 2: "He published relativity in 1905..." ← Match!
└─ Child Chunk 3: "Won Nobel Prize in 1921..."
Query: "Einstein 1905 theory"
Search Result: Child Chunk 2 (precise match)
LLM Receives: Entire Chapter 3 (full context)How It Works
Step 1: Create Parent-Child Structure
When chunking your documents:
{
"doc_id": "einstein_bio", # Parent
"chunk_id": "einstein_bio:::2", # Child (2nd chunk)
"text": "He published relativity in 1905...",
"meta": {
"source": "biography.pdf",
"parent_id": "einstein_bio" # Link to parent
}
}Step 2: Index the Children
# Build index with SMALL chunks (1-2 sentences)
chunks = build_store(docs, strategy="sentence", sentences_per_chunk=2)
dense_idx.build(chunks)Step 3: Retrieve Children, Return Parents
def attach_parent_sections(retrieved_chunks, all_chunks, max_chars=1200):
"""For each retrieved chunk, find and attach parent context."""
# Group all chunks by parent document
by_parent = defaultdict(list)
for chunk in all_chunks:
parent_id = chunk.get("meta", {}).get("parent_id")
if parent_id:
by_parent[parent_id].append(chunk)
enriched = []
for chunk in retrieved_chunks:
# Get parent document ID
parent_id = chunk.get("meta", {}).get("parent_id")
if not parent_id:
# No parent, just return original chunk
enriched.append(chunk)
continue
# Get all sibling chunks from same parent
siblings = by_parent.get(parent_id, [])
# Combine sibling chunks into parent context
parent_text = " ".join([s["text"] for s in siblings])
# Truncate if too long
if len(parent_text) > max_chars:
parent_text = parent_text[:max_chars] + "..."
# Attach parent context
chunk["parent_context"] = parent_text
enriched.append(chunk)
return enrichedVisual Example
Original Document Structure:
Document: "Machine Learning Tutorial"
Chapter 1: Introduction
├─ Chunk 1-1: "ML is a subset of AI..."
├─ Chunk 1-2: "It learns from data..."
└─ Chunk 1-3: "Common applications include..."
Chapter 2: Supervised Learning
├─ Chunk 2-1: "Supervised learning uses labeled data..."
├─ Chunk 2-2: "Examples include classification..."
└─ Chunk 2-3: "Linear regression is the simplest..."Query: "What is supervised learning?"
Without Parent-Child:
Retrieved: Chunk 2-1 "Supervised learning uses labeled data..."
LLM Sees: Just that one sentence
Answer: Limited, might miss important detailsWith Parent-Child:
Retrieved: Chunk 2-1 (matched query)
LLM Sees: All of Chapter 2 (Chunks 2-1, 2-2, 2-3 combined)
Answer: Complete explanation with examples and context ✅Sentence Window: A Simpler Alternative
If you don't have a clear parent-child structure, use sentence window expansion:
def sentence_window_expand(retrieved_chunks, window_sentences=2):
"""Expand each chunk by N sentences before/after."""
expanded = []
for chunk in retrieved_chunks:
text = chunk["text"]
# Split full document into sentences
# (In real code, you'd need the full document reference)
sentences = sent_tokenize(full_document)
# Find current chunk in sentences
chunk_sentences = sent_tokenize(text)
start_idx = find_sentence_position(sentences, chunk_sentences[0])
# Get window around it
window_start = max(0, start_idx - window_sentences)
window_end = min(len(sentences), start_idx + len(chunk_sentences) + window_sentences)
# Combine window
window_text = " ".join(sentences[window_start:window_end])
chunk["window_context"] = window_text
expanded.append(chunk)
return expandedExample:
Full Text: "Sentence 1. Sentence 2. Sentence 3. Sentence 4. Sentence 5."
Retrieved Chunk: "Sentence 3"
Window (±2): "Sentence 1. Sentence 2. Sentence 3. Sentence 4. Sentence 5."Multi-Vector Documents: Advanced Approach
The problem: Documents have different parts — title, summary, body. Queries might match different parts.
Example:
Document: "Python Programming Guide"
├─ Title: "Python Programming Guide"
├─ Summary: "Learn Python basics, syntax, and best practices"
└─ Body: [5000 words of detailed content]
Query: "Python guide"
- Matches title strongly (short, distinctive)
- Would match body weakly (buried in long text)Solution: Create multiple embeddings per document:
def multi_vector_views(document):
"""Create multiple searchable views of one document."""
views = []
# View 1: Title (for broad matching)
views.append({
"chunk_id": f"{doc_id}:::title",
"text": document["title"],
"view_type": "title",
"parent_id": doc_id,
"returns": "body" # If matched, return body
})
# View 2: Summary (for detailed matching)
views.append({
"chunk_id": f"{doc_id}:::summary",
"text": document["summary"],
"view_type": "summary",
"parent_id": doc_id,
"returns": "body"
})
# View 3: Body (for specific facts)
body_chunks = chunk_by_sentence(document["body"], sentences_per_chunk=3)
for i, chunk_text in enumerate(body_chunks):
views.append({
"chunk_id": f"{doc_id}:::body_{i}",
"text": chunk_text,
"view_type": "body",
"parent_id": doc_id,
"returns": "self"
})
return viewsWhy this works:
Query: "Python programming tutorial"
Matches: 1. Title view: "Python Programming Guide" (score: 0.92) 2. Summary view: "Learn Python basics…" (score: 0.85) 3. Body chunk 47: "…advanced Python features…" (score: 0.71)
All point to the same document → Boost that document's rank Return: Full body content (what the user actually needs)
Approach Best For Complexity
Parent-Child -- Structured docs (books, manuals with chapters) -- Medium
Sentence Window -- Unstructured text (articles, blogs) -- Low
Multi-Vector -- Documents with distinct sections (papers, reports) -- HighImplementation Tips
- How big should parent sections be?
# Too small (defeats the purpose)
max_parent_chars = 200 ❌
# Too large (exceeds LLM context)
max_parent_chars = 10000 ❌
# Just right
max_parent_chars = 1000-1500 ✅2. Should you always return the parent?
No! Sometimes the child chunk is enough:
def should_expand_context(chunk, query):
# If chunk is already long, no need
if len(chunk["text"]) > 500:
return False
# If query matches specific fact, chunk is sufficient
if is_factoid_query(query): # "When was X born?"
return False
# Otherwise, expand
return True3. Deduplication
If multiple child chunks from the same parent are retrieved, don't repeat the parent context:
seen_parents = set()
for chunk in retrieved:
parent_id = chunk.get("meta", {}).get("parent_id")
if parent_id in seen_parents:
chunk["parent_context"] = "[See previous result]"
else:
chunk["parent_context"] = get_parent_text(parent_id)
seen_parents.add(parent_id)Real-World Example
# Your RAG pipeline with parent-child
def rag_with_context_expansion(query):
# Step 1: Retrieve small chunks (precise)
candidates = hybrid_search(query, dense_idx, sparse_idx, k=100)
# Step 2: Rerank
top_chunks = cross_encoder_rerank(query, candidates[:50], top_k=10)
# Step 3: Expand context (return large parent sections)
expanded = attach_parent_sections(
top_chunks,
all_chunks=store["chunks"],
max_chars=1200
)
# Step 4: Generate with expanded context
answer = openai_generate(
query,
[{"text": c["parent_context"]} for c in expanded]
)
return answerBefore (without expansion):
Query: "How does photosynthesis work?"
Retrieved: "Plants convert light to energy."
LLM Answer: "Plants convert light energy to chemical energy."
Quality: ⭐⭐⭐ (vague, missing details)After (with expansion):
Query: "How does photosynthesis work?"
Retrieved: "Plants convert light to energy." (small chunk)
Expanded: [Full paragraph about chlorophyll, light reactions, Calvin cycle...]
LLM Answer: "Photosynthesis occurs in two stages. First, in the light-dependent reactions, chlorophyll absorbs photons..."
Quality: ⭐⭐⭐⭐⭐ (detailed, accurate, complete)What's Next
You now know how to:
- Index small chunks for precise retrieval
- Return a large context for complete answers
- Handle structured and unstructured documents
But what if someone asks about "recent news" or "last quarter's reports"? Time matters!
In the next article, we'll cover Time-Based Filtering and Freshness Boosting — how to prioritize recent documents and filter by date ranges. Essential for news, updates, and time-sensitive queries!