When Legacy Code Meets Modern Infrastructure: A Wake-Up Call for CISOs

On December 26, 2025, a critical vulnerability was discovered in makeindex 2.17, a 40-year-old indexing tool embedded in TeX Live 2025 — the LaTeX typesetting system used by millions of academic institutions, publishing houses, and technical documentation platforms worldwide. The vulnerability (CVSS 5.5 MEDIUM, CWE-674) causes complete denial of service when processing specially crafted index files, resulting from uncontrolled recursion in a custom sorting algorithm that predates modern security practices. While extensive testing ruled out code execution, the 100% reproducible crash poses real business risk: automated build pipelines fail silently, CI/CD systems stall, thesis compilation servers crash, and documentation generation grinds to a halt. The attack vector is deceptively simple — a malicious .idx file delivered via email, committed to a shared repository, or embedded in a LaTeX template.

What makes this discovery significant isn't just the technical flaw — it's what it reveals about invisible dependencies in critical infrastructure. How many of your organization's documentation systems, academic publishing workflows, or automated build processes rely on TeX Live without your security team's awareness? This vulnerability was discovered accidentally during legitimate use (building a course index from Excel notes), not through targeted fuzzing, which raises an uncomfortable question: how many similar flaws exist in the legacy tools we trust implicitly? The attack scenarios are straightforward: compromise an upstream documentation repository to inject malicious index generation, upload a weaponized LaTeX package to CTAN (the TeX package repository), or social-engineer targets with malicious templates. No privilege escalation required, no complex exploit chains — just a crafted file that triggers algorithmic failure in code written before stack canaries and ASLR were standard protections.

For CISOs, the strategic takeaway extends beyond patching a single tool. This incident underscores the need for software bill of materials (SBOM) visibility into academic and research toolchains, security audits of legacy dependencies in automated systems, and input validation at pipeline boundaries even for "trusted" file formats. Immediate mitigations include wrapping makeindex execution with timeouts and auditing which systems in your infrastructure rely on TeX Live. Long-term remediation requires vendor patches and replacing vulnerable sorting algorithms with modern implementations. The lesson: when critical infrastructure relies on decades-old code maintained by academic volunteers, security becomes everyone's responsibility — not just when vulnerabilities are announced, but before they're discovered.

Discovery Story — The Bug That Found Me

When Your Course Study Notes Crash a 40-Year-Old Program

December 26, 2025. While most security researchers were winding down from the holidays, I was deep in GXPN (GIAC Exploit Researcher and Advanced Penetration Tester) course materials — a SANs course I've since completed — preparing study notes for certification at the time, with 371 rows of Excel notes covering advanced exploitation techniques, reverse engineering, and vulnerability research. The irony of what happened next isn't lost on me: while studying how to find vulnerabilities, I accidentally discovered one in the tools I was using to organize my notes!

My workflow was simple: convert Excel course notes into a beautiful, indexed PDF using LaTeX. I wrote a Python script (excel_to_idx_improved.py) that transformed my spreadsheet into 2,171 LaTeX index entries—topics like "buffer overflow protection," "ROP chain construction," and "kernel exploitation primitives." The script worked flawlessly. The conversion completed without errors. I ran makeindex main_categorized.idx expecting a sorted index file, ready for PDF compilation.

Instead, I got this:

$ makeindex main_categorized.idx
This is makeindex, version 2.17 [TeX Live 2025]
Scanning input file.....done (1958 entries accepted, 104 rejected).
Sorting entries....
Segmentation fault: 11

My first thought: "I broke LaTeX." My second thought: "Wait, that's impossible." Exit code 139 (128 + signal 11) meant SIGSEGV — a segmentation fault during the sorting phase. Not during parsing. Not during output. During sorting. A sorting algorithm crashing on 1,958 entries? That felt wrong. Modern computers sort millions of records in milliseconds. Something deeper was happening.

The Investigation Begins: From Frustration to Fascination

I did what any developer does: I Googled the error. Nothing. I checked LaTeX forums. Silence. I tried smaller files — they worked. I tried the same data in different orders — no crash. I removed duplicate entries — suddenly, no crash. That's when frustration turned into curiosity. This wasn't a corrupted file or a syntax error. This was reproducible, data-dependent behavior. This was a vulnerability.

I created a version of my index file with duplicates removed: 1,955 unique entries. It processed perfectly. The difference? Exactly 35 duplicate entries. Adding those 35 duplicates back to the working file reproduced the crash 100% of the time. I had stumbled into something rare in security research: an organically discovered vulnerability with a perfect proof-of-concept file generated during completely legitimate use.

But I needed answers. Why 1,990 entries? Why these specific duplicates? Why did the original ordering crash while reversed or shuffled versions worked fine? I pulled out my security research toolkit and started systematic analysis.

Phase 1: Binary Search Fuzzing — Finding the Crash Boundary

I wrote a binary search fuzzer to isolate the exact crash threshold (with assistance from Claude AI for rapid prototyping):

def binary_search_crash_point(entries):
    low, high = 0, len(entries)
    while low < high:
        mid = (low + high) // 2
        if crashes_with_entries(entries[:mid]):
            high = mid
        else:
            low = mid + 1
    return low

Result: The crash occurred at exactly 1,990 entries. Not 1,989. Not 1,991. Precisely 1,990. But here's where it got weird: when I created synthetic files with 2,000, 5,000, even 10,000 random entries, they all processed successfully. This wasn't a simple threshold bug — it was content-specific. The vulnerability required both the right number of entries AND the right data pattern.

Phase 2: Delta Debugging — The 35 Duplicates

I compared my crashing file (1,990 entries) with the working file (1,955 unique entries). The difference: 35 duplicate entries spread throughout the file. Entries like "Scapy: Citrix Provisioning services TFTP" appeared 10 times, "PowerShell Modules" appeared 7 times, "PowerShell Runas" appeared 6 times, and various other GXPN course topics were repeated with different page references.

I wrote a delta debugger to isolate the minimal crash case. When I added those exact 35 duplicates back to the working file, maintaining their original positions and ordering, the crash returned. Remove any one of them? No crash. Change their order? No crash. This vulnerability required surgical precision — specific duplicates in specific positions.

Phase 3: Order Dependency — The Smoking Gun

This test sealed it:

original = load_entries('crashing.idx')
reversed_entries = list(reversed(original))
shuffled = original.copy()
random.shuffle(shuffled)
print(f"Original: {test_crash(original)}")          # CRASH
print(f"Reversed: {test_crash(reversed_entries)}")  # OK
print(f"Shuffled: {test_crash(shuffled)}")          # OK

Only the original ordering triggered the crash. Reverse the file? No crash. Shuffle it randomly? No crash. This ruled out buffer overflows, heap corruption, and memory safety bugs. Those vulnerabilities don't care about input order — they care about input size or content. Order dependency screams one thing: algorithmic vulnerability.

Phase 4: Pattern Analysis — The Clustering Discovery

I analyzed the 35 duplicate entries and found a disturbing pattern:

  • 14 unique entries were duplicated (some appeared 2–10 times)
  • 62.9% of duplicates appeared within 10 lines of their original
  • Average gap between duplicate and original: 7.6 lines
  • Duplicates formed clusters in alphabetically sorted regions

This wasn't random duplication from sloppy data entry. My Excel-to-LaTeX conversion script had legitimately created these duplicates because the same GXPN topics appeared on multiple pages.

But that clustering — duplicates appearing close together in the input file — was the trigger.

The sorting algorithm was choking on clustered duplicates in a specific order.

The Realization: I Found Real Vulnerability

At this point, I wasn't just debugging my LaTeX workflow anymore. I was conducting professional vulnerability research. I had:

  • 100% reproducible crash with proof-of-concept file
  • Isolated trigger conditions (1,990 entries + 35 duplicates + specific order)
  • Ruled out common bug classes (buffer overflow, heap corruption)
  • Identified root cause category (algorithmic vulnerability in sorting)

What started as "Why won't my course notes compile?" had become "I need to contact the TeX Live Security Team." But first, I needed to understand the root cause at the source code level and assess exploitability. Was this just a denial-of-service bug, or could an attacker leverage it for code execution?

The next phase would require diving into 149 MB of TeX Live source code, analyzing 40-year-old C implementations of quicksort, and testing every exploitation vector I'd been studying in my GXPN materials. The student had become the researcher, and the course materials that triggered the bug would now guide me in analyzing it.

Special thanks to SEC660 Authors James Shewmaker and Barnett Darnell for doing an awesome job in explaining those concepts.

Technical Analysis — Anatomy of a 40-Year-Old Algorithm Failure

The Journey into TeX Live Source Code

After confirming the crash was algorithmic, I downloaded the complete TeX Live source code — 149 MB of C code, makefiles, and documentation spanning decades of academic software development. I needed to find the sorting implementation and understand why it was failing.

The crash message pointed me to the right place: Sorting entries.... followed by SIGSEGV. I traced the execution flow:

1. Entry Point: sort_idx() in sortid.c

void sort_idx(void)
{
    MESSAGE("Sorting entries...");
    idx_dc = 0;
    idx_gc = 0L;
    // Custom quicksort - this crashes
    qqsort(idx_key, (size_t)idx_gt, sizeof(FIELD_PTR), compare);
    MESSAGE1("done (%ld comparisons).\n", idx_gc);
}

Not qsort() from the standard library. qqsort()—a custom implementation. That's when alarm bells went off. Custom implementations of foundational algorithms often contain subtle bugs that standard library versions solved decades ago. I dove into qsort.c.

The Vulnerable Quicksort: A 1970s Algorithm Without Modern Safeguards

The core vulnerability lives in qst() (quicksort recursive function) in qsort.c. Here's the critical section:

static void qst(char *base, char *max)
{
    register char *i, *j, *jj, *mid;
    register int ii;
    register char c;
    void *tmp;
    int lo, hi;
    lo = max - base;  // Number of elements
    do {
        // 1. PIVOT SELECTION (median-of-three)
        mid = i = base + qsz * ((unsigned)(lo / qsz) >> 1);
        if (lo >= mthresh) {
            j = ((*qcmp)(base, i) > 0 ? base : i);
            if ((*qcmp)(j, (tmp = max - qsz)) > 0) {
                j = (j == base ? i : base);
                if ((*qcmp)(j, tmp) < 0)
                    j = tmp;
            }
            // Swap median into middle position
        }
        // 2. PARTITIONING (2-way) - THE PROBLEM
        for (i = base, j = max - qsz;;) {
            while (i < mid && (*qcmp)(i, mid) <= 0)
                i += qsz;
            while (j > mid) {
                if ((*qcmp)(mid, j) <= 0) {
                    j -= qsz;
                    continue;
                }
                // Swap logic...
            }
        }
        // 3. RECURSION - NO DEPTH LIMIT!
        i = (j = mid) + qsz;
        if ((lo = j - base) <= (hi = max - i)) {
            if (lo >= thresh)
                qst(base, j);  // Recurse on smaller partition
            base = i;
            lo = hi;
        } else {
            if (hi >= thresh)
                qst(i, max);  // Recurse on smaller partition
            max = j;
        }
    } while (lo >= thresh);
}

Three critical design flaws converge to create this vulnerability:

Flaw #1: No Recursion Depth Limit

Notice what's missing? No recursion counter. No depth check. No safety valve. Modern quicksort implementations limit recursion depth to prevent stack exhaustion. The standard library qsort() uses iterative approaches or bounded recursion. This implementation assumes partitioning will always create balanced divisions. That assumption is fatal.

Flaw #2: 2-Way Partitioning with Duplicate Mishandling

The partitioning loop uses (*qcmp)(i, mid) <= 0 (less than or equal). When entries compare as equal (duplicates return 0 from the comparison function), they're treated as "less than or equal to the pivot" and placed in the left partition.

Classical 2-way quicksort creates:

[elements < pivot] | [pivot] | [elements > pivot]

But with duplicates returning compare() == 0, it becomes:

[elements <= pivot (including ALL duplicates)] | [pivot] | [elements > pivot]

All duplicates go to one side. With 35 duplicates of the pivot value, that's 35 entries guaranteed to create an unbalanced partition.

Flaw #3: The Comparison Function Returns 0 for True Duplicates

In sortid.c, the compare_page() function handles entries that match in all fields:

static int compare_page(const FIELD_PTR *a, const FIELD_PTR *b)
{
    int m = 0;
    short i = 0;
    // Compare page number arrays
    while ((i < (*a)->count) && (i < (*b)->count) &&
           ((m = (*a)->npg[i] - (*b)->npg[i]) == 0))
    {
        i++;
    }
    if (m == 0) {
        if ((i == (*a)->count) && (i == (*b)->count)) {
            // Entries are IDENTICAL
            // CRITICAL: Truly identical entries
            else if (STREQ((*a)->encap, (*b)->encap))
            {
                if (((*a)->type != DUPLICATE) &&
                    ((*b)->type != DUPLICATE))
                    (*b)->type = DUPLICATE;
                // RETURN 0 = EQUAL
                // This causes unbalanced partitions!
            }
        }
    }
    return (m);
}

When index entries are truly identical (same term, same page numbers, same formatting), the comparison returns 0. This is semantically correct for marking duplicates, but algorithmically catastrophic for 2-way quicksort.

The Cascading Failure: From O(log n) to O(n) Recursion Depth

Let me walk you through what happens when makeindex processes my 1,990-entry file:

Expected behavior with balanced partitions:

Input: 1,990 entries
Expected recursion depth: log₂(1990) ≈ 11 levels
Stack usage: 11 × ~100 bytes ≈ 1,100 bytes

Actual behavior with pathological duplicate pattern:

Input: 1,990 entries (35 duplicates clustered, specific order)
Actual recursion depth: ~1,990 levels
Stack usage: 1,990 × ~100 bytes ≈ 199,000 bytes

Here's the step-by-step cascade:

  1. Median-of-three selects pivot from beginning, middle, end
  2. Pivot happens to be a duplicated entry (62.9% are clustered within 10 lines)
  3. Comparison function returns 0 for all duplicates of this pivot
  4. Partitioning loop treats all duplicates as <= pivot
  5. Left partition: 1,989 entries (all duplicates went left)
  6. Right partition: 0 entries
  7. Recursion depth increases by 1, repeat with 1,989 entries
  8. This pattern continues ~1,990 times until stack exhaustion
  9. Stack guard page is accessed → SIGSEGV

The order dependency now makes perfect sense. My specific data pattern (alphabetically sorted Excel notes with clustered duplicates) consistently triggers poor pivot selection. When I reverse or shuffle the file, the median-of-three chooses different pivots that don't have as many duplicates nearby, creating better balance.

Stack Overflow Mechanics: Why ~2,000 Levels Crashes an 8 MB Stack

Each recursive call to qst() consumes stack space:

static void qst(char *base, char *max)
{
    register char *i;      // ~8 bytes
    register char *j;      // ~8 bytes
    register char *jj;     // ~8 bytes
    register char *mid;    // ~8 bytes
    register int ii;       // ~4 bytes
    register char c;       // ~1 byte
    void *tmp;             // ~8 bytes
    int lo;                // ~4 bytes
    int hi;                // ~4 bytes
    // Total local variables: ~53 bytes
    // Function call overhead: ~40-60 bytes
    // Estimated total: ~100 bytes per call
}

macOS default stack size: 8 MB = 8,388,608 bytes Theoretical maximum depth: 83,886 calls Actual failure depth: ~1,990 calls

Wait — why does it crash at 1,990 when the math says it should handle 83,000? Because the recursion pattern isn't linear — it's nested. Each unbalanced partition creates a new recursion level that maintains its own stack frame while recursing deeper. The actual stack consumption involves:

  • Local variables for each level
  • Return addresses for each call
  • Register saves across function boundaries
  • Compiler-inserted stack protection (stack canaries)

The complexity multiplies faster than simple local variable arithmetic suggests.

The Proof: Why This Is Algorithmic, Not Exploitable

I tested every exploitation vector from my GXPN studies:

Test 1: Buffer Overflow Attempt

Created index files with entries up to 16,000 characters:

for length in [1000, 2000, 4000, 8000, 16000]:
    create_long_entry_idx(f"test_{length}.idx", length, 100)
    result = test_crash(f"test_{length}.idx")

Result: All tests passed (no crash). Conclusion: Not a simple buffer overflow.

Test 2: Massive Duplicates Without Order Dependency

Created files with up to 2,000 duplicate entries in random order:

for dup_count in [100, 500, 1000, 2000]:
    create_duplicate_pattern(f"test_{dup_count}.idx",
                            "TestEntry", dup_count, spacing=10)
    result = test_crash(f"test_{dup_count}.idx")

Result: All tests passed (no crash). Conclusion: Not triggered by duplicate count alone — requires specific ordering.

Test 3: Pattern Injection for Memory Corruption

Attempted to inject patterns (cyclic patterns, format strings, shellcode-like sequences):

patterns = ["AAAAAAAA", "BBBBBBBB", "0x41414141", "%n%n%n"]
for pattern in patterns:
    create_pattern_idx(f"test_pattern.idx", pattern, 2000)
    result = test_crash(f"test_pattern.idx")

Result: All tests passed (no crash). Conclusion: Cannot inject controllable data into the crash.

Binary Security Analysis

$ otool -hv /Library/TeX/texbin/makeindex
MH_MAGIC_64  ARM64  EXECUTE  PIE

Security features active:

  • PIE (Position Independent Executable) → ASLR enabled
  • Stack canaries (macOS default)
  • DEP/NX (Data Execution Prevention)

Crash Characteristics

  • Signal: SIGSEGV (Segmentation Fault)
  • Exit code: 139 (128 + 11)
  • Cause: Stack guard page access (pure exhaustion)
  • Address: Fixed, not controllable
  • Type: Pure stack exhaustion, no memory corruption

Exploitability Verdict: DoS Confirmed, Code Execution Ruled Out

After exhaustive testing:

✅ Denial of Service: CONFIRMED

  • 100% reproducible crash with PoC file
  • Affects automated build systems
  • Easy delivery vector (malicious .idx file)
  • Reliable impact on availability

❌ Code Execution: RULED OUT

  • Pure stack exhaustion, not buffer overflow
  • No memory corruption beyond crash point
  • Cannot inject controllable patterns
  • Crash address is fixed (stack guard page)
  • Modern security mitigations prevent exploitation
  • Only real-world data triggers crash (not synthetic patterns)

CWE Classification: CWE-674 (Uncontrolled Recursion) CVSS Score: 5.5 (MEDIUM)-CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H

This is a design flaw from 1970s algorithmic assumptions, not a modern exploitation target. The vulnerability assumes trust in input data and relies on statistical probability that random data will create balanced partitions. My GXPN course notes — with their alphabetically organized, legitimately duplicated entries — happened to be the exact pathological case that breaks those assumptions.

CVE Disclosure Details — Navigating Responsible Disclosure

The Moment of Decision: Publish or Disclose?

I had everything a bug hunter dreams of: a perfect proof-of-concept, complete root cause analysis, 100% reproducibility, and a vulnerability in software used by millions worldwide. The temptation to tweet "BREAKING: 0-day in TeX Live" or rush a Medium post was real. But this isn't how professional security research works — and it's not how you protect the ecosystem.

The irony wasn't lost on me: Book 3 of the GXPN course, entitled "Product Security Testing, Fuzzing and Code Coverage," had just taught me about responsibly disclosing vulnerabilities. On the same day I discovered this bug, I had not only learned how to fuzz but also how to disclose my findings in a responsible way.

Responsible disclosure means prioritizing user safety over personal recognition. It means giving vendors time to patch before attackers weaponize your findings. It means following established disclosure frameworks even when you're excited to share your work. I'd been studying exploitation in GXPN; now I was learning the equally important skill of responsible vulnerability coordination.

The Disclosure Timeline: What Actually Happened

December 25, 2025 (Night of Discovery):

  • Sent initial vulnerability report to TeX Live Security Team (tlsecurity@tug.org)
  • Included: crash analysis, root cause, CVSS scoring, reproduction steps
  • Prepared complete disclosure package

December 26, 2025 (Vendor Response — 1 Day):

  • TeX Live Security Team responded promptly
  • Requested: Reproducible code and supporting files as tarball
  • Vendor assessment: "We don't believe this is exploitable, so we're planning on treating this as a regular bug rather than a security vulnerability. We'll aim for fixing it in the TL26 release. Feel free to publicly disclose this bug whenever you want, and do anything you want with CVEs. We don't interact with the CVE system."

December 26, 2025 (Disclosure Package Delivered):

  • Sent complete tarball to vendor including:
  • makeindex_vulnerability_VOneGiri.pdf (technical report)
  • Technical_Root_Cause_Analysis.pdf (source code analysis)
  • main_categorized.idx (proof-of-concept file)
  • Fuzzing tools (Python scripts for reproduction)
  • Suggested patches and mitigations

December 26, 2025 (CVE Submission):

  • Submitted CVE request to MITRE via https://cveform.mitre.org
  • Provided complete vulnerability details:
  • Software: makeindex 2.17 [TeX Live 2025]
  • Type: Stack Overflow via Uncontrolled Recursion
  • CWE: CWE-674
  • CVSS: 5.5 MEDIUM — CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H
  • Impact: Denial of Service (DoS only, code execution ruled out)
  • Discoverer: V-One Giri
  • Discovery Date: December 26, 2025
  • Reference: makeindex-2.17-segfault-duplicate-entries

Current Status (As of February 11, 2026–46 days post-disclosure):

  • ✅ Vendor notified and acknowledged (same day)
  • ✅ Vendor authorized immediate public disclosure
  • ✅ Complete disclosure package delivered
  • ⚠️ CVE submitted to MITRE on December 26, 2025 — no response received as of February 11, 2026 (46 days elapsed)
  • ⏳ Vendor patch targeted for TeX Live 2026 release — no patch timeline or findings package update received

Why Vendor Authorization Matters

The TeX Live Security Team's response — authorizing immediate public disclosure — reflects their professional assessment that this is a denial-of-service bug affecting availability, not a critical security compromise. Their willingness to engage constructively and provide clear guidance demonstrates the collaborative spirit that makes responsible disclosure work.

Given the vendor's explicit authorization, this research transitioned directly from initial notification to public disclosure, rather than following a traditional 90-day embargo period. Publication proceeds with vendor authorization and 46 days elapsed — independent of CVE assignment, which remains pending. This approach balances:

  • User awareness: Security teams can assess risk and implement mitigations immediately
  • Vendor autonomy: TeX Live team controls their own patch timeline for TL26
  • Research transparency: The security community can learn from this discovery

Recommended Fixes Provided to Vendor

Short-term mitigation (can be deployed immediately):

static int recursion_depth = 0;
#define MAX_RECURSION_DEPTH 1000
static void qst(char *base, char *max)
{
    if (++recursion_depth > MAX_RECURSION_DEPTH) {
        fprintf(stderr, "Error: Recursion depth exceeded.\n");
        fprintf(stderr, "Input may contain pathological patterns.\n");
        exit(1);
    }
    // ... existing qst() code ...
    --recursion_depth;
}

Long-term solution (requires more testing):

Replace 2-way partitioning with 3-way partitioning (Dutch National Flag algorithm):

// Partitions into: [< pivot] [== pivot] [> pivot]
// Equal elements don't recurse, preventing duplicate-induced imbalance
static void qst_3way(char *base, char *max)
{
    char *lt = base;     // Less than pivot
    char *gt = max - qsz;// Greater than pivot
    char *i = base;      // Current element
    char *pivot = base;  // Pivot element
    while (i <= gt) {
        int cmp = (*qcmp)(i, pivot);
        if (cmp < 0) {
            swap(lt, i);
            lt += qsz;
            i += qsz;
        } else if (cmp > 0) {
            swap(i, gt);
            gt -= qsz;
        } else {
            i += qsz;  // Equal to pivot - no swap needed
        }
    }
    // Recurse ONLY on < and > partitions
    // Equal elements already in correct position
    if (lt - base >= thresh)
        qst_3way(base, lt);
    if (max - gt - qsz >= thresh)
        qst_3way(gt + qsz, max);
}

Alternative (lowest risk): Replace custom qqsort() with standard library qsort():

// In sort_idx() in sortid.c, replace:
qqsort(idx_key, (size_t)idx_gt, sizeof(FIELD_PTR), compare);
// With:
qsort(idx_key, (size_t)idx_gt, sizeof(FIELD_PTR), compare);

Standard library implementations already handle duplicates efficiently and include recursion protection.

CVE Credit and Attribution

This vulnerability was submitted to MITRE's CVE program on December 26, 2025. As of February 11, 2026 (46 days later), MITRE has not responded and no CVE number has been assigned. MITRE's processing times can vary significantly; submission does not guarantee timely assignment. The intended CVE entry, when (and if) assigned, would read:

CVE-2025-XXXXX

Description: makeindex version 2.17 in TeX Live 2025 contains a stack overflow vulnerability due to uncontrolled recursion in the custom quicksort implementation (qsort.c). When processing index files with specific patterns of duplicate entries, the 2-way partitioning algorithm creates severely unbalanced partitions, leading to recursion depth proportional to input size rather than logarithmic depth. This exhausts the stack and triggers a segmentation fault (SIGSEGV), resulting in denial of service. The vulnerability requires local file processing with user interaction but no special privileges. CVSS v3.1 Base Score: 5.5 (MEDIUM) — CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H

Discovered by: V-One Giri Discovery Date: December 26, 2025 Reference: makeindex-2.17-segfault-duplicate-entries

CWE-674: Uncontrolled Recursion

If assigned, this CVE will be searchable in the National Vulnerability Database (NVD), referenced in security advisories, cited in academic security research, and listed on professional security researcher profiles as part of TeX Live's security history. Note: Publication of this article is proceeding regardless of CVE assignment, per vendor authorization and the 46-day elapsed disclosure period.

Impact & Remediation — Who's Affected and What to Do

The Invisible Dependency: Who's Running makeindex Without Knowing It?

Here's a question for your organization's CISO: Do you know if your infrastructure relies on TeX Live?

Most security teams would answer "no" with confidence. After all, TeX is an academic typesetting system, not enterprise software. But that assumption misses the invisible supply chain. Let me paint a picture of who's actually affected:

Affected Ecosystems: Broader Than You Think

1. Academic Institutions (Millions of Users)

  • Thesis/dissertation compilation systems: Universities run automated pipelines that convert LaTeX submissions to PDF. A malicious .idx file in a student's thesis crashes the compilation server.
  • Journal publishing platforms: Academic publishers use TeX Live for formula rendering and scientific document processing. Overleaf, Authorea, and institutional LaTeX servers all run makeindex.
  • Research collaboration platforms: Shared LaTeX projects (Git repositories, Overleaf teams) where one malicious contributor can poison the index file.

Real-world scenario: A PhD student submits their thesis with a crafted index file. The university's automated compilation service crashes. Hundreds of pending submissions are blocked until IT investigates. Deadlines are missed. Graduation ceremonies disrupted.

2. Technical Documentation Pipelines (Enterprise Impact)

  • Open-source project documentation: Projects using Sphinx with LaTeX output (make latexpdf) invoke makeindex. A malicious pull request adding a crafted .idx file kills the CI/CD pipeline.
  • API documentation generators: Tools that produce PDF manuals from code comments often use TeX Live under the hood.
  • Corporate documentation systems: Large enterprises generating product manuals, compliance documents, or training materials in LaTeX.

Real-world scenario: An attacker submits a "documentation improvement" pull request to a popular open-source project. The PR includes a .tex file that generates a pathological .idx during build. Every CI/CD run times out or crashes. Maintainers struggle to identify why builds suddenly fail.

3. Publishing Houses (Commercial Systems)

  • Book production pipelines: Publishers processing author-submitted LaTeX manuscripts for mathematics, physics, and computer science textbooks.
  • Mathematical formula rendering: Systems that convert LaTeX math to images or PDFs for web display.
  • Preprint servers: arXiv.org and similar platforms process thousands of LaTeX submissions daily.

Real-world scenario: An author submits a mathematics textbook with deliberately crafted index entries. The publisher's automated typesetting system crashes. Production deadlines slip. The vulnerability isn't obvious — it looks like "the author's file is weird" rather than a security issue.

4. Automated Build Systems (DevOps Targets)

  • CI/CD pipelines: GitHub Actions, GitLab CI, Jenkins jobs that build documentation from LaTeX sources.
  • Container-based documentation builders: Docker images with TeX Live installed for automated PDF generation.
  • Scheduled report generation: Systems that pull data, generate LaTeX reports, and compile to PDF on cron schedules.

Real-world scenario: A company's quarterly financial report generation uses LaTeX for professional formatting. An attacker with access to the data source injects duplicate index entries. The scheduled job crashes. The CFO's board presentation PDFs aren't ready. Embarrassment ensues.

Attack Scenarios: How This Gets Weaponized

Attack Vector 1: Malicious CTAN Package

CTAN (Comprehensive TeX Archive Network) is the npm/PyPI of the LaTeX world — a repository of thousands of user-contributed packages.

Attack flow:

  1. Attacker creates a legitimate-looking LaTeX package (e.g., "advanced-indexing" or "course-notes-template")
  2. Package includes example files with crafted .idx files
  3. Package documentation says: "Compile the examples to see features"
  4. Users download, run makeindex example.idx, crash
  5. Widespread disruption before CTAN moderators notice

Likelihood: Low (CTAN has review processes) Impact: Medium-High (wide distribution, trusted source)

Attack Vector 2: Supply Chain via Git Repositories

LaTeX projects often live in Git repositories (GitHub, GitLab, Bitbucket).

Attack flow:

  1. Attacker forks a popular LaTeX template repository
  2. Adds "improved index organization" with pathological duplicate pattern
  3. Submits pull request: "Fixed index formatting for better readability"
  4. Maintainer merges (doesn't test makeindex step, only reviews .tex changes)
  5. CI/CD breaks for all users who pull the update
  6. Issue manifests as "random build failures" that are hard to debug

Likelihood: Medium (social engineering + trusted contribution model) Impact: Medium (disrupts CI/CD, hard to diagnose)

Attack Vector 3: Targeted Denial of Service

Social engineering attack against specific organizations or individuals.

Attack flow:

  1. Attacker researches target's LaTeX usage (academic department, publishing house)
  2. Crafts professional-looking LaTeX document (conference paper, thesis template)
  3. Emails target: "Could you review this paper?" or "Here's the template you requested"
  4. Target opens, compiles, crashes
  5. For automated systems: crash persists until manual intervention

Likelihood: Medium (feasible via email, low technical barrier) Impact: Low-Medium (temporary inconvenience, easy to fix once identified)

Attack Vector 4: Insider Threat / Disgruntled Contributor

Academic collaborations involve multiple authors with shared repository access.

Attack flow:

  1. Disgruntled co-author or student modifies .idx file in shared LaTeX project
  2. Commit message: "Updated index for chapter 3"
  3. Next person to compile encounters crash
  4. Blame is diffuse ("LaTeX is broken" vs. "Bob sabotaged us")
  5. Disrupts publication timelines, conference submissions

Likelihood: Low (requires insider access + knowledge) Impact: Medium (targets specific high-value documents)

Impact Assessment: Why CVSS 5.5 (MEDIUM) Is Appropriate

Let's break down the CVSS v3.1 scoring:

Attack Vector (AV): Local (L) Requires the victim to process a malicious file locally. Can't be exploited remotely without social engineering.

Attack Complexity (AC): Low (L) Once the attacker crafts the malicious .idx file, exploitation is trivial—just run makeindex on it. No race conditions, no timing dependencies.

Privileges Required (PR): None (N) Any user who can run makeindex can trigger the crash. No admin rights needed.

User Interaction (UI): Required (R) Victim must actively process the malicious file (run makeindex, compile LaTeX document). Doesn't crash automatically.

Scope (S): Unchanged (U) Crash affects only the makeindex process, not other system components. No sandbox escape.

Confidentiality ©: None (N) No information disclosure. Pure DoS.

Integrity (I): None (N) No data modification or corruption beyond the crash.

Availability (A): High (H) Complete denial of service — makeindex crashes 100% of the time with PoC input.

CVSS Vector: CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H Base Score: 5.5 (MEDIUM)

Why not higher? No code execution, no data breach, requires local file processing and user interaction. This is a reliability/availability issue, not a critical security compromise.

Why not lower? 100% reproducible, easy to deliver, affects critical infrastructure (academic publishing, CI/CD), and has real business impact.

Are You Affected? How to Check

Audit Your Systems for makeindex

The first step is knowing whether makeindex is present in your environment:

# Check if makeindex is installed and find its version
which makeindex
makeindex --version
# Search for makeindex binaries (Linux/macOS)
find /usr /opt /Library -name "makeindex" 2>/dev/null
# macOS: MacTeX installed via .pkg from tug.org lives here
ls /Library/TeX/texbin/makeindex 2>/dev/null
# Debian/Ubuntu: check installed TeX Live packages
dpkg -l | grep texlive
# RedHat/CentOS:
rpm -qa | grep texlive
# macOS (Homebrew Cask only):
brew list --cask | grep mactex

Audit Your Pipelines and Build Configs

If makeindex is invoked anywhere in your automated workflows, those systems are potential targets:

# Find CI/CD workflow files referencing makeindex or TeX Live
find .github/workflows -name "*.yml" -exec grep -l "pdflatex\|makeindex\|texlive" {} \;
# Find Dockerfiles that install TeX Live
find . -name "Dockerfile*" -exec grep -l "texlive\|mactex" {} \;
# Find Makefiles or build scripts calling makeindex
find . -name "Makefile" -exec grep -l "makeindex" {} \;
find . -name "*.sh" -exec grep -l "makeindex" {} \;

Remediation

Immediate: Wrap makeindex Calls with a Timeout

Since this crash is a fast stack exhaustion during sorting, wrapping makeindex with a timeout will catch and surface the failure rather than letting it hang or fail silently in build pipelines:

# Linux (GNU coreutils)
timeout 60s makeindex document.idx || echo "ERROR: makeindex timed out or crashed"

This is a stopgap, not a fix — but it prevents automated systems from stalling indefinitely and makes the failure visible.

Long-Term: Wait for the Vendor Patch

The TeX Live Security Team has acknowledged the issue and targeted a fix for the TeX Live 2026 release. When available, update via the TeX Live Manager:

sudo tlmgr update --self --all

If you installed TeX Live via a Linux distribution's package manager (apt, dnf), monitor your distro's package updates separately—distribution packages can lag behind upstream TeX Live releases significantly.

For developers interested in the underlying fix: the root cause is in texk/makeindexk/qsort.c. The custom qqsort() implementation has no recursion depth limit and uses 2-way partitioning that degrades to O(n) recursion depth when duplicate entries cluster together. The cleanest fix is replacing qqsort() with the standard library qsort(), which handles pathological inputs safely on modern platforms.

The Big Picture: Legacy Software in Modern Infrastructure

This vulnerability is a case study in a broader security challenge: legacy software in critical infrastructure. makeindex was written in the 1970s-1980s, before:

  • Stack canaries and ASLR
  • Fuzzing and security testing were standard practice
  • Academic software faced adversarial threat models
  • CI/CD pipelines existed

Yet today, this 40-year-old code runs in:

  • Cloud CI/CD systems processing millions of builds
  • Automated academic publishing platforms
  • Enterprise documentation generators
  • Container orchestration workflows

The lesson for security teams: Audit your dependencies. The tools you don't know you're using are often the ones that bite you. SBOM (Software Bill of Materials) isn't just for npm packages — it's for every tool in your pipeline, including the LaTeX processor you inherited from a grad student in 2005.

Conclusion — What We Learned and Where We Go From Here

The Full Circle: From Student to Researcher

When I discovered this vulnerability on December 26, 2025, I was deep in GXPN coursework, focused on learning exploitation techniques — buffer overflows, ROP chains, kernel vulnerabilities, and the art of turning bugs into shells. I never expected my course notes would become a case study in vulnerability discovery. But that's how security research often works: the best findings come from organic use, not just targeted fuzzing campaigns.

This journey taught me that vulnerability research isn't always about fuzzing frameworks or million-dollar bug bounties. Sometimes it's:

  • A segfault during legitimate work
  • Curiosity about why something failed
  • Systematic debugging turned into root cause analysis
  • Following the evidence wherever it leads
  • Responsible disclosure instead of Twitter glory

I learned more from this accidental discovery than from weeks of GXPN coursework — not about exploitation techniques, but about the mindset of a security researcher: persistence, methodology, and ethics.

The Technical Lessons: What This Vulnerability Teaches Us

Lesson 1: Old Code Doesn't Mean Battle-Tested Code

makeindex has been in production for 40 years. Millions of users have processed billions of index files. Yet this vulnerability went unnoticed until pathological input — 1,990 entries with 35 clustered duplicates in a specific order — triggered it organically.

The myth: "If it's been around this long, it must be secure." The reality: Legacy code often survives because typical inputs don't trigger edge cases. The bugs are there, waiting for the right data pattern.

Takeaway for developers: Age is not a security audit. Tools written before modern security practices need fuzzing, static analysis, and adversarial testing — even if they've "worked fine" for decades.

Lesson 2: Algorithmic Vulnerabilities Are Subtle and Dangerous

This wasn't a buffer overflow you could catch with AddressSanitizer. It wasn't a use-after-free that Valgrind would flag. It was a design assumption baked into a 1970s quicksort implementation:

"Duplicate entries are rare, and input is random enough that partitions will be balanced."

That assumption held for 40 years — until it didn't. My GXPN course notes, with their alphabetically organized, legitimately duplicated entries, were the perfect storm.

Takeaway for security researchers: Don't just fuzz for memory corruption. Test algorithmic assumptions:

  • What happens with all identical inputs?
  • What if data is already sorted?
  • What if duplicates cluster together?
  • What's the worst-case complexity?

Lesson 3: "DoS Only" Is Still a Real Impact

Some researchers dismiss denial-of-service vulnerabilities as "not sexy" compared to code execution. But tell that to:

  • The PhD student whose thesis won't compile 24 hours before the deadline
  • The CI/CD pipeline that's been failing for 6 hours and blocking production deployments
  • The publisher whose automated typesetting system crashes on every submitted mathematics textbook

DoS vulnerabilities have real business impact:

  • Lost productivity
  • Missed deadlines
  • Operational disruption
  • Reputational damage (unreliable systems)

Takeaway for CISOs: CVSS 5.5 (MEDIUM) doesn't mean "ignore it." Assess impact in your context. If your business relies on automated LaTeX compilation, this is HIGH severity for you.

Lesson 4: Responsible Disclosure Protects Everyone

I had a perfect proof-of-concept, complete root cause analysis, and a great story. The temptation to tweet "BREAKING: 0-day in TeX Live used by millions" was real. But I chose coordinated disclosure instead.

Why this matters:

  • Users get protection: Vendors have time to assess and patch before attackers weaponize the bug
  • Vendors cooperate: TeX Live Security Team was responsive and professional, providing clear guidance
  • Ecosystem improves: Constructive collaboration leads to better patches than rushed emergency fixes
  • Researcher reputation: Future vendors will trust me because I handled this responsibly

Takeaway for aspiring security researchers: Your first CVE is a career milestone. Do it right. Follow CERT guidelines, coordinate with vendors, and prioritize user safety over social media clout.

The Broader Implications: Legacy Software in Modern Infrastructure

This vulnerability is a microcosm of a larger problem: invisible dependencies in critical infrastructure.

How many organizations know they rely on:

  • 40-year-old makeindex for documentation pipelines?
  • Volunteer-maintained academic software in production systems?
  • Tools that predate modern security practices?

The Software Supply Chain Problem:

npm install
↓
installs 847 packages
↓
one of them runs LaTeX for README rendering
↓
which uses TeX Live 2025
↓
which includes vulnerable makeindex 2.17
↓
Your CI/CD now has a DoS vector you didn't know about

What security teams need to do:

  1. SBOM everything: Software Bill of Materials isn't optional. Know every tool in your pipeline, including the LaTeX processor in your documentation builder.
  2. Audit legacy dependencies: Just because it's been stable for years doesn't mean it's secure. Fuzz it. Test it. Review it.
  3. Defense in depth: Wrap tool invocations with timeouts. Don't assume a tool won't crash just because it never has.
  4. Vendor diversity: If one ancient tool breaks, have fallbacks. Standard library qsort() vs. custom quicksort. Alternative documentation generators.

Acknowledgments: Standing on the Shoulders of Giants

This vulnerability discovery wouldn't have happened without:

  • SANS/GIAC GXPN Course (SEC660): The coursework that generated the data pattern which triggered the bug. Special thanks to the SEC660 authors Barnett Darnell and James Shewmaker for doing an awesome job explaining vulnerability research concepts. Sometimes security education teaches you in unexpected ways. Thank you for an awesome course.
  • Casian Ciobanu, the malware wizard: If it wasn't for him pointing me to this software for compiling my GXPN study notes, I would have never stumbled on this vulnerability.
  • Claude AI (Anthropic): For assisting with rapid prototyping of the fuzzing and analysis tools through AI-assisted pair programming. The binary search fuzzer, delta debugger, and pattern analyzers were developed using modern "vibe coding" techniques — where I described the logic and Claude generated the implementation, allowing me to focus on vulnerability analysis rather than debugging Python scripts.
  • TeX Live Community: For creating and maintaining incredible open-source academic software for decades. This vulnerability doesn't diminish their work — it's a reminder that even great software needs ongoing security review.
  • Security Research Community: The methodologies I used (binary search fuzzing, delta debugging, exploitability assessment) come from decades of published research. I stood on the shoulders of giants.
  • MITRE and CERT: For providing structured disclosure frameworks that make coordinated vulnerability disclosure possible.

Special thanks to the TeX Live Security Team (tlsecurity@tug.org) for their prompt response and professional handling of this disclosure. Their acknowledgment and support for public disclosure demonstrates the collaborative spirit that makes coordinated vulnerability disclosure work.

Final Thoughts: The "Good Bug" Philosophy

Not all bugs are created equal. Some vulnerabilities are found through malicious intent — attackers probing systems to steal data or cause harm. Others are found by researchers hunting for glory or bug bounties. But some bugs are what I call "good bugs" — discovered during legitimate use, reported responsibly, and fixed collaboratively to make software better for everyone.

This is a "good bug." I didn't set out to break TeX Live. I was building a study tool. When it crashed, I investigated. When I found the root cause, I disclosed it responsibly. When the patch is ready, the ecosystem will be stronger.

My hope is that this writeup inspires:

  • Students: Your coursework might lead to unexpected discoveries. Stay curious when things break.
  • Developers: Your 40-year-old sorting algorithm might have edge cases you never tested. Fuzz it.
  • Security teams: Your infrastructure has invisible dependencies. Audit them.
  • Researchers: Responsible disclosure is a badge of honor, not a compromise. Do it right.

The Last Lesson: Sometimes the Best Vulnerabilities Find You

I'll leave you with this: I didn't go looking for a vulnerability. I was organizing my notes, studying for an exam, trying to make a prettier PDF. The vulnerability found me. It was a xmas gift from Santa!

And that's the beautiful, frustrating, humbling reality of security research — the bugs are everywhere, hiding in plain sight, waiting for the right combination of circumstances to reveal themselves.

Maybe it's in the password reset flow you use every day. Maybe it's in the documentation builder that's been running for years without issues. Maybe it's in the sorting algorithm that "everyone knows works."

You won't find it by assuming everything is secure. You'll find it by being curious when things break. You'll find it by asking "why?" instead of "how do I work around this?" You'll find it by doing the right thing when you discover something dangerous.

So here's my challenge to you: The next time a tool crashes, don't just Google the error and move on. Investigate. Reproduce it. Isolate it. Understand it.

Maybe it's just a bug. Or maybe — just maybe — you've found the next CVE.

And when you do, remember: responsible disclosure, thorough analysis, and protecting users comes first. The glory comes later.

V-One Giri Security Researcher Discovered: December 26, 2025 Disclosed: February 11, 2026 CVE Pending (MITRE non-responsive as of Feb 11, 2026) | CWE-674 | CVSS 5.5 MEDIUM

"Sometimes the best vulnerabilities find you while you're just trying to study for an exam."

Appendix: Resources and References

Vulnerability Details:

  • CVE: CVE-2025-XXXXX (submitted December 26, 2025 — MITRE has not responded as of February 11, 2026)
  • CWE: CWE-674 — Uncontrolled Recursion
  • CVSS: 5.5 (MEDIUM) — CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H
  • Reference: makeindex-2.17-segfault-duplicate-entries

Technical Documentation:

  • Full technical analysis: [GitHub repository — to be published]
  • Proof-of-concept file: main_categorized.idx (196 KB)
  • Fuzzing tools: Binary search, delta debugger, pattern analyzer
  • Source code analysis: TeX Live makeindex qsort.c

Disclosure Contacts:

  • TeX Live Security: tlsecurity@tug.org
  • MITRE CVE: cve@mitre.org
  • Discoverer: V-One Giri

Further Reading:

  1. MITRE CWE-674: https://cwe.mitre.org/data/definitions/674.html
  2. CVSS v3.1 Specification: https://www.first.org/cvss/v3.1/specification-document
  3. CERT Guide to Coordinated Vulnerability Disclosure: https://vuls.cert.org/confluence/display/CVD
  4. Sedgewick, R. (1977). "Quicksort," Stanford University
  5. Bentley & McIlroy (1993). "Engineering a Sort Function," Software: Practice and Experience

This article documents the discovery and responsible disclosure of a stack overflow vulnerability in makeindex 2.17 (TeX Live 2025). The vulnerability was discovered on December 26, 2025, disclosed to the vendor the same night, and the vendor authorized immediate public disclosure. This article is being published on February 11, 2026–46 days after discovery. A CVE was submitted to MITRE on December 26, 2025; as of publication, MITRE has not responded or assigned a CVE number. No vendor patch has been released yet; a fix is targeted for the TeX Live 2026 release.

All technical details are accurate as of the publication date. For the latest patch status and mitigation guidance, please check the official TeX Live security advisories at https://tug.org/texlive/