CVE-2026–23398 is a remote denial-of-service in the Linux kernel IPv4 ICMP path: amissing NULL check in icmp_tag_validation() allows a kernel panic when the host is configured with net.ipv4.ip_no_pmtu_disc = 3 ("hardened" Path MTU Discovery mode) and processes an ICMP "Fragmentation Needed" (Type 3, Code 4) message whose quoted inner IPv4 header references a protocol number with no registered inet_protos[] handler.
There is no privilege escalation and no memory corruption exploit chain in the public description: the failure mode is a NULL pointer dereference in softirq context, which typically becomes an immediate system halt.
This article explains the mechanism at the code and packet level, clarifies which preconditions are non-obvious, and documents why minimal lab PoCs often fail unless the receive path matches production ingress behavior.
Full PoC:
https://github.com/zpol/cve-2026-23398-poc

Path MTU Discovery (PMTUD) relies on ICMP "Destination Unreachable / Fragmentation Needed and DF set" (IPv4 ICMP type 3, code 4 per RFC 1191). When a router cannot forward a packet because it exceeds the next hop MTU and the Don't Fragment bit is set, it returns this ICMP message. The payload embeds (at least) the IPv4 header of the original datagram plus eight octets of payload, so the receiver can associate the error with a flow and adjust its notion of the path MTU.
The Linux sysctl net.ipv4.ip_no_pmtu_disc controls how aggressively the kernel trusts such ICMP for various protocols. Mode 3, introduced to harden environments where you want PMTU for protocols that cryptographically or statefully validate the quoted packet (TCP, SCTP, DCCP), but want to ignore less trustworthy cases, gates handling of ICMP frag-needed on a per-protocol flag: icmp_strict_tag_validation on struct net_protocol.
That design is sound in intent: only protocols that implement stringent validation should honor spoofed-looking errors. The implementation bug was a classic oversight: the code assumed every possible IPv4 "protocol" field in the *inner* quoted header would map to a non-NULL entry in inet_protos[].
The sparse inet_protos[] table
The kernel does not register 256 IPv4 protocol handlers. Only a small subset of the IANA protocol numbers have net_protocol structures installed. For unassigned or unsupported numbers, inet_protos[proto] is NULL.
icmp_tag_validation(), in the vulnerable form, did: rcu_read_lock(); ok = rcu_dereference(inet_protos[proto])->icmp_strict_tag_validation; rcu_read_unlock();
If proto indexes a NULL slot, the dereference is a NULL+offset load (the field access adds a small offset to the pointer), which triggers a fault in interrupt/softirq context. The observed crash signatures (e.g. CR2 near 0x10 on x86_64) align with accessing a member of a struct through a NULL net_protocol pointer.
The fix is minimal and correct: capture the pointer, test for NULL, and treat "no handler" as "cannot perform strict tag validation," i.e. return false so the ICMP is discarded under mode 3 instead of crashing.
Exact trigger conditions (all must align)
For the vulnerable code path to execute:
A) sysctl net.ipv4.ip_no_pmtu_disc == 3 on the receiving network namespace.
B) An ICMPv4 message arrives that the stack processes as ICMP_DEST_UNREACH with
ICMP_FRAG_NEEDED (type 3, code 4), with valid outer IPv4 + ICMP headers.
C) The quoted inner IPv4 header (in the ICMP payload) has ip.protocol set to a value N such that inet_protos[N] is NULL after rcu_dereference.
D) Execution reaches icmp_tag_validation(N) in that context (see icmp_unreach in net/ipv4/icmp.c in affected trees).
If (A) is false, the switch in icmp_unreach does not take the mode-3 branch that calls icmp_tag_validation for frag-needed in the same way. If [C] is false because N is TCP/UDP/ICMP/etc., a real handler exists and there is no NULL deref.
This is why a naïve Scapy one-liner that only pads the ICMP payload may not reproduce the bug: the inner quoted header must be shaped so that the kernel parses a protocol field that maps to an empty slot.
Packet crafting
A structurally valid ICMP frag-needed includes the RFC 1191 next-hop MTU field and the quoted original IP datagram prefix. In Scapy terms, layering is typically:
outer_ip / ICMP(type=3, code=4, nexthopmtu=<mtu>) / inner_ip / inner_payload_8bytes where inner_ip uses an uncommon ip.proto (e.g. experimental/unused numbers) and addresses that are consistent enough for the stack to parse the quoted section as an IPv4 header.
The outer destination must be a live IPv4 address on the victim; routing and firewall rules must allow the packet to reach the host's IP stack.
Why "send to self" often shows nothing
Lab authors frequently run the sender on the same machine as the "victim." That can fail to reproduce for reasons unrelated to the CVE:
• Locally generated packets may short-circuit parts of the receive path or differ in routing decisions.
• QEMU user networking does not faithfully support host-to-guest ICMP the way a tap/bridge or physical NIC does.
• Even on real hardware, the precise path from raw socket transmission to local delivery is not guaranteed to mirror "remote packet arrives on eth0."
A reliable lab therefore isolates the *ingress* path:
• Two network namespaces on the same kernel, connected by a veth pair: send from the init netns to the peer address in the "victim" netns where ip_no_pmtu_disc=3 is set, so icmp_rcv runs as on a received frame; or
• A second machine or VM on a real L2/L3 path; or
• QEMU with tap + host route to the guest IP.
This distinction is important for engineers interpreting "one packet" headlines: the vulnerability is still "one malicious packet on the wire," but local tooling must actually place that packet on the wire (or equivalent) from the receiver's perspective.
Timeline nuance: bug age vs CVE date
Public writeups tie the issue to CVE-2026–23398 and a 2026 fix. The underlying pattern (NULL deref when mode 3 consults inet_protos[proto]) traces to the introduction of mode 3 and icmp_tag_validation in commit 8ed1dc44d3e9 ("ipv4: introduce hardened ip_no_pmtu_disc mode") in the historical record; the NULL check fix landed later as 614aefe56af8e ("icmp: fix NULL pointer dereference in icmp_tag_validation()").
Distributions backport fixes under their own kernel build numbers. A "patched" system may show no panic even with sysctl=3, while an older mainline build predating the fix remains vulnerable. For reproducible research, pin an explicit kernel image (e.g. aknown-vulnerable package) rather than assuming "latest Ubuntu" is exploitable.
Impact model
• Confidentiality: none inherent to this bug.
• Integrity: none inherent.
• Availability: complete for the machine (kernel panic); in clusters, cascading effects depend on orchestration and health checks.
• Attack prerequisites: ability to deliver IPv4 ICMP to the target (same broadcast domain,routed path, or tunnel). No authentication. EPSS and CVSS (when published) quantify exposure but do not change the technical mechanism.
Mitigations (defense in depth)
1) Patch the kernel (preferred).
2) If you must run unpatched temporarily, avoid global sysctl ip_no_pmtu_disc=3 unless you fully understand the workload; lowering to 0/1/2 per documentation removes this specific branch (with different PMTU/security tradeoffs).
3) Network filtering: dropping inbound ICMP frag-needed has operational cost (PMTUD black holes). Prefer fixing the kernel.
4) Detection: post-mortem logs may show "Kernel panic" / NULL deref in icmp_unreach; inline prevention is the patch, not IDS signatures alone.
References
• NVD: CVE-2026–23398 — https://nvd.nist.gov/vuln/detail/CVE-2026-23398
• Linux fix commit: 614aefe56af8e13331e50220c936fc0689cf5675
• Introduced mode 3 / icmp_tag_validation: 8ed1dc44d3e9
• RFC 1191 — Path MTU Discovery
• RFC 792 — ICMP (baseline formats)
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
CVE-2026–23398 is a sharp lesson in invariant reasoning across sparse tables: security features (mode 3) added real policy, but the implementation forgot that "protocol number" is not synonymous with "registered handler." For software engineers, the takeaway is to treat kernel-facing inputs(especially under hardening branches)as adversarial unless every lookup is validated, including the "impossible" indices that are actually trivially constructible on the network.