Monday, November 29, 2021

Exploiting TotalMeltdown: the fine way

CVE-2018-1038 aka TotalMeltdown is quite an old bug (2018) but still an awesome bug so i decided to write a decent exploit for it.

The vulnerability was discovered by @ulffrisk, The first functioning LPE PoC was released by @xpn and he also wrote a blog about it you can find it here, i used @xpn's blogpost to write the first exploit however though it is working the exploit kept crashing the system as it overwrite some critical memory regions and also the raw physical memory search takes ages, so i wrote another clean exploit which i will walk you through in this post.

TotalMeltdown


CVE-2018-1038 is a logical bug caused by setting a bit (U/S) which was not supposed to be set in one of the PML4 (Page Map Level 4) entries by Microsoft after mitigating the Meltdown Vulnerability.

Microsoft by setting the U/S (User/Supervisor) bit in the PML4 entry at index 0x1ED it allowed any usermode/ring3 program to access any physical memory mapped to this particular entry, however the catch here is that the PML4 entry 0x1ED was actually a Self Ref PML4 entry which makes it worse than just reading and writing to a the physical memory mapped to this entry!

Paging 101


As you may already know the addresses used by the programs to reference the memory are not really the real memory addresses used to address the physical memory installed in your computer, those addresses are called Virtual Addresses and must be translated first to Physical Addresses, this is usually done by the MMU unit in your CPU. the Virtual Addressing is provided via enabling the Paging in the CPU (this is done by the OS in the early stages of booting).

So how is this Virtual Address is translated anyway? well let's first see what a Virtual Address is:

As you can see the Virtual Address is just a bunch of offsets (or indexes) in some tables, and by understanding these tables we can eventually understand paging.

Those are 4 tables (in long mode paging e.i: PAE enabled and 4KiB page size) required by the paging mechanism to be setup before enabling the paging (otherwise the system will crush after enabling the paging), those tables are required to be setup because once the OS enables the paging (setting the PG bit in the CR0 register) every address afterwards is deemed to be a virtual address and will be translated by the MMU. each table is of size of 4KiB and holds 512 entries ( 4KiB / 8 ).

the first table and the top in the hierarchy is the Page Map Level 4 table PML4 for short there is only 1 PML4 per process, the second is the PDP or Page Directory Pointer table, every entry in the PML4 holds the base address of a PDP table so they are up to 512 tables in case of using all the entries in PML4 table, and next is the PD or the Page Directory table which is also up to 512 table, and finally the PT or the Page Table also up to 512 tables.

How those pages are linked and work together is quite simple, every entry of these tables (8 bytes on x86-64) is a physical address to the next table starting from the PML4 to the PT. And every offset (or index) in the Virtual Address being translated is used within that table.

For example if the PML4 physical address is 0x1000 (table addresses are always page aligned and their size is same as page size e.i 4KiB) and the PML4 index of this virtual address is 0xe4, then to MMU would extract the address of the next table (PDP) by reading the entry at base of the table + index * size of entry (8 on x86-64) which is 0x1000 + (0xe4 * 8) = 0x1720.

Here is how the full translation is done:
in the end of the day a physical address is obtained via reading the entry in the PT, this physical address points to a physical page/frame, the 12 bits offset is used within this page to access any position in it.

let's try to translate this virtual address: 0xfffffa80018ab040 which is the address of the System's EPROCESS in my testing VM, first let's get the indexes:
  • first 12 bits are the offset in the page: 0x40
  • next 9 bits of the PT index: 0xab
  • next 9 bits of the PD index: 0xc
  • next 9 bits of the PDP index: 0
  • next 9 bits of the PML4 index: 0x1f5
  • the most significant 16 bits: 0xffff.

For a Virtual Address to be a valid address it should be a canonical address meaning the most significant 16 bits of a Virtual Address should be the sign extend of the last bit of the PML4 index, so if the last bit of the PML4 index is set to 1 this should be all 1 (0xFFFF), and if it is 0 then these 16 bits are all set to 0 (0x0000).

Any other value for this 16 bits other than 0x0000 and 0xffff means the address is not canonical thus invalid and will result in a #GP fault. why is this in the first place? Currently only 48 bits virtual addresses are being used (9 bits PML4 index + 9 bits PDP index + 9 bits PD index + 9 bits PT index + 12 bits offset) which allows mapping 256 terabytes of physical memory and using only 4 paging tables (instead of 6 tables in case of full use of 64 bit virtual addresses).

the designers could just make the CPU ignore whatever is in the upper 16 bits but then the programmers would take advantage of this "free" 16 bits to store extra informations, and then when the CPU designers eventually decide to extend the bits of the virtual addresses the code built to take advantage of the "free" unused upper 16 bits will break, so they just prevented this from happening with this canonical addresses concept.

which makes the valid virtual addresses ranges are:
  • 0x0000000000000000-0x00007fffffffffff (generally used for userspace defers per process)
  • 0xffff800000000000-0xffffffffffffffff (generally used for kernel space and mapped on every process or at least before the Meltdown vulnerability, see: KVAS/KPTI)

the first thing to do is to get the physical address of the base of the PML4 table, so we can add to it our PML4 index 0x1fc. this address as shown in the picture above is in the CR3 register:


that's 0x187000, we add to it the PML4 index 0x1fc times the size of entry which is 8 bytes which gives 0x187fa8



we read what's in this PML4 entry (!dq is used to read QWORD from a given 
physical address and the L parameter is used to shows only 1 QWORD):


well that value is not aligned on page boundary? I haven't talked about this earlier, but each entry of the tables is an address to the next table (always aligned on page boundary), but also contains some flags.

src: Intel developer system programming manual

the flags are in the lower 12 bits and in the upper 16 bits, so we just clear them to get the physical address, however only the lower 12 bits flags are being used, there is only one flag in the upper 16 bits XD (Execute Disable) and it is also reserved (must be set to 0) if ia32_efer.nxe = 0 and the page is present, which makes most of the upper 16 bits reserved/ignored:

src: Intel developer system programming manual

You might notice from the picture above that the CR3 also contains flags, i didn't mention that before because in our case the CR3 has no flags (as .formats command above shows).

so we have 0x4000863 we clear the flags:


this is the PDP table base address, we add the PDP index * 8, the index is 0 so the the address is the same, we read the entry and clear the flags:



the PD table base is 0x4001000 we add the PD index 0xc * 8, read the entry and clear the flags:



the PT base address is 0x580d000 we add the PT index 0xab * 8 and clear the flags:



Now 0x7ff6a000 is the physical address of the page we will add the 12 bits offset 0x40 to:



And that's it! we got the physical address of 0xfffffa80018ab040 which is 0x7ff6a040.

We can verify the translation with !vtop windbg command which translates virtual addresses to physical addresses, the argument 0 means use the current process (System) context, because every process has its own PML4 table meaning if you translate an address from a process with another process's PML4 table and subsequently the other tables the translation will be wrong, though this is irrelevant for kernel addresses because they are mapped in the upper half of the PML4 table of every process.



seems our math is OK! now about the flags we can use the !pte command to translate the virtual address 0xfffffa80018ab040 and also show the flags:



those are the flags under each table base, V stands for valid and K for a Hypervisor/Kernel mode page (code running in ring3 can't access these pages, the opposite of this flag is the U (usermode) flag which grants access to the ring3 code), W stands for read and writable (the opposite is R which means read only), you can find all the flags in !pte docs or in the Intel manual. you can also see the table bases are the same as we read from the entries during our translation process.

Self Ref Entry


the self ref entry trick or self reference entry is a trick osdevs use to access any entry on any of the 4 tables without any extra structures or custom code, all what is needed is to add a PML4 entry which points to the PML4 tables itself i.e: the entry contains the physical address of the PML4 tables just like the CR3 register.

how will this allow us to access any table? well here is an example:

consider the entry at index 266 in the PML4 of our imaginary os is a self ref entry, the PML4 table will be similar to this one:

relevant flags are in green: K stands for Hypervisor/Kernel mode, U for usermode

if we want to edit the PML4 entry at index 188, we would build a virtual address with the following indexes, you can use @xpn's go script to do that:



the script is pretty straightforward, it puts each index at its right offset in the virtual address via left shifting with the right offset, then sign extend.
  • PML4 entry: self ref entry 266
  • PDP entry: 266
  • PD entry: 266
  • PT entry: 266
  • offset: 188*8
the virtual address of this would be: 0xFFFF8542A150A5E0.

If we access this address in our imaginary OS after adding the self ref entry at index 266 it would return the content of the entry at index 188 of the PML4 table. how? lets see:
  • first the MMU will read the PML4 entry at index 266 the entry is self ref entry it points to the PML4 base address, this is considered as the PDP base address.
  • then the MMU will read the PDP entry 266 in table with the base address it read before which will point again to the PML4 base. this is considered as the PD base address.
  • again the MMU will read the PML4 entry 266 and points it back to the base. this is considered as the PT base address.
  • once again the PT index 266 is used with the PML4 base address which points to the very same PML4 base. this is considered as the base address of the physical page we are trying to access with the address.
  • 188*8 is the offset of the PML4 entry at index 188 (we have to multiply by 8 because the MMU will multiply by 8 itself only when dealing with table indexes but it does not do that with the last 12 bits the page offset which makes sense otherwise we wouldn't have access to the whole page), this is added to the base address the MMU read before, which is the base of PML4 table.
eventually the final address the MMU constructs is the physical address of the entry 188 in the PML4 table.

Since we can access any entry in any table that means besides being able to edit the PML4 table, we can also translate any usermode address to its physical address, lets try to translate 0x482e50 which is a usermode address of a chunk allocated with LocalAlloc().

first the indexes of the Virtual Address:




then we need to build a Virtual Address like this:
  • PML4 index: self ref entry: 0x1ED
  • PDP index: PML4 index of the address we want to translate: 0
  • PD index: PDP index of the address we want to translate: 0
  • PT index: PD index of the address we want to translate: 2
  • offset: PT index of the address we want to translate * 8: 0x82*8 = 0x410



which gives: 0xfffff68000002410
, we read what's in this address:


in this case there is some upper flags 0xA8 (XD Execute Disable flag and the protection key to be exact) meaning No Execute is enabled for this page. as expected actually LocalAlloc() which is a wrapper to HeapAlloc() allocates chunks on the heap which is not executable due to DEP :]

PD entry (PDE) format in case of 4KiB page, src: Intel developer system programming manual

anyway clearing the upper and the lower flags gives: 0x
101ad000, this is the physical address of the physical page this address translates to.

now we add the 12 bits offset 0xe50 to it: 0x101ad000 + 0xe50 = 0x101ade50, confirm using !vtop in windbg:


The bug


Now that we understand both the paging mechanism and the self reference entry, we can explain TotalMeltdown (CVE-2018-1038), the self reference on the windows 7 os is the entry at index 0x1ED, this entry is supposed to be (and was) accessible only by the ring0 code (Kernel), but after the patch of the Meltdown Vulnerability, Microsoft has introduced this bug after setting the U (usermode) bit in this entry's flags.


This means that any program in the usermode can use this entry and access the memory it maps, and as we know it is a self ref entry and points to the base of the PML4 table, the code running at ring3 then have access to the PML4 table/page (every page holds one table) it can add or edit or remove entries.

As we are looking for a LPE we will not DOS the system by corrupting the existing inuse entries, instead we will add a new entry to the PML4 table with U (usermode) bit set and map 31GiB of physical memory (you can map as much as you like but 31GiB is more than enuf for my testing VM which has only 2GiB of ram). once we do that we have arbitrary read write of 31GiB of physical memory.

The Exploit


The initial exploit i wrote was quite simple we map 31GiB of memory with the U (usermode) flag set, then we search the mapped memory looking for the System's EPROCESS and the exploit process's EPORCESS, we do that by looking for some identifiers like the process name ("System" for the system's EPORCESS) and the process id (4 for system's EPORCESS) the exploit's process name and pid can be found at runtime, then once we find both EPROCESS structures we copy the System's EPROCESS.Token to the exploit process EPORCESS's Token.

Pretty straightforward but the memory search takes ages and when we do the 31GiB mapping we need 32 physical pages (1 PDP table and 31 PD table, no need for PT cause we use long pages 2MiB) the original exploit just overwrites whatever on the range 0x10000-0x1F000 which crashes the system most of the time.

physical memory region 0x10000 probably being used

The Low Stub to the rescue!


The Low Stub is an undefined structure named PROCESSOR_START_BLOCK it can be found in the start of a page somewhere between 0x1000-0x100000 in the physical memory (the lowest 1MiB), The Low Stub was covered in a presentation by Alex Ionescu in his talk "Getting Physical with USB Type-C" at RECON BRUSSELS 2017 conference, you can find the slides here.

From Alex Ionescu's talk slides:
This structure is used when resuming from ACPI Sleep Vector, as well as when initializing the Application Processors (APs).
this is what the structure looks like:

typedef struct _PROCESSOR_START_BLOCK {
    // The block starts with a jmp instruction to the end of the block
    FAR_JMP_16 Jmp;

    // Completion flag is set to non-zero when the target processor has
    // started
    ULONG CompletionFlag;

    // Pseudo descriptors for GDT and IDT.
    PSEUDO_DESCRIPTOR_32 Gdt32;
    PSEUDO_DESCRIPTOR_32 Idt32;
    
    // The temporary 32-bit GDT itself resides here.
    KGDTENTRY64 Gdt[PSB_GDT32_MAX + 1];
    
    // Physical address of the 64-bit top-level identity-mapped page table.
    ULONG64 TiledCr3;

    // Far jump target from Rm to Pm code
    FAR_TARGET_32 PmTarget;

    // Far jump target from Pm to Lm code
    FAR_TARGET_32 LmIdentityTarget;

    // Address of LmTarget
    PVOID LmTarget;

    // Linear address of this structure
    PPROCESSOR_START_BLOCK SelfMap;

    // Contents of the PAT msr
    ULONG64 MsrPat;

    // Contents of the EFER msr
    ULONG64 MsrEFER;

    // Initial processor state for the processor to be started
    KPROCESSOR_STATE ProcessorState;
} PROCESSOR_START_BLOCK;
the last member is of type _KPROCESSOR_STATE:


this is _KSPECIAL_REGISTERS:

and lastly _CONTEXT which is documented in winapi docs and also in windbg but its so long and doesn't fit in a screenshot:


typedef struct _CONTEXT {
  DWORD64 P1Home;
  DWORD64 P2Home;
  DWORD64 P3Home;
  DWORD64 P4Home;
  DWORD64 P5Home;
  DWORD64 P6Home;
  DWORD   ContextFlags;
  DWORD   MxCsr;
  WORD    SegCs;
  WORD    SegDs;
  WORD    SegEs;
  WORD    SegFs;
  WORD    SegGs;
  WORD    SegSs;
  DWORD   EFlags;
  DWORD64 Dr0;
  DWORD64 Dr1;
  DWORD64 Dr2;
  DWORD64 Dr3;
  DWORD64 Dr6;
  DWORD64 Dr7;
  DWORD64 Rax;
  DWORD64 Rcx;
  DWORD64 Rdx;
  DWORD64 Rbx;
  DWORD64 Rsp;
  DWORD64 Rbp;
  DWORD64 Rsi;
  DWORD64 Rdi;
  DWORD64 R8;
  DWORD64 R9;
  DWORD64 R10;
  DWORD64 R11;
  DWORD64 R12;
  DWORD64 R13;
  DWORD64 R14;
  DWORD64 R15;
  DWORD64 Rip;
  union {
    XMM_SAVE_AREA32 FltSave;
    NEON128         Q[16];
    ULONGLONG       D[32];
    struct {
      M128A Header[2];
      M128A Legacy[8];
      M128A Xmm0;
      M128A Xmm1;
      M128A Xmm2;
      M128A Xmm3;
      M128A Xmm4;
      M128A Xmm5;
      M128A Xmm6;
      M128A Xmm7;
      M128A Xmm8;
      M128A Xmm9;
      M128A Xmm10;
      M128A Xmm11;
      M128A Xmm12;
      M128A Xmm13;
      M128A Xmm14;
      M128A Xmm15;
    } DUMMYSTRUCTNAME;
    DWORD           S[32];
  } DUMMYUNIONNAME;
  M128A   VectorRegister[26];
  DWORD64 VectorControl;
  DWORD64 DebugControl;
  DWORD64 LastBranchToRip;
  DWORD64 LastBranchFromRip;
  DWORD64 LastExceptionToRip;
  DWORD64 LastExceptionFromRip;
} CONTEXT, *PCONTEXT;

as outlined in the slides of the talk the Rip member of the CONTEXT struct points to the kernel entry (nt!KiSystemStartup), and CR3 in 
_KSPECIAL_REGISTERS as you guessed holds the base address of the kernel's PML4 table.

We need these two to write the exploit, cause the self ref trick can't translate kernel mode addresses and even if it could our PML4 table has only couple of kernel mode addresses. this is due -ironically- to the meltdown vulnerability patch aka KVAS which separates the Kernel and Usermode PML4 tables (before the KVAS every process has the kernel VAS mapped to the upper half of its PML4 table) so yeah we need the Kernel's CR3 anyways. 
We will use the Kernel's CR3 to translate its addresses, the kernel entry is used to get the Kernel's EPROCESS as explained later on.

to find the low stub structure in windbg we use this hacky script:


.for (r $t0 = 0x1000; $t0 < 0x100000; r $t0 = $t0 + 0x1000) {r $t1=($pqwo(@$t0) & 0xffffffffffff00ff);.if (@$t1 == 0x00000001000600E9) {.printf "possibly found @ %p = %p\n", @$t0, @$t1;r $t1 = ($pqwo(@$t0 + 0x268) & 0xfffff80000000000);.if (@$t1 == 0xfffff80000000000) {.echo "1st check succeed";r $t3 = ($pqwo(@$t0 + 0x268));r $t2 = ($pqwo(@$t0+0xa0));.if(@$t2 == @cr3) {.echo "2nd check succeed";.printf "Kernel Entry: %p\n", @$t3;.printf "PML4: %p\n", @$t2;.break;}}}}

here is the output:


the extracted PML4 base 0x187000 is the same as CR3 register:


and for the extracted kernel entry address 0xfffff800028f7360:


well that's nt!KiSystemStartup which is the entry point of ntoskrnl.exe:


If you are wondering how does the windbg script finds the Low Stub, it does so by looking in the first 8 bytes of every page for an encoded relative jmp instruction (opcode: 0xe9), this is the first member of the PROCESSOR_START_BLOCK structure:

typedef struct _PROCESSOR_START_BLOCK {
    // The block starts with a jmp instruction to the end of the block
    FAR_JMP_16 Jmp
    ...
} PROCESSOR_START_BLOCK;

the low stub structure is at 0x7000 we can check the disassembly of the first QWORD in windbg (up command disassembles physical address):


Now that we have both the Kernel's PML4 base address (CR3) and the exploit process's PML4 base address (using self ref), we can translate any Virtual Address to its physical address, it requires a well understanding of how paging works but we can do it.

the fine exploit


Enough with introductions i believe now we can discuss what the exploit will look like and how it will take advantage of the Low Stub structure, this how it is done:
  • add a new entry to the PML4 table of the exploit process.
  • LocalAlloc() the required 32 pages for mapping 31GiB of memory (1 PDP + 31 PD).
  • create a MMU() function which can translate any virtual address to physical address.
  • use the MMU() function to translate the virtual address of every page we got from LocalAlloc() to a physical address.
  • map 31GiB of physical memory using the pages allocated with LocalAlloc() with flag U (usermode) accessible.
  • find the Low Stub, save the CR3 of the Kernel and the Kernel Entry
  • fallback to raw memory search if Low Stub is not found
  • the MMU() function should be able to translate now the privileged Virtual Addresses (ring0) after we have the CR3 of the Kernel (the PML4 base) and 31GiB of physical memory mapped.
  • find the Kernel Entry (nt!KiSystemStartup) in physical memory.
  • load ntoskrnl.exe in user mode, PE parse it, get the offset of the entry and the offset of nt!PsInitialSystemProcess.
  • calculate the address of ntoskrnl.exe base (nt) in physical memory.
  • read nt!PsInitialSystemProcess.
  • traverse the EPROCESS doubly linked list using EPROCESS.ActiveProcessLinks.
  • find the EPROCESS of the parent process.
  • duplicate the System's token.

setup_paging()


this is the function which maps 31GiB of physical memory, how it is done is easy, once we translate the virtual addresses of the usermode allocated 32 pages, we add a new entry to the PML4 table which points to the first page we have, next we identity map the physical memory starting from address 0 using 1 PML4 entry (points to the first page from LocalAlloc() ), 1 PDP table (the first page from LocalAlloc() ), and 31 PD tables (the left 31 pages from LocalAlloc() ) which is equal to 31  * 512 * 0x200000 (2MiB) = 31GiB, we use large paging (2MiB for each page instead of 4KiB) cause it is easier and requires less physical pages to setup tables i.e: we don't need PT tables/pages.

relevant flags are in green: K stands for Hypervisor/Kernel mode, U for usermode, L for large page

this diagram shows what the tables looks like after running the exploit, as you can see the self ref entry at 0x1ED points to the base of PML4 table but it has the U (usermode) flag it should have the K flag as all the entries in the upper half (the kernel half), in this example we injected an entry at index 427 or PML4 base + 427*8 which also has U (usermode) flag set (otherwise we wouldn't be able to access the memory we mapped from ring3).

if you are wondering how do we find a free PML4 entry? it's easy we just read the content of that entry if its NULL then it is not used, otherwise skip it and check next entry.

the PDP table is one of the usermode pages we got from LocalAlloc() and translated its address using MMU(), in this table we will add 31 entries each will point to a single PD table, meaning 31 PD tables, these 31 PD tables are also from the usermode pages we got from LocalAlloc() and translated its address using MMU() ( hence we need 32 pages to be allocated via LocalAlloc() ), each entry of these PD tables points to a 2MiB page in physical memory starting from physical address 0.

we use 2MiB large pages (via setting large page flag in PD entry) so no need for PT tables, originally every single PT table maps 2MiB ( 512 entry in PT table * 4KiB (0x1000 byte) page per entry = 0x200000 byte (2MiB) ) of physical memory now that we use large pages we don't need the PT tables and thus we require less physical pages.

If the CPU supports 1GiB pages and we were to use them (via setting large page flag in the PDP entry) we wouldn't need both the PD tables and the PT tables, again because originally every 1 PD table (and 512 PT table under it) maps 1GiB of physical memory ( 512 entries in PD table * 1 PT table per entry = 512 * 2MiB per PT table = 512 * 0x200000 byte = 0x40000000 byte (1GiB) ).

who knows probably at some point in the future, there will be 512GiB large page then again no need for PDP, PD and PT tables, because originally 1 PDP table (and 512 PD tables and 512*512 PT tables under it) maps 512GiB of physical memory.

MMU()


this is probably the most important function in the exploit it translates both user mode and kernel mode Virtual Addresses to physical addresses, the usermode translation is done via self ref entry, the kernel mode translation is only possible after getting both the CR3 register of the Kernel and arbitrary read write of physical memory which we both have. it also handles large pages (in which cases the PT is not used).

It works just like explained in the previous sections, we have access to the physical memory and we know the base address of the PML4 table, so we can just read and clear the flags, use the next index, read then clear the flags, eventually we will get the physical address.

In case of usermode addresses the things are easier we can just use the self ref entry to access any entry in the paging tables as long as the address is valid and usermode accessible we can translate it using only the self ref entry.

There is an exception however which kept crashing the exploit until i found out that i was trying to translate an address which uses large pages.
when large paging (usually 2MiB pages) is being used, the the large page flag is set on the flags of the PD entry, the PT is not used in this case the translation is done only via PML4 -> PDP -> PD then using both PT index and the offset directly with the PD entry (PT index is left shifted by 12).

0xfffffa80044a9060 is an example of a an address mapped to a large page:


the PD entry (PDE) flags shows L which means 2MiB large page is being used. to get the physical address we should left shift the PT index 0xa9 by 12 and OR it with the PD entry 0x7D200000
, which gives:


then we add the 12 bits offset 0x60:



our math says the physical address is: 0x7d2a9060, we can verify it with !vtop command:


the function does the same steps we did using windbg but in C.

nt!PsInitialSystemProcess


nt!PsInitialSystemProcess is a global variable in the kernel (ntoskrnl.exe) which is exported, and points to the system's EPROCESS.



we can get its offset via loading first ntoskrnl.exe with LoadLibraryExW() then using GetProcAddress() then calculate its offset within ntoskrnl.exe (
GetProcAddress((HMODULE)nt, "PsInitialSystemProcess") - LoadLibraryExW(L"ntoskrnl.exe", NULL, DONT_RESOLVE_DLL_REFERENCES)).

nt!KiSystemStartup in the other hand (which we have its address in kernel space and its physical address using MMU() ) is not exported, however lucky us it is the entry of ntoskrnl.exe image which means its offset is in the optional header of the PE file in AddressOfEntryPoint.

We can verify this using windbg and PE-bear:


windbg says the offset from nt!KiSystemStartup to the base of the nt image is 0x
2a9360, now lets see on PE-bear:



PE-bear shows the same offset: 0x2a9360.

So we can just PE parse the loaded ntoskrnl.exe binary to get the offset, we subtract it from the kernel entry to get nt base address, then we add the nt!PsInitialSystemProcess offset to get the System's EPROCESS
then what's left is easy finding the parent's EPROCESS using ActiveProcessLinks list then patching its token.

Notes:

  • in this post I interpreted the PTEs as described by Intel in SDM, but in Windows world there is actually a special struct to represent this entry named nt!MMPTE_HARDWARE, and ofc these two representations do not conflict.

PoC


the exploit is available here.

a tale of a weird WebSocket based HTTP request smuggling bug

I recently played  Securinets CTF , which have hosted a Web challenge Mark4Archive by @nzeros , which required to bypass this Varnish rule :...