|
|
On February 9, zero-day exploit code [1] was posted on milw0rm site. It exploited
vulnerability in Linux kernels Versions 2.6.17 to 2.6.24.1. This bug allows
an unprivileged local user to gain root privileges. This vulnerability was
assigned CVE-2008-0600.
There are reports that this exploit is reliable and actively used in the wild.
The inner workings of this exploit are quite interesting from the
technical point of view; let’s have a look.
The vulnerability lies in the get_iovec_page_array function
(in fs/splice.c, line numbers from 2.6.23.1-42.fc8 kernel),
reachable from the vmsplice() system function:
1286: if (unlikely(!len)) // "len" variable is under user's
control
1287: break;
...
1296: off = (unsigned long) base & ~PAGE_MASK;
...
1306: npages = (off + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
1307: if (npages > PIPE_BUFFERS - buffers)
1308: npages = PIPE_BUFFERS - buffers;
1309:
1310: error = get_user_pages(current, current->mm,
1311: (unsigned long) base, npages, 0, 0,
1312: &pages[buffers], NULL);
The get_user_pages function expects its fourth argument (the
number of pages descriptors to fill; it limits the return value) to be at
least 1. In the preceding code it is assumed that the npages variable is at least 1 (because len must be nonzero, so the off + len + PAGE_SIZE - 1 expression should be greater or equal than PAGE_SIZE). However, if the len variable is close to UINT32_MAX, then the off + len + PAGE_SIZE -1 computation will result in an integer wrap, and npages can be zero.
As a result, get_user_pages may return more than
PIPE_BUFFERS entries, and the pages array will
overflow. However, the overflow payload is not controlled by the attacker,
so it would be difficult to turn this overflow into reliable code execution.
The reliable exploitation happens thanks to the subsequent loop:
1320: for (i = 0; i > error; i++) {
1321: const int plen = min_t(size_t, len,
PAGE_SIZE - off);
1322:
1323: partial[buffers].offset = off;
1324: partial[buffers].len = plen;
1325:
1326: off = 0;
1327: len -= plen;
1328: buffers++;
1329: }
Here, the partial array, which is also PIPE_BUFFERS
elements long, is overflowed with (off=0, plen=0×1000) pairs. Now, depending on the variables
layout chosen by the compiler, various data structures (that follow partial array) can be overwritten with zero. In the most common case, the pages array will be located after the partial array. The pages array contains pointers,
thus after the preceding loop, it will contain NULL pointers.
Normally, when the kernel tries to access a NULL pointer, it will result in an
exception and the process will be terminated. However, the attacker can map
memory pages at address zero, and store arbitrary data there. In such a scenario,
when the kernel dereferences pointers from the pages array,
attacker-controlled data will be processed, which may result in arbitrary
code execution in the kernel context. In our case, the convenient technique is
to make an entry in the pages array look as a compound page
descriptor, which will result in a function call to an attacker-controlled
address in user space:
37 static void put_compound_page(struct page *page)
/* attacker controls arg */
38 {
39 page = (struct page *)page_private(page);
40 if (put_page_testzero(page)) {
41 void (*dtor)(struct page *page);
42
43 dtor = (void (*)(struct page *))page[1].lru.next;
44 (*dtor)(page); /* so attacker controls the target
of the call
45 }
46 }
To sum up, the exploitation involves:
The kernel upgrade is the preferred solution; but if it is not feasible, there
are workarounds.
A simple kernel module, which disables the sys_vmsplice system
call, has been posted [2].
The exploit we’ve discussed relies heavily on the possibility to map memory at
address zero. Starting with kernel 2.6.23, there is a mechanism to forbid such
mapping via procfs. The echo 65536 > /proc/sys/vm/mmap_min_addr
command will set the lowest possible mapping to be at 64K. Note that:
[1]
Linux vmsplice Local Root Exploit By qaaz
[2]
Runtime disable of sys_vmsplice
|
|
[3] it does work on 64 bit kernels
plen is PAGE_SIZE not 0, that’s why the exploit as it is doesn’t work on 64 bit kernels nor would mmap_min_addr be able to prevent exploitation there (it’s not a NULL deref there).
The protection should also work without SELinux compiled in, as the dummy hook checks mmap_min_addr, right ?
excellent ! would like to see more linux exploit analysis.
Submit your own comments / message for this post