A serious RTF zero-day attack has struck recently. McAfee detection solutions were provided a couple of days ago that allowed us to spot in-the-wild attacks. We detected this exploit on Wednesday. McAfee Labs researchers have been actively working on this threat. In this post, we will share our perspective on how the exploit works–specifically how an extended instruction pointer (EIP) is controlled at a deep technical level.
From our analysis, we believe the root cause of the vulnerability is related to the RTF “overridetable” control word (also called a structure) or the inside structures. An “overridetable” structure may include the “listoverride,” “listoverridecount,” and “lfolevel” fields. The “listoverridecount” basically tells how many instances of “lfolevel” the structure may contain. According to Microsoft’s official specification, the legal value should be 0, 1, or 9. However, in this exploit, the value is 25.
The in-the-wild exploit is a bit complex. However, we can simplify it into this one-line proof of concept:
During our tests, when the value of the “listoverridecount” is set to 25, starting from the 29th value the “lfolevel” structure is handled incorrectly by the Microsoft Word. Specifically, an object-confusion fault occurs, for example, class A is wrongly handled as class B. As every byte of the confusing object can be controlled by the attacker via various control words, the attacker can control the program flow (EIP) accurately.
The attacker controls the EIP to an address in MSCOMCTL.DLL. Because the DLL doesn’t have address space layout randomization (ASLR) enabled (for Office 2010 or earlier versions), the attacker can make the exploit work for Office on newer operating systems such as Windows 7. The first controlled EIP is a fixed address, 0x275A48E8. Let’s see what it looks like:
The preceding first address is controlled from somewhere (shown in the following image) in wwlib.dll via a “call [ecx+4]“.
As we can see, at this point the object (pointed by ecx, at 0×07941060 for this test) is being used incorrectly. What we see is that the memory bytes are always the following (listed at 0×18 length):
07941060 7B 7B 00 00 E8 48 5A 27 89 64 59 27 EF B8 58 27
07941070 59 59 00 00 5A 5A 00 00
Note the second DWORD 0x275A48E8; this is the EIP that is controlled. The other bytes are also important for making sure all the following steps (after the first EIP control) work correctly, such as ROP and shellcode executions. So the question is, Where do the memory bytes for this incorrectly used object come from? Is this filled by some kind of heap spraying or something else? More deep research showed that all of the bytes actually came directly from the RTF file; in other words, all the bytes can be controlled.
The mystery lies in the fields (and their values) in the “listoverrideformat” structure inside the incorrectly handled “lfolevel” structure. The following image shows exactly how the fields’ values are transferred into the 0×18-length memory bytes:
Here are the highlights:
- Bytes 0-3 (first DWORD) are controlled via the “\levelstartat” control word
- Byte 4 is controlled via the “\levelnfcn” control word
- Controlling byte 5 is a little tricky, but it’s important because that’s where the first EIP control goes. This step is not easy due to the nature of the confusing object. The attacker is apparently smart enough to realize that he or she can control the fourth and seventh bits (from low to high) of the byte via the “\levelnorestart” and the “\levelold” control words, respectively. When these two bits are set, the byte comes to 0×48 (in bits, 0100 1000) which is a part of the DWORD 0x275A48E8. This is enough to transfer the program flow to a useful starting address in MSCOMCTL.DLL.
- Bytes 6-14 can be controlled via the “\levelnumbers” control word. In this example, the attacker uses a “\’” control word to input a hex byte 0x5A (\’5A). The program reads the other bytes (7-14) directly from the following bytes in the RTF file.
- Byte 15 is controlled via the “\levelfollow” control word
- The DWORDs at bytes 16-19 and at bytes 20-23 are controlled via the “\levelspace” and the “\levelindent” control words, respectively. (The linking relationship isn’t shown in the preceding figure.)
Because the object memory can be controlled accurately by the attacker via specific RTF control words, the attacker can make a highly reliable and accurate exploitation, even without a heap spray.
Overview of follow-up executions
(provided by McAfee Labs researcher Jun Xie)
The ROP chain (in MSCOMCTL.DLL) allocates a memory block marked as READ/WRITE/EXECUTE at address 0×40000000, and copies the first-stage shellcode to this address. After that, a specific ROP, usually known as stack pivot, runs and the program flow goes to 0×40000040.
In the first-stage shellcode, the exploit performs a brute-force search to find the file handle to map the RTF file into memory. Then it searches the second-stage shellcode and copies the second-stage shellcode to address 0×40002000. The second-stage shellcode reads the Microsoft patch-log file on the system. If it finds that the last patch time is after April 8, the execution is terminated. Otherwise, it will decrypt and drop malware named svchost.exe (to confuse the victim). The malware makes some other confusing moves, for example, at the end it decrypts and shows a harmless Word document (which includes some porn images) to the victim.
From what we’ve learned, we can see how sophisticated this exploit is and how deeply the attackers understand RTF. Apparently the bad guys understand the related control words and their memory representations at a really deep level.
Considering these elements, we see this zero day as a serious threat and suggest that everyone take the following action(s) as soon as possible:
- For McAfee customers, apply our detection solutions, found here
- Apply the “Fix It” tool or install EMET, as suggested by Microsoft
- Wait to apply the patch that will be released next Tuesday, according to the Microsoft Security Response Center blog post
Thanks to Bing Sun, Xiaoning Li (Intel Labs), and Chong Xu for their help with this analysis.