My name is Bow Sineath and I have recently joined the SecureWorks Counter Threat Unit (CTU) as a security researcher. During my previous employment, I managed an IDS/IPS signature set and was responsible for acting on vulnerability intelligence that was, more often than not, very limited in public details. My experience in reverse engineering, source code analysis and countermeasure development is assisting SecureWorks in developing countermeasures that accurately protect our clients.
For my initial blog post as a CTU researcher, I am going to detail a vulnerability from Microsoft's June 2008 patch cycle and correct an error I have seen in a number of the publicly available countermeasures. The vulnerability exists in the way that Microsoft DirectX 7 and 8 handle the parsing of Synchronized Accessible Media Interchange (SAMI) files. Due to the fact that Microsoft 7 and 8 are only distributed with Windows 2000, all other modern variants of Windows are not vulnerable to this specific issue, which limits the scope of this vulnerability significantly. The Microsoft advisory identifier for this vulnerability is MS08-033.
The SAMI file format is a captioning technology developed by Microsoft and detailed here (http://msdn.microsoft.com/en-us/library/ms971327.aspx). The format has a series of tags, similar to HTML, that describe and print out captions. The files can be identified by both the file contents which are plain-text and, in most cases, the .smi file extension. When SAMI is used, the SAMI file describing the captions exists as a separate entity from the media, which means there will be two files, the media itself and the SAMI file providing the captions. The file format itself is relatively similar to HTML in that it uses tags and markup similar to HTML and CSS. Further details on the file format are available from Microsoft.
The vulnerability we will be analyzing exists in the way that DirectX 7 and 8 prior to MS08-033 handle class declarations in SAMI files. The SAMI file format requires that two headers exist, the SAMIParam block and the Style block, both of which provide metadata for the SAMI file and the captions. Within the Style block, classes can be declared and defined that give each caption the ability to perform language-specifications. A class can have a name, a language and formatting information associated with it. When classes are declared, they are done so with a period (.) followed by a sequence of alphanumeric characters, which will be the name used to associate a caption with a class. Following the class name, an open bracket designates the start of the class definition. Within the definition lies the meat of the class, including a class name to be presented to the user and a language selection, which by definition should follow the ISO639-ISO3166 naming convention. In addition, the author can specify formatting options within the class definition. The final portion of the definition is a close bracket, designating the end of the definition. An example declaration and definition could look like this:
.F00 { Name:"Foo Class"; Lang: en-US; color: white; }
Once the two header blocks have been defined, the captions can be defined. A basic SAMI caption consists of two tags, the first being SYNC, which defines the time in milliseconds to display the caption and the second being P, which allows the author to specify a class and ID for formatting. A typical caption definition could look like this:
<SYNC Start=10> <P Class=F00> Hello there
In addition to class definitions, a caption can be formatted within the style block by using either a Paragraph block or a Source block. Both of these formatting types are documented in the Microsoft SAMI specification and work somewhat similar to classes.
The vulnerability lies in the way that quartz.dll applies style information to captions and exists in the CSAMIRead::FillBuffer() function. Within the CSAMIRead::FillBuffer() function, there are multiple calls to functions such as lstrcpyn and wsprintf, which are functions that can easily be misused. The use of these functions, in combination with a failure to properly track the size of data copied into a buffer, results in a trivially exploitable stack overflow.
At the start of CSAMIRead::FillBuffer(), two stubs are called which essentially obtain values from a structure and return them. Both of the function calls are virtual, so the target of the calls is not immediately clear.
There are several ways the target of virtual function calls can be determined. In this case the quickest and easiest way to determine the target of the calls is by setting a breakpoint on CSAMIRead::FillBuffer() and running it in a debugger. This can be done by attaching a debugger to Windows media player and setting the debugger to break on library load, and setting breakpoints within quartz.dll after the library has been loaded. The two functions are CMediaSample::GetSize() and CMediaSample::GetPointer() and are both virtual member functions of the IMediaSample COM object. The block of code which makes these calls is below.
.text:35584819 mov edi, [ebp+arg_0] .text:3558481C mov esi, ecx .text:3558481E push edi .text:3558481F mov eax, [edi] .text:35584821 call dword ptr [eax+10h] ;
CMediaSample::GetSize() .text:35584824 mov eax, [edi] .text:35584826 lea ecx, [ebp+var_8] .text:35584829 push ecx .text:3558482A push edi .text:3558482B call dword ptr [eax+0Ch] ;
CMediaSample::GetPointer() .text:3558482E test eax, eax .text:35584830 jge short loc_35584842 ;
if(GetPointer() != 0) return error
It is important to note in this block of code that the return value of the call to CMediaSample::GetSize() is discarded, which makes calling this function useless since the only purpose it serves is to return the size of the buffer. Moving a few blocks down into the function, checks are performed to see if certain fields were initialized in the SAMI file. This block of code is below.
.text:3558486A mov ecx, [esi+494h] ;
Pointer to SOURCE= object .text:35584870 test ecx, ecx .text:35584872 jz short loc_3558487D .text:35584874 mov edi, ecx .text:35584876 mov ecx, offset Default ; "" .text:3558487B jmp short loc_35584884 .text:3558487D ; --------------------------------------------------------------------------- .text:3558487D .text:3558487D loc_3558487D: ; CODE XREF: CSAMIRead::
FillBuffer(IMediaSample *,ulong,ulong *)+61j .text:3558487D mov ecx, offset Default ; "" .text:35584882 mov edi, ecx .text:35584884 .text:35584884 loc_35584884: ; CODE XREF: CSAMIRead::
FillBuffer(IMediaSample *,ulong,ulong *)+6Aj .text:35584884 mov eax, [eax+4] ; .text:35584884 ; Pointer to class definition .text:35584887 test eax, eax .text:35584889 mov edx, eax .text:3558488B jnz short loc_3558488F .text:3558488D mov edx, ecx .text:3558488F .text:3558488F loc_3558488F: ; CODE XREF: CSAMIRead::
FillBuffer(IMediaSample *,ulong,ulong *)+7Aj .text:3558488F mov eax, [esi+498h] ; .text:3558488F ; Pointer to Paragraph style .text:35584895 test eax, eax .text:35584897 jz short loc_3558489B .text:35584899 mov ecx, eax
The three pointers shown here are going to be pushed as arguments to a wsprintf() call, which means they must be given some value before being passed to wsprintf(). The block of code above checks the pointers to the locations of the respective objects in memory to see if they are NULL or not, if the pointer is NULL then a pointer to an empty string is copied into the register used for an argument to wsprintf(), otherwise the pointer to the object is passed into wsprintf(). This seems a bit complicated, but seeing the wsprintf() call will make it clear.
.text:3558489B loc_3558489B: ; CODE XREF: CSAMIRead::
FillBuffer(IMediaSample *,
ulong,ulong *)+86j .text:3558489B push edi .text:3558489C push edx .text:3558489D push ecx .text:3558489E push offset aPStyleHsHsHs ;
"<P STYLE="%hs %hs %hs">" .text:355848A3 push [ebp+var_8] ; LPSTR .text:355848A6 call ebx ; __imp__wsprintfA .text:355848A8 mov edi, eax .text:355848AA mov eax, [esi+49Ch] .text:355848B0 add esp, 14h .text:355848B3 xor ecx, ecx .text:355848B5 mov eax, [eax+20h] .text:355848B8 mov [ebp+var_C], ecx .text:355848BB test eax, eax .text:355848BD mov [ebp+var_4], eax .text:355848C0 jz short loc_355848E7
According to the stdcall calling convention, arguments are pushed in reverse order onto the stack, so in the call above edi points to the source object of the caption, edx points to the class definition, and ecx points to the Paragraph style block. In effect, this block of code is replacing the variable names specified in the SAMI source, class and style objects inside of each caption. Due to the fact that the wsprintf() function does not provide any means of bounds checking, any of the three strings passed into this wsprintf() call can be used to trigger the vulnerability. The code that follows contains a series of lstrcpyn() and wsprintf() calls, all of which can be used to trigger the vulnerability as well.
Most countermeasures for this vulnerability do not provide complete protection against this vulnerability and only account for certain portions of class definitions, completely ignoring the Source and Paragraph blocks and missing the fact that there do not need to be valid identifiers within class definitions for the vulnerability to be triggered. This vulnerability is one of many that underscores the importance of reverse engineering patches and creating internal proof of concepts for vulnerabilities when creating countermeasures.