WebSpellChecker Stack Buffer Overflow

Discovering a vulnerability in the WebSpellChecker

Published in

Salesforce Engineering

6 min readJan 8, 2018

In today’s blog post, we discuss how the ReLabs team discovered a stack-overflow vulnerability in the WebSpellChecker . WebSpellChecker is a third-party service that used by the Salesforce platform that provides spell-checking functionality through a web API.

This software comes in two versions: hosted or licensed. The hosted version runs on the WebSpellChecker servers, and the licensed version needs to be run on your own organization’s server. The licensed version works in different platforms: Windows (32bit or 64bit) and Linux (32bit or 64bit). This post discusses a stack buffer overflow found in the licensed version for all platforms.

The software has several components as can be seen in the following image:

The SSRV component, in the center, is a binary program, a CGI running in the server. The AppServer is another binary running usually in the same server as the CGI. The AppServer is the real application doing the spell-checking, meanwhile the CGI is validating, parsing and decoding the arguments sent by the Web Components. The CGI is accessible from the Internet and it can be provided with the arguments the user desires. This and the fact that is a binary program running in the server make it the best point to search vulnerabilities for.

Fuzzing

The first approach was fuzzing the CGI to try to find vulnerabilities. We chose to test the CGI on the 64bit Linux version, as that is what we run at Salesforce. First, we needed to understand the way the user can send inputs to the CGI and what kind of input the program expects. To accomplish this we performed a static analysis using IDA Pro. The commands spell, sc and check_spelling are underlined in red in this image:

The full list of commands and arguments was used to construct a custom fuzzing framework to find vulnerabilities in the software.

Fuzzing results

We found different Access Violation Exceptions in Linux 64 bits using our fuzzing framework. In the following lines we analyze the exceptions and their sources to learn whether the user can exploit it to control the execution flow. The results show the RIP address when the exception occurs and the assembly instruction at that address if available:

0000000000000000 | None This exception happens when the command “user_dictionary” is used and there is no action or it is being used an invalid one. The program uses the “action” parameter to map a function and call it. But when there is no a valid action then this function pointer is empty and it executes a CALL 0x0000000000000000. There is no way to take advantage of that to achieve code execution. The following argument generates this exception: cmd=user_dictionary&name=test
00000000004E1695 | jmp r11 This exception is raised when the function void atexit_event() is executed. Inside that function there is a function pointer (jmp r11). But this pointer is not valid. The user doesn’t control the value of the pointer. The following argument generates this exception: cmd=ospsave&dname=a
000003999AB6672 | lock xadd DWORD PTR [rdi],eax This exception is raised when a local variable in the stack is overwritten after a call to sprintf. After the call to sprintf, and before the function reaches the RET, it is executed the following function: __gnu_cxx::__exchange_and_add This function receives a pointer variable and adds a given value to the variable. When the stack is overwritten, the local variables are overwritten as well. Then, when this function is called the pointer variable doesn’t point to a valid address and an exception is generated because the instruction is trying to write to an invalid location. The following argument generates this exception:

cmd=user_dictionary&name=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=&action=create

This last exception is very interesting, the stack was overflowed with the user-controlled input data, the parameter name, but an exception happened before the program reached the RET instruction, and the application ended its execution.

Vulnerabilities found

As mentioned before, using the fuzzer we were able to find a vulnerability in the way WebSpellChecker is using sprintf. Sprintf doesn’t limit the amount of bytes written after the format string is composed. This allows a user to overflow the variable holding the final string. Doing a manual analysis, we identified twelve more places where the CGI uses sprintf in the same insecure way (the addresses shown are from the Windows 32-bit version):

0x0043FED0 sprintf(&v193, “{dname: “%s”, action: “%s”}”, v50, v49);
0x0044015A sprintf(&v193, “{dname: “%s”, action: “%s”}”, v71, v70);
0x00440596 sprintf(&v193, “{dname: “%s”, action: “%s”}”, v101, v100);
0x00440859 sprintf(&v193, “{dname: “%s”, action: “%s”}”, v119, v118);
0x004409A0 sprintf(&v193, “{dname: “%s”, action: “%s”}”, v129, v130);
0x00440CC5 sprintf(&v193, “{word: “%s”, action: “%s”}”, v138, v137);
0x00440F62 sprintf((char *)(a1–276), “{text: “%s”, type: “%d”, error: true, dname: “%s”}”, v10, v9, v8);
0x00443D0C sprintf(&v29, (const char *)v6, v7, v5);
0x0044394F sprintf(&v41, (const char *)v5, v6, v20);
0x004432DC sprintf(&v24, (const char *)v4, v5, v6);
0x0044287C sprintf(&v29, (const char *)v6, v7, v5);
0x00442FA4 sprintf(&v42, (const char *)v14, v15, v17);

These sprintf calls are invoked with the following commands, each one corresponds with a different action:

cmd=dictionary&action=create&ud=1&udn=1&c=a&callback=a&format=xml&view=true&wordlist=aaa&dname=AAAAAA…[“A”*1024]
cmd=dictionary&action=delete&ud=1&udn=1&c=a&callback=a&format=xml&view=true&wordlist=aaa&dname=AAAAAA…[“A”*1024]
cmd=dictionary&action=rename&ud=1&udn=1&c=a&callback=a&format=xml&view=true&wordlist=aaa&dname=AAAAAA…[“A”*1024]
cmd=dictionary&action=restore&ud=1&udn=1&c=a&callback=a&format=xml&view=true&wordlist=aaa&dname=AAAAAA…[“A”*1024]
cmd=dictionary&action=getname&ud=1&udn=1&c=a&callback=a&format=xml&view=true&wordlist=aaa&dname=AAAAAA…[“A”*1024]
cmd=dictionary&action=addword&ud=1&udn=1&c=a&callback=a&format=xml&view=true&wordlist=aaa&dname=AAAAAA…[“A”*1024]
cmd=ospsave&text=create&&udn=bbbbbb&dname=AAAAAA…[“A”*1024]
cmd=user_dictionary&action=setdict&name=AAAAAA…[“A”*1024]
cmd=user_dictionary&action=getdic&name=AAAAAA…[“A”*1024]
cmd=user_dictionary&action=check&name=AAAAAA…[“A”*1024]
cmd=user_dictionary&action=rename&name=AAAAAA…[“A”*1024]

NOTE: The string AAAAAA…[“A”*1024] represents 1024 “A” characters.

Exploiting the vulnerabilities

We tried to exploit the stack overflow to execute code but we found some difficulties. In almost all of the above indicated sprintf, once the overflow takes place, an exception is thrown. Said exception happens before the RET instruction gets executed preventing control of the execution flow.

The following image shows the sprintf call and how an exception is thrown before reaching the return.

An interesting command is “cmd=dictionary&action=delete”. This command doesn’t throw an exception by code, but raises an access violation exception when a local variable, a pointer, gets overwritten and dereferenced.

A solution to that problem is to overwrite the exception handler, SEH, so when the exception is raised we could control where the execution goes. This option is only applicable to Windows, where the SEH is stored in the stack.

One problem is that the NULL string character, 0x00, cannot be used as part of the input. This character ends the string so it cannot be used for exploiting purposes.

Another problem is that all of the modules are compiled with SafeSEH, so it is not possible to overwrite the SEH with an address belonging to one of these modules. Still we can overwrite the SEH with addresses belonging to the heap or belonging to the stack if the DEP protection is disable. But usually the stack and the heap addresses are in the low part of the memory so they contain at least one 0x00 byte:

0x00xxxxxx

In some scenarios it would be possible to partially overwrite the SEH, replacing only the three first bytes and using the end of string, 0x00, as the fourth byte. But in this case it is not possible to do that because after the user controlled string, %s, there are additional characters ( “} ) in the format string.

"{text: \"%s\", type: \"%d\", error: true, dname: \"%s\"}"

In the next section we will see the encoding function used in WebSpellChecker that escapes the 0x00 characters.

Encoding

The arguments sent to the WebSpellChecker CGI are encoded using the algorithm UrlEncoding. So we tried to use the character 0x00 encoded as %00 but it didn’t work.

We identified the function doing the decoding to try to understand why this was happening and if it was possible to use another way to write a 0x00 character in the argument.

The function doing the decoding is the following one:

__int64 __fastcall TextUtils::escaped2plain(char *original_string, const std::string *transformed_string, std::string *a3)

The function is parsing the input looking for the character %

if ( (_BYTE)v10 != '%' )

After this character is found, the function takes the next 2 characters and it scans them with the function “sscanf” using as format one hexadecimal number:

std::string::string((std::string *)&two_chars_hex, (const std::string *)original_string, v5, 2uLL);
    sscanf(two_chars_hex, "%x", &number);

After that, function code is checking if the resulting number is 0:

if ( number )

And when this condition is not satisfied (number == 0), the function directly uses the first character after the %. So it is not possible to escape the NULL character because %00 is converted to the character “0”, 0x30.

Conclusion

Several vulnerabilities (including stack buffer overflows) were found, but due to different issues, it wasn’t possible to exploit these vulnerabilities to execute code. The life of a vulnerability research is not always fun and profit.

Reporting timeline:

First contact with vendor: 12/9/2015
Vendor first response: 12/9/2015
Vendor acknowledge the vulnerabilities: 1/13/2016
Vendor provide fix: 4/5/2016

(This post was originally published on 11–22–2016)