Finding and exploiting CVE-2018–7445 (unauthenticated RCE in MikroTik’s RouterOS SMB)

Summary for the anxious reader

  • CVE-2018–7445 is a stack buffer overflow in the SMB service binary present in all RouterOS versions and architectures prior to 6.41.3/6.42rc27.
  • It was found using dumb-fuzzing assisted with the Mutiny Fuzzer tool from Cisco Talos and reported/fixed about a year ago.
  • The vulnerable binary was not compiled with stack canaries.
  • The exploit does ROP to mark the heap as executable and jumps to a fixed location in the heap. The heap base was not randomized.
  • Dumb fuzzing still found bugs in interesting targets in 2018 (although I’m sure there must be none left for 2019!)
  • The post describes the full process from target selection to identifying a vulnerability and then producing a working exploit.

Introduction

The last few years have seen a surge in the number of public vulnerabilities found and reported in MikroTik RouterOS devices. From a remote buffer overflow affecting the built-in web server included in the CIA Vault 7 leak to a plethora of other vulnerabilities reported by Kirils Solovjovs from Possible Security and Jacob Baines from Tenable that result in full remote compromise.

MikroTik was recently added to the list of eligible router brands in the exploit acquisition program maintained by Zerodium, including a one-month offer to buy pre-auth RCEs for $100,000. This might reflect an increasing interest in MikroTik products and their security posture.

This blog post is an attempt to make a small contribution to the ongoing MikroTik RouterOS vulnerability research. I will outline the steps we took with my colleague Juan (thanks Juan!) during our time together at Core Security to find and exploit CVE-2018–7445, a remote buffer overflow in MikroTik’s RouterOS SMB service that could be triggered from the perspective of an unauthenticated attacker.

The vulnerability is easy to find and exploitation is straight-forward, so the idea is to provide a detailed walk-through that will (hopefully!) be useful for other beginners interested in memory corruption. I will try to cover the full process from “hey! let’s look at this MikroTik thing” to actually finding a vulnerability in a network service and writing an exploit for it.

The original advisory can be found here.

Mandatory disclaimer: I am no longer affiliated with Core Security, so the content of this post does not reflect its views or represents the company in any way.

Setup

The vulnerability is present in all architectures and devices running RouterOS prior to 6.41.3/6.42rc27, so the first step is getting a vulnerable system running.

MikroTik makes this very easy by maintaining an archive of all previously released versions. It is also possible to download the Cloud Hosted Router version of RouterOS, which is available as a virtual machine that boasts full RouterOS features. This allows running RouterOS in x86–64 architectures using popular hypervisors without needing an actual hardware device.

Let’s get the 6.40.5 version of the Cloud Hosted Router from here and create the virtual machine on VirtualBox.

Default administrator credentials consist of admin as the username and an empty password.

RouterOS administrative console

The RouterOS console is a restricted environment and does not allow the user to execute any command outside of a pre-defined set of configuration options.

In order to replicate the vulnerability discovery, the SMB service needs to be enabled. This can be achieved with the ip smb set enabled=yes command.

Enabling SMB and checking the IP address of the device

Note that the fact that the service is not enabled by default makes the likelihood of active exploitation much smaller. In addition, you should probably not be exposing your SMB service to public networks, but well, there’s always those pesky users in the internal network that might have access to this service.

The restricted console is not suitable to perform proper debugging, so before looking for vulnerabilities it is useful to have full shell access. Kirils Solovjovs has published extensive research on jailbreaking RouterOS, including the release of a tool that can be used to jailbreak 6.40.5. It would not make sense to repeat the underlying details here, so head to Kirils’ research hub or the more recent Jacob Baines’ post for newer versions where the entry point used for 6.40.5 has been patched.

Jailbreaking RouterOS 6.40.5 is as easy as cloning the https://github.com/0ki/mikrotik-tools repository and running the interactive exploit-backup/exploit_full.sh exploit pointing to our VM.

Jailbreak tool by 0ki targeting 6.40.5

Finally, download a pre-compiled version of GDB from https://github.com/rapid7/embedded-tools/raw/master/binaries/gdbserver/gdbserver.i686 and upload it to the system using FTP.

Connecting to the device via Telnet would allow us to attach to running processes and debug them properly.

We are now ready to start looking for vulnerabilities in the network services.

Target selection

There are lots of services running in RouterOS. A quick review shows common services such as HTTP, FTP, SSH, and Telnet, and some other RouterOS-specific services such as the bandwidh test server running on port 2000.

Jacob Baines pointed out that over 90 different binaries that implement network services can be reached by speaking the Winbox protocol (see The Real Attack Surface in his excellent blog post).

We were not aware of all that reachable functionality when we started poking around with RouterOS and did not invest the time to reverse engineer the binaries that spoke Winbox, so we just went ahead and looked at the few binaries that were explicitly listening on the network.

Most (all?) services in RouterOS seem to have been implemented from scratch, so there are thousands of lines of custom low level code waiting to be audited.

Our objective was to achieve unauthenticated remote code execution, and on a first look the binaries for common services such as FTP or Telnet did not provide much reachable functionality without providing credentials. This made us turn to other services that might not be enabled by default but require the implementation of a rich feature set. The fact that these services are not enabled by default means they might have been neglected by other attackers wanting to maximize their ROI on vulnerabilities that affect default installations of RouterOS and are therefore much more valuable.

By following this rationale and inspecting the available services we decided to take a look at the SMB implementation.

Finding the vulnerability

We know we want to find vulnerabilities in the SMB service. We have the virtual machine setup, the service running, we have full shell access to the device and we can debug any processes. How do we find a vulnerability?

One option would be to disassemble the binary and look for insecure coding patterns. We would identify interesting operations such as strcpy, memcpy, etc. and see if the correct size checks are in place. We would then see if those code paths can be reached with user controlled input. We could combine this with dynamic analysis and use our ability to attach to a running process with GDB to inspect registers at runtime, memory locations, etc. However, this can be time consuming and it is easy to feel frustrated if you do not have experience doing reverse engineering, especially if it is a large binary.

Another option is to fuzz the network service. This approach consists of sending data to the remote service and checking if it causes an unexpected behavior or a crash. This data will contain malformed messages, invalid sizes, very long strings, etc.

There are different ways to conduct the fuzzing process. The two most popular strategies are generation and mutation-based fuzzing. Generation-based fuzzing requires knowledge of the protocol to build test cases that comply with the format specified by the protocol and will (most likely) result in a more thorough coverage. More coverage means more chances of hitting vulnerable code paths and therefore more bugs. On the other hand, mutation-based fuzzing assumes no prior knowledge of the protocol being fuzzed and takes much less effort at the cost of potentially poor code coverage and additional difficulties in protocols that need to compute checksums to ensure data integrity.

We decided to try our luck with a dumb fuzzer and chose the Mutiny Fuzzer tool that had been released a few months earlier by the Cisco Talos team. Mutiny takes a sample of legitimate network traffic and replays it through a mutational fuzzer. In particular, Mutiny uses Radamsa to mutate the traffic.

Radamsa mutation example

Performing this kind of fuzzing has the benefit of being very quick to get up running and, as we will see, might provide great results if we have a good selection of test cases that stress various features.

Putting this together, the steps to fuzz a network service are:

  • Capture legitimate traffic
  • Create a Mutiny fuzzer template from the resulting PCAP file
  • Run Mutiny to mutate the traffic and replay it to the service
  • Observe what happens with the running service

Mutiny does provide a monitoring script that can be used to (d’oh!) monitor the service and identify weird behavior. This can be accomplished by implementing the monitorTarget function as described in https://github.com/Cisco-Talos/mutiny-fuzzer/blob/master/mutiny_classes/monitor.py. Sample checks could be pinging the remote service or connecting to it to assess its availability, monitoring the process, logs, or whatever else might signal weird behavior.

In this case, the SMB service will take a while to restart after a crash and log a stack trace message, so we decided it was not worth scripting any monitoring actions. Instead, we just captured the traffic throughout the fuzz process with Wireshark and relied on the default behavior of Mutiny, which is to exit when the request fails due to a connection refused error, meaning that the service is down. This is rather rudimentary and leaves a lot of room for improvement, but it was enough for our tests.

It is important to enable full logging before we initiate the fuzzing process. This could prove useful to track any crashes that might occur, as the full stack trace will be included in the logs that are located in /rw/logs/backtrace.log. This can be configured from RouterOS’ web interface.

Enable all logs to be written to the disk

Another thing that proved useful was running the binary in an interactive console to get the debug output in real time. This can be achieved by killing the running process and relaunching it from the full-fledged terminal. Errors and general status of processed requests will be printed.

Get debug output as requests are processed

Now that we have a high level overview of the steps involved, let’s recap and actually fuzz the SMB service.

First we clone https://github.com/Cisco-Talos/mutiny-fuzzer.git and follow the setup instructions.

The next step in our plan consists of generating some network traffic. In order to do this, open Wireshark and attempt to access a resource on the router with smbclient.

Smbclient will send a Negotiate Protocol request to port 445/TCP and receive a response which we do not care about. This can be observed in the Wireshark capture.

We want to use this request as the starting point to produce (hopefully!) meaningful mutations. Stop the Wireshark capture and save the request packet by going to File -> Export Specified Packets with the request packet selected. Output format should be PCAP.

Once we have the PCAP containing the request to fuzz, we prepare Mutiny’s .fuzzer file with the mutiny_prep.py interactive script.

It is a good idea to review the resulting file to identify any weirdness that could come up during conversion.

Here we could configure Mutiny to fuzz only parts of the message. This would be useful if we wanted for example to focus our efforts on individual fields. In this case we will fuzz the entire message. It is worth mentioning that Mutiny can also handle multi-message exchanges.

If the test cases we use as initial templates contain parts that do not cause the program to take different paths, then all the modifications we make to this data will never increase code coverage, which results in wasted time and inefficient fuzzing.

Without going into much detail of the SMB protocol, we can observe that the request contains a list of about a dozen Requested Dialects. Each dialect corresponds to a specific set of supported commands. This could be interesting if we were fuzzing a particular set of commands, but right now we do not care about this.

Providing a shorter list of one or two dialects would result in Radamsa creating more meaningful mutations and a larger variety of SMB request types being sent. Our reasoning is that maybe mutating one dialect or the other will not make the application take very different paths on a single message conversation, so we do this and edit the template to look as follows:

outbound fuzz ‘\x00\x00\x00\xbe\xffSMBr\x00\x00\x00\x00\x18C\xc8\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfe\xff\x00\x00\x00\x00\x00\x9b\x00\x02NT LANMAN 1.0\x00\x02NT LM 0.12\x00’

With the template in place, we can begin fuzzing. Remember to capture the full session with Wireshark. Mutiny can also record each packet sent, but we found it easier to look at Wireshark for when the server stopped responding after a crash.

Open a telnet connection to the router using the devel account and run pkill smb; /nova/bin/smb to start a new SMB process and observe its output.

The following command will instruct Mutiny to sleep half a second between packets and log all requests: ./mutiny.py -s 0.5 — logAll negotiate_protocol_request-0.fuzzer HOST

The verbose output will show packets of different sizes being sent and the numeric identifier of each of them. This value is useful to repeat the exact same sequence of mutations and provide a way to reproduce crashes. This is important because, even if we find a crash, previous requests could have corrupted something or altered the application state in a way that is required for the crash to happen. If we cannot recreate the state before the crash, we might be left empty-handed even if we identify which particular request ended up causing the crash.

If the fuzzing session is interrupted and we do not want to replay the previous mutations, it is possible to use the -r parameter to instruct Mutiny to start sending mutations from that iteration onwards (e.g: -r1500- will send mutations 1500, 1501, 1502, and so on).

If we observe Wireshark while the fuzzer runs, we will see that not all packets conform to the expected format, which is a good thing for us. Vulnerabilities usually arise when the application cannot handle unexpected data in a proper manner.

The terminal where we are running the SMB binary will also contain useful data to confirm that we are in fact feeding malformed requests to the service.

Now we let the fuzzer run. We can play with different delay values and see if the server can process requests that fast, but two requests per second is OK for this proof of concept.

A few minutes later Mutiny finishes running after trying to connect to the service and not being able to do so.

If we take a look at the terminal running the binary, we will be greeted with a Segmentation fault message.

As mentioned before, the backtrace.log file contains the register dump and a bit more information about what caused the crash.

Finally, by inspecting Wireshark we can see that the last packet sent to the server is described as “Session request to Illegal NetBIOS name”.

Understanding the crash

First we will make sure that we can reproduce the crash at will. Copying the last packet sent by Mutiny or extracting the message from Wireshark is equivalent here. We are not interested in the layers below NetBIOS, as we will create a small script to send the packet over TCP.

Extract the raw bytes from the exported file.

>>> open(“req”).read()
‘\x81\x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00’

Create a simple python script that sends the payload over to the remote service. Note that I have replaced the whitespace with its equivalent hex representation for clarity.

Run the script a few times after spawning a new SMB process (pkill smb && /nova/bin/smb) and see what happens. We now have a reliable way to reproduce the crash with a single request.

In this case we are dealing with a protocol for which Wireshark has a dissector, so we can use that information to understand the cause of the crash at a protocol level. Apparently, sending a NetBIOS session request (message type 0x81) exercises a vulnerable code path in the SMB binary.

Let’s extract the binary from the router so we can open it in a disassembler. Copy /nova/bin/smb to /flash/rw/pckg so it can be accessed via FTP and download it.

This is also a good time to be able to debug the process with GDB. I like to use PEDA to enhance GDB’s display and add some useful commands.

We open two connections to the target router. In one we do pkill smb && /nova/bin/smb to get real time output, and on the other we start gdbserver attached to the newly spawned process.

Finally, we open GDB in our testing machine and connect to the debugging server with target remote IP:PORT.

It is also useful to instruct GDB which binary we are attached to by doing file smb. The next time it connects to the debugging server it will attempt to resolve symbols for the loaded libraries.

Press c on the debugging session so execution continues as usual.

Running the proof of concept will cause the service to stop due to a SIGSEGV as expected. Here we see that a NULL pointer was dereferenced when performing a copy operation.

Now, I must admit that I am very bad doing static analysis, especially when C++ programs are involved. In a lame attempt to overcome this limitation, I will rely as much as I can on dynamic analysis and, in this particular case, on the information provided by the Wireshark dissector that gives more insight about the protocol fields.

As can be observed, the first byte of the NetBIOS Session Service packet we are sending sets the message type to Session request (0x81).

The second byte contains the flags, which in our proof of concept has all bits set to zero.

Next two bytes represent the message length, which is set to 32.

Finally, the remaining 32 bytes are referred to as Illegal NetBIOS names.

We can assume that this size is being read at some point, and since it is the only message length information we are sending, it might be related to the vulnerability. To test this assumption, we are going to place breakpoints on common functions such as read and recv to identify where the application is reading the packet from the socket.

After running the script, the program breaks in read().

We navigate to the next instructions using ni and stop right after the read system call is executed.

The definition for read looks as follows:

ssize_t read(int fd, void *buf, size_t count);

EAX holds the number of bytes read, which seems to be 0x24 (36). This corresponds to the header we analyzed before: 1 byte for the message type — 1 byte for the flags — 2 bytes for the message length — 32 bytes for the NetBIOS names.

ECX contains the address of the buffer where the data read is stored. We can use vmmap $ecx or vmmap 0x8075068 to verify that this corresponds to the heap area.

Finally, EDX states that the read operation was called to read up to 0x10000 bytes from the socket.

From here on we can continue stepping through the execution and add watchpoints to see what happens with our data.

Since Wireshark did not identify anything relevant to the analyzed protocol in the NetBIOS names, let’s change our payload to contain more distinguishable characters such as “A”s so it is easier to identify that payload in our debugging session. This is also a good idea to see if additional copy operations might be triggered that would otherwise stop at the first NULL byte.

We have two bytes to play with different sizes, so before moving forward it is interesting to try sending different message lengths and see if the crash still happens.

We already tried 32 bytes, so let’s get crazy and do 64, 250, 1000, 4000, 16000, 65500.

  • 64 (payload = “\x81\x00\x00\x40”+”A”*0x40)

Same as original proof of concept. Registers look the same.

  • 250 (payload = “\x81\x00\x00\xfa”+”A”*0xfa)

This is a very interesting variation. We see most registers set to 0x41414141, which is our input, we see the stack filled with lots of “A”s as well and even EIP seems to have been corrupted.

  • 1000 (payload = “\x81\x00\x03\xe8”+”A”*0x3e8)

Same as previous payload.

  • 4000 (payload = “\x81\x00\x0f\xa0”+”A”*0xfa0)

Same as previous payload.

  • 16000 (payload = “\x81\x00\x3e\x80”+”A”*0x3e80)

This is crashing at a different instruction, although we see that the stack is corrupted as well.

  • 65500 (payload = “\x81\x00\xff\xdc”+”A”*0xffdc)

Same as previous payload.

So… we see the program crashes when executing different instructions. However, the common thing we can observe is that the stack has been corrupted at some point when parsing NetBIOS names from a single NetBIOS session request message, and that most registers included parts of our payload when we sent a 250 bytes message. This makes it particularly interesting for analysis, since we have a direct EIP overwrite and control over the stack.

Note that we cannot really ensure that all crashes are due to the exact same bug at this point. Maybe sending a larger buffer took us down a different path that ended up being more easily exploitable, so you will have to answer that question yourself.

There are also some seemingly random number of “.” (0x2e) characters in between. We will see what they are later on.

Right before the crash the program prints a message that reads “New connection:”. This can be useful to get some situational awareness without having to add watchpoints to our buffer and track dozens of read operations (you can add a read watchpoint in GDB with rwatch *addr and execution will be stopped whenever the program accesses that memory address).

We open the /nova/bin/smb binary in Binary Ninja and search for the string.

There is only one occurrence at 0x80709fb. Inspecting the cross references shows a single usage, which is probably what we want.

If we go to the beginning of sub_806b11c, we will notice that a couple of conditions need to be met in order for us to get to the block that prints the string.

The first condition is a byte comparison with 0x81, which is the message type we are sending.

Putting a breakpoint at 0x806b12e and following the execution allows us to inspect the register values and get a better picture of what is happening. We can observe for example that the size we send in the request needs to be above 0x43 to enter the interesting block.

Based on our previous tests, we know that one of the functions called from this block needs to be the one corrupting the stack. We continue going through each instruction in GDB using n instead of s to avoid stepping into the functions. After each function run, we take a look at the stack.

The first function we encounter is 0x805015e.

After it runs we see that the stack seems to be OK, so this is probably not the function responsible for the overflow.

A few instructions later we have the next candidate, function at 0x8054607. Once again, we let it run and observe the stack and register context afterwards.

Aaaaand we found our culprit. Take a look at EBP and observe that the stack frame has been corrupted. Continue debugging until the function is about to return. Here various registers are popped from the stack that contains our data.

Unpopular thought here: you do not really need to understand what this function is doing at all to exploit the vulnerability. We already have EIP control and most of the stack looks more or less as uncorrupted input data.

Taking some time to review the function at 0x8054607 with GDB’s help results in the following pseudo-code:

int parse_names(char *dst, char *src) {
int len;
int i;
int offset;

// take the length of the first string
len = *src;
offset = 0;

while (len) {
// copy the bytes of the string into the destination buffer
for (i = offset; (i - offset) < len; ++i) {
dst[i] = src[i+1];
}

// take the length of the next string
len = src[i+1];

// if it exists, then add a separator
if (len) {
dst[i] = ".";
}

// start over with the next string
offset = i + 1;
}

// nul-terminate the string
dst[offset] = 0;

return offset;
}

In essence, the function receives two stack-allocated buffers, where the source buffer is expected to be in the format SIZE1 — BUF1, SIZE2 — BUF2, SIZE3 — BUF3, etc. “.” is used as the entry separator.

The first byte of the source buffer is read and used as the size for the copy operation. The function then copies that amount of bytes into the destination buffer. Once that is done, the next byte of the source buffer is read and used as the new size. This loop finishes when the size to copy is equal to zero. No validation is done to ensure that the data fits on the destination buffer, resulting in a stack overflow.

Writing an exploit

How to approach the exploitation depends on the specifics of the targeted device and architecture. Here we are only interested in the Cloud Hosted Router x86 binary.

It is worth mentioning that there might be several different ways to achieve reliable exploitation of this vulnerability, so we are going to review the one *we* used, which might not be the most elegant or efficient way to do it.

Tobias Klein’s checksec script is a great resource to check which mitigations we will need to fight against. This script can be invoked from PEDA.

The lack of stack canaries is probably the most relevant mitigation that is missing, enabling stack based buffer overflows to be easily exploited. If the program had been compiled with stack canaries, then our previous tests would have had a very different result. Stack canaries place random values before important data in each function frame that allocates buffers, and these values are checked before the function returns. In case an overflow occurs, execution will be terminated and no further exploitation possible.

PIE disabled means we can rely on fixed locations for the program code, and a disabled RELRO means we can overwrite entries in the Global Offset Table.

To sum up, we will only be dealing with NX, which restricts execution from writable areas such as the stack or the heap.

Another mitigation that is implemented at a system level is ASLR. This is a 32 bits system, so a partial overwrite or even brute forcing may be considered a feasible bypass of ASLR. In this case, that will not be necessary.

Inspecting the memory mappings for any program in RouterOS shows that the stack base is indeed randomized but the heap is not. This can be verified running cat /proc/self/maps a few times and comparing the results.

The first step to build our exploit is getting the exact offset to get control of EIP. In order to do this we can generate a unique pattern with PEDA with the command pattern create 256 and plug it into our exploit skeleton. Note that the first byte after the header will be the size parsed by the vulnerable function, so we specify 0xFF to read 256 bytes unaltered and avoid the “.” character being placed in the middle of our payload.

When the crash takes place, it is possible to use the accompanying pattern offset VALUE command to determine the exact location to overwrite EIP.

Alter the payload and verify that EIP can be set to an arbitrary value.

We do not observe annoying “.” characters, which is good.

Now that we control EIP and the rest of the stack, we can use the borrowed code chunks technique, which is better known as return oriented programming or ROP (some people seem to be very annoyed with those using the latter term, so it is probably better to mention all the alternatives).

The main idea is that we will chain various code snippets that end in a RET instruction to execute more or less arbitrary code. Given enough of these gadgets we should be able to run anything we want. In this particular case though, we only want to mark the heap area as executable. The end goal is to store something in the heap (which already contains messages read from the client) and jump there taking advantage of the static base address.

The relevant function here is mprotect, which looks as follows:

int mprotect(void *addr, size_t len, int prot);

The address will be 0x8072000, which is the base of the heap. This needs to be page-aligned for it to work.

Len can be anything we want, but let’s change the protection of the whole 0x14000 bytes.

Finally, prot refers to a bitwise-or of the desired protections to enforce. 7 refers to PROT_READ | PROT_WRITE | PROT_EXEC, which is essentially the RWX we are aiming for.

There are various tools that can attempt to create the chain automatically, such as ROPGadget and Ropper. We will use ropper but build the ROP chain manually to show how it can be done.

As per the Linux system call convention, EAX will contain the syscall number, which is 0x7d for mprotect. EBX will contain the address parameter, ECX the size and EDX the desired protection.

Let’s start setting EBX to 0x8072000. We look for a gadget that contains the POP EBX instruction and has the least side-effects possible.

We choose the smaller gadget and start constructing our chain. This looks as follows. Execution will be redirected to 0x804c39d, which will first execute a POP EBX instruction, setting EBX to the desired value of 0x8072000. Next, POP EBP will be executed, so we need to provide some dummy value so there is something to pop from the stack. Finally, the RET instruction is executed, which pops whatever is next in the stack and jumps there. This needs to be our next step in the chain.

All values are packed as little-endian unsigned integers.

We do the same process to set the desired size in ECX. It is important to understand that the order matters, as we could be unknowingly overwriting the registers we have already set if we are not careful. Moreover, sometimes the gadgets will not look as nice as POP DESIRED_REG; RET and we will have to deal with potential side effects that need additional adjustments.

Here we will choose the more benign 0x080664f5. This gadget alters the value of EAX, but we do not rely on anything specific set in EAX at this time, so it is useful. We append this to our ROP chain.

We repeat the process, this time to set EDX to 7, which is the RWX protection level.

This time we select the gadget at 0x08066f24, which does not mess with our previously set registers.

Finally, we need to set EAX to the syscall number 0x7d. We search gadgets containing POP EAX and do not find anything that will not alter our current setup.

We could try to reorder our gadgets in a different way, but we will just search for another gadget that does XCHG EAX, EBP and chain it with the ubiquitous POP EBP; RET.

From here we take 0x804f94a and 0x804c39e and append them.

The registers are now configured as desired to execute the mprotect system call. In order to do this, we need to call INT 0x80, which notifies the kernel that we want to execute the system call.

However, when we look for gadgets containing this instruction, we find none. This can make things a bit more difficult.

Luckily, there is another place where we can find this kind of gadget. All user space applications have a small shared library mapped into their address space by the kernel that is called vDSO (virtual dynamic shared object). This exists for performance reasons and is a way for the kernel to export certain functions to user space and avoid the context switch for functions that are called very often. If we take a look at the man page, we will see something interesting:

This means there is a function in the vDSO that might know how to perform system calls. We can inspect what this function does in GDB.

As can be observed in the screenshot above, __kernel_vsyscall contains a useful gadget. We execute the process a few times and realize that this mapping is not affected by ASLR, which allows us to use this gadget. The values of EBX, ECX and EBP do not really matter right now, as they will be set after the system call is executed anyway.

We update the exploit code to send the chain we built and attach GDB to the running SMB binary.

EIP will redirect execution to our first gadget, so it is a good idea to put a breakpoint at 0x804c39d, which is the start of the chain.

Use stepi to observe how the registers are set to the desired values. Right after INT 0x80, we can list the mapped areas and if everything worked OK, the heap will be marked as RWX.

The remaining piece consists of storing arbitrary code in the heap at a known location so we can jump there and get a shell, but how can we do this?

When we put a breakpoint in read(), we observed that the request data was being stored somewhere in the heap. In addition, we had various samples of Negotiate Protocol Request requests, so it is possible to determine that if the message type byte is set to 0x00 then we are going to reach some path in the program where the payload will be processed and stored in the heap.

To test this assumption, let’s put a breakpoint in read() again and change the PoC payload to send a benign Negotiate Protocol Request message with 512 “A”s as content. As a reminder, the format is:

message type (1 byte) — flags (1 byte) — message length (2 bytes) —message

This time message type will be set to NETBIOS_SESSION_MESSAGE (0x00). We are not using another Session Request message (0x81) to avoid accidentally triggering the vulnerability and having to deal with the “.” characters that the vulnerable function places in between.

Step through the read function until the 0x204 bytes (512 “A”s + the 4 byte header) are read from the network. As stated before, ECX contains the address of the buffer.

Inspecting the memory contents at the specified address shows our payload.

Press c to allow execution to continue normally and send a new request to check if the previous one gets overwritten or if it is just left there in the heap for now.

When the breakpoint is reached again, we try to print the contents of the read buffer and unfortunately we realize that it has been zeroed out.

However, the previous request could still be lingering somewhere else if the application made a copy that was not zeroed out. It is possible to search the current address space with PEDA by using the find or searchmem commands. Our message consists of 512 “A”s, so we attempt to find a contiguous block of “A”s. The commands take an optional parameter separated by a white-space to confine the search to a specific area. We are only interested in results that might be present in the heap.

This means that the contents of the request are being copied and left in some buffer that is not cleared out. We need to make a few more tests to be able to trust this location to store our payload. In particular, if we change the script to send 512 “B”s instead of “A”s, we will see that 0x8085074 will end up containing the “B”s after the request is processed. We need data to persist throughout additional requests, so this is not good.

However, if we first send 512 “A”s and then let’s say 256 “B”s, it will become evident that the first half is overwritten but the second half will still contain the bytes from the previous request. The weird looking 0x00000e89 is chunk metadata from the heap control structures and is not relevant to our scenario.

Knowing that the data will be persistent across at least two requests, we can craft the following plan:

  1. Send a Negotiate Protocol Request with the code we want to execute. The first part will be a few hundred NOP instructions because these bytes will be overwritten when we issue the second request with the corresponding Session Request message that triggers the vulnerability.
  2. Send a Session Request message that corrupts the stack, ROPs to mprotect, marks the heap as executable and jumps to the hard-coded location where the payload from #1 is stored, abusing the fact that the heap base is not randomized.

We make an arbitrary decision to leave 512 bytes for the second request, so we will jump at the hard-coded location of 0x8085074 + 512 = 0x8085270. This address needs to be appended to our ROP chain. The previous gadget will execute its final RET instruction, 0x8085270 will be popped from the stack and the program execution will follow.

The first version of the shellcode will consist only of INT3 instructions so the debugger breaks upon execution. The opcode for INT3 is CC.

The script is also modified to open two connections, one for each request.

Attach to a new SMB process and run the exploit.

We are now executing arbitrary code. Let’s generate a reverse shell payload with msfvenom.

We modify the first stage to store this payload and run the exploit again.

This time we open a netcat listener at the specified port so we can receive the connection.

…and we have a shell. Hooray!

Conclusion

Fuzzing a network service using a mutation-based approach can be done with very little effort and may produce great results.

If you are looking for vulnerabilities in applications that talk arcane proprietary protocols or are just too lazy to build a comprehensive template, give dumb fuzzing a go. You can even leave the fuzzer running with minimum effort while you apply your ninja reverse engineering skills to understand the protocol and build something better.

RouterOS powered devices are now everywhere and the lack of modern (for arbitrary definitions of modern) exploit mitigations is a bit worrisome. Having full ASLR enabled would make the life of exploit writers a bit more difficult, and most stack overflows would be rendered unexploitable in the absence of info-leak vulnerabilities if the binaries were compiled with stack canaries support.

It is worth mentioning that MikroTik’s response and patch times were great. At first the changelog did not hint a security vulnerability existed:

What's new in 6.41.3 (2018-Mar-08 11:55): 
*) smb - improved NetBIOS name handling and stability;

However, they seem to be more serious now. They include more detailed comments in their changelogs regarding security vulnerabilities and seem to have a blog where they post official announcements regarding these types of issues as well.

The astute reader might have noticed that if you reproduced the steps outlined in this post, you might even have found a few additional 0days in RouterOS SMB.

Have fun!

Additional resources