Introduction to Unicode attacks

Charithra Kariyawasam
White Hats
Published in
3 min readOct 3, 2017

--

Introduction

The attack aims to explore flaws in the decoding mechanism implemented on applications when decoding Unicode data format. An attacker can use this technique to encode certain characters in the URL to bypass application filters, thus accessing restricted resources on the Web server or to force browsing to protected pages. Before we learn about the attack let’s learn about Unicode

Brief introduction to Unicode

Early character encodings also conflicted with one another. That is, two encodings could use the same number for two different characters, or use different numbers for the same character. Any given computer (especially servers) would need to support many different encodings. However, when data is passed through different computers or between different encodings, that data runs the risk of corruption.
The Unicode Standard provides a unique number for every character, no matter what platform, device, application or language. It has been adopted by all modern software providers and now allows data to be transported through many different platforms, devices and applications without corruption.

How this attack work

In normal file path representation ../ or ..\ are used. So as these can be used to do path traversal servers will always check the requested URL doesn’t go over the permitted file directories. Unless using the above characters a user can access any of the files in a given server. So servers will make sure to check what does URL requests or tries to enter.

But when the URL contains Unicode encoded characters the above security check may not work. After the initial check for traversing characters in the URL, it will then start to decode the Unicode URL. If the “/” character is encoded in Unicode as ”%c0%af”, the URL will pass the security check, as it does not contain any “../” patterns. Instead the security check only sees “..%c0%af”, which it does not recognize as a malicious pattern. So when the decoding procedure is over that character will be treated as a “../”. So now a malicious traversal can be performed. Some web applications scan query string for dangerous characters such as

  • ..
  • ../
  • ..\

to prevent directory traversal.

However, the query string is usually URI decoded before use. Therefore, these applications are vulnerable to percent encoded directory traversal such as:

  • %2e%2e%2f which translates to ../
  • %2e%2e/ which translates to ../
  • ..%2f which translates to ../
  • %2e%2e%5c which translates to ..\

Threats of Unicode attacks.

Due to the ability to traverse to unauthorized directories a user can get sensitive details and can delete or modify them according to their preferences. They can also execute arbitrary commands to the server via URLs. So due to this traversal a low-level user will have escalated privileges.

Countermeasures against Unicode attacks

  1. Using patches given by the servers. As an example, Microsoft IIS has provided a patch to fight against this vulnerability.
  2. When client input is required from web-based forms, avoid using the GET method to submit data, as the method causes the form data to be appended to the URL and is easily manipulated. Instead, use the POST method whenever possible.
  3. Any security checks should be completed after the data has been decoded and validated as acceptable content (e.g. maximum and minimum lengths, correct data type, does not contain any encoded data, textual data only contains the characters a-z and A-Z etc.).

Summary

The power and flexibility of the Unicode vulnerability make it one of the most popular, and therefore dangerous, vulnerabilities currently used by attackers today. Attackers can very easily create new and varied attacks using this vulnerability, and the vulnerability is easy enough to exploit that talented attackers can attack your server “on the fly”, adjusting their commands issued to the server to adjust for your particular environment. However, this vulnerability can be easily defeated if a careful system administrator takes a few simple steps, such as moving the web folder root off of the logical drive that holds the system executables. But there are too many systems on the Internet whose administrators take a less proactive approach to security and only apply security patches on their system rather than design security into the system from the start.

--

--