Trendyol CDN — 1

Levent CENGIZ
Trendyol Tech
Published in
6 min readJun 22, 2020

The purpose of the Content delivery network (CDN) is to deliver content to end-users with high performance uninterruptedly. The servers included in the CDN network have been based at different geographical points and connected over several ISPs. In this way, users receive the content from the closest point to them and thus, access the content with minimum delay.

First, I would like to touch upon Trendyol’s CDN Infrastructure. Trendyol actively uses MultiCDN architecture and some part of the incoming traffic is met by a different CDN provider while the other part is met by TrendyolCDN (Girdap) we have developed and in-housed. The amount of traffic in normal time is over 100 Gbps.

The aspects such as redundancy, sustainability, and scalability are primary factors underlying our migration to MultiCDN architecture. We can summarize the improvements we have made on Trendyol CDN (Girdap) as follows.

WebP Migration

webp offers us a better compression method compared to jpeg technology and thus, we achieve an average of 40% gain in file size on supported platforms. Thereby, we indirectly reduce our bandwidth usage while the user can access the content much faster.

Object Storage Migration

Along with that OpenStack comes into our lives, we have also migrated to the OSS structure actually offering high scalability and redundancy which can be scaled in a more agile way on the Object Storage section. The entire CDN content of Trendyol was hosted on legacy file servers. The Legacy Fileservers that we used at Trendyol’s current scale were not scalable sufficiently. Therefore, we had to separate the file server-side and request the content on the part of several file servers. In this case, writing unnecessary rules on the load balancer side was a constrained issue for us both operationally and in terms of system flexibility. By migrating the Trendyol CDN side to OSS, we almost removed the limits in the fields of Scalability and Efficiency. On the OSS side, we have a 500 TB storage with the current physical resources. Besides, we can say that the object Storage used by Trendyol in its CDN infrastructure is the first major project we have taken to the production environment on OpenStack at a high scale.

We receive a lot of questions such as How can I develop my CDN infrastructure?, and in this post, we will try to explain how to develop a DIY CDN at its simplest.

To create our CDN Infrastructure, we will use Tengine as the CentOS web server as Operating System.

What is Tengine?

Tengine is a web server developed by Taobao, one of the largest e-commerce sites in Asia. It has been forked from Nginx and it has many advanced features. Tengine also publishes many large-scale websites, including taobao.com and tmall.com, decisively and efficiently.

Go to the following URL to access Tengine RPM that we have custom-built for this article.

https://trendyol.dsmcdn.com/RPM/tengine-2.3.2-1.el7.ngx.x86_64.rpm

Tengine, VTS Support (Virtual Host Traffic Status), Async SSL support, Brotli Support, Header More Support, Cache Purge Support, Concat and vnswrr Support, Fancyindex Support, Dynamic SSL Buffer Size feature, and Tengine have been specially built for this article and cover all the aspects covered in the article content.

Let’s start by activating BBR on CentOS. We have touched upon BBR in our previous post. You can access the details in the article below and perform the BBR installation.

On the assumption that we have completed the BBR installation, we can proceed to the Tengine installation. You can install Tengine as follows.

wget https://trendyol.dsmcdn.com/RPM/tengine-2.3.2-1.el7.ngx.x86_64.rpm

yum localinstall tengine-2.3.2–1.el7.ngx.x86_64.rpm

Now that we have completed the installation, we can proceed to the configuration side.

/etc/nginx/nginx.conf content

The ports 3380 and 3381 have been defined for VTS (Virtual Host Traffic Status) Metrics. As the specified ports are not included in the default HTTP ports of SELinux, you can turn off SELinux completely or add rules for the relevant ports to utilize these ports.

You can display the ports known for HTTP services with the help of the command below.

semanage port -l | grep -w http_port_t

Using the commands below, you can enable ports 3380 and 3381 on SELinux.

semanage port -a -t http_port_t -p tcp 3380
semanage port -a -t http_port_t -p tcp 3381

Or you can turn it off completely.

sed -i ‘s/enforcing/disabled/g’ /etc/selinux/config /etc/selinux/config

/etc/nginx/conf.d/test.conf content

Let’s start examining the Config content we have created:

proxy_cache_path /girdap/cache_path keys_zone=girdap_test:1024m levels=1:2 inactive=1d max_size=1g use_temp_path=off;

The first configuration we have performed on the Tengine side is to determine the variables such as path, zone, size where the cache resources will be stored. In the aforesaid configuration, we have stated that we will keep the cached content in /girdap/cache_path with a maximum size of 1 GB, the directory hierarchy will be 1:2 and we have created a zone under the name girdap_test and allocated 1024 Mb to keep the cache meta in this area. The term inactive refers to invalidating the file if the relevant content is not accessed for 1 day.

Tengine determines the hierarchy to cache the content according to the proxy_cache_key definition that we have specified in the configuration.

Variables that we use under proxy_cache_key $host$uri

  • $host=domain
  • $request_uri=URL

Let’s create a cache based on the content a thttp://trendyol-temp/web/logo/ty-elite.svg

As you can see in the above screenshot, it has returned as X-Cache: MISS following the first request as the content is not available on our Cache servers. When we make a request to the same URL again, we can display that the content has been cached.

The Cache was created based on the variables we used in proxy_cache_key. We may need to make some reverse engineering to display the cache file on the file system, and the Nginx created the content on the file system according to the aforesaid levels=1:2 definition.

$host: trendyol-temp

$request_uri: /web/logo/ty-elite.svg

To access the file, let’s first take the MD5 version of the trendyol-temp/web/logo/ty-elite.svg combination.

1 that we have specified on the side of the hash key created 9543e65e8327ace5357b120c276fe7ca levels=1:2 represents a which is the last definition of hash. 2 represents the last two 7c preceding a.

We can access the Cache file we have created in this direction via the File Path below.

/girdap/cache_path/a/7c/9543e65e8327ace5357b120c276fe7ca

You can access detailed statistics using VTS, which is the Nginx module.

In our next post, we will touch upon Brotli, Async SSL (With QAT), OS Base Tuning. See you :)

Fist bump, keep in touch!

--

--