Redesign of the Czech TV video platform and SEO perspective

Sarka Jakubcova
Česká televize / Czech TV
18 min readMar 29, 2022

Read our case study on the preparation, planning and implementation of complete rebuilding of the Czech TV online video platform with regard to search performance. What needed to be addressed in the SEO redesign of a site with tens of millions of pages and a long history?

The Czech TV video platform is called iVysílání

1 Why does Czech TV need SEO?

SEO is mostly understood as a part of performance online marketing. The first question that may arise is, why is performance marketing necessary for a public service media? After all, it inherently works differently than a regular company which sees growth and profit generation as its main goals. Czech Television as a public service doesn’t work with classic performance marketing.

2 Mapping the original solution

In order to successfully plan and implement all the important aspects that are involved in search visibility, we needed to identify the pitfalls of the original site design.

The original iVysílání, and the entire Czech TV website in general, was one big jumble of information. The site had been growing naturally over its almost 20 years’ history, parts piling up on top of each other, features growing gradually, huge amounts of content being added in a chaotic way, bugs being patched up in various ways or remained on the site, piling up one over the other.

From the SEO perspective, not even the basics were being addressed here, which mostly affected the technical state of the site and the basic on-page. On the other hand, lots of unique content and high authority were strong foundations to build on.

When we started planning the redesign also from the SEO perspective in the spring of 2020, there were over 4 million valid pages in Google’s index and about 14 million pages that Google decided to exclude from the index for various reasons.

This, at first glance, was already a strong disproportion.

In total, we estimated the size of the site at 20–40 million URLs, data of all of which we were naturally unable to download. Instead, we used the following data samples for the basic analysis:

  • Screaming Frog crawl all types of URLs including subdomains of size approx.1 million URLs
  • Screaming Frog crawl of HTML pages ceskatelevize.cz with size
    approx.1 million URLs
  • data from Google Analytics and BigQuery of the main domain, subdomains and sub-sections of the website
  • Search Console data from the main domain and all subdomains
  • data from Ahrefs and Majestic tools

2.1 The Czech TV domain system

The Czech Television websites consist of a total of approximately 300 separate domains and 200 subdomains. Many of the separate domains are registered to the name of the programme, but not all of them contain content, and many domains are used for internal applications. For our purposes, the following sites are the most important:

Ceskatelevize.cz
The domain contains the sections iVysílání, Programmes, Live Broadcasting, TV Guide, information All about Czech TV, e-shop, teletext and other types of partial content.

In the million crawl we collected data on 918,000 pages, or 91.8% of all data.

CT24.ceskatelevize.cz
From the subdomain of the Czech TV24 news channel, we crawled data from 34 000 pages. The subdomain accounted for 3.4% of all data.

Decko.ceskatelevize.cz
The website of the children’s channel Czech TV:D contained 12 000 pages in our dataset, i.e. 1.2% of all data.

Sport.ceskatelevize.cz
The sports channel contained 7,500 pages and 0.8% of the collected data.

Edu.ceskatelevize.cz
The educational channel for children and young people contained 3,800 pages and 0.4% of our data.

Art.ceskatelevize.cz
Czech TV Art, a channel with cultural and artistic programmes, contains 2,700 pages and 0.3% of our data set.

Other domains and subdomains contained 22,000 pages, or 2.2% of our dataset.

Representation of Czech TV domains in the dataset

In practice, page counts mean the number of linked pages that lead from the main domain ceskatelevize.cz and that we were able to crawl — it told us how much each domain was linked to the main one and how extensive each domain was. As part of the video platform rebuild, we dealt primarily with the main domain and its iVysílání, Programmes, Live Broadcast, and TV Guide parts.

2.2 Duplications… just… everywhere

If the previous iVysílání solution had to be described in a few words, it would be “you can find everything everywhere”. We found hundreds of types of duplications and content similarities on the site itself and between subdomains of the site. Here is the list of the most important ones.

2.2.1 Different versions of URLs

Most of the important URLs of the main site existed in slash and non-slash versions, with some links leading to one version and others to the other. The main URL for iVysílání works as an example:

https://www.ceskatelevize.cz/ivysilani
315587 unique internal links

https://www.ceskatelevize.cz/ivysilani/
704099 unique internal links

Ratio of internal links to the slash and non-slash version of the URL

Another related issue was the unfinished migration of some parts of the site to https protocol and internal links that led to insecure http.

This condition not only complicated search visibility, as multiple versions of the URL were indexed, but was also confusing for analysts, for example, when evaluating the performance of individual pages.

Therefore, one of the first priority tasks was to merge all URL variants in the entire web system into one, and ideally redirect the other variants to it in one step. All URLs were given the form

https://www.ceskatelevize.cz/ivysilani/
and all the other variants were permanently redirected to it. For example:
https://www.ceskatelevize.cz/ivysilani
https://ceskatelevize.cz/ivysilani/
http://www.ceskatelevize.cz/ivysilani
http://ceskatelevize.cz/ivysilani/
and other.

2.2.2 iVysílání vs. programmes

On the original website, the video platform consisted of two basic sections:

  • iVysílání — the section served as a simple player. Here we found a selection of videos sorted by alphabet, by date, by category. All content, including the videos as such, was placed on a URL with the /ivysilani/ directory.
  • Programmes — the section served as a catalogue of programmes. The programmes could be played here as well, there was also a selection sorted in the alphabetical order and a slightly wider selection of categories and genres. In addition, most of the programmes had quite rich secondary content, i.e. pages with various related bonus information.Everything, including videos and listing pages, was located in the /programmes/ directory.

So, if we typed a simple query into the search for the title of the programmes, e.g. a cooking progamme called “Boys in Action”, we had two options for where to go without clearly identifying the user’s intent.

Either to the URL https://www.ceskatelevize.cz/ivysilani/10084897100-kluci-v-akci/ in the iVysílání section, where the latest episode immediately started playing. There we also found all the other episodes, a discussion panel or the option to write a note to the editor.

Boys in Action — player in iVysílání

Or we could go to the programme’s page in the programme section at a different URL, https://www.ceskatelevize.cz/porady/10084897100-kluci-v-akci/, where, in addition to all the episodes, was the secondary content for the programme, such as recipes.

Boys in Action — page in section Programmes

The new solution removed this duality and merged all videos in one place in the Programmes section. This was a good choice regarding SEO.

2.2.3 Listing pages and duplicate navigation

The same duality was also found in the selection of programmes within their categories and for other listing pages containing videos. For example:
https://www.ceskatelevize.cz/ivysilani/podle-abecedy/
https://www.ceskatelevize.cz/porady/a-z/

However, different pages pursuing the same goal with similar content were not only tracked between two sections of the site, but largely also within a single section. To give an example, the following URLs existed for the movie selection category:

https://www.ceskatelevize.cz/ivysilani/filmy
https://www.ceskatelevize.cz/porady/filmy
https://www.ceskatelevize.cz/porady/tema/filmy/
https://www.ceskatelevize.cz/porady/tema/vyber/filmy-serialy/

In addition to the examples above, there were hundreds of other individual instances of duplicate and similar content on the web, which we decided to crack down on within the iVysílání project.

2.3 Every programme with its own website

Each programme page often had very rich secondary content that was building on top of the programme page. This content didn’t have a predetermined structure, therefore, each programme was a separate small website within the Czech TV website, often with completely unique content and URL structure.

We found 165 unique parameters in the structure of the entire site, most of which were used specifically in the programme section. For example, there were as many as 7 unique parameters for the pagination of content on the site:

?page=
?pg=
?strana=
?stranka=
?pageDiv=
?from=
?chapter=

Secondary programme content made up the largest type page on the site, totaling around 5 million separate URLs. The vast majority of this content showed traffic only during the programme’s broadcast. Only about 10% of the secondary content had permanent traffic. The most popular types of this content are games, recipes or encyclopaedic content.

Examples of popular secondary content for programmes:

Game — AZ Quiz
Recipes — Herbarium
Encyclopedic content — Michael’s Experiments

Regarding current or permanently popular programmes, information about the creators and cast, filming locations, etc. is also important.

The rest, i.e. 90% of all secondary content, were so-called zombie pages, without a single user visit in years. We found hundreds of thousands of pages on the web that discussed, for example, what a programme’s theme song was in 2007 or who won an SMS contest in 2012. We decided to get rid of these relics and leave only the content currently interesting to users and generating stable traffic on the site after the redesign.

2.4 Index clutter and crawling barriers

Over the years, the accumulating and unresolved errors have certainly not made the search engines’ job a piece of cake. Not only was the website giving them a hard time with duplicates and large amounts of worthless content, we found many other technical errors. Here are a few examples of the most important ones from the SEO point of view:

  • Indexing temporary URLs for embedded videos in the iframe player e.g. https://www.ceskatelevize.cz/ivysilani/embed/iFramePlayer.php?IDEC=291%20383%2060485&index=743663&origin=artzona&hash=6f6585c9387fe08c27e306ad51f4a7361cbebccd
  • Indexing and bulk internal linking to internal search results pages that cannibalized primarily show pages.
  • Automatic creation of parameters when using the site, e.g. when clicking on help. This created millions of more unnecessary URLs. E.g. https://www.ceskatelevize.cz/ivysilani/napoveda/kontaktni-formular/?program-url=/ivysilani/10220197828-co-dite-to-muzikant/30929434018/
  • Indexing of user forms for the show from the Discussion, Contact Us or parametric links for sharing and reminding of the programme. A large number of these URLs were in the index, banned for crawling in robots.txt and additionally with noindex in the robots meta. In practice, this state means that they could have remained in the index forever this way.
  • Indexed and unoptimized filtering parameters in different parts of the site.
  • Huge amount of redirects and 404 errors in internal linking.
  • Poorly configured canonical.
  • Non-existence of sitemap.

2.5 Complete absence of basic on-page optimization

Something that was completely missing from the site was at least a basic on-page optimization. The biggest bugs fixed in the new iVysílání were:

  • Repetitive and often inappropriately constructed headlines, where the person who was searching often didn’t know exactly which page they were clicking on.
  • The same H1 heading for a large number of pages, very often just iVysílání.
  • Missing page meta descriptions, which often caused funny snippets describing menu items or listing letters of the alphabet.
Original snippet of the main show listing
  • Cluttered images, unoptimized titles and missing alternate descriptions, which, among other things, caused the site to perform very poorly in image searches.
  • Unoptimized video content, which was, among other things, causing poor performance in video search.

2.6 Huge amount of redirects

The redirects within Czech TV websites are a huge system that has gradually developed and expanded over time. At the same time, it is stretched across all Czech TV domains and subdomains and across several levels of implementation:

loadbalancer > web server > application server > administration

As mentioned above, Google indexed over 4 million valid pages. However, it also had 6.3 million redirects in the index. So Google was getting more URLs with redirects from us than it did of valid content.

According to the Ahrefs tool, there were over 30.8 million unique links to ceskatelevize.cz, but 11.5 million of them were to URLs that redirected somewhere further and led through more than one redirect (redirect chain).

An analysis of access logs from a 4-day period in December 2020 showed us that the server had to handle 5.3 million redirects. And millions more redirects were likely returned by other parts of the infrastructure (loadbalancer, web server). This shows that this amount was extremely high.

The large amount of these redirects was mainly caused by:

  • Bad URL versioning,
  • incomplete http to https migration, and the enforcement of http by some applications in redirects,
  • unclear use of 301 and 302 redirects and their alternation,
  • using friendly URLs to present programmes in different versions of the site and linking to them in an inconsistent way. These friendly URLs are then redirected multiple times. The number of redirects was often 6 or more, with Google enabling to follow a maximum of 5 steps.
Multiple redirects of an internal link to a friendly programme URL

First of all, it was important to fix several systemic problems, and release the layers of bugs and broken or unnecessary redirect rules that were occurring at different levels of the redirection. And last but not least, it was necessary to fix the internal linking to the correct version of the URL, thus removing millions of unnecessary URLs directly from the site’s linking. This set the stage for millions of new redirects that were needed when we launched the new iVysílání project.

2.7 High authority and referral traffic

Clearly the strongest area regarding SEO is the link profile and authority of the Czech TV website. In general, Czech Television has a lot of content that naturally generates thousands of links. It’s a high authority domain with millions of backlinks, tens of thousands of referring domains and hundreds of thousands of organic keywords.

Link Profile Overview — Spring 2020

The goal for the new platform was to migrate the entire link profile by consistently redirecting old URLs to new ones. And also fixing about 32,000 broken links that led to Czech TV sites. Other than that, we decided not to take any further action on authority.

3 SEO strategy for the new platform and its implementation

Based on the state of the website before the redesign and the new iVysílání project, we built an SEO strategy and started to implement it gradually. It’s important to mention that the whole new iVysílání project ran in MVP mode, i.e. with the goal to start with a basic viable version, release it and then develop it agilely. Within the MVP, we addressed the following areas in relation to SEO:

  • database clean-up and maintenance
  • reworked category and programme selection logic
  • consolidation of programme sections into one place and their new presentation
  • maintaining the most visited secondary content
  • new internal search technology
  • improved live streaming
  • global headers and footers for websites

Below is the description of how the project progressed with the planning and implementation of things that couldn’t be missed for SEO.

3.1 Cleaning up the index and troubleshooting global errors

First and foremost, and regardless of the state of the new project, we did our best to clean up the index and make the site, in its entirety, much easier to traverse. This meant:

3.1.1 Unifying domain and URL versioning

Here, we managed to unify all the iVysílání URLs to https://www.ceskatelevize.cz/ivysilani/. All other versions are redirected to the default URL in one or two steps max.

3.1.2 Cleaning up the original redirect mess

The main tasks that we implemented included removing the http protocol enforcement in the historical applications providing redirection, and unifying the redirection protocol to 301. We also cleaned up the redirects from the build-up of ancient and no longer necessary rules. We also fixed the most important internal links on the site so that they lead directly to the correct version of the URL. These four steps have made the redirect chains significantly shorter. After clicking on the friendly URL in the example below, the chain shortened from six steps to just three.

New path from the link to the programme’s friendly URL to the programme page

3.1.3 Avoiding the creation or indexing of unnecessary URLs

The goal of this action was to literally pull out as many unnecessary URLs as possible from the current index. Specifically, we managed to incorporate the following points before the launch of the new iVysílání:

We removed the ban on crawling pages with noindex in robots.txt, allowing search engines to read and apply the noindex.

We removed and redirected redundant help URLs. Instead of linking to a URL with a show parameter, links led to a clean URL with no parameter.

So instead of:
https://www.ceskatelevize.cz/ivysilani/napoveda/kontaktni-formular/?program-url=/ivysilani/10220197828-co-dite-to-muzikant/30929434018/

they all led to a single page:
https://www.ceskatelevize.cz/ivysilani/napoveda/kontaktni-formular/

We put noindex on the new URL of the temporary video iframes.

We pulled all the known parameters from the URL and set in Search Console how Google should treat them.

From the original 4.7 million pages indexed by Google, the number has dropped to 2.8 million. In contrast, the number of censored pages has risen to 17.7 million. This huge number includes newly deployed redirects.

Desirable decline in indexed pages

As far as Seznam.cz (Czech search engine) is concerned, we saw a decline in webmaster tools from 700,000 indexed pages and 500,000 errors through launch to 360,000 indexed pages and 50,000 pages with errors.

3.1.4 Global internal linking fixes

The biggest contribution in this respect was clearly the reconstruction and unification of the headers and footers of all Czech TV sites. In particular, there was a lot of clutter in the various footers of different and historical sites. Redirect chains from the footer could account for up to 80% of all redirected links in internal linking.

CT24 footer: Orange — redirect chain, red — 404 error

The percentage of redirects in internal linking dropped from 42% to just 3% after the launch of the new site within our 1 million data set. Not only did this tremendously relieve the infrastructure that had to handle these unnecessary 301 and 302 redirects, but it has also greatly improved the user experience and made search engine robots’ crawling of the site much more efficient.

As part of the new design, we made sure that all links ideally lead straight to the landing page, without unnecessary redirects or even links that lead to 404s. We also ensured that iVysílání doesn’t bulk link to internal search results pages, but always to the classic site page.

3.2 Clarifying the new URL structure

The goal was to preserve as many URLs as possible the way they existed on the original site. The big SEO achievement was to keep the URLs of all programmes and episodes. On the other hand, the thing that was completely reconstructed was the system of listing pages for programmes, categories and video bonuses. The live streaming section, too, underwent several changes.

The experience with the inconsistency of the content on the original website was helpful to us when creating the type pages within the new application. Therefore, the new site can only contain pages that have a predetermined URL structure and similarly structured content of a predetermined quality.

So we created type pages like:

  • video listings on new URLs
  • programmes and their episodes on the original URLs
  • video bonuses on new URLs
  • secondary content was partially preserved on the original URLs with the original design.

3.3 SEO checklist

For each sub-section of the site that was even slightly related to SEO, we prepared a custom specification that corresponded with the goals for the entire platform. The checklist had 40 items with detailed SEO specifications for the content and administration of each page, on-page, technical setup and linking of pages, or perhaps structured data for each content type. Each sub-section of the site had its own specification, and so did each separate technical or structural issue.

All documentation was consulted with the rest of the team, including mainly backend and frontend developers, UX designers, editors, graphic designers or analysts. From the documentation, we created dozens and dozens of specific tasks that were prioritized with a simple must-have / nice-to-have system.

This part of the work was so extensive and very specific to each sub-task that I’ll only highlight 3 most interesting areas within the overall solution.

3.3.1 Server side rendering

Since the new iVysílání project is entirely applied with the React JS library, it was necessary to provide server-side rendering of all content to HTML for search engines. This allowed us to index the new content quickly and seamlessly.

For example, all the new video selection categories were indexed within the second day of launch.

Snippet of the new iVysílání — movies category

3.3.2 Video optimization

The core content of a video platform is, of course, videos. Our goal was not only to make the video pages visible in the classic full-text search, but also to increase the number of hits directly from video search.

In this regard, we relied on the best possible metadata, incl. Open Graph or Twitter card and, of course, on structured data. This, and I’ll admit it, is still a bit of a struggle. In the mandatory structured data fields, we often miss important data such as description or upload date in the database.

Another problem is the lack of a so-called ContentURL or EmbedURL that links directly to the video file. iVysílání is a streaming service that does not have a URL for the physical video file, as it inherently works on a different technological principle.

Last but not least, currently we have to deal with the fact that we have videos in different contexts on other Czech Television websites. At this point, it’s not uncommon for Google to only show videos from one domain, which may not be iVysílání.

Google prefers to display videos on the Edu subdomain

In this respect, we are still struggling and gradually improving the existing system. Video search performance is growing rather slowly at present.

Year-on-year comparison of video search performance

3.3.3 Redirects

The cleaning up of the original redirects was closely followed by the preparation of new redirects. It was necessary to prepare a functional redirect system based on as simple rules as possible to physically handle millions of redirects.

The new redirects were built either directly in the new iVysílání application or at the loadbalancer level. Most of the redirect rules for the test phase used the 302 protocol, which switched to 308 after rigorous testing and debugging. This is a permanent redirect for the first and all subsequent URL requests.

Compared to a 301 redirection, the advantage is that robots don’t keep rechecking the redirect as they crawl the site.

We prepared the redirect rules for each part of the site or type page separately. It was necessary to determine exactly which URLs remain the same (e.g. programmes, episodes, selected secondary content or even URLs from the old player), and which URLs were being moved.

The most extensive part was planning to merge the dual URLs in the /ivysilani/ section into the /porady/ section, both at the programme level, but also at the category or bonus video level.

Most of the redirects were set up correctly at launch. Partial errors or forgotten redirects were recognized and debugged about a week after the launch.

4 Summary of the results

The entire iVysílání project went live in its first version on December 6, 2021, and its development has been continuing since. Below is a brief overview what we were able to implement and achieve in terms of SEO, as well as what tasks are still underway.

4.1 Objectives achieved

✅ Maintaining traffic from organic search without drops
✅ Bringing new and strong traffic to the new video category system
✅ Properly executed redirects and quick reindexing of important content
✅ Unifying structure and minimizing duplication
✅ Cleaning up the index and reducing the number of indexed pages on Google and Seznam
✅ No new crawl errors within the iVysílání section

4.2 What hasn’t worked so far

❌ Structured data and improved performance in video search
❌ Image SEO and improved performance in image search
❌ Fixing broken links and working with link profile
❌ Complete sitemap implementation

5 What’s next

Plans for the development of iVysílání are big. Even within SEO, we will continue to work on other and often large parts of the project. A lot lies ahead, but let me just highlight what’s somehow related to organic search.

Already in the spring of 2022, the new video player technology should be launched, together with the indexable URLs of specific parts of programmes, such as individual reports. The launch of a new mobile app is also planned for later this year.

We can also look forward to a revised and redesigned TV Guide that will be connected to the iVysílání app in a better way. Better linking of iVysílání to other parts of the website and subdomain content is also being addressed more intensively.

What should be available in 2023 is also a new system of secondary content for programmes, a complete tag system that will add video categories, and the publication of a full database of performers and programme creators, according to which it will also be possible to select video content.

If anyone would like to join the work at Czech Television, we are currently looking for a colleague to help us fine-tune content SEO, in particular.

--

--

Sarka Jakubcova
Česká televize / Czech TV

Šárka has been working in SEO for 12 years and focuses mainly on technical and content SEO.