Music To Your Ears

One of the largest parts (by filesize) of Dota 2 replay files is the audio component of all the various casters broadcasting their commentary of the game to tens or hundreds of thousands of listeners. Being able to review, search and analyze this data is something that very few people have looked into — so we’re pleased to announce that we’re rolling out some features related to this.

There are two components for this feature: a single ‘landing page’ showing all the recent games, and information on each match page for when looking at matches in the normal datdota website. The landing page is live now, and the per-match view will follow in the next few days.

This (mobile-friendlyish) landing page is at

To prevent insane traffic and scrapers — we’re rate limiting users to see how we handle the load. Please don’t try be nasty about this.

We only plan to parse professional games (i.e. “premium” and “professional” under our tiering system) , not “semi-professional” or “amateur games”. We’ve already parsed all the premium games and the remaining Source 2 (~post-TI5) professional games will be done in the next few hours. We’ll consider Source 1 (which might actually have more demand since people can’t watch the replays any more) if the feature gets some good traction. Audio parsing of new games happens automatically as soon as they are fully processed into our database for all the other stats we use. This is normally within 120–150 seconds of the game ending (right around the time you’d finish watching the match with the regular 2 minute spectator), but can be delayed if the replay gets stuck on Valve’s CDN. This makes loading up a few games in your browser for listening easy — perfect for work situations. The limited bandwidth required is also really helpful for users who are on the go — the entire TI6 Grand Finals is less than 50MB of data!

This is only the first step towards a variety of features related to this type of data — we have a lot of cool ideas planned over the next while, and like most cool features, this was a really big team effort:

  • Invokr wrote all the very complicated audio parsing code and parsing pipeline which nobody understands but trusts to work :D
  • Noxville did all the non-parsing backend and devops work.
  • Cyborgmatt did all the frontend work (which Noxville butchered into millions of little pieces and then put back together poorly).

In our next roll-out, we’re also spiffing up some of the caster pages, so I’d like to pre-thank the following guys for their work on this upcoming feature:

  • Leo helped us fix up a lot of caster information for RUHUB, and sourced all their photos.
  • Borno handled the editing and quality control of the talent photos.
  • Blaze for sorting out permissions with the talent photos for BeyondTheSummit and some independent casters.
  • Bafikk for sourcing and sorting out permissions with the talent photos for Starladder.
  • Zyori for sorting out permissions with the talent photos for Moonduck.

Please note that some of the casters are very loud, some have horrible mic setups via Dota2TV, some echo, and some clip. :D

Hope you enjoy it!

— ; Noxville