MySQL diary #4: Defer your work

One of the rules that should be applied when developing a feature or fixing a bug (however, try to avoid it in your real life ;)) is “if this can be done later, do it later”.

Maciej Brencz
Fandom Engineering
3 min readMar 18, 2019

--

Wally’s philosophy applied wisely does help / http://dilbert.com/strip/2015-03-09

Defer writes to the database

Using database query logs we identified that we make a fair amount of UPDATE queries when handling GET requests to MediaWiki app (680k a day). Almost 60% of these were coming from WatchedItem::updateWatch method (it unmarks the page you’re viewing on Special:Watchlist). A fix moved this queries to an offline task.

Graph of UPDATE queries on GET requests that reached SJC Apaches with a nice drop caused by a fix.

We used the same approach to fix an issue that was hitting master database nodes quite heavily with long queries and deadlocks. Backports from MediaWiki upstream helped us again. Heavy UPDATE query that was issued when visiting category page was moved to an offline task. As a result we noticed a nice drop of 95th percentile of response time from category view transaction. The overall response time percentile dropped as well — 95th percentile dropped by 25%.

NewRelic request time percentiles for category view transaction …
… and overall.

Or even better, avoid it

User’s cache invalidation was impacting users making “heavy” contributions, for instance uploading multiple files at once or making edits in multiple tabs. Instead of seeing their edits, they were faced with the following error:

User::saveSettings method was bumping the value stored in user’s table user_touched field on shared database. A great candidate for a deadlock. This method was responsible for ~640k of UPDATE queries a day on a shared database cluster (out of ~825k of ALL UPDATE queries there). Fortunately, since MediaWiki 1.25 that’s no longer the case. A much lighter approach is applied now - key-value cache is used instead. The lightweight “touched” handling now takes place for uploads and edits. Database is updated only when user preferences is changed.

The above graph shows the number of all UPDATE queries made on shared database before and after a fix. And what’s important — our users can now make multiple edits and uploads without risking loss of their contribution.

Do not perform redundant requests

Sometimes a task does not need to be deferred, it can simply be ignored. When working on Chat feature recently we noticed that we were making ~1,4mm of AJAX requests daily to fetch a list of users that were available on chat on a given wiki. However, this was not needed, as the right rail HTML with Chat module already has the fresh data. As a result we removed two AJAX requests (we were also fetching Mustache template to render the list of users) made on every page view for logged-in users on wikis with Chat enabled.

Drop of number of Chat AJAX requests that fetch the up-to-date list of chat members. This is still needed to update in-article Chat widget as its content is kept in parser cache for 24h.

TL;DR

Deferring heavy tasks to an offline process is one of the easiest solutions when solving performance issues. But before moving them there, consider if they’re really needed or if they can be replaced with a much lighter solution.

--

--

Maciej Brencz
Fandom Engineering

Poznaniak z dziada-pradziada, pasjonat swojego rodzinnego miasta i Dalekiej Północy / Enjoys investigating how software works under the hood