How to visually spot and analyze slow MongoDB operations

We improved the performance of our MongoDB databases by using our own developed operation profiler and visualizer

Kay Agahd
idealo Tech Blog
14 min readSep 5, 2018

--

TL;DR If you are just interested in the result of our work or want to have a quick visual overview of your slow queries you might be interested in our open source project which is hosted on our Github repo. I recently wrote also a practical guide to analyze slow MongoDB operations by using this tool.
Keep on reading if you want to know more insights about this project.

Did you ever wonder why your MongoDB database seemed to be much slower than expected? Did you experience periodical hot spots and wanted to know the culprit(s)? Did you ever want to know which database operations were the most expensive, probably the most improvable?

Well, in these cases you certainly got in touch with MongoDB’s built-in database profiler because it allows to collect slow operations per database on a running “mongod” process. Although MongoDB’s database profiler is a valuable source to spot slow operations, it has some inconveniences which may make hunting the culprit time and resource eating operations hard:

  1. each mongod (and database) needs to be handled separately
    This means, that you have to start the database profiler on every mongod that constitutes your database(s). Since one of MongoDB most appreciated feature is horizontal scaling, which means that you simply add mongod running servers to the database cluster in order to scale out, you may have tens, hundreds or more running mongod’s. Multiply this number with the number of databases that are running on this cluster because each database has its own database profiler. Lucky you if you can exclude some unsuspicious databases on that cluster to be profiled, else you may have to start hundreds or thousands of database profilers and, worse, to analyze thereafter the results of each of them.
  2. it gives only a view over a limited time span due to capped collection MongoDB’s database profiler writes the collected slow queries in a capped collection, which is limited per default to 1 MB. When the collection is full, new incoming data overwrite the oldest (FIFO). This makes it impossible to have a historical view of the slow operations, especially, if the collected query is many Bytes big. Of course, you could delete the default capped collections and create bigger ones for each mongod and database but that would add still more work to do profiling.
  3. querying the slow-operations-collection may add more stress to the profiled database
    Of course, if the capped collection is very small (e.g. the default 1 MB) this wouldn’t be an issue but if you had increased its size in order to cover a bigger time span then the database performance could even more decrease with every query run against that collection, especially if no or only suboptimal indexes are used.
  4. different formats of query field make querying more difficult
    The collected operations may have different formats/schemata. Here are 3 of it to give you an idea:
  • query as document:
  • query as an embedded document:
  • query embedded as $query:

You would need to execute multiple queries or to execute one quite complex query in order to respect all these different formats/schemata.

5. operations sent through mongos router omit the user (jira ticket: SERVER-7538)
Although this bug has been created already in November 2012 and its priority is Major -P3, it's still unresolved, which makes it difficult to know the user of the collected slow query.

Our requirements

Why we use MongoDB

We are using MongoDB since version 1.6 (around 2011) in production, mainly because of heavy data growth. MongoDB’s ability to scale out horizontally, automatic failover, and its powerful query language made it an optimal fit. We use it mainly for backend processing where we don’t need to rely on transactions. As a price comparison company, we need to store all offers of products we get constantly from our commercial clients (aka shops) in order to show the best prices of products to our end customers. In simple words, if our databases are too slow then prices won’t be updated as fast as we, our commercial clients, and end customers expect to.

Right now, our biggest MongoDB database is constituted of 66 data nodes (mongod’s). Each data node is a bare metal server consisting of generally 48 CPU’s, 128 GB RAM, and 750 GB SSD’s in Raid 1. This database stores nearly 2 Billion documents and executes up to 1 Billion write operations each day. There are slightly more read than write operations. Besides that, we have other sharded clusters and sole replica sets, the latter both as bare metal servers and VM’s.

What we wanted more

In order to optimize data processing, we wanted to have a quick overview of slow operation types and their quantity within a time period. “Types” means whether the slow operation was an insert, update, delete or read, the server where the query was executed and the fingerprint of the query, means, the fields that were queried, projected, and sorted on.

We also wanted a historical view to see how slow operations evolved. This allows us to correlate events like software deployments, database or hardware updates, unusual fast data growth, etc. with performance degradations. Furthermore, recurrent peaks become clearly visible and can be eventually investigated.

Even if we were using MongoDB primarily for huge datasets, we are seeing an increasing number also of smaller MongoDB databases, often only replica sets and sometimes only running on KVM hosts. Having an overview of all database systems in one central place, issuing administrative commands to some or all of them would be big time saver.

Last but not least, related to our main goal to optimize database operations, we wanted to filter by slow operation types and time range to drill down to a minimal point in time to spot the culprit queries. These tasks should not be reserved to database administrators or MongoDB experts. All software developers who are using MongoDB should be empowered to inspect the databases their software is talking to in order to spot performance bottlenecks as fast, easy, and efficient as possible. The desired results are:

  • faster queries
  • better adapted indexes
  • better adapted data schemata
  • less required resources (hardware)
  • smarter workflow by avoiding unnecessary data flows
  • higher relevant data throughput

Achieving our objectives

Two global steps are required:

  • collect and aggregate slow operations from all concerned mongod’s and databases into one global collection
  • provide a company-wide accessible graphical user interface to query this global collection, show filterable results in a concise manner

Both requirements can be achieved by a web application.

Step 1 of 2

Global collection

The usage of one global collection allows easy and fast querying of all slow queries of a whole MongoDB sharded system and can even correlate multiple MongoDB systems one with another. Since it’s located on another replica set, it avoids interfering with the profiled mongod’s while the collected slow operations are being analyzed. As long as no capped collection is used to store slow operations, it keeps historical data for all the time needed for investigation.

Collector

We will use tailable cursors to collect constantly slow operations from the profiled mongod’s database(s). We have to guarantee that only one collector per mongod and database is running at once in order to avoid doubled entries. In case of failure, the collector must quickly enough reconnect to the capped collection before data gets overwritten but it has to avoid DoS.

Collected entries

The slow operation documents collected by MongoDB’s database profiler are quite verbose. Here is an example:

{
"op":"getmore",
"ns":"offerStore.offer",
"query":{
“query”:{
"shopId":123,
"onlineProductIds":{
"$ne":null
},
"smallPicture":{
"$ne":null
},
"_id":{
"$gt":1555008076
},
"lastChange":{
"$gt":ISODate("2013­04­02T22:00:00Z")
}
},
“orderby”:{
“_id”:1
}
},
"cursorid":NumberLong("5773493375904448215"),
"ntoreturn":500,
"keyUpdates":0,
"numYield":2350,
"lockStats":{
"timeLockedMicros":{
"r":NumberLong(8724165),
"w":NumberLong(0)
},
"timeAcquiringMicros":{
"r":NumberLong(5321722),
"w":NumberLong(7)
}
},
"nreturned":500,
"responseLength":94656,
"millis":5322,
"ts": ISODate("2013­04­05T01:41:31.710Z"),
"client":"172.16.65.202",
"user":"pl_parser"
}

We reduce its size by keeping only interesting fields. Such reduced document would look like:

{
"_id":ObjectId("512e43099bbcf52b9aff3602"),
"ts":ISODate("2013­04­05T01:41:31.710Z"),
"adr":"s233.ipx",
"op":"getmore",
"fields":[
"shopId",
"onlineProductIds.$ne",
"smallPicture.$ne",
"_id.$gt",
“lastChange.$gt“
],
"sort":[
"_id”
],
"nret":500,
"reslen":94656,
"millis":5322,
"user":"pl_parser",
"lbl":"myCluster"
}

We chose short field names to reduce the size even more. Key-value pairs inside query and orderby are transformed to arrays of only key names (aka field names) appended with their operator (or none for equality comparisons) because it doesn't matter to know the exact parameter values nor the sort order of a slow query in order to be able to check whether such query may be optimized. Appending the operator such as $in, $exists, $gt, $lt etc. to the field, e.g. _id.$gtas shown above, is helpful to get the right picture because it makes a big difference in terms of performance to query for example {_id : 123}or {_id : {$in : [1,2,3...thousands of values...n]}}. The latter query will take a multiple of time compared to the first, so it's essential to see also the operator to get the analysis right.

We added one field lbl to associate the entry with a user-friendly label of the cluster or database system to which the server belongs to. With such documents, we are well equipped to query efficiently our global slow operations collection.

Today, our collector database, respectively its collection where slow operations are stored, contains already more than 250 million documents! The average document size is only 238 Bytes, data size ca. 55 GB, index size less than 9 GB, and storage size, thanks to wired tigers storage engine using compression, only less than 12 GB. That’s great in regard to more than 250 million stored documents!

Step 2 of 2

Graphical user interface, aka analysis page

The graphical user interface is to filter and analyze slow operations. It consists of three parts:

  • a search form
  • a diagram
  • a data table

These three components need to be implemented in an analysis page.

Search form

The search form contains several input fields in order to limit the search. The most important are from- and to-date to define a time span as well as an input field to enter the name of the database system. The following input fields may contain one ore more values. If the user enters multiple values, they will be logically combined by OR:

  • Server address
  • Replica set name
  • Database name
  • Collection name
  • User name
  • Operation type

The same applies to the following input fields except multiple values will be logically combined by AND:

  • Queried fields
  • Sorted field

Last but not least there are two input fields to define the min and max time in milliseconds because it may be helpful to analyze only such queries which took more than x but less than y milliseconds.

Besides these input fields, there are the following checkboxes in order to group by one or multiple criteria:

  • Label of the database system
  • Server address
  • Replica set name
  • Database name
  • Collection name
  • User name
  • Operation type
  • Queried fields
  • Sorted fields

As you can see, the checkboxes reflect almost all above-mentioned input fields. The more boxes are checked, the finer-grained is the result. For example, if you check Queried fieldsbut let Sorted fieldsunchecked, all slow queries queried on the same fields will be in the same group, no matter if they were sorted by the same fields or not. "Same group" means, that you will see one entry in the diagram and data table with its timestamp and duration of operation. The more checkboxes you set, the more entries you will obtain, allowing you to drill in deeper, for example, to know which of the N database servers constituting the whole analyzed database system were exactly involved, or which collection or which user, etc.

Finally, the search form offers also the following radio buttons in order to define a suitable resolution for the diagram (it’s not applicable to the data table because it does not have a time dimension like the diagram):

  • Year
  • Month
  • Week
  • Day
  • Hour
  • Minute

For example, if you choose Hour, all slow operations of the same type (remember the above described Group-by checkboxes) occurred between a precise hour, say between 08:00,00 and 08:59,59 o'clock, are grouped together. You won't be able to know the exact minute when they occurred. These will be shown as they occurred all at 08:00,00 o'clock. However, you will still know the exact quantity of them.

Diagram

If we want a chronological overview of the collected slow queries, a diagram in which the x-axis is the timestamp of when the slow operation happened, is practical. We wanted also instantly to get an adequate picture of both the duration and of the quantity of the filtered slow operation types. We came up with an idea where the y-axis corresponds to the duration time. If multiple slow operations of the same type happened at the same time, a circle of a surface that corresponds to the number of these slow operations is drawn in place of a single point onto the diagram. This idea allows us to immediately see when many or just a few slow operations occurred while seeing at the same time their duration time.

The higher a point or circle on the y-axis, the slower was the execution time of this operation. The greater the diameter of the circle, the more slow operations of this type were executed at this time. The diagram has to be zoomable in and out. While hovering with the mouse over the shown points, respectively circles, details of them has to be displayed in a legend next to the diagram. To implement such a diagram client side, we used dygraphs which is a fast, flexible open-source JavaScript charting library.

Data table

The presentation of slow operations in a data table is advantageous when data needs to be filtered more (drilled down) or sorted by a certain criterion. For example, you could sort by a column which sums up the duration times of all operations of the corresponding type in order to see instantly which of these queries altogether were the most expensive ones.

Regarding the data we collect with each slow operation, we may build the following table columns:

  • Slow operation type
  • Count
  • Durations:
- min
- max
- average
- sum
  • Number of returned documents:
- min
- max
- average
- sum
  • Number of returned documents per millisecond
  • Milliseconds per returned document

The table has to be filterable. Columns should be sortable and selectable to be hidden or shown. The sum of the values, where it makes sense, is shown at the bottom of the columns.

To implement such a flexible data table client-side, we used datatables which is a plug-in for the jQuery javascript library.

Implementation insights

How to query collected slow operations?

Regarding our schema of the collected slow operations, we can group by time component to apply a resolution of year, month, week, etc. Grouping by server address, user, operation, queried, or sorted fields allows to define different slow operations types. Fields like ts, adr, op,fields,sort, millis, user and lblmay be used to filter specific slow operations respectively to focus on a specific time period.

Here is an example of such an aggregation query to illustrate what’s being said:

Since we always query the timestamp field to focus on a specific time period and often indicate the label of the profiled database system, we built a combined index {ts:-1, label:1}. The order of tsis descending because we query generally recently collected slow operations.

Furthermore, we added an index {adr:1, db:1, ts:-1} which is essential for starting up quickly all profiling readers because they need to know the last stored slow operation for their mongod host address and database else they could re-insert already written entries which may be still present in the system.profile capped collection.

It turned out that both indexes perform very well. Like I wrote above, our collector database today stores already more than 250 million documents. It’s running on a single replica set (no sharding), shared with other databases. However, queries to get the latest profiling entry for a database label are executed within 10 ms, even if they are executed concurrently for all our registered databases which is the case when the web application starts up.
Aggregation queries, executed for analyzing slow operations are slower because they are much more complex but they have finished generally within a few seconds dependent on the search and group criteria.

Configuration

We need to configure which databases have to be profiled and also which database has to be used to store collected slow operation entries, called collector. The latter could be expressed like this JSON formatted object:

{	"collector":{
"hosts":[
"myCollectorHost_member1:27017",
"myCollectorHost_member2:27017",
"myCollectorHost_member3:27017"],
"db":"profiling",
"collection":"slowops",
"adminUser":"",
"adminPw":""
}
}

The collector has some access points, defined by array variable hosts. The database and collection where the slow operation entries have to be written are defined by fields dbrespectively collection. If authentication is used, we need to enter the credentials of an user having an administrative role. An administrative role is required to be able to create the database and collection in case they do not yet exist.

The database systems to be profiled could be configured this way:

{	"profiled":[
{ "enabled": false,
"label":"dbs foo",
"hosts":[
"someHost1:27017",
"someHost2:27017",
"someHost3:27017"],
"ns":[
"someDatabase.someCollection",
"anotherDatabase.anotherCollection"],
"adminUser":"",
"adminPw":"",
"slowMS":250,
"responseTimeoutInMs": 1500
},
{ "enabled": false,
"label":"dbs bar",
"hosts":["someMongoRouter:27017"],
"ns":[
"someDatabase.someCollection",
"anotherDatabase.*",
"*.myCollection"],
"adminUser":"",
"adminPw":"",
"slowMS":250,
"responseTimeoutInMs": 3000
},
{ "enabled": false,
"label":"dbs with many databases to be profiled",
"hosts":["someOtherMongoRouter:27017"],
"ns":["*.*"],
"adminUser":"",
"adminPw":"",
"slowMS":250,
"responseTimeoutInMs": 3000
}
]
}

Fields of profiled entries explained:

  • enabled = whether collecting has to be started automatically upon (re)start of the application
  • label = a label of the database system in order to be able to filter, sort, and group on it
  • hosts = an array of members of the same replica set, or just a single host, or one or more mongo router of the same cluster. The application will resolve all members of a replica set (even if only 1 member has been defined) respectively all shards and its replica set members of a whole MongoDB cluster.
  • ns = an array of the namespaces to be collected in the format of databaseName.collectionName. The placeholder * may be used instead of databaseName and/or collectionName to collect from all databases and/or all collections. Examples:
  • databaseName.* collects from all collections from database databaseName
  • *.collectionName collects from all databases from collection collectionName
  • *.* collects from all collections from all databases
  • adminUser= if authentication is enabled, name of the user for database "admin" having role "root"
  • adminPw= if authentication is enabled, the password of the user
  • slowMS= threshold of slow operations in milliseconds This threshold can't be set per database. It's set per mongod instance. This is a MongoDB limitation.
  • responseTimeoutInMs= response timeout in ms

To make the configuration less verbose, recurrent fields like enabled, adminUser, adminPw, slowMs and responseTimeoutInMscould be set to default values that apply where these fields have not been specified.

The result

The result is a web application that all software developers within the company can access to visually spot and analyze slow MongoDB operations. They may share a view of their identified slow operations by means of the URL of the web application so other colleagues will have an identical view and may investigate further by themselves. That was a basic step to enable software development teams to be more independent, being less contingent on the database administrator team.

Also, this project was the first project that my company open-sourced.
Please have a look and play with it if you’re interested. It’s hosted on our Github repository. You will see there, that the web application has evolved from a pure slow operation analysis tool to a potential administration tool. I will explain the latter in the following post. Stay tuned!

I’m curious to know your feedback, suggestions, or contributions! If you found this article useful, give me some high fives 👏🏻 and share it with your friends so others can find it too. Follow me here on Medium (Kay Agahd) to stay up-to-date with my work. Thanks for reading!

--

--