ASP.NET Core Performance Tuning
Today we have a very interesting topic to discuss. We will discover and test how different decisions, solutions, and approaches affect the ASP.NET Core web applications performance. We will talk about small and simple things that sometimes are forgotten (and will measure their impact). A huge amount of computing resources is wasted again and again due to unnecessary operations and suboptimal algorithms, and our web applications work much slower than they could. Let’s see what we can do.
I’ve created a sample web application (using SQLite database and Entity Framework Core; you can find it on GitHub), it is something like a simple e-commerce website’s homepage. It displays a category list (as a menu) and 20 articles. Each article is displayed with its category name and photo. So, our database contains 3 tables:
Photos (each article may have several photos, but we display the one with a special flag set):
To measure performance, I have also created extremely simple console tester in the same solution. It makes N requests and shows total and average request processing time. Also, I will look at Task manager window to check CPU and memory usage.
Please note, that our test web application (and its parts) can’t be considered as a great architecture design example. It is done as simple as possible to be used as an illustration for this article, not as a base for your projects. And this article is not a set of ready-to-use recipes, it only shows how you could think when tuning your ASP.NET Core web application’s performance. Also, we don’t consider cases, when ASP.NET Core MVC performance itself is not enough, and developer needs, for example, to change default controller or view resolver implementations etc.
What are the common and the most obvious ways to improve performance of a web application (but don’t forget, that any optimization should be done when current development iteration is over)?
- Content that changes rarely (or is used very often) might be cached. Memory, file system, or external cache should be considered (depending on the situation).
- There should be no operations duplication where it is possible.
- Session-specific data should be stored on the client-side to save server memory (when possible).
- There should be as few database (or file system, or any other third-party system, like API etc.) queries as possible (ideal number is 1 or 0 per request).
- A database queries should be built it in a way that only the really required data (tables and columns) is selected.
- When it is possible, using of the nested queries, join operations, and other resource-intensive operations should be avoided (but usually nested query or join operation is much better than 2 separate queries).
- All the columns that are involved in the queries (used in a WHERE or an ORDER BY clauses) should have the corresponding indexes.
- Sometimes it is better to implement some logic on the database side (for example, using stored procedures feature). Sometimes the raw SQL queries should be used instead of Linq. Using database views (and corresponding Entity Framework Core entities) for data retrieval often gives good results too (for example, you can have the PopularArticle entity that is mapped to the corresponding database view, which is built using complicated SQL query).
The Database Side
We are going to start from the worst-case scenario: database doesn’t contain any indexes (except automatically added ones), controller action loads all the categories for the menu, 20 articles for the list (including columns that aren’t in use), and then, makes 40 more database requests to load a category and a photo for each of the articles. So, we should have 42 database queries per request. Not so good. (In fact, we have “only” 22 database requests, below I’ll tell you why.)
Let’s measure performance of our web application at this moment. Our testing console application makes 1000 requests and the best result is:
Total: 18,7991217, average: 0,0187991217
The problem is that ASP.NET Core is too fast. Even this absolutely non-effective code can process more than 50 requests per second with no problems! This might be deployed to production and work for years. Sadness (especially, when you pay for the consumed resources).
Just for fun, I change the articles sorting direction from ascending to descending and re-run the test:
Total: 86,4605377, average: 0,0864605377
Wow. 4+ times slower? (I always check few times.) Of course, I think the reason of this difference lies in the database automatically created indexes (in the index direction, it should be ascending by default). I add descending index on the
Id column, and… nothing changed. Hm, interesting. I change the sorting column from the
Id to the
Price one (this column doesn’t have any indexes configured).
Total: 71,8268408, average: 0,0718268408
Total: 76,5685945, average: 0,0765685945
The results are almost the same. Now, I add both ascending and descending indexes to the
Price column. Run the test again.
Total: 58,170244, average: 0,058170244
Total: 58,8626863, average: 0,0588626863
So, we saved much time, it is good difference (about 3 more requests per second comparing to the previous results). The database size increased by less than 2% (and that is for both ASC and DESC indexes).
Ok, let’s back to our sample with sorting by the
Id column (ascending). It works pretty fast. But what if we add indexes for all the columns involved in filtering and sorting (in our case we only need to add indexes for the 2 columns of the
We select one photo for each of the articles. The
IsDefault columns are involved in the query. I add corresponding indexes. (Database size increased by about 5%.) Run the test. No changes. Strange. I go to DB Browser for SQLite and test query there. Without any indexes all photos with the
IsDefault flag set are returned in 10-11 ms. With the corresponding index created it takes 8 ms. Not so big difference, but I think it will be more and more obvious as the database grows, so we should really have that indexes. So, leave them.
You can notice, that our
Article entity has more properties (and its table has more columns) that are used in the view. There are the
Created ones that are not in use. I intentionally added relatively long text in the description of the articles (about 1 paragraph). These values are selected from the database all the time and are loaded into the web application, but never used. If we just comment them out in the entity class (or will write Linq query that will select only the columns we need), we will save about 0.2-0.3 seconds on 1000 requests!
The Web Application Side
As I said before, in fact, our web application (and Entity Framework Core) doesn’t execute 42 database requests to render our page. It makes “only” 22 ones. Why? It is because Entity Framework Code is smart enough to understand, that if it already loaded all the categories (for the menu), it doesn’t need to load them again for each article within the database context lifetime (usually database context exists within the request). It might be a bit confusing if you forgot about such behavior.
BTW. One more tip. Did you hear about the
AsNoTracking Entity Framework Core method? It is common idea, that using this method gives “significant performance gains”. Let’s check that:
I add the
AsNoTracking method call only in the categories retrieval (in 1 place) and run the test:
Total: 19,8171053, average: 0,0198171053
About 1 second longer. Obviously, it is not the “significant performance gains” we were waiting for. But why? The thing is that calling that method prevents categories to be cached inside the Entity Framework Core, so, when the menu is built, our web application has to execute 1–5 more database requests to get the unique categories of the articles (despite the fact that all of them were previously loaded). (I’m not saying that it is bad idea to use the
AsNoTracking method, it really makes data retrieval a bit less resource-intensive operation, but just use it responsibly.)
And one more interesting thing. If we use the
SingleOrDefault methods instead of the
Find one to get the category, web application performance will be dramatically decreased:
Total: 31,3762644, average: 0,0313762644
It is because these methods don’t use the cache, so they make the database requests again and again. The same as using the
AsNoTracking method, but from the opposite side.
Ok. Anyway, 22 database requests to render a simple page is too much. What can we do to reduce that number? I think all of you know about the
Include method. It allows to load the related objects within the single database request (in fact it makes 2 requests as I understand from the log) using the join operation:
If I include the article photos and run the test we will have the following result (including the category won’t have any effect, because categories are loaded from the cache internally by the Entity Framework Core):
Total: 7,6851146, average: 0,0076851146
That is much, much better. Now our web application can process more than 130 requests per second!
Let’s think about what else we could do. If we look at the ordinary e-commerce website we will see, that its pages contain static and dynamic fragments. For example, categories menu is the static fragment (it changes very rarely), while the popular articles or viewed ones can be changed every request (it is not so important, only the idea is important now).
So, why should our web application load the categories from the database every request, if they possible won’t be changed this year? And we use them not only in the menu, we also need categories in the different other places… Yes, I’m going to talk about caching.
Usually it is good idea to store such small pieces of data in a very fast memory cache. In our case (storing categories), we could even set cache option to never expire and invalidate it only when the categories set is changed by a web application administrator (not so important things will be removed from the cache when there is not enough memory, but not the categories, because we need them all the time for sure). We are going to implement a simple categories memory cache now (or we will make it generic). Using that cache we can retrieve the categories in this way:
Now I re-run our tester:
Total: 6,924735, average: 0,006924735
Difference is not so big on our sample web application, but even here it is 14 more requests per second! And what is even more important, it reduces the database load.
But while it is good idea to cache the categories in the memory, storing the articles there (if you have a lot of them, of course) can cause the server to run out of memory, so this approach must be carefully considered in each case.
At the same time, if you have large pages that takes much time to be rendered (for example, a home page, a category page, or an article page in an e-commerce website), you can use output caching (the most top-level one). A web application will save the generated HTML to a cache and then just use that saved version. It works extremely fast. Probably, you will have to invalidate the cache very often (for example, a category page is changed every time when you add a new article or change any existing one), but it is still better than render it again and again.
Some pages (a home page is the best candidate) are used very often, so they might be stored inside the server memory. Let’s see how much requests can we process using the server-side built-in response caching:
Total: 3,1642856, average: 0,0031642856
The result is more than 315 requests per second! Fantastic. But as we discussed above, we can’t store all the pages in server memory. For example, we can have 100 categories and 100 000 articles in our e-commerce web application. We can store top-level categories in memory, but write other category pages to disk. The same with the articles: there are too many of them to store them inside server memory, but we can write them on disk (using our custom caching middleware) and process a single file system read operation to process a request:
Total: 4,858305, average: 0,004858305
As you can see, it is much slower than using memory cache, but still it can process more than 205 requests per second. And it doesn’t use server memory, only storage (it takes 6,77 KB to store our page as HTML).
Real Life Scenarios
Usually, websites have dynamic fragments on all the pages. In our case, we might need to show number of items in the cart, or visited articles list. Also, we usually need to display current user’s name and other details. How to handle all this and use the benefits of the response caching?
I would say, that it is the most interesting part of the web application optimization. There can not be any ready to use solutions. So, now I will share only few ideas and cases from practice.
I hope that was interesting. Of course, this topic is incredibly extensive, and I was able to tell only about a small part of the ideas. Please, feel free to ask any questions, I will try to answer. Also, if you feel that I miss something important, please, let me know.