C++ tips to get around TLE

Published in

The ACM Manipal Blog

9 min readAug 21, 2020

Time Limit Exceeded or TLE can be a significant cause of frustration for competitive coders. If you are getting one, you probably need to change your approach, but if you are stuck in a coding challenge and can’t think of any other method to the problem, you can try using these techniques to get around those sneaky test cases.

Make your I/O fast(est)

C++ developers usually get accustomed to using std::cin and std::cout for standard I/O operations, but they aren’t the fastest ones out there. If you are feeling lazy to change your code, you can just add the statements given below and enjoy the speed boost.

The first statement disables the sync between C and C++ style I/O, so this may cause unknown errors if you combine them. For competitive coding, it’s usually not required to use both so you can use it without any worries.

The second statement unties std::cin with std::cout. Whenever one of the streams is used, the other is flushed, but when you untie them, they no longer flush each other, which results in better performance. Of course, you can always use your old C friends, scanf, and printf. They are faster than std::cin and std::cout by about three times, but when compared to unsynchronized std::cin and std::cout they don’t give much of a performance boost.

Want it to be even more fast? Are you taking integer input? Oh, then you are in luck! Just copy this function, and you can take integer inputs in the fastest, thread-safe way.

The function given above reads each character and converts them into an integer. This method is free from a lot of system calls that scanf has to make, like deducing the type of input.

You must be wondering why I wrote “the fastest, thread-safe way.” Well, because there is a faster way, but that makes it thread-unsafe. Want to use it? Replace getchar() with getchar_unlocked() and experience the beast. In competitive programming environments, you don’t need to worry about being thread-safe unless you are doing concurrency related tasks, so go ahead with it!

How do you believe what I say without any proof? Well, here it is. The different methods were benchmarked, and the results are presented below. If you want to look at the code used in the benchmark, I have uploaded it to a GitHub repository, which you can find here.

cin vs. cin_wo_sync vs. scanf vs. fastscan vs. fastscan_unlocked — Comparison of std::cin vs. other input methods

Faster Vectors

While vectors are fantastic because of their expandability, but the same attribute makes them slow too. std::vector is implemented as a dynamic array, which means that each time it needs to increase the size of the array, it creates another array double the size and then copy the elements from the previous array. If you know the size of the vector before you add items, reserving the space eliminates the overhead required to copy the elements because it creates an array at least as big as the reserved size.

Where max is the number of elements you are going to store in the vector.

Adding a reserve makes your code about 1.2x faster.

Time taken with reserve vs. without — Comparison between the time taken with and without reserve

Again C style arrays or std::array is faster than using a vector any day, but its size has to be known at compile-time and thus isn’t always feasible to use. So vectors are your best bet, and it’s better to reserve the space beforehand.

Another surprising thing about std::vector is the at() function. It seems to do the same thing as subscript operator [] but when the performance is measured it is actually 3.1 times slower.

at() vs. [] — Comparison between using at() function and subscript operator []

That can do a lot of damage to the time taken by your program. The reason being is that the at operator has builtin checks for out of range values that the subscript operator doesn’t, so if you give a value that is out of range it might give you a segmentation fault or cause unknown errors without telling you what has gone wrong while at will throw an exception.

One final thing that you should know about std::vector is that you should never copy an std::vector to another manually using a loop and push_back(), rather you should use assignment operator =. The difference in speed is huge as apparent by the chart below.

Loop with push_back() vs. = — Comparison between using a loop and push_back() vs. assignment operator =

This difference in performance is caused because when you are doing assignment it knows the size of the vector it is copying, and needs to call the memory manager only once to create the assigned vector’s space and copy all the elements unlike the push_back() operation which makes a call to the manager for every element.

Don’t just add, append

The fastest way to work with strings in C++ would be to use C style strings, i.e. character arrays, equivalent performance to character arrays, but one mistake programmers frequently make is to use the + operator to append to a string. The reason it is not suitable to use is that it creates another character array of the total size of both the operand strings and copies the first one to the array while concatenating the other to the newly created array. What you should do instead is use the append() function, or it’s equivalent += operator:

Using the += operator instead of + operator is better for any class that supports it. This helps in reducing the time taken and memory to create a new object and copy the first and then add the second object to it.

= + vs. += — Comparison between using + operator and += operator

Also, when using std::string or any other class, it’s best to initialize it using the constructor than using the assignment operator.

You don’t need std::list

If you don’t know about std::list, we are done here, but if you do and you are thinking about using it, then think again. std::list is a doubly-linked list, so the advantage it gives over std::vector is that it can remove and insert at any position in constant time. That can seem great when you need it, but vectors can be cached by the processor so sequential access is faster than when using std::list.

If you have to remove several elements from the middle of a vector, you can use erase_if(). erase_if() was introduced in C++ 20, but you can combine remove_if() and erase() together to create your custom erase_if() for previous versions of C++.

You might see there is a variable called predicate. The predicate is a function that takes in an element from the vector and returns a bool. Typically, each time you erase an item from a vector, the ones after it are shifted forward, but if you use an erase_if() construct, the shifting is done after all the elements have been deleted, so it is great for removing multiple elements. An example predicate is given below.

If you use the above function in the erase_if() expression:

It will remove all the zeros from the vector v.

If you want to see a comprehensive benchmark between std::vector and std::list you can take a look at this beautiful post by Baptiste Wicht here. After reading the post, you might have a question in mind, “If std::list is worse than using std::vector, then why is it even there in the standard library?” It’s because there are a few places where lists are better, i.e., when the elements being stored in the list are huge and not a lot of them can be fit in the cache, lists can dominate over vectors.

Befriend unordered_map

std::map is maintained as a binary search tree, and the main advantage of using map over unordered_map is that it sorts the elements according to the key. One other benefit of using std::map over std::unordered_map is that it supports pairs out of the box. However, the difference between O(log n) and O(1) can be the difference between a TLE and an AC solution. So let’s look at how you can create your hash function for a pair of integers.

This structure tries to return a unique integer for every pair it gets. You just need to plug this struct into your unordered_map and voila! Your hash table now supports std::pair<int, int>.

You can Google many more hash functions suited for different needs and choose your flavor.

std::map should be only used when sorting of elements is required other than that std::unordered_map is your friend. The same goes for std::set and std::unordered_set.

Also, you can use reserve with std::unordered_map too. In my benchmarks, it sped up the process of adding elements to the hash table 1.3x times.

With reserve vs. without — Comparison between creating a hash table with and without reserve

If you want a faster hash table than unordered_map and you or your CP platform uses g++, there is excellent news. You can use __gnu_pbds::gp_hash_table! It isn’t part of the standard library but has almost the same interface as std::unordered_map but is much faster.

unordered_map vs. gp_hash_table — Comparison between unordered_map and gp_hash_table

__gnu_pbds::gp_hash_table is faster than std::unordered_map because it drops a lot of many unnecessary features that STL has to retain. If you want a better explanation, you can look at the comment here, and if you want more insight, you can see this post here.

There is one more optimization that you can do for std::unordered_map that is setting its max_load_factor to 0.25. The load factor is defined as the ratio of the number of elements in the hash table to the number of buckets. Keeping max_load_factor at 0.25 means at least ¾ of the buckets will be empty. This reduces collisions in the hash table; therefore, the lookup time but also increases the space taken by your hash table. I couldn’t find much of a performance gain when using it, but you can try it if std::unordered_map seems to perform worse than you expected, and you can’t switch to gp_hash_table.

Conclusion

To summarize:

Use fastscan() for the fastest integer input and std::cin without sync for the rest.
Use reserve with std::vector and std::unordered_map as much as possible.
Use assignment operator to copy vectors and operator [] instead of at() function
Use addition assignment operator instead of addition and then assignment for strings as well as other classes that support it.
Don’t use std::list use std::vector instead.
Use std::map and std::set when the keys need to be sorted otherwise use std::unordered_map or std::unordered_set, respectively.

There are a lot more optimizations that you can achieve in C++. An example of a small one would be to use bit shift operators instead of multiplying or dividing by powers of 2, as shown here.

I would encourage you to explore more on your own, benchmark any such tips you may find, and maybe comment them down below. I had fun making this post and hope you had fun reading it too and learned something new.

P.S.: I have assumed that the code is compiled without any compiler optimization because the primary target audience of this article is competitive coders.

References

Apart from the ones linked in the article here are some sites and books that I referred from:

Fast IO Optimization in C++ by Arun Prasad
6 Tips to supercharge C++11 vector performance by Deb Haldar
CPP Optimizations Diary by Davide Faconti
Optimized C++: Proven Techniques for Heightened Performance by Kurt Guntheroth
C++ Reference
And of course loads and loads of Stack Overflow searches