App performance in the era of the shrinking attention span

Tripadvisor
Tripadvisor Tech
Published in
6 min readMay 24, 2024

How to increase page engagement when high-latency becomes an issue.

By Andrews Cortez and Jacob Bradbeer, Tripadvisor

With the ever-increasing role that social media plays in our daily lives, research studies¹ suggest that the average attention span of an online user has decreased. Our constant exposure to easily digestible, low-latency content on today’s social media platforms has conditioned our brains towards immediate gratification that longer-form content or high-latency applications may not be able to compete with.

With this in mind, application response time has become more crucial than ever before. According to a blog post by Neil Patel², 47% of consumers expect a web page to load in 2 seconds or less, with 40% of people abandoning the website if it takes more than 3 seconds to load.

Recent advancements in the application of generative AI have massively impacted how tech companies are able to generate personalized content at scale. Applications are now limited only by the speed at which these models can parse data sets and generate outputs.

Tripadvisor uses AI to help plan your vacation
Trips, the Tripadvisor trip planning tool uses AI to help plan your vacation

This is the case with our AI-based itinerary generator tool. Trips, our trip building tool, creates an entire trip itinerary from a user’s preferences, including their destination, travel dates, demographic details, and any supplied interests. Our AI model leverages more than a billion trusted reviews of locations, restaurants, hotels, and things to do, and as such, it might require a few extra seconds to generate unique and usable recommendations for every specific user. In some cases, this takes longer than the 3 second limit mentioned above.

So how can we keep engagement when our users expect content to be delivered instantly?

Know when to work asynchronously

We use generative AI to produce final recommendations and itineraries, but we also use it to help guide users through the trip generation process. For example, when inputting preferences, users can include interests on which to base their trip. Although we serve some generic interests that apply to any location, such as “Must-see Attractions” and “Great food”, we also list location-specific and season-specific items, all leveraged from listed tours, attractions availability, and user reviews.

Points of interests are automatically suggested to the user
Points of interests are automatically suggested to the user

At the moment, you can’t expect a generative AI model to parse such a large data set and produce individualized recommendations while still providing low-latency and high performance. To solve this issue, we start processes as soon as possible, we prefetch data, and we cache as much as we can.

For the case above, we strategically added extra steps between the selection of a trip’s destination and the selection of anything that is destination-based. When a user selects the destination, a first request is sent to our services to start generating destination-based decisions. This generation uses any cache possible depending on previous requests for that location. It then sends back a confirmation that the data will be available to be fetched soon. By the time a user reaches destination-based selections, any related AI-generated content is already available to be served.

Progressive loading

When the performance of the underlying system cannot be improved enough to satisfy user experience requirements reliably, we use “tricks” to help improve waiting time, thus providing the user with a better overall experience. Progressive loading is a strategy that works by deferring the loading of as many resources as possible and only loading critical elements while making sure that the page remains reactive and engaging as it loads the rest of the content. It must be synchronized with any background services in order to clearly indicate progression. It also must show a start, a loading progress indicator, and an end, all while enabling the user to interact with the page. Furthermore, it absolutely can’t accidentally lose the request while doing any of this. This technique leverages the psychological principle of perceived performance, where users see an application as faster and more efficient if they can quickly access the most essential information, while non-essential resources are progressively loaded in the background.

For our trip generator case, we decided to implement progressive loading in 4 stages;

  • During stage 1, the initial generation is requested. If this succeeds, the API will tell the interface what to expect next: either a full load screen or the progressive loading stage.
  • During stage 2, we show the full load screen. This is a quick stage that happens when the AI model is at capacity.
  • The progressive loading happens during stage 3. This is a polling stage where, bit by bit, the APIs will handle the response of the request until the status is completed.
  • The page is fully loaded at stage 4.

Stage 3 requires the most attention since this is where the resources are delivered to the user. This stage is expected to take the longest, so it’s important to apply a few critical techniques that will improve the user’s overall experience. We decide which resources we want to prioritize for readiness and which are non-essential and can be lazy-loaded. We then build a loading sequence that produces an intuitive and engaging experience for the user.

A variety of well-known techniques can be combined to construct an interactive loading experience, such as:

  • Skeleton screens
    These are simple representations of the components that are still loading. They replace the final component in a ready-to-use interface and give the user the complete look of the page before it has loaded. Giving an interactive interface to the user, even if some components are just skeletons of loading resources, can buy some time for your request.
  • Asynchronous loading
    It’s essential that the application manages the progressive loading of resources wisely. Heavy content must be requested as soon as possible, while others can come after. It is important that the page avoids cascading requests. These are requests that depend on other resources that are still not available. Cascading requests can be a killer to any application.
  • Smart content generation
    The generation of the content can be split into different parts, where the overall structure of the trip is handled first and details are loaded later. It’s easier to generate a list of hotels based on interests, for example, than to generate hotel descriptions based on reviews. We can serve hotel recommendations to the user in that order.
  • Feedback
    Feedback helps the user know what is going on with the page. A static image or a long loading spinner can be seen as a frozen page. Small animations or transitions can also distract users from loading times.
  • Error handling
    Managing edge cases and error states can significantly improve the user’s experience when things go wrong, especially when your resource management starts to get more complex. Friendly messages, retries, and replacing content can be very helpful when handling unexpected situations.

Conclusion

The performance and interactiveness of a progressively loaded page will vary depending on the underlying implementation and the scale of the data, but the main idea will always be the same. By prioritizing the loading of critical resources and allowing non-essential elements to load progressively, developers can create a perception of speed and responsiveness that aligns with user expectations.

Ultimately, the user has the final say in how they perceive your feature or product, and you’ve got 3 seconds or less to convince them to stay. You should always strive to make the experience as performant as possible and take advantage of progressive loading techniques when dealing with large quantities of data and less than ideal processing requirements.

--

--