Optimising Power Apps for large data sources

Jesse Kim

Published in

Cruising Altitude

11 min readApr 10, 2024

Contents

Defining ‘large’ in Power Apps
User interface/experience (UI/UX) tips and techniques
Data modelling to facilitate large data sources
Conclusion: It’s all about going Full Stack

Defining ‘large’ in Power Apps

The hallmark of my Power Apps projects in early 2024? Large data sources. As sexy, pragmatic, and versatile as the service is, Power Apps has surprising limitations when faced with large volumes of data.

First, the definition of ‘large’ in Power Apps needs to be put forward. There are two numbers of interest: 500 and 2000. As the official Power Apps documentation explains:

When a query is non-delegable, Power Apps only gets the first 500 records from the data source…

…for the sake of performance. Now, this limit can be increased, app by app, to 2000 at the developer’s discretion. Going by these facts, then, a table with 500 rows is considered sufficiently large in the universe of Power Apps; one that breaks the 2000-row barrier is, well, ‘unusually’ large.

“Wait, two thousand rows of data?!? How is that even fit for enterprise applications?” Sceptics of low-code and Power Apps, of course, immediately jump up and down in shock and awe upon hearing this. Let me get a couple of things straight. Having more than 2000 rows stored in a data source, and willing to scroll through 2000+ rows within the UI of an app, are two different things. The Power Apps row limit is about the latter, and the default limit of 500 was empirically chosen by Microsoft to guarantee acceptable performance even when the user has the time and determination to comb through that many results in front of them.

Equally importantly, this row limit has nothing to do with the total number of records held in each data source, no matter the choice of data-layer technology; each Microsoft list (SharePoint list), for instance, can hold up to 30 million items. What Power Apps does is delegate sorting, filtering, and lookup operations to the respective data service, thereby ensuring its reach over the full set of given data. It is only when Power Apps returns the results of a single query to the user that the output is impacted by the row limit, which may or may not have important implications in individual apps.

Admittedly, none of the above-mentioned limitations and considerations necessarily matter across many real-world use-cases. Motivated citizen developers are producing Power Apps that are perfectly fit for purpose without ever worrying about a row limit or delegation. Picture an app that renders daily news updates curated under specific themes, or one that facilitates approvals of travel requests initiated by team members. Here, the average end-user is simply not interested in scrolling beyond the latest few content items relevant to them, let alone 500 or 2000.

That said, my recent mission was to build Power Apps that each handled multiple large data sources, some of which hosted tens of thousands of financial records where every single one had to find its way to whoever was authorised to discover and audit it. In other words, guaranteeing full discoverability of records was the most critical challenge from a developer’s perspective. Some of the high-volume users were known to have been allocated hundreds, even thousands of subject matters to comb through. “Here are the first 500 items that may be of interest to you…” didn’t cut it. Each query must bring back and render the precise number of applicable results, which in turn meant the 2000-row limit had to be overcome. In addition, a big chunk of the development effort went into optimisation to squeeze the last drop out of achievable performance, both actual and perceived, while also ensuring safeguards against rage clicks and potential concurrency issues across commits originating from a pool of more than a thousand authorised users.

Yes, this describes enterprise-scale applications done in Power Apps. Now, to the real point of this article — below is a round-up of tips and techniques that I found relevant and noteworthy, divided into two themes: UI/UX and data modelling.

User interface/experience tips and techniques

Mask all jaggedness. When rendering visual elements on a data-rich screen, prescribed data operations and calculations take time, often into thousands of microseconds, during which the user may be exposed to incomplete, inappropriate, or transitory results. Worse, there are occasions where on-the-fly business logic designed to hide irrelevant or unauthorised pieces of information from the current user, takes effect a few blinks too late. It is the developer’s responsibility to ensure a hood over under-the-hood operations. Mask all such jaggedness with a text label featuring a helpful message (such as “Preparing to load data…”) and an opaque background whose visibility is controlled by a variable. The OnVisible logic of each screen should begin with this variable getting set to true/active/on and end with the reverse once all routines have completed. It’s a simple principle: Keep screen content curtained off until when fully ready.

Communicate what to expect. Clever use of the above-mentioned masking label can turn unavoidable lead time into something more psychologically positive and factually informative. For example, in predictable edge cases where a query would take longer because there are far more rows of data to read in than usual, utilise a masking label to indicate the progress in increments instead of, or in addition to, a generic “Please wait…”. If the total number of rows to process is known in advance, a fancy progress bar can also be implemented. Every forced pause is an opportunity to communicate.

Eliminate rage clicks. Even after the completion of screen load, the same technique of using a variable to force a ‘hands-off-while-processing’ insulation can be applied to buttons, dropdowns, and other controls that trigger data operations, through the DisplayMode or Visible property of each. A common example is two buttons placed side by side where one of the buttons takes a visibly significant amount of time to complete its OnSelect operations. The second button, or any other clickable controls on the screen for that matter, should remain disabled for the duration of the execution in order to bulletproof the app against conflicting, interrupting, or otherwise unwanted user input that may end up compromising the integrity of underlying data.

Choose pagination over scrolling up and down. One of the things I was able to do a first-hand side-by-side comparison of is the performance of a paginated scroll-less gallery against that of a traditional scrollable one. Conclusion: Pagination that caps the number of results rendered on the screen to some low number, say 10, performs significantly better than a gallery holding an arbitrary number of results only capped by the imposed row limit, that is, even when both sides are holding the same number of data rows in memory. That extra development effort to implement pagination via three simple custom parameters working in the Items property of a gallery — number of items to accommodate per page, current page, and total number of pages — handsomely pays off. Pagination also opens the door to bulk operations across the items on any given page with predicable, consistent processing overheads. This goes to show the significance of last-mile overheads going into rendering visual elements, irrespective of how repetitive things appear to be.

Choose pop-up over screen navigation. The most typical construct in Power Apps development consists of two screens: Call them ‘Listing’ and ‘Item Details’. Selecting an item from the Listing screen or choosing to create a new item transports the user to the Item Details screen, and then back to Listing when the task at hand is done. Such physical back-and-forth and repeated reloads have no real benefits from a performance standpoint especially when the Listing screen is already aware of all attributes of each item in the data source. A sensible alternative is to implement a pop-up to render the specifics of the currently selected item without any physical screen navigation, conceptually similar to a single-page application.

Minimise interaction with live data sources. One way to speed up read/write operations on data is to take a snapshot of some or all rows in a data source by creating a collection in memory. Operate on this in-memory replica to search, filter, lookup, consume, analyse, update, create, and delete data while enjoying the best possible speed and efficiency Power Apps can give. Compared to this, having to connect to and query live data through the network for every single read/write operation, quickly (or slowly) becomes costly. The obvious catch, however, is that the results of any update/create/delete operations (‘deltas’) must also be promptly written back to the live data source to keep things in sync. Another caveat is that the benefits of an in-memory collection are not universally present; if the data source is highly transactional in nature and if there is little or no sense of compartmentalisation in the way people work with it, any snapshot will quickly become outdated and risk losing its value as a source of truth, ultimately amounting to a concurrency nightmare. For this reason, in-memory collections generally work best with reference or dimension tables where only a low volume or frequency of change is expected under normal operation. [Edit: Appropriately filtered in-memory collections work well with transaction data where there are clear allocations of responsibilities.]

Overcome the row limit. Time to address the elephant in the room: Not being able to access the complete set of sought-after records due to a hard row limit imposed by Power Apps can be a major user experience failure. Any such edge cases must be caught out by the developer and testers early on. Perform a test to see if the entirety of the data sought after by the current user exceeds what a single query can deliver. If so, employ a divide-and-conquer tactic. Split the data source in question into logical compartments based on the characteristics of a column that can act as a reference, where none of those resulting compartments violates the row limit. Examples of compartmentalisation are non-blank text starting with each letter of the alphabet / financial year-quarter date ranges / ownership or assignment of each record / some obvious, readily available classification; every data source presents ways to slice and dice itself into reasonably even chunks. Bring those compartments in as separate queries and assemble them into a single consolidated in-memory collection, which knows no such thing as a row limit. Performing the individual queries in parallel inside a Concurrent() block most definitely helps.

Avoid concurrency pitfalls. In a data source where any number of people could be working on any single row to update any combination of its fields anytime, concurrency checks are a must. Dealing with concurrency need not be rocket science, though. When viewing a record, show on the screen who last modified it when. As each user begins/decides to edit a record, capture the last-modified date/time of that record. When the user submits their change, query the last-modified date/time of the same record in the live data source. If the two date/times don’t match, stop and alert the user. It may be a good idea to give them options: 1) start over with the now-up-to-date version of the record; or 2) acknowledge and override.

Avoid loops and repeated lookups. In Power Apps, creating a loop with ForAll() for actions executed one row at a time is generally a bad idea performance-wise, unless the number of iterations remains low (as in single-digit low) and predictable. Actually, ForAll() executed on a large data source or collection may be useful if the intent is to create some animated visual effect demonstrating growth or depletion one unit at a time, much like how live election results are televised. A true alternative to ForAll() that triggers unique data operations in each iteration may require the developer to re-think the organisation of their data sources, possibly leading to treatment or surgery beyond UI and code (see the next section). On a related note, the practice of repeatedly calling LookUp() within a single routine or event to get the value of a single field inside a single record each time is a major resource-waster and should be avoided, especially if the target record resides in a live data source. Instead, do a LookUp() once on the entire target record and save the result in a variable. Subsequent references will point to attributes of this in-memory variable as opposed to issuing fresh queries to fetch data.

Data modelling to facilitate large data sources

Unlike a greenfield project, app development with ‘large’ data sources naturally entails a large pile of pre-existing data organised in an established, complex schema to which the app developer (probably) has had no prior contribution or oversight. Access to this data is typically granted via a tightly scoped window through which the developer can exercise very little control, if at all, in terms of what comes in through the pipe. But this is precisely how enterprise resources are managed; separation of duties, governance protocol, black boxes, multiple Area 51’s, and people minding their own business while respecting each other’s specialisations.

Nevertheless, the objective is to improve and modernise business processes using masses of existing data through a brand-new app experience. What matters to the app developer, then, is that their app operates on a data model that not only satisfies the prescribed functional requirements but also facilitates optimal performance and highest achievable efficiency. Closing the gap between available raw data and the desired data model crafted for the inner workings of the app, is hard work.

The app developer might not have knowledge of what goes on in all the Area 51’s out there. But ensuring that the data at hand goes through necessary treatment such that it is made to work with the nuances of their app and therefore for the users, is squarely under their control. Find a place or platform to construct supporting tables, columns, relationships, and indexes as required. Feed this model applicable processed data using tools of automation and analysis, one-off or scheduled. Keep optimising the setup with reporting, archiving, and exports in mind. This is a full-stack app developer’s own Area 51 — hopefully a sufficiently documented one.

Data modelling in this app-focused context is making sure:

Reliable, functioning primary keys exist, and individual records can always be pinpointed and looked up with complete accuracy.
Any duplicate, redundant, or missing pieces of data are identified, and how they are to be treated in the app is agreed upon.
The app has a normalised view of the data. Meaningful dimension tables are added to the data model as a result.
Every record is accounted for and made accessible to authorised users in a security-trimmed manner.
Necessary search and filtering criteria are understood and facilitated.
Columns used for in-app sorting and filtering are appropriately indexed.
Referential integrity is established and asserted across commits.
Calculated tables and columns to speed up filtering and lookup operations, or to ensure visibility, are in place.
Results of write-back operations are headed to the most appropriate destination, which can be any of the following: 1) replacing original values in the data source; 2) new columns added to the data source; or 3) a new table, with the original data source left intact and read-only. The choice needs to be driven by a mix of performance implications, licensing considerations, governance priorities, and ease of reporting.
Things are built with futureproofing and interoperability in mind. The same data model could serve other similar apps or systems in the future.
No matter what treatment or surgery has been applied, no part of the original truth is lost, irreversibly manipulated, or otherwise has its integrity compromised.

I don’t think the above fits into any of the traditional Data Analyst-Scientist-Engineer pigeonholes. I see this as an augmentation of app development, particularly for Power Apps. (Perhaps learning Power Apps is a natural augmentation of every data profession?)

It’s all about going Full Stack.

Budding Power Apps developers and consultants will face large data sources sooner or later and be expected to plug them into their apps. It takes work of in-depth optimisation and fine-tuning at both UI/UX and data levels, and this will propel them to become full-stack engineers. Front end vs. back end is never a debatable subject in the world of Power Apps; they are one and the same.

Optimising Power Apps for large data sources

Defining ‘large’ in Power Apps

User interface/experience tips and techniques

Data modelling to facilitate large data sources

It’s all about going Full Stack.

Written by Jesse Kim