Software Action Models Will Commoditize Today’s Platforms
I am incredibly excited about a recent AI innovation called “software action models.” These are ML models trained to use software GUIs as a human user would. I’ve previously written about these models in my AI Market Theses post; I’d also recommend checking out Adept Lab’s demo of what these models can do.
This technology has an obvious, big enterprise market to go after: Robotic Process Automation. And while that market is super interesting, I think there’s an even bigger technological shift that it will affect: Software action models will de-platform major consumer applications by shifting behavioral power to the end user.
This blog post presents my argument for this shift as well who stands to benefit (or lose) from such an evolution.
Shift in Power
Let’s start by acknowledging a fundamental truth of today’s market: every company decides exactly how its users access its service/product. There are two types of users, a human or a machine/third-party. Each has a product form factor: build a GUI for humans to access your service; build an API for code to access your service. There’s generally no overlap between these two paradigms — machines can’t use GUIs, and humans can’t use APIs. Companies determine their business strategy and then cater to the necessary user type(s) by building the appropriate product form factor. We tend to take for granted that companies hold this power, and how strongly it can influence their business model.
Software action models change the playing field. They make it much more difficult for a company to expressly disallow machine or third party access to its service/product (although I’m sure they will still try). Mint and the early days of Plaid are helpful examples of similar end-runs around how a companies (in this case, banks) allow access to its services. Software action models will usher in a supercharged version of this behavior, shifting the balance of power towards the end-user.
Let’s look at an example. Every time I order a rideshare I open and use two mobile apps, Lyft and Uber. I’m forced to open both apps because neither Uber or Lyft wants to become a commodity; the only way to order an Uber is to use the Uber app. Both companies aspire to be its own ecosystem, its own app experience. As the end-user I don’t care about any of that — I’m simply comparing price and ETA to get into a car.
If my iOS had a built-in software action model, it could grok available rideshare options from Uber and Lyft and present those aggregated options (and even just make the decision for me) without my ever having to interact with the rideshare app itself.
Are these first-world problems? Absolutely. But in a tech ecosystem obsessed with owning consumer behavior it could have massive implications for how we build and invest in businesses moving forward.
Winners and Losers
The optimal place in the tech stack for software action models is within the operating system. Browsers are the next best choice, especially on desktop (as a higher % of activity occurs there compared to mobile browsers). Integrating this technology at either of these layers provides users the ability to fully leverage the models power with whatever service they wish.
I can see startups in this space becoming attractive acquisitions for adventurous OS/browser owners looking to differentiate their product through some innovative, but risky, changes in form factor — Android and Microsoft come to mind.
I concede that this reality is likely to be delayed due to economic factors, rather than technological limitations. There is an incredible amount of economic pressure to maintain the status quo. Operating systems, especially mobile ones are economically linked to the revenue of their application ecosystem. So my hope is that we’ll see this paradigm shift first on desktop/browser and then, once the behavior is commonplace enough that consumers require it, mobile operating systems will follow suit.
One of the dangers to the end user are software action models economically incentivized towards certain behaviors. To use my previous example, what if your model had a lead generation deal with Uber on the backend, and was more likely to choose Uber than Lyft in “close” scenarios? Especially if these abilities are integrated at the operating system layer, and therefore not made available as a “pay for your own” extra, it will be important for companies to be transparent about how those models behave.The alternative is that we’re all walking around with a personal assistant that makes decisions not entirely on our behalf.
The losers in this equation are application-layer services that:
a) Eschew API access
b) Sell a commodity (ex. Uber and Lyft), or have the same inventory as other apps (ex. food delivery)
b) Rely on ad revenue, but users want to minimize time spent in the app (ex. Fandango)
These services will inevitably fight against AI usage of its products. A good analogy here is the evolution of screen scraping companies, like Mint or the early days of Plaid. Banks initially fought “spoofed” logins, but eventually acquiesced and began working to enable this behavior in safe (and in some cases mutually economically beneficial) way. That was data; this is action.
It would be interesting to pass regulation that upholds a user’s right to access services by proxy (i.e. software action model). We could make it legally indistinguishable between a human using an application and a software action model using that same service on a human’s behalf. That would help short circuit some of the inevitable resistance. And, fascinatingly, move us further towards a world that recognizes an individual’s ability to delegate technological agents to autonomously operate in the digital world on our behalf.