Member-only story
Need help getting the most out of your computer? Help is at hand
Ramping up the competition against OpenAI, Anthropic has just announced the launch of an improved version of Claude 3.5 Sonnet that can interact with any application on a computer.
Through a new “use computer” API, now available in open beta, the model can perform keystrokes, button clicks, and mouse or trackpad movements, essentially emulating a user sitting in front of a computer.
The algorithm’s training allows it to see the processes that take place on a screen, and then use the available tools to carry out tasks, so that when a developer asks Claude to use a certain program and grants it the necessary access, it will use screenshots of what is visible to the user, counting how many pixels you need to move your cursor vertically or horizontally, and click in the appropriate place.
The algorithm can take on practically any task using a range of apps, an idea that some academics have been advancing and commenting on for some time now, and which Cory Doctorow describes as “user agents”, loyal to the user; able, for example, to manage a browser’s preferences so that pages load how they want (if you want to block ads, use certain color schemes that are easier for you to see, delete cookies after the visit, log in in a certain way, etc.), always giving preference to the user over what…