GenAI: Why Apple Will Strike Back

Paul Hackenberger
Axel Springer Tech
Published in
12 min readMay 27, 2024
Apple’s ML Logo

Apple has recently recognized its lag in the LLM/GenAI space compared to formidable competitors like ChatGPT, Gemini, and LLaMA. As these AI technologies advance rapidly, Apple is strategizing to catch up and potentially surpass its rivals. Despite being late to the game, Apple is leveraging its existing technologies and resources to build the most personalized on-device AI assistant.

Disclaimer

The content of this article is not connected in any way with Apple Inc. Any references to Apple, its products, or services are for illustrative purposes only and should not be construed as endorsements or affiliations. All views and opinions expressed are solely those of the author.

Acknowledging the Gap

In early 2023, Apple’s top software executives realized the urgency to revamp Siri after testing OpenAI’s ChatGPT. ChatGPT’s capabilities in generating poetry, coding, and handling complex queries underscored Siri’s limitations. Siri struggled with conversation continuity and often misunderstood user requests, making it clear that a significant overhaul was needed.

This realization triggered Apple’s most significant reorganization in over a decade, focusing on integrating generative AI throughout its ecosystem. Determined to catch up in the tech industry’s AI race, Apple has made generative AI a tent pole project — the company’s special, internal label that it uses to organize employees around once-in-a-decade initiatives.
The goal is to close the gap in the AI race, positioning generative AI as a central project. Apple aims to unveil an improved Siri with enhanced conversational abilities supported by new generative AI technology at its developers’ conference on June 10.

Existing iOS AI Capabilities

Apple has been quietly incorporating AI features into its products for years, showcasing its commitment to enhancing user experience through advanced technology.

Apple Photos App

The Apple Photos app offers users innovative AI-driven features to enhance their photo management experience. One notable feature is the ability to create and share cutouts, allowing users to isolate and share the subject of a photo effortlessly. This functionality is available on iPhone XS, iPhone XR, and later models.

Another impressive AI capability is recognizing people in photos through private on-device machine learning. The Photos app uses deep neural networks to detect faces and upper bodies, extract feature vectors (embeddings), and cluster these embeddings to form galleries of known individuals. This supports features like viewing images of a specific person, accessing the People Album, and creating personalized Memories. Apple’s emphasis on privacy ensures that this process happens entirely on the device, providing real-time processing with minimal latency.

On-device AI APIs

On-device AI API Examples

Apple provides a range of on-device AI APIs designed for immediate app integration, requiring no prior machine learning experience. These APIs fall into several categories:

  • Vision: Image Classification, Object Detection, Text Recognition, Face Detection, and more.
  • Natural Language: Tokenization, Named Entity Recognition, Sentiment Analysis, and more.
  • Speech: Speech Recognition.
  • Sound Analysis: Sound Classification.

These APIs enable developers to create apps that can perform complex AI-driven tasks on the device, ensuring user privacy and enhancing performance.

Leveraging iOS Technologies for a Personalized AI Assistant

Apple’s existing iOS technologies and apps position it to build the most personalized on-device AI assistant.

iOS 18 and the Deal with OpenAI

Bloomberg reported that Apple is nearing a deal with OpenAI to integrate ChatGPT into its upcoming iOS 18 for iPhones. This collaboration would make OpenAI Apple’s first AI partner, though Apple is also in talks with Google and Anthropic to enhance Siri and search functions. The company is simultaneously developing its own AI chip for data centers, indicating a dual strategy of using both in-house and external AI technologies. Apple’s preference for on-device AI solutions, due to privacy concerns, has slowed its progress in adopting cloud-based AI.

Siri with Generative AI

Apple might enhance Siri with generative AI to compete with advanced chatbots from OpenAI, Microsoft, or Google. After testing ChatGPT, Apple executives recognized Siri’s limitations in understanding and conversing. Insiders indicate that Apple will announce significant AI advancements at its WWDC event on June 10, 2024, including AI-powered features in the new iPhone.

Siri, introduced in 2011, has been criticized for its lack of conversational ability. Apple’s restructuring focuses on integrating generative AI into Siri and operating systems, aiming to make Siri more responsive and capable of tasks like text summarization. AI processing will primarily occur on-device, differing from other models that use cloud servers. Upcoming iOS 18 will include these generative AI features. Apple aims to make Siri more responsive and capable of complex tasks.

MLLM-Guided Image Editing in Photos App (MGIE)

Apple has developed the MLLM-Guided Image Editing (MGIE) model, which enables users to modify images through simple text descriptions. Collaborating with the University of California, Santa Barbara, Apple’s MGIE can perform tasks like cropping, resizing, flipping, and applying filters without traditional photo editing tools. By interpreting user prompts and generating corresponding visual changes, the model can handle complex edits, such as altering object shapes or brightness levels. For instance, typing “make it more healthy” on a pizza photo adds vegetables, while “add more contrast” brightens a dark image. Available on GitHub and Hugging Face Spaces, MGIE exemplifies Apple’s expanding AI capabilities, aiming to integrate more AI features into its products.

Apple’s Future AI Integration Options

Apple’s AI strategy includes various future integration options, leveraging its existing and new AI capabilities to enhance user experience and stay ahead in the AI race.

⇒ Apple’s Option: On-device (Edge) AI on iOS

On-device AI offers significant advantages, including enhanced privacy and performance. Major tech players like Google and Meta are already working on edge AI solutions like Google Gemini Nano and Meta’s Llama 3, which provide powerful AI functionalities directly on the device. These solutions ensure user data remains private and enable real-time processing with minimal latency. Apple’s focus on on-device AI aligns with its commitment to user privacy and seamless performance.

Journal App: King of User Context

Journal App

The Apple iOS Journal app provides users with a digital diary that leverages the capabilities of their iOS devices to automatically capture and organize their personal experiences and activities. This app integrates seamlessly with other Apple services and applications, offering users a comprehensive journaling experience with features like automatic entry suggestions, personalized moments, integration with health and fitness data, and more. The Journal app’s emphasis on privacy and security ensures that all journal entries and data are encrypted and stored securely.

Journal App Data Sources

  • Photos and Videos: Entries can be created based on photos and videos taken with the device’s camera or saved in the Photos app.
  • Location Data: The app can suggest entries based on significant locations visited, using data from the Maps app.
  • Health and Fitness Data: Integration with the Health app to include data from workouts, mindfulness sessions, sleep patterns, and other health metrics.
  • Calendar Events: The app can pull in events and reminders from the Calendar app, helping users to journal about planned activities.
  • Music: Information about music played on the device, such as songs and playlists from Apple Music, can be used to enhance journal entries.
  • Weather: Current weather conditions at the user’s location can be automatically added to entries.
  • Messages and Communication: Insights from interactions via Messages, calls, and other communication apps may be used to suggest journal entries.
  • Safari: Browsing history and Safari highlights to help recall web activities and interests.

Apple’s Option: iOS-integrated AI with full access to user context

By leveraging on-device AI, Apple can provide a highly personalized user experience. AI models can access user-specific context available on the device to generate and return the best personalized and most specific generated answers. This context-driven approach ensures that AI understands user intentions more accurately and provides more relevant responses.

Core Spotlight: Indexing and Summarizing

Spotlight Icon

Core Spotlight allows developers to make their app’s content searchable on users’ devices, providing a seamless way for users to access specific activities and items directly from Spotlight and Safari search results. By utilizing the Core Spotlight framework, developers can index user data such as photos, contacts, and purchased items on the device, ensuring that this data remains private and is not shared with Apple or synced between devices. This local indexing enhances user experience by making app content easily accessible without needing to navigate through the app manually.

Apple’s Option: AI RAG SDK

Building on the Spotlight SDK, Apple could develop an AI Spotlight Retrieval-Augmented Generation (RAG) SDK. This would allow app developers to create AI agents that index or summarize their content, deeply integrating it into iOS. Such a framework would enhance the discoverability and usability of app content, leveraging AI to provide richer and more relevant search results.

Shortcuts App

Shortcuts App

The iOS Shortcuts app is a powerful tool designed to streamline tasks by allowing users to create custom shortcuts and automate actions. Users can combine multiple actions to create complex workflows and integrate third-party apps through action extensions and SiriKit. This flexibility makes Shortcuts a versatile platform for automating daily routines and enhancing productivity.

OpenInterpreter: Shortcuts meets AI
Similar to the Apple Shortcuts app, the open source project OpenInterpreter is a versatile tool that enables the AI to run tasks and execute code directly on user’s local computers using natural language commands, controlling the OS not only via terminal, but via mouse and keyboard input as well. This integration allows users to automate tasks via AI and perform complex operations without needing to manually write extensive automation code.

They even produced a hardware device called 01 to show its capabilities.

Apple’s Option: AI Shortcut Actions SDK

Apple could further enhance the Shortcuts app by developing an AI Shortcut Actions SDK. This would allow app developers to provide custom actions, code executions, or task automation integrated into the iOS ecosystem. By enabling AI to execute those Actions and extend functionality through app-specific plugins, Apple can create a more powerful and user-friendly automation platform.

WidgetKit and App Clips

WidgetKit enables developers to create widgets for their apps, offering quick, glanceable information on the Home Screen or Today View. Widgets can display static content or dynamically update based on data changes, providing users with essential information without needing to open the app.

App Clips, introduced in iOS 14, enable users to access a small, essential part of an app quickly without downloading the full version, streamlining tasks like paying for parking, ordering food, or renting scooters. These lightweight clips, limited to 10 MB, can be launched via QR codes, NFC tags, Safari links, Apple Maps, and Messages. They integrate with Apple Pay for secure transactions and Sign in with Apple for easy authentication, ensuring privacy and security. App Clips provide an efficient, user-friendly experience while offering developers a way to increase engagement and reach a wider audience, potentially converting users to the full app.

Apple’s Option: iOS-integrated AI Answer Widget SDK

Apple could introduce an AI Answer Widget SDK to create a more visually appealing, LLM-focused user interface for future iOS versions. This would enrich generated responses with interactive app-specific widgets or slices, allowing users to interact with app features directly within their AI-driven experiences.

AI “Dreaming” & AI Assistent: On-device AI customization

AI “Dreaming” involves constantly improving and personalizing the local AI by additional training and fine-tuning based on previous user AI interactions via prompt or speech.

This process might help the AI become more attuned to user preferences and context, providing more accurate and relevant responses over time. Whether this training occurs on-device, through distributed processing among household devices (a beefed up HomePod, e.g.), or via anonymous server-side fine-tuning, the goal is to create a highly personalized and responsive AI assistant, that is constantly improving while interacting with the user.

Bringing the iOS AI Bricks Together

App Clips, Journal App, Spotlight, Shortuts App, ML

Apple is poised to create a unique, AI-first iOS version by integrating various AI technologies and frameworks. With AI-enabled Siri, on-device (edge) AI and the option to build on top of existing apps, services and APIs, Apple can offer a comprehensive and highly personalized user experience that outshines competitors.

  • AI User-context: Enabling AI to understand not only what the user prompts, but what to read between the (user-context) lines, what he wants to achieve
  • AI Retrieval-Augmented Generation (RAG) SDK: Provide specialized app-specific content — both indexed and summerized — to generate the most precise answers
  • AI Shortcut Actions SDK: On top of just replying to answers, Siri could be enabled to execute tasks, not limited to iOS, but extended by any app that provides app-specific actions
  • AI Answer Widget SDK: Meanwhile the current LLMs provide text, and in best case image, responses, Apple could increase not only the joy of use, but also the usefulness of generated answers, by including app-specific widgets, that allow direct interactions with specialized apps

Combining these advancements, Apple is well-positioned to offer the most personal, effective, and enjoyable AI interactions possible. The company’s focus on on-device AI ensures that user data remains private while delivering real-time performance. The integration of AI on top of existing various apps and services, such as the Journal app and Shortcuts, enhances user convenience and productivity.

Conclusion

With the combination of these AI technologies and frameworks, Apple is set to revolutionize the user experience in its future iOS versions. By integrating advanced AI capabilities, prioritizing user privacy, and providing developers with powerful tools, Apple could and will deliver a unique, AI-first ecosystem that stands out in the competitive tech landscape. This strategic approach might ensure that Apple not only catches up with its competitors but also sets new standards for personalized and effective AI interactions.

Disclaimer

The content of this article is not connected in any way with Apple Inc. Any references to Apple, its products, or services are for illustrative purposes only and should not be construed as endorsements or affiliations. All views and opinions expressed are solely those of the author.

--

--