AI Top-of-Mind for 6.11.24 — Apple Intelligence (Finally)

dave ginsburg
AI.society
Published in
4 min readJun 11, 2024

Late today due to travels…

Today: Apple Intelligence unveiled, McKinsey’s Lilli, Kling for video, Meta’s Chameleon, and LLM guardrails and jailbreaks

Top-of-mind is the hype surrounding the Apple Intelligence announcement, where innovations spanned every OS. Looking at each new feature including OpenAI’s bot, as a whole they are useful, but I’m not sure any will fundamentally change how we interact with the devices. Not until something like an Apple AI agent that you can send off to do your biding.

‘Wired’ offers a good summary of the different announcements, many of which center on personalization. They add additional AI-type capabilities to existing applications, and the article compares each to some of the existing Android features including Apple hardware limitations. One positive development on the privacy front is the Siri handoff to ChatGPT, where as a user you are notified of and must approve this transfer. In any case, this announcement is just the beginning of Apple’s AI announcements, including their internal bot innovations combining an on-device SLM and a cloud-based LLM. From the launch:

Source: Apple
Source: Apple

Not super recent, but also on the agent front, is McKinsey’s ‘Lilli.’ From the blog:

· “Over the years, our experts have used it to develop playbooks, case studies, and white papers that eventually became data, digital solutions, and analytics tools.”

· “With Lilli, we can use technology to access and leverage our entire body of knowledge and assets to drive new levels of productivity.”

· “It saves up to 20 percent of my time preparing for meetings, but more importantly, it improves the quality of my expertise and my contributions.”

A while back there was major buzz around OpenAI’s ‘Sora’ video generator announcement. Not to be outdone, from China we now have ‘Kling.’ Jim Clyde Monge writing in ‘Generative AI’ compares the two. Kling is said to be better with physics and what is termed ‘temporal coherence,’ and can generate a two-minute clip vs Sora’s one minute limitation. But there is a cost:

Sora requires eight NVIDIA A100 graphics processing units (GPUs) running for over three hours to produce a one-minute clip. One NVIDIA A100 costs over 10,000 USD. So Kling would probably require double of that compute power to produce a 2-minute video result.

His conclusion is that it is not yet up there with Sora, and in any case the tool won’t be available for wider testing until later in the year. Jim’s prompt: A giant panda playing guitar by the lake:

And on the model front, Aziz Belaweid in ‘Generative AI’ looks under the covers of ‘Chameleon,’ a MLLM from Meta. From the article:

· Representing different modalities in the same space can be challenging due to their different natures. Text is discrete, as it can be divided into a number of words or tokens, whereas images are continuous.

· To address this, Meta’s team proposes using an image tokenizer. The job of this tokenizer is to transform an image into a discrete set of tokens.

· These tokens, combined with text tokens, are fed into the same transformer architecture. This fusion allows the model to reason and generate across modalities easily.

Aziz points out that there are still limitations, but the approach is still in its early days and will improve.

Turning to marketing, some innovations from Linkedin with their ‘Accelerate AI’ suite. As reported by ‘Search Engine Land,’ this includes Microsoft Designer integration, an AI campaign assistant, better targeting, and a new in-stream video ad generator. Some published statistics:

· Advertisers using Accelerate are creating campaigns 15% more efficiently

· They’re also seeing a 52% lower cost per action compared to classic campaigns

· Video uploads on LinkedIn up 45% YoY

Lastly, are we pushing our luck with the newest ‘frontier’ models? Ignacio de Gregorio offers his view. He looks at the ‘alignment’ phase of models that creates guardrails, and how this can be reversed. This analysis partially draws on Anthropic’s ‘feature’ mapping.

Without the capacity to activate the refusal feature, the model became totally servile, responding to every request, no matter how harmful.

--

--

dave ginsburg
AI.society

Lifelong technophile and author with background in networking, security, the cloud, IIoT, and AI. Father. Winemaker. Husband of @mariehattar.