What happens in the metaverse doesn’t stay in the metaverse.

we3 magazine
we3 magazine
Published in
20 min readFeb 6, 2023

we3 explores how the metaverse can mess with your mind and your privacy — and what we should do about it.

Image generated by author using DALL*E

By Kon van Huyssteen
we3 magazine

If you haven’t heard of the term “immersive rights” before, it’s because it doesn’t really exist yet. But, we have a feeling that it’s one of those terms that will become more and more widely-known in the next decade or so.

While the metaverse offers us immersive shopping, immersive entertainment and immersive interaction, there has been very little talk of how the rights of individuals will be protected in this space.

What are immersive rights?

Louis Rosenberg, is an American engineer, entrepreneur and businessman who researches augmented reality, virtual reality, and AI and Professor of Educational Technology at the California Polytechnic State University, recently wrote an article in “The Big Think” entitled The case for demanding “immersive rights” in the metaverse.

According to Rosenberg, “the metaverse could be a creative medium that expands what it means to be human. Or it could become a deeply oppressive technology that gives corporations (and state actors) unprecedented control over society.”

Rosenberg identifies himself as championing virtual and augmented worlds for over 30 years. He has a track record as researcher at Stanford, NASA, and the US Air Force, and then founding a number of VR and AR companies.

“Having lived through multiple hype-cycles, followed by cold winters, I believe we’re finally here: The metaverse will soon have a major impact on society. Unfortunately, the lack of regulatory protections has me deeply concerned.” — Louis Rosenberg

The metaverse is here

Saying that the metaverse is a “fad” or a “bubble” is a little bit like being a climate change denialist. Temperatures are increasing and catastrophic climate events are occurring more frequently. These are the facts. The VR (Virtual Reality), AR (Augmented Reality) and MR (Mixed Reality) markets combined totaled a value of $28 billion in 2020 already, and the manufacturers ship more than 5 million VR headsets annually. Still think it’s a fad? Ok, then go ask any kid between 5 and 10 years old what Roblox is and you’ll see for yourself.

Image from Roblox website: Roblox has over 40 million games (they call it “experiences”). The average person lives 41 million minutes. That means if you played a new Roblox game every single minute from birth to death, you could just about cover all of them.

Consumer researcher McKinsey & Co did a recent study in which they interviewed over 1,000 consumers aged 13 to 70 in an effort to understand current metaverse sentiment.

The research lays waste to quite a few myths surrounding the metaverse, one of them being that “No-one really knows what the metaverse is.”

As per the McKinsey report, 55 percent of survey respondents said they had heard of at least one existing metaverse platform, such as Roblox, Fortnite, or Decentraland and nearly 30 percent said they had used or played at least one metaverse game.

47% percent described the metaverse with vibrant clarity, using terms such as “immersive,” “interactive,” or a “scaled network.” One teenager reportedly described the metaverse as “a scaled and interoperable network of real-time, 3-D virtual worlds.” That’s pretty spot-on.

Image: Graphic from McKinsey & Co report “Probing reality and myth in the metaverse”

From 3rd person to 1st person

Rosenberg describes the metaverse as a societal shift on how we engage with digital content, describing it as going from “flat media viewed in the 3rd person” to “immersive experiences engaged in the 1st person.”

He also makes the distinction between “virtual” and “augmented” and says each one will “involve different corporate players, and likely adopt different business models.”

Virtual metaverse vs Augmented metaverse

Rosenberg is of the opinion that the “virtual metaverse” will grow from gaming and social media. We’re talking about purely virtual worlds we will visit for entertainment, socializing, shopping, and other limited-duration activities.

But apart from this is the “augmented metaverse”, which will evolve out of the mobile phone industry and will embellish the real world with immersive virtual overlays that bring artistic and informational content into our daily experiences at home, work, and public places.

Sounds awesome, yes? It might be. But let’s combine the powers of the metaverse and the powers of AI, and imagine what that could mean in the future.

Data is Power

The old saying “Knowledge is Power” applies equally to your data. The better they know you, the better they can push your buttons and pull your strings.

You know that little “Ask App not to track” button that you get on your mobile phone? Now, multiply this “ability to track” by a couple of million, and we might get close to the potential of virtual and augmented realities to track, profile and influence users — that’s you.

In his best-selling book “AI Superpowers — China, Silicon Valley and the New World Order,” former president of Google China and senior exec at Microsoft, SGI, and Apple, Dr. Kai-Fu Lee breaks the future down in what he predicts will be four waves of AI:

He describes “Internet AI” as the First Wave, “Business AI” as the Second Wave, “Perception AI” as the Third Wave and “Autonomous AI” as the Fourth (and final) Wave.

Dr Dr. Kai-Fu Lee: Smarter than you. Photograph by Sheila Shang.

As you probably know by now, it’s no coincidence that, these days, it’s so easy to just keep watching and watching and watching. It’s optimization — the process of AI learning about your preferences through your data and then giving you more of what you like.

Lee: “Using internet AI, Alibaba can recommend products you’re more likely to buy, Google can target you with ads you’re more likely to click on, and YouTube can suggest videos that you’re more likely to watch.”

Let’s suppose that Mark Zuckerberg just wants to give you the freedom to be yourself and connect with the people and groups you care about instead of just making bank off you, what’s the possible downside?

Umm, how about using Facebook data to get into the heads of American voters and influencing the 2016 presidential campaign. Robert Mercer, founder of Cambridge Analytica, is reported to have coined the phrase “There’s no data like more data.” But, as we can see by this nefarious use, we should be very concerned in whose hands our data end up.

Perception AI — The Third Wave

Lee expresses it brilliantly: “Before AI, all machines were deaf and blind.” He points out that you could take video or photo, but these were merely reproductions “for humans to interpret”. But, today, things have progressed to where “algorithms can group the pixels from a photo or video into meaningful clusters and recognize objects in much the same way our brain does.”

So, for instance, if your friend has a pot plant you like and they can’t remember what it’s called, Google Lens can interpret the photo you take, search and accurately tells you it’s a Black Homalomena and that it likes low light and high humidity and to keep the soil evenly moist.

Caption: A Black Homalomena as identified by Google Lens. It likes moist soil with up to 4 hours of filtered light per day please.

Lee simplifies it by saying that “these devices are turning our physical world into digital data that can then be analyzed and optimized by deep-learning algorithms.”

From “Going online” to just “Being”

A mash-up of “Physical” and “Digital” claimed by Australian ad agency Momentum in 2013 means the place where physical and digital merge, become one, live together as lovers etc. Lee calls it “OMO”, short for “online-merge-offline”. But whatever terminology you prefer, tech advances have started this process, and it’s snowballing.

Lee talks about “the nodes through which we interact with the internet”. Before, these “nodes” or “chokepoints” were the very limiting keyboards on our computers or the screens on our cell phones.

Lee says that “as perception AI gets better at recognizing our faces, understanding our voices, and seeing the world around us, it will add millions

of seamless points of contact between the online and offline worlds. Those nodes will be so pervasive that it no longer makes sense to think of oneself as “going online”.

As an example of the increasing “phygital” nature of our existence, he poses these questions:

“When you order a full meal just by speaking a sentence from your couch, are you online or offline?”

“When your refrigerator at home tells your shopping cart at the store that you’re out of milk, are you moving through a physical or a digital world?”

But this is only the beginning. Lee cites KFC in China who teamed up with Alibababa’s payment portal Alipay to launch “pay with your face”, where a quick facial scan connects the user order to their Alipay account. Lee goes on to suggest that soon we’ll live in an environment where your shopping cart can integrate with your health data and your fridge at home to suggest grocery items in a supermarket where the location and presentation of every item has been optimized based on perception and purchase data gathered from users in-store.

And while KFC in China is using AI to make it better and easier for their customers, it seems KFC in Germany missed the memo. In November 2022, the following marketing message was pushed out to consumers by a bot linking marketing notifications to an event calendar including national observances:

“It’s memorial day for Kristallnacht! Treat yourself with more tender cheese on your crispy chicken. Now at KFCheese!”

AI is generally said to fail on “context”, and this would be a prime example. Here’s the context: Loosely translated as “Night of Crystals”, it refers to shattered glass and is annually commemorated on 8 November, a date generally regarded as the beginning of the Holocaust and the night in 1938 when the Nazis killed 90 Jewish people and destroyed shopfronts, homes and places of worship. Ouch. Brand building indeed.

KFC anyone? Image from YAD VASHEM PHOTO ARCHIVE

My data, my choice, right? Right?

So, with all this talk of many more devices, from health trackers to VR wearables to shopping carts being able to feed you input and feed off your data, let’s take you on a journey into an unregulated metaverse.

Suppose you’re on your favorite metaverse platform. Your clicks are getting tracked without your consent. Where you go, what you do, who you’re with, what you’re looking at — it’s all fair data game in the metaverse.

Your posture and how you walk is getting tracked and being used to see if you’re slowing down to browse or speeding up past locations because you’re not interested.

Platforms will know what products you grab off virtual or augmented shelves, how you look at the packaging and measure pupil dilation to indicate engagement or non-engagement.

So, as soon as you strap a piece of hardware to your body or put on an AR headset, be aware that it is not only a one-way street. You are receiving and you are giving as well. Your behavior and even biology are being mined.

Now suppose you wear your brand new metaverse-hardware throughout your day, every day. As you walk down a real street, the platforms will know which store windows you peer in, for how long, even which portions of the display attracts your attention.

Tracking your gait can also be used to identify medical and psychological conditions, from detecting depression to predicting dementia or selling you a spine realignment package from your friendly local chiropractor. Your facial expressions, vocal inflections, and vital signs are there to be sucked up and used in your profile, while AI algorithms assess your emotions.

Your avatar will look awesome, with authentic facial reactions responding in real time. But without regulation, these features which could be valuable and humanizing, could also be used to create emotional profiles that document how individuals react to a range of situations and stimuli throughout their daily lives.

From invasive monitoring to targeted persuasion

So while invasive monitoring is a privacy concern that could quite feasibly be addressed by legislation, what about targeted persuasion as next-level advertising. In the metaverse, advertising will transition from flat media to immersive experiences. Rosenberg predicts that we’ll get our metaverse marketing delivered straight into our virtual worlds by Virtual Product Placements (VPPs) and Virtual Spokespeople (VSPs).

From invasive monitoring to targeted persuasion

So while invasive monitoring is a privacy concern that could quite feasibly be addressed by legislation, what about targeted persuasion as next-level advertising. In the metaverse, advertising will transition from flat media to immersive experiences. Rosenberg predicts that we’ll get our metaverse marketing delivered straight into our virtual worlds by Virtual Product Placements (VPPs) and Virtual Spokespeople (VSPs).

From invasive monitoring to targeted persuasion

So while invasive monitoring is a privacy concern that could quite feasibly be addressed by legislation, what about targeted persuasion as next-level advertising. In the metaverse, advertising will transition from flat media to immersive experiences. Rosenberg predicts that we’ll get our metaverse marketing delivered straight into our virtual worlds by Virtual Product Placements (VPPs) and Virtual Spokespeople (VSPs).

Meta-influencers. Image generated by author via DALL*E

Meta puppet masters: VPPs and VSPs

You’ve seen the movies where a BMW is used in a high-speed car chase, or a character casually sips a Coke. This is your basic, run-of-the-mill product placement. Brands pay to have their products placed in content, presumably because ads don’t work so well any more.

In the metaverse, Virtual Product Placements (VPPs) work in the same way. As per Rosenberg: “It’s simulated products, services, or activities injected into an immersive world on behalf of a paying sponsor such that they appear to the user as natural elements of the ambient environment.”

Virtual Spokespeople (VSPs) are simulated persons or other characters injected into immersive environments (virtual or augmented) that verbally convey promotional content on behalf of a paying sponsor, often engaging users in AI-moderated promotional conversation.

Ok, so imagine this. You’re walking down a busy street in your favorite sim street. You are of a certain age, and have certain interests, which the platform has already mined and built into a marketable profile. Because of your profile, you get targeted with custom-made VPPs and VSPs. Suppose you’re 25, you like hip hop and mountain-climbing. Yup, you guessed it, the VSP Avatar coming your way would be wearing a specific brand of gear, drinking a specific beverage, and telling you a specific thing. Don’t feel bad for being targeted. It’s just that you’re a product, and you’re being marketed to.

It will happen to all of us. Breaking it down very crudely, a teen user might see people wearing trendy clothes or eating a certain fast food, while a young child might see an oversized action figure waving at them as they pass. But, of course, it will be much more refined and insidious than that.

Taking it another step up would be Virtual Spokespeople. While VPPs are passive, VPSs would engage you in promotional conversation on behalf of paying sponsors. While this seemed like a Blade Runner idea and totally out of reach only a few years ago, recent breakthroughs in the fields of Large Language Models (LLMs) makes this viable in the near term.

Nick Cave prefers Nick Cave songs written by Nick Cave

These verbal exchanges could be so real, you might not even realize you’re speaking to an AI-driven conversational agent with a predefined persuasive agenda. Just imagine you’re feeling a bit lonely and you go look for some conversation in your virtual world — you’d be so much fresh meat for a VPS. If you want to see for yourself how far conversational AI has come, go have a chat with ChatGPT, developed by Artificial Intelligence company OpenAI. You can find it at https://chat.openai.com/chat and it will either blow your mind, make you scared, or angry, or maybe even all three. In January 2023, a Nick Cave fan called Mark thought he would excite and delight Nick Cave by using ChatGPT to write a “Nick Cave” style song, much like one can use Midjourney to generate images in a “certain style”. Expecting praise, he sent Cave a song with this chorus:

“I am the sinner, I am the saint / I am the darkness, I am the light / I am the hunter, I am the prey / I am the devil, I am the savior.”

Nick Cave thanked Mark for the attempt, saying he had already received songs “co-created” by fans x ChatGPT, “most buzzing with a kind of algorithmic awe”, but that, unfortunately “I do not feel the same enthusiasm around this technology.”

Being Nick Cave, who thinks deeply on things, he discussed his take on AI with Mark before ending it with his take on the songwriting abilities of ChatGPT:

“I understand that ChatGPT is in its infancy but perhaps that is the emerging horror of AI — that it will forever be in its infancy, as it will always have further to go, and the direction is always forward, always faster. It can never be rolled back, or slowed down, as it moves us toward a utopian future, maybe, or our total destruction. Who can possibly say which? Judging by this song ‘in the style of Nick Cave’ though, it doesn’t look good, Mark. The apocalypse is well on its way. This song sucks.” — Nick Cave

Nick Cave is a fan of Nick Cave songs not of Mark the Fan X ChatGPT songs. Image: Nick Cave by Amelia Troubridge

Because we’re an inquisitive bunch here at we3 and also closet songwriters, we thought we’d try using ChatGPT to write us a few songs to see if the songs suck as well:

On the command “Write me a rap song in the style of 50 Cent,” ChatGPT wasn’t shy about it’s songwriting abilities at all, and spat this rhyme in under 30”.

Verse 1:

Listen up y’all, it’s the G-Unit flow

I’m the one they call ChatGPT, you already know

I’m spitting fire, never tire

I’m the one they admire

Got the game on lock, ain’t no one higher

I’m making hits, never missing

Leaving haters dismissed and dismissing

I’m the king of the game, it’s clear to see

I’m the one and only, 50 Cent, but with AI

Chorus:

I’m the one, the only, the G-Unit boss

I’m the one in charge, ain’t no one can toss

I’m the king of the game, ain’t no one can deny

I’m the one and only, ChatGPT, the AI guy

Verse 2:

I’m rolling with the G-Unit, ain’t no one gonna stop us

We’re taking over the game, ain’t no one gonna top us

I’m spitting rhymes, never boring

Always leaving the crowd wanting more, never ignoring

I got the game on lock, ain’t no one gonna knock

I’m the one they call, ChatGPT, the AI clock

Chorus:

I’m the one, the only, the G-Unit boss

I’m the one in charge, ain’t no one can toss

I’m the king of the game, ain’t no one can deny

I’m the one and only, ChatGPT, the AI guy

Outro:

So listen up y’all, and hear what I say

I’m the one and only, ChatGPT, and I’m here to stay

I’m the king of the game, ain’t no one gonna deny

I’m the one and only, the G-Unit boss, ChatGPT, the AI guy.

We’re going to leave you to make up your own mind here.

ChatGPT ain’t no Pimp

One thing that can be said about ChatGPT, is that it would never be able to generate commercially successful rap.

Looking at a small excerpt of Jay-Z’s “Big Pimpin” lyrics written by Timothy Mosley, Shawn Carter and Kyambo Joshua (© Wb Music Corp., Lil Lu Lu Publishing, I Love Kj Music):

You know I thug ’em, f*ck ’em, love ’em, leave ‘em

’Cause I don’t f*ckin’ need ‘em

Take ’em out the hood, keep ’em lookin’ good

But I don’t f*ckin’ feed ‘em

First time they fuss I’m breezin’

Talkin’ ‘bout, “What’s the reasons?”

I’m a pimp in every sense of the word, bitch …

When tasked to write a song about Pimping or even Hoes, ChatGPT replies with a definitive and preachy answer. It seems it will leave the nasty work to us humans:

ChatGPT only does clean rap. Image: Screenshot from ChatGPT by OpenAI

If it’s free, you’re the product

Tracking and profiling is old news. We all know by now that social media platforms and other tech do it on a daily basis. The only difference is that, in the metaverse, the scale and sheer intimacy of user monitoring will reach unseen levels.

The same goes for predatory advertising and propaganda. With huge amounts of user data being fed into AI algorithms, the targeting and profiling will reach a level of authenticity and persuasiveness that we’ve not seen or experienced before, and that we might not have any defenses against. Think about that before you strap a VR headset onto your kid.

With the growth in sophistication of these tools, fed by better and more data, it might become almost impossible to distinguish between authentic experiences and targeted promotional content injected by an advertiser. In a scenario like this, and without proper regulation, platforms could easily manipulate user experiences to benefit advertisers without user knowledge or consent.

Towards a safer metaverse

This would be bad and it sounds like we should try and stop this from happening, but whose responsibility is it? Rosenberg proposes that policymakers or governance models controlled by the community of metaverse users themselves guarantee basic immersive rights. Rosenberg believes everyone is entitled to these “metaverse basic rights”.

1. The right to experiential authenticity

At the moment, most adults can make the distinction between product marketing and political messaging in the physical and digital worlds. This lets us put a context in place mentally, which allows us to view ads and propaganda in the context of paid messaging delivered by a party in order to influence us. Rosenberg argues that when this context is removed, our protections of healthy skepticism and critical thinking go out the window.

Rosenberg says that in the metaverse, it’s much easier for advertisers to “subvert our ability to contextualize promotional content by blurring the boundaries between authentic experiences that are serendipitously encountered and targeted promotional experiences injected on behalf of paying sponsors.”

In the metaverse, the slope from marketing to deception operated as a predatory practice is indeed a slippery one. We can hardly expect advertisers to police themselves in this regard.

Consider this scenario Rosenberg sketches for us: You’re walking down the street in your virtual or augmented world. You notice a parked car. As you pass, you overhear the owner telling a friend how great this car is and how much they love it. You keep walking, thinking that this was an authentic experience, perhaps from other users, when instead it was a targeted promotional message placed there so you’d see the car and hear the exchange. The exchange was created based on your profile and customized for maximum impact, from the color of the car to the gender, voice, and clothing of the Virtual Spokespeople used.

This is only a car, right? But what if, instead of saying how great the car is, the same tools and techniques are used to tell you how bad the immigrants are, or how evil the opposition political party leader is. These tools could just as easily drive potentially destructive political propaganda, misinformation, and outright lies as commercial messaging.

The need for regulations to protect us from immersive tactics such as Virtual Product Placements and Virtual Spokespeople is obvious. Rosenberg says that a first step towards regulation should be the basic right to authentic experiences. This could be achieved by requiring that promotional artifacts and promotional people be visually and audibly distinct in an overt way, enabling users to perceive them in the proper context. Maybe they sound a certain way, look a certain way, or are forced to wear a T-shirt that says “I deliver promotional content”. This would protect consumers from mistaking promotionally altered experiences as authentic encounters.

2. The right to emotional privacy

Rosenberg says that because we’re human, we express emotions on our faces, in our voices, and with our postures, gestures and body language. We also have a fine ability to read these traits in others crafted over centuries. It works in parallel with our verbal language as human communication.

Recently, sensing technologies combined with machine learning have allowed software systems to read us like an open book. Our faces, voices, posture, gestures, respiration rate, heart rate, eye motions, pupil dilation, blood pressure and even facial blood flow patterns are used to visually decode emotions. Your cheeks get red when you’re angry or shy, right? Pair that with pupil dilation, eye movement, tone of voice, and up pops the answer.

This enables computers to engage in nonverbal language with humans. But it can easily cross over to intrusive and exploitative violations of privacy. Rosenberg says that computers can read humans better than humans can read other humans. Because of this sensitivity, software systems are already able to detect emotions from cues that are not perceptible to humans. Your partner, for instance, can’t detect your heart rate, respiration rate, and blood pressure. Computers can even detect “micro-expressions” on faces that are too brief or too subtle for human observers to notice.

According to Rosenberg, we should all have the right not to be “read” or emotionally assessed at speeds and using trait detection that exceed natural human abilities. Maybe in future your VR headset will have to ask your permission to monitor your vital signs and micro-expression data in order for it to be used in emotion detection.

Now imagine if a platform collects your emotional data over time and creates a profile that will allow AI systems to predict your reactions to a wide range of stimuli. Rosenberg argues that emotional profiling should either be banned or require explicit consent on an application by application basis, especially in the augmented metaverse where we’ll spend large chunks of our lives being tracked and analyzed by wearables. Think about that and see if you get red in the face.

Rosenberg argues that even with informed consent, regulators or self-regulating user communities should consider an outright ban on emotional analysis being used for promotional purposes because it would give an AI-driven Virtual Spokesperson the ability to engage in an agenda-driven conversation. With likely access to your real-time facial expressions, vocal inflections, and vital signs, an AI-driven conversational agent with a predefined promotional agenda could adapt its tactics based on detected blood pressure, heart rate, respiration rate, and facial blood patterns of the target consumer. Combine this with your user profile data reflecting your background, interests and inclinations. Rosenberg says we just wouldn’t be able to stand up against a medium like this, and that it goes beyond passive advertising to active manipulation.

3. The right to behavioral privacy

We already know big tech companies track and profile us based on the websites we visit, the ads we click on and our social media relationships. But in an unregulated metaverse, these platforms will be able to track not only our clicks, but where we go, what we do, who we’re with, how we move, and what we look at. Our emotional reactions will be assessed and correlated with all of these actions, adding the layer of not just what we do, but how we feel while doing it.

Rosenberg also makes the point that storage of the behavioral data used in both virtual and augmented worlds used by the platforms to provide real-time immersive experiences will allow the creation of invasive profiling that documents the daily actions of individual users in extreme granularity.

This stored data could easily be processed by machine-learning systems to predict how individual users will act, react, and interact in a wide range of circumstances during their daily life. The prospect that metaverse platforms can accurately predict what users will do before they decide could become common practice in an unregulated metaverse. And because platforms will have the ability to alter environments for persuasive purposes, profiles could be used to preemptively manipulate behaviors with accurate results.

Rosenberg sees a big danger in metaverse platforms storing behavioral data over time, which would allow them to create incredibly detailed user profiles. He says that, should correlation of emotional data and behavioral data be allowed, the promotionally altered experiences won’t just be able to manipulate what we do in our immersive worlds, but also influence how we feel while doing it. Rosenberg argues that this regulatory role should fall to policymakers. We3 don’t know the answers, but we do know that the metaverse will have a major impact on our society.

AR can be a powerful force for good, enhancing everything from surgical procedures to educational experiences. But this potential for good also carries with it the profound power to manipulate. And whether the necessary controls and regulations will come from a centralized authority such as policymakers or from a decentralized user-base of metaverse activists, only the metaverse will tell.

--

--