Privacy Talk with Hao Ping(Hank) Lee, PhD student, Human-Computer Interaction Institute at Carnegie Mellon University: How to taxonomize the AI’s privacy risk?

Kohei Kurihara

Published in

Privacy Talk

9 min readJun 16, 2024

“This interview has been recorded on 23th May 2024 and discusses AI and privacy risk”

What is the monetization issue against large language model?
How to taxonomize the AI’s privacy risk?
Message to listeners

So like the summary of the work is that our privacy practitioner show limited Privacy Awareness in some way, and also be faced more inhibitors than the motivators to go beyond the minimum requirements of what AI privacy means to the work as well as they had really few tools and resources to provide AI specific guidance when they actually wanted to practice work.

（Movie: [CHI 2024] Deepfakes, Phrenology, Surveillance, and More! A Taxonomy of AI Privacy Risks）

And I could unfold a little bit here, right. So when I say they have limited awareness, it means that in some way, they do have some awareness that privacy is important, but what does that really mean is in the context of AI is really unclear sometimes for them right?

What is the monetization issue against large language model?

For example, allow folks actually used a large language model in their products, but few are even none of them about memorization issues in the large language model, which is really well known issues right.

So the memorization issue in language model is given the particular prompt, you can try to retrieve information that model is being trained on and it revealed that to you right, so that could be problematic when for example some real world large language model systems is actually traine on personal information of users.

And that could be a case where the other user could use a particular prompt actually get some really personal information from the user, right? Imagine that that could be problematic in many cases.

That’s one example of what we mean by there are limits there overall, we have limited where it is built in terms of building AI specific privacy risk.

And then I refer to as you know, we found practitioner having more inhibitors than motivators to do privacy world what do we mean by aside there’s a bunch of obstacles they had to really go through if they really wanted to do privacy work right.

For example, there’s like an example, when in particular talked about the issue of power that actually makes them do less privacy work. You can imagine that, you know, maybe they cared about privacy, but maybe the higher management or manager didn’t really care about it, right.
So there could be attention,right. It’s your opportunity cost, which is actually a really well known issue in the privacy security community as well right. by the way to do more privacy work, but when in some way a trade off with otters things that we could contribute to a product.

But, you know, some of the practitioners of it actually told us that you know, they probably work but they’re not going to be promoted because they are doing more privacy work right.

They are being evaluated in terms of how many cars they buy, how many features they actually got delivered and someone’s privacy was so like, pushed a little bit back into product this centric agenda.

There are many other examples as well. That will be also covered later, which I think is pretty interesting. To try to get a good sense of what the current challenges for practitioners try to do more privacy work, essentially.

And the final piece of that is I got out there a few tools to actually help to do more privacy work, right. So probably not surprising because I know time when they say hey, do you have any type of, you know, background information and knowledge or tools that actually help you to do privacy in your day to day work in the product teams.

A lot of time they actually refer to the privacy training that the company offered to them. You know, the ones that for example, like got the onboarding session, the word onboarding session, but obviously, a lot of comments that or critiques they had to those types of approaches. Those types of privacy training are normally pretty general.

And it’s really hard for them to actually try to apply the knowledge they got from that into their day to day work, right. So essentially it is just like going back to practitioners itself, if privacy is more than you had to, by the way you have to figure it out. right.

Another thing they also frequently imagine like the overarching thing here is that it took quite a lot about clients, you know, like the main motivators or why they care about privacy in the first place.

A lot of time, they actually say help because GDPR actually asks us to, so I comply with this in order to actually sell a product to EU, for example. So compliance, definitely a really important thing but we also found that it was some in some ways you set a tone in terms of hey, this is a bare minimum that you had to achieve.

I think beyond that, you know, may or may not you haven’t really cared about it. So is there also a really interesting tension that we that we always, you know, probably really folks are really familiar in the privacy security researchers that at the end of the day maybe like legal compliance like the the final resort that we had to rely on, but what we found here is that they also saw like them that’s why, you know, the practitioners may want to actually do more also idea more into that in actual works.

So yeah, so this work gives us a so like the overarching thing here is that know that now we know that AI changeable privacy, but another thing that if we look into the practitioners side, the since I did just get enough support from the practitioners and to actually care more about privacy specific way I think.

Kohei: That’s a brilliant, thank you for telling us that’s been very helpful to understanding the overview of privacy. So in addition that you mentioned a bit about about the regulation part, that led to becoming an AI regulation in some parts of the regulators, especially for the EU has been started.

And also in the US part is also like the government in providing the guidelines or any frameworks to companies or public institutions has to comply with privacy risk.

So as for your research that’s been very interesting you try to taxonomize of the AI as a privacy risk. So could you share about how we can minimize the taxonomizing the AI and privacy risks from research?

How to taxonomize the AI’s privacy risk?

Hank: I mean, this is actually you know, we definitely see our taxonomy as a starting point, right. We see an ongoing conversation with a community essentially, in a way that we’re not saying that hey, this is all the privacy risks that will be introduced by AI as, as events as of AI keep going on.

So do the privacy risks, right, AI might create new ones that we had never seen before? right. So we totally acknowledge that there’s this ongoing conversation and when we see new incidents coming up, there could also be new privacy risks is also popping up as well.

And this is really, really interesting, you’re probably regulation in this conversation. So down the road or at least in a long term goal, our hope is definitely to try to in some way inform the policy making and all that right.

But it’s not there yet. there’s a lot of things that we definitely need to do. ight. So I guess the narrative here that I do want to share is that you know, AI indeed, provides amazing technology opportunities, and this is the narrative that most of the folks are doing.

But you know, there’s also a really important part in terms of what the privacy risks associated with that we also have to consider specifically that we’re arguing or like, try to highlight a bit more as we tried to foreground the discussion about this AI utility and privacy trade off right, when we when people only talked about.

People talked about it because AI is not a problem. The problem become when they only talked about utility but not so anti privacy trade off behind those utilities that end to build right.
One of the key things that we really want to push forward for is this. So like this line of work is that we really tried to build it in this white space for privacy preserving AI right so like you know, if you think about the common notion in terms of disappearing AI normally two things happen.

As I mentioned, differential privacy or federated learning is definitely only based on the tech savvy word that would probably only cover a really small portion of risks that AI is activated.

So they could cover some part of the privacy risk that AI exacerbate to install processing data, but it does not at all, consider you know, the dissemination risk or processing risk that AI newly created processing risks that AI newly created, give you example of here is let’s say I want to feel a differentially private argued and that predicts whether user is a criminal or not based on the image, we could for sure that build this system in let’s say, a federal building machine learning architecture.

It doesn’t remove the privacy risks that it’s like fundamentally attached with a specific URL with a system right so this is something we argue anyway. So how can we actually help practitioners to think more broadly in terms of what it means by privacy, preserving and we definitely need a souI.

So an org that, in some ways supports a particular person, didn’t make that decision. In a way that you know, building privacy preserving AI is not only an engineering problem, it’s not only a technical problem, it’s also a design problem.

Kohei: I totally agree then designing it’s the key part of the discussion, then the processing of the personal information especially that’s been very challenging. AI feel that we need more conversation about this topic. So then because your research is ever helpful to take many references from the privacy expert itself.

So finally, I’d like to ask you about the message for listeners because you do a lot of the great research so far then you have any ideas and future research. So could you share the message for research?

Message to listeners

Hank: So I guess one thing I do want to highlight here is that the message here is definitely not you know, AI is bad for privacy. So stop developing AI. This is definitely not a message that we want to put out there.

And say I’m still really optimistic in terms of the future events as AI. As I mentioned, I think AI actually feels like giving a lot of interesting and amazing AI technology opportunities that we actually can realize that we couldn’t do before, right.

But what is really important here, I think, is we really need to start this conversation to solve how AI privacy actually plays a role here. right?

And I think is actually a really good timing that is already about lack of need for AI but I think it’s not too late to start to think about privacy on and as these that we can start to foreground this type of discussion and start to not only talk about the utility of AI, but also talk about the privacy base of AI right.

So I feel like that’s actually we need more research to really understand how, you know, from a producer perspective and how they consider privacy decision making, and how we could actually help them to support them in doing this type of work in their day to day like, and you know, as a starting point, I really encourage again, audience to check our AI privacy taxonomy website as a starting point and try to get a sense of that. What are the directions that we’re heading towards, but you know, obviously, definitely more work to do as a community.

Kohei: That’s amazing. Thank you for sharing that. Again, I totally appreciate it then join in it today, then your existence is a very great to the deliver the privacy AI space still thank you for Mr. Hunk and to join his interview.

Hank: Thanks again for having me.

Kohei: Thank You.

Thank you for reading and please contact me if you want to join interview together.

Privacy Talk is the global community with diversified expert, and contact me below Linkedin if we can work together!

Privacy by Design Lab — データを切り口にこれからの時代を探索する

Kohei is a Co-founder of Privacy by Design Lab, that is leading data privacy culture and society community, as a not…

privacybydesign.jp

Privacy Talk with Hao Ping(Hank) Lee, PhD student, Human-Computer Interaction Institute at Carnegie Mellon University: How to taxonomize the AI’s privacy risk?

Privacy by Design Lab — データを切り口にこれからの時代を探索する

Kohei is a Co-founder of Privacy by Design Lab, that is leading data privacy culture and society community, as a not…

Written by Kohei Kurihara