In the News: A Look Under the Hood of Google and Twitter Algos & Supreme Court Hearings
We’re back with a more technical newsletter this week, but we hope you’ll still find all of these topics very relevant to your daily lives (because you’re probably using the technology!) In between, we also have an update on the ongoing debate on AI regulation
Testing… Testing…. Google’s AudioLM Generates Audio with Language Modeling
by Annika Lin
Google’s AudioLM generates audio based on a language modeling approach. While the language modeling approach doesn’t require data to be annotated, the two main challenges are the high audio data volume (relative to written text) and the one-to-many relationship between text and audio (i.e. same sentence spoken in different styles and recording conditions). To ensure high audio quality and long-term consistency, the model leverages both i) semantic tokens (from w2v-BERT) to capture dependencies and structures like the music melody, harmony, and rhythm and ii) acoustic tokens (from SoundStream neural codec) to capture details like recording conditions. The Audio-Only Language Model chains several Transformer models to train each stage for the next token prediction based on past tokens.
Experiments show that AI-generated continuations vs. real speech are almost indistinguishable by humans (see video here). Yet, a simple audio classifier can detect AudioLM-generated continuations, thus providing a safeguard against potential misuse of AudioLM.
The Internet on Trial: How the Supreme Court Could Change Social Media
by Spencer Karp
The Supreme Court is about to hear a few cases that could change the way tech companies monitor the internet. In the next Supreme Court session, the Court will hear at least two cases that ask questions of moderation (or lack thereof) on the internet. The first case is about personalized recommendations. Social media companies are not responsible for the inappropriate or threatening content that their users post. But could they be held responsible if their personalization algorithms recommend them to people? This is what the Court will decide. The next case asks how responsible tech companies are with regard to terrorists using their services. Finally, a case that may or may not be heard is regarding tech companies’ power to take down large amounts of misinformation. There is a similar theme in each of these cases as they all question tech companies’ responsibilities towards their users. The decisions should come out during May or June 2023. If the court were to say that companies are responsible for their recommendations, these algorithms may have to be rewritten altogether. This will surely change the way we surf the internet. Think about how you would rule on these cases, and about how these rulings may affect your personal life.
Where you lookin’? ML Paper explores gender bias in Twitter image cropping algorithms
If you’ve ever seen that only parts of an image are shown in the preview of a Tweet, it’s because many major media and social media firms like Twitter, Google, and Facebook utilize saliency-based image cropping (SIC) algorithms. As you can gather from the big-name companies, this technology is everywhere but ironically isn’t very salient in our minds. Twitter switched in 2018 from cropping images around faces to SIC, justifying that this technology is ‘“… able to focus on the most interesting part of the image” and “… able to detect puppies, faces, text, and other objects of interest”’, but none of us probably noticed a difference. Apple similarly follows two saliency frameworks, one of which is based on where viewers’ eyes are immediately drawn to in an image and the other on foreground objects.
A recent 2022 paper explored if bias in these algorithms led to cropping that reflected the male gaze (“directed at the chest and the waist-hip areas of the women being gazed upon”). Researchers formulated this question from the Tweets above, in which long and thin images of women at shows or events with corporate logos in the background were cropped in a male-gaze-like manner. Interestingly, the paper found from Twitter’s saliency-cell-maps that the focal point of these images was not the women, but their clothes or the background logos! The paper ends by qualifying that the observations of this data set fell within narrowly-defined conditions (minimum height-width ratio, full-body images only, images with background logos only) and that it aimed to encourage further study into these algorithms across various platforms– a sensible reminder that we shouldn’t make sweeping judgments after one study.