Decoding the US Senate Hearing on Oversight of AI: NLP Analysis in Python

Word frequency analysis, visualization and sentiment scores using the NLTK toolkit

Published in

Towards Data Science

21 min readJun 2, 2023

Last Sunday morning, as I was switching TV channels trying to find something to watch while having breakfast, I stumbled upon a replay of the Senate Hearing on Oversight of AI. It had only been 40 minutes since it started, so I decided to watch the rest of it (Talk about an interesting way to spend a Sunday morning!).

When events like the Senate Judiciary Subcommittee Hearing on Oversight of AI take place and you want to catch up on the key takeaways, you have four options: witness it live, look for future recordings (both options would require three hours of your life); read the written version (transcripts), which are about 79 pages long and over 29,000 words; or read reviews on websites or social media to get different opinions and form your own ( if it’s not from others).

Nowadays, with everything moving so quickly and our days feeling too short, it’s tempting to go for the shortcut and rely on reviews instead of going to the original source (I’ve been there too). If you choose the shortcut for this hearing, it’s highly probable that most reviews you’ll find on the web or social media focus on OpenAI CEO Sam Altman’s call for regulating AI. However, after watching the hearing, I felt there was more to explore beyond the headlines.

So, after my Sunday funday morning activity, I decided to download the Senate Hearing transcript and use the NLTK Package (a Python package for natural language processing — NLP) to analyze it, compare most used words and apply some sentiment scores across different groups of interest (OpenAI, IBM, Academia, Congress) and see what could be between the lines. Spoiler alert! Out of the 29,000 words analyzed, only 70 (0.24%) were related to words like regulation, regulate, regulatory, or legislation.

It’s important to note that this article is not about my takeaways from these AI hearing or Mr. ChatGPT Sam Altman. Instead, it focuses on what lies beneath the words of each part of society (Private, Academia, Government) represented in this session under the roof of Capitol Hill, and what we can learn from those words mixing with each other.

Considering that the next few months are interesting times for the future of regulation on Artificial Intelligence, as the final draft of the EU AI Act awaits debate in the European Parliament (expected to take place in June), it’s worth exploring what’s behind the discussions surrounding AI on this side of the Atlantic.

STEP-01: GET THE DATA

I used the transcript published by Justin Hendrix in Tech Policy Press (accessible here).

Access the Senate Hearing transcript here

While Hendrix mentions it’s a quick transcript and suggests confirming quotes by watching the Senate Hearing video, I still found it to be quite accurate and interesting for this analysis. If you want to watch the Senate Hearing or read the testimonies of Sam Altman (Open AI), Christina Montgomery (IBM), and Gary Marcus (Professor at New York University), you can find them here.

Initially, I planned to copy the transcript to a Word document and manually create a table in Excel with the participants’ names, their representing organizations, and their comments. However, this approach was time-consuming and inefficient. So, I turned to Python and uploaded the full transcript from a Microsoft Word file into a data frame. Here is the code I used:

# STEP 01-Read the Word document
# remember to install  pip install python-docx

import docx
import pandas as pd

doc = docx.Document('D:\....your word file on microsoft word')

items = []
names = []
comments = []

# Iterate over paragraphs 
for paragraph in doc.paragraphs:
    text = paragraph.text.strip()

    if text.endswith(':'):
        name = text[:-1]  
    else:
        items.append(len(items))
        names.append(name)
        comments.append(text)

dfsenate = pd.DataFrame({'item': items, 'name': names, 'comment': comments})

# Remove rows with empty comments
dfsenate = dfsenate[dfsenate['comment'].str.strip().astype(bool)]

# Reset the index
dfsenate.reset_index(drop=True, inplace=True)
dfsenate['item'] = dfsenate.index + 1
print(dfsenate)

The output should look like this:

 item name comment
0 1 Sen. Richard Blumenthal (D-CT) Now for some introductory remarks.
1 2 Sen. Richard Blumenthal (D-CT) “Too often we have seen what happens when technology outpaces regulation, the unbridled exploitation of personal data, the proliferation of disinformation, and the deepening of societal inequalities. We have seen how algorithmic biases can perpetuate discrimination and prejudice, and how the lack of transparency can undermine public trust. This is not the future we want.”
2 3 Sen. Richard Blumenthal (D-CT) If you were listening from home, you might have thought that voice was mine and the words from me, but in fact, that voice was not mine. The words were not mine. And the audio was an AI voice cloning software trained on my floor speeches. The remarks were written by ChatGPT when it was asked how I would open this hearing. And you heard just now the result I asked ChatGPT, why did you pick those themes and that content? And it answered. And I’m quoting, Blumenthal has a strong record in advocating for consumer protection and civil rights. He has been vocal about issues such as data privacy and the potential for discrimination in algorithmic decision making. Therefore, the statement emphasizes these aspects.
3 4 Sen. Richard Blumenthal (D-CT) Mr. Altman, I appreciate ChatGPT’s endorsement. In all seriousness, this apparent reasoning is pretty impressive. I am sure that we’ll look back in a decade and view ChatGPT and GPT-4 like we do the first cell phone, those big clunky things that we used to carry around. But we recognize that we are on the verge, really, of a new era. The audio and my playing, it may strike you as curious or humorous, but what reverberated in my mind was what if I had asked it? And what if it had provided an endorsement of Ukraine, surrendering or Vladimir Putin’s leadership? That would’ve been really frightening. And the prospect is more than a little scary to use the word, Mr. Altman, you have used yourself, and I think you have been very constructive in calling attention to the pitfalls as well as the promise.
4 5 Sen. Richard Blumenthal (D-CT) And that’s the reason why we wanted you to be here today. And we thank you and our other witnesses for joining us for several months. Now, the public has been fascinated with GPT, dally and other AI tools. These examples like the homework done by ChatGPT or the articles and op-eds, that it can write feel like novelties. But the underlying advancement of this era are more than just research experiments. They are no longer fantasies of science fiction. They are real and present the promises of curing cancer or developing new understandings of physics and biology or modeling climate and weather. All very encouraging and hopeful. But we also know the potential harms and we’ve seen them already weaponized disinformation, housing discrimination, harassment of women and impersonation, fraud, voice cloning deep fakes. These are the potential risks despite the other rewards. And for me, perhaps the biggest nightmare is the looming new industrial revolution. The displacement of millions of workers, the loss of huge numbers of jobs, the need to prepare for this new industrial revolution in skill training and relocation that may be required. And already industry leaders are calling attention to those challenges.
5 6 Sen. Richard Blumenthal (D-CT) To quote ChatGPT, this is not necessarily the future that we want. We need to maximize the good over the bad. Congress has a choice. Now. We had the same choice when we face social media. We failed to seize that moment. The result is predators on the internet, toxic content exploiting children, creating dangers for them. And Senator Blackburn and I and others like Senator Durbin on the Judiciary Committee are trying to deal with it in the Kids Online Safety Act. But Congress failed to meet the moment on social media. Now we have the obligation to do it on AI before the threats and the risks become real. Sensible safeguards are not in opposition to innovation. Accountability is not a burden far from it. They are the foundation of how we can move ahead while protecting public trust. They are how we can lead the world in technology and science, but also in promoting our democratic values.
6 7 Sen. Richard Blumenthal (D-CT) Otherwise, in the absence of that trust, I think we may well lose both. These are sophisticated technologies, but there are basic expectations common in our law. We can start with transparency. AI companies ought to be required to test their systems, disclose known risks, and allow independent researcher access. We can establish scorecards and nutrition labels to encourage competition based on safety and trustworthiness, limitations on use. There are places where the risk of AI is so extreme that we ought to restrict or even ban their use, especially when it comes to commercial invasions of privacy for profit and decisions that affect people’s livelihoods. And of course, accountability, reliability. When AI companies and their clients cause harm, they should be held liable. We should not repeat our past mistakes, for example, Section 230, forcing companies to think ahead and be responsible for the ramifications of their business decisions can be the most powerful tool of all. Garbage in, garbage out. The principle still applies. We ought to beware of the garbage, whether it’s going into these platforms or coming out of them.

Next, I considered adding some labels for future analyis, identifying the individuals by the segment of society they represented


def assign_sector(name):
    if name in ['Sam Altman', 'Christina Montgomery']:
        return 'Private'
    elif name == 'Gary Marcus':
        return 'Academia'
    else:
        return 'Congress'

# Apply function 
dfsenate['sector'] = dfsenate['name'].apply(assign_sector)


# Assign organizations based on names
def assign_organization(name):
    if name == 'Sam Altman':
        return 'OpenAI'
    elif name == 'Christina Montgomery':
        return 'IBM'
    elif name == 'Gary Marcus':
        return 'Academia'
    else:
        return 'Congress'

# Apply function
dfsenate['Organization'] = dfsenate['name'].apply(assign_organization)

print(dfsenate)

Finally, I decided to add a column that counts the words from each statement, which could help us also for further analysis.

dfsenate['WordCount'] = dfsenate['comment'].apply(lambda x: len(x.split()))

At this part, your dataframe should look like this:

   item                            name  ... Organization WordCount
0       1  Sen. Richard Blumenthal (D-CT)  ...     Congress         5
1       2  Sen. Richard Blumenthal (D-CT)  ...     Congress        55
2       3  Sen. Richard Blumenthal (D-CT)  ...     Congress       125
3       4  Sen. Richard Blumenthal (D-CT)  ...     Congress       145
4       5  Sen. Richard Blumenthal (D-CT)  ...     Congress       197
..    ...                             ...  ...          ...       ...
399   400         Sen. Cory Booker (D-NJ)  ...     Congress       156
400   401                      Sam Altman  ...       OpenAI       180
401   402         Sen. Cory Booker (D-NJ)  ...     Congress        72
402   403  Sen. Richard Blumenthal (D-CT)  ...     Congress       154
403   404  Sen. Richard Blumenthal (D-CT)  ...     Congress        98

STEP-02: VISUALIZE THE DATA

Let’s take a look at the numbers we have so far: 404 questions or testimonies and almost 29,000 words. These numbers give us the material we need to get started. It’s important to know that some statements were split into smaller parts. When there were long statements with different paragraphs, the code divided them into separate statements, even though they were actually part of one contribution. To get a better understanding of each participant’s involvement, I also consider the number of words they used. This gave another perspective on their engagement.

As you can see in Figure 01, interventions by members of Congress represented more than half of all the hearings, followed by Sam Altman’s testimony. However, an alternate view obtained by counting the words from each side shows a more balanced representation between Congress (11 members) and the panel composed of Altman (OpenAI), Montgomery (IBM), and Marcus (Academia).

It’s interesting to note the different levels of engagement among the members of Congress who participated in the Senate hearing (View table below) . As expected, Sen. Blumenthal, as the Subcommittee Chair, was highly engaged. But what about the other members? The table shows significant variations in engagement among all eleven participants. Remember, the quantity of contributions doesn’t necessarily indicate their quality. I’ll let you do your own judgement while you review the numbers.

Lastly, even though Sam Altman received a lot of attention, it’s worth noting that Gary Marcus, despite it may appear that he had few participation opportunities, had a lot to say, as indicated by his word count, which is similar to Altman’s. Or is it maybe because academia often provides detailed explanations, while the business world prefers practicality and straightforwardness?

Alright, professor Marcus, if you could be specific. This is your shot, man. Talk in plain English and tell me what, if any rules we ought to implement. And please don’t just use concepts. I’m looking for specificity.
Sen. John Kennedy (R-LA). US Senate Hearing on Oversight of AI ( 2023)

#*****************************PIE CHARTS************************************
import pandas as pd
import matplotlib.pyplot as plt

# Pie chart - Grouping by 'Organization' Questions&Testimonies
org_colors = {'Congress': '#6BB6FF', 'OpenAI': 'green', 'IBM': 'lightblue', 'Academia': 'lightyellow'}
org_counts = dfsenate['Organization'].value_counts()

plt.figure(figsize=(8, 6))
patches, text, autotext = plt.pie(org_counts.values, labels=org_counts.index, 
                                  autopct=lambda p: f'{p:.1f}%\n({int(p * sum(org_counts.values) / 100)})', 
                                  startangle=90, colors=[org_colors.get(org, 'gray') for org in org_counts.index])
plt.title('Hearing on Oversight of AI: Questions or Testimonies')
plt.axis('equal')
plt.setp(text, fontsize=12)
plt.setp(autotext, fontsize=12)
plt.show()

# Pie chart - Grouping by 'Organization' (WordCount)
org_colors = {'Congress': '#6BB6FF', 'OpenAI': 'green', 'IBM': 'lightblue', 'Academia': 'lightyellow'}
org_wordcount = dfsenate.groupby('Organization')['WordCount'].sum()

plt.figure(figsize=(8, 6))
patches, text, autotext = plt.pie(org_wordcount.values, labels=org_wordcount.index, 
                                  autopct=lambda p: f'{p:.1f}%\n({int(p * sum(org_wordcount.values) / 100)})', 
                                  startangle=90, colors=[org_colors.get(org, 'gray') for org in org_wordcount.index])


plt.title('Hearing on Oversight of AI: WordCount ')
plt.axis('equal')
plt.setp(text, fontsize=12)
plt.setp(autotext, fontsize=12)
plt.show()

#************Engagement among the members of Congress**********************

# Group by name and count the rows
Summary_Name = dfsenate.groupby('name').agg(comment_count=('comment', 'size')).reset_index()

# WordCount column for each name
Summary_Name ['Total_Words'] = dfsenate.groupby('name')['WordCount'].sum().values

# Percentage distribution for comment_count
Summary_Name ['comment_count_%'] = Summary_Name['comment_count'] / Summary_Name['comment_count'].sum() * 100

# Percentage distribution for total_word_count
Summary_Name ['Word_count_%'] = Summary_Name['Total_Words'] / Summary_Name['Total_Words'].sum() * 100

Summary_Name  = Summary_Name.sort_values('Total_Words', ascending=False)

print (Summary_Name)
+-------+--------------------------------+---------------+-------------+-----------------+--------------+
| index |              name              | Interventions | Total_Words | Interv_%        | Word_count_% |
+-------+--------------------------------+---------------+-------------+-----------------+--------------+
|     2 | Sam Altman                     |            92 |        6355 |     22.77227723 |  22.32252626 |
|     1 | Gary Marcus                    |            47 |        5105 |     11.63366337 |  17.93178545 |
|    15 | Sen. Richard Blumenthal (D-CT) |            58 |        3283 |     14.35643564 |  11.53184165 |
|    10 | Sen. Josh Hawley (R-MO)        |            25 |        2283 |     6.188118812 |  8.019249008 |
|     0 | Christina Montgomery           |            36 |        2162 |     8.910891089 |  7.594225298 |
|     6 | Sen. Cory Booker (D-NJ)        |            20 |        1688 |      4.95049505 |  5.929256384 |
|     7 | Sen. Dick Durbin (D-IL)        |             8 |        1143 |      1.98019802 |  4.014893393 |
|    11 | Sen. Lindsey Graham (R-SC)     |            32 |         880 |     7.920792079 |  3.091081527 |
|     5 | Sen. Christopher Coons (D-CT)  |             6 |         869 |     1.485148515 |  3.052443008 |
|    12 | Sen. Marsha Blackburn (R-TN)   |            14 |         869 |     3.465346535 |  3.052443008 |
|     4 | Sen. Amy Klobuchar (D-MN)      |            11 |         769 |     2.722772277 |  2.701183744 |
|    13 | Sen. Mazie Hirono (D-HI)       |             7 |         755 |     1.732673267 |  2.652007447 |
|    14 | Sen. Peter Welch (D-VT)        |            11 |         704 |     2.722772277 |  2.472865222 |
|     3 | Sen. Alex Padilla (D-CA)       |             7 |         656 |     1.732673267 |  2.304260775 |
+-------+--------------------------------+---------------+-------------+-----------------+--------------+

STEP-03: TOKENIZATION

Here is where the natural language processing (NLP) fun begins. To analyze the text, we’ll use the NLTK Package in Python. It provides useful tools for word frequency analysis and visualization. The following libraries and modules would provide the necessary tools for word frequency analysis and visualization.


#pip install nltk
#pip install spacy
#pip install wordcloud
#pip install subprocess
#python -m spacy download en

First, we’ll start with Tokenization, which means breaking the text into individual words, also known as “tokens.” For this, we’ll use spaCy, an open-source NLP library that can handle contractions, punctuation, and special characters. Next, we’ll remove common words that don’t add much meaning, like “a,” “an,” “the,” “is,” and “and,” using the stop word resource from the NLTK library. Finally, we’ll apply Lemmatization which reduces words to their base form, known as the lemma. For example, “running” becomes “run” and “happier” becomes “happy.” This technique helps us work with the text more effectively and understand its meaning.

To summarize:

o Tokenize the text.

o Remove common words.

o Apply Lemmatization.

#***************************WORD-FRECUENCY*******************************

import subprocess
import nltk
import spacy
from nltk.probability import FreqDist
from nltk.corpus import stopwords

# Download resources
subprocess.run('python -m spacy download en', shell=True)
nltk.download('punkt')

# Load spaCy model and set stopwords
nlp = spacy.load('en_core_web_sm')
stop_words = set(stopwords.words('english'))

def preprocess_text(text):
    words = nltk.word_tokenize(text)
    words = [word.lower() for word in words if word.isalpha()]
    words = [word for word in words if word not in stop_words]
    lemmas = [token.lemma_ for token in nlp(" ".join(words))]
    return lemmas

# Aggregate words and create Frecuency Distribution
all_comments = ' '.join(dfsenate['comment'])
processed_comments = preprocess_text(all_comments)
fdist = FreqDist(processed_comments)

#**********************HEARING TOP 30 COMMON WORDS*********************
import matplotlib.pyplot as plt
import numpy as np

# Most common words and their frequencies
top_words = fdist.most_common(30)
words = [word for word, freq in top_words]
frequencies = [freq for word, freq in top_words]

# Bar plot-Hearing on Oversight of AI:Top 30 Most Common Words
fig, ax = plt.subplots(figsize=(8, 10))
ax.barh(range(len(words)), frequencies, align='center', color='skyblue')

ax.invert_yaxis()
ax.set_xlabel('Frequency', fontsize=12)
ax.set_ylabel('Words', fontsize=12)
ax.set_title('Hearing on Oversight of AI:Top 30 Most Common Words', fontsize=14)
ax.set_yticks(range(len(words)))
ax.set_yticklabels(words, fontsize=10)

ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['left'].set_linewidth(0.5)
ax.spines['bottom'].set_linewidth(0.5)
ax.tick_params(axis='x', labelsize=10)
plt.subplots_adjust(left=0.3)

for i, freq in enumerate(frequencies):
    ax.text(freq + 5, i, str(freq), va='center', fontsize=8)

plt.show()

As you can see in the bar plot (Figur 02) , there was a lot of “Thinking”. Maybe the first five words give us an interesting hint of what we should do today and for our future in terms of AI:

“We need to think and know where AI should go”.

As I mentioned at the beginning of this article, at first sight, “regulation” doesn’t stand out as a frequently used word in the Senate AI Hearing. However, concluding that it wasn’t a topic of main concern could be inaccurate . The interest in whether AI should or should not be regulated was expressed in different words such as “regulation”, “regulate”, “agency” or “regulatory”. Therefore, lets make some adjustments to the code, aggregate these words, and re-run the bar plot to see how it impacts the analysis.

nlp = spacy.load('en_core_web_sm')
stop_words = set(stopwords.words('english'))

def preprocess_text(text):
    words = nltk.word_tokenize(text)
    words = [word.lower() for word in words if word.isalpha()]
    words = [word for word in words if word not in stop_words]
    lemmas = [token.lemma_ for token in nlp(" ".join(words))]
    return lemmas

# Aggregate words and create Frecuency Distribution
all_comments = ' '.join(dfsenate['comment'])
processed_comments = preprocess_text(all_comments)
fdist = FreqDist(processed_comments)
original_fdist = fdist.copy() # Save the original object

aggregate_words = ['regulation', 'regulate','agency', 'regulatory','legislation']
aggregate_freq = sum(fdist[word] for word in aggregate_words)
df_aggregatereg = pd.DataFrame({'Word': aggregate_words, 'Frequency': [fdist[word] for word in aggregate_words]})

# Remove individual words and add aggregation
for word in aggregate_words:
    del fdist[word]
fdist['regulation+agency'] = aggregate_freq

# Pie chart for Regulation+agency distribution
import matplotlib.pyplot as plt

labels = df_aggregatereg['Word']
values = df_aggregatereg['Frequency']

plt.figure(figsize=(8, 6))
plt.subplots_adjust(top=0.8, bottom=0.25)  

patches, text, autotext = plt.pie(values, labels=labels, 
                                  autopct=lambda p: f'{p:.1f}%\n({int(p * sum(values) / 100)})', 
                                  startangle=90, colors=['#6BB6FF', 'green', 'lightblue', 'lightyellow', 'gray'])

plt.title('Regulation+agency: Distribution', fontsize=14)
plt.axis('equal')
plt.setp(text, fontsize=8)  
plt.setp(autotext, fontsize=8)  
plt.show()

As you can see in Figure-03, the topic of regulation was after all many times during the Senate AI Hearing.

STEP-04: WHAT HIDES BEHIND THE WORDS

Words alone may provide us with some clues, but it is the interconnection of words that truly offers us some perspective. So, let’s take an approach using word clouds to explore if we can discover insights that cannot be shown by simple bar and pie charts.

# Word cloud-Senate Hearing on Oversight of AI
from wordcloud import WordCloud
wordcloud = WordCloud(width=800, height=400, background_color='white').generate_from_frequencies(fdist)
plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title('Word Cloud - Senate Hearing on Oversight of AI')
plt.show()

Let’s explore further and compare the word clouds for the different groups of interest represented in the AI Hearing (Private, Congress, Academia) and see if they words reveal different perspectives on the future of AI.

# Word clouds for each group of Interest
organizations = dfsenate['Organization'].unique()
for organization in organizations:
    comments = dfsenate[dfsenate['Organization'] == organization]['comment']
    all_comments = ' '.join(comments)
    processed_comments = preprocess_text(all_comments)
    fdist_organization = FreqDist(processed_comments)

    # Word clouds
    wordcloud = WordCloud(width=800, height=400, background_color='white').generate_from_frequencies(fdist_organization)
    plt.figure(figsize=(10, 5))
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.axis('off')
    if organization == 'IBM':
        plt.title(f'Word Cloud: {organization} - Christina Montgomery')
    elif organization == 'OpenAI':
        plt.title(f'Word Cloud: {organization} - Sam Altman')
    elif organization == 'Academia':
        plt.title(f'Word Cloud: {organization} - Gary Marcus')
    else:
        plt.title(f'Word Cloud: {organization}')
    plt.show()

It’s interesting how some words appear (or disappear) for each group of interest represented in the Senate AI Hearing while they talk about artificial intelligence.

In terms of the big heading, “Sam Altman’s call for regulating AI” ; well, if he is in favor of regulation or not, I really can’t tell, but it doesn’t seem to have much regulation in its words to me.Instead, Sam Altman seems to have a people-centric approach when he talks about AI, repeating words like “think,” “people,” “know,” “important,” and “use,” and relies more on words like “technology” ,”system” or “model” instead of using the word “AI”.

Someone that did had something to say about “risk”, and “issues” was Christina Montgomery (IBM) who repeated this words constantly when talking about “technology”, “companies” and “AI”. Interesting fact in her testimony, is finding words that most of all expect to hear from companies involved in developing technology ; “trust”, “governance” and “think” what it’s “right” in terms of AI.

“We need to hold companies responsible today and accountable for AI that they’re deploying…..”
Christina Montgomery. US Senate Hearing on Oversight of AI ( 2023)

Gary Marcus in his initial statement said, ‘“I come as a scientist, someone who’s founded AI companies, and is someone who genuinely loves AI…” So, for the sake of this NLP analysis, we are considering him as a representation of the voice of Academia. Words like “need”, “think”, “know”, “go” , “people” stand out among others. An interesting fact is that the word “system” seems to be repeated more than “AI” in his testimony. Maybe AI it’s not a single lone technology that would change the future, the impact on the future will come from multiple technologies or systems interacting with each other (IoT, robotics, BioTech, etc.) rather than relying solely on one of them.

At the end, the first hypothesis mentioned by Senator John Kennedy seems not entirely false after all (not just for Congress but for society as a whole). We are still in that stage where we are trying to understand the direction AI is heading.

“Permit me to share with you three hypotheses that I would like you to assume for the moment to be true. Hypothesis number one, many members of Congress do not understand artificial intelligence. Hypothesis. Number two, that absence of understanding may not prevent Congress from plunging in with enthusiasm and trying to regulate this technology in a way that could hurt this technology. Hypothesis number three, that I would like you to assume there is likely a berserk wing of the artificial intelligence community that intentionally or unintentionally could use artificial intelligence to kill all of us and hurt us the entire time that we are dying…..”
Sen. John Kennedy (R-LA). US Senate Hearing on Oversight of AI ( 2023)

STEP-05: THE EMOTION BEHIND YOUR WORDS

We’ll use the SentimentIntensityAnalyzer class from the NLTK library for sentiment analysis. This pre-trained model uses a lexicon-based approach, where each word in the lexicon (VADER) has a predefined sentiment polarity value. The sentiment scores of the words in a piece of text are aggregated to calculate an overall sentiment score. The numerical value ranges from -1 (negative sentiment) to +1 (positive sentiment), with 0 indicating a neutral sentiment. Positive sentiment reflects a favorable emotion, attitude, or enthusiasm, while negative sentiment conveys an unfavorable emotion or attitude.

#************SENTIMENT ANALYSIS************
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')

sid = SentimentIntensityAnalyzer()
dfsenate['Sentiment'] = dfsenate['comment'].apply(lambda x: sid.polarity_scores(x)['compound'])

#************BOXPLOT-GROUP OF INTEREST************
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_style('white')
plt.figure(figsize=(12, 7))
sns.boxplot(x='Sentiment', y='Organization', data=dfsenate, color='yellow', 
            width=0.6, showmeans=True, showfliers=True)

# Customize the axis 
def add_cosmetics(title='Sentiment Analysis Distribution by Group of Interest',
                  xlabel='Sentiment'):
    plt.title(title, fontsize=28)
    plt.xlabel(xlabel, fontsize=20)
    plt.xticks(fontsize=15)
    plt.yticks(fontsize=15)
    sns.despine()

def customize_labels(label):
    if "OpenAI" in label:
        return label + "-Sam Altman"
    elif "IBM" in label:
        return label + "-Christina Montgomery"
    elif "Academia" in label:
        return label + "-Gary Marcus"
    else:
        return label

# Apply customized labels to y-axis
yticks = plt.yticks()[1]
plt.yticks(ticks=plt.yticks()[0], labels=[customize_labels(label.get_text()) 
                                          for label in yticks])

add_cosmetics()
plt.show()

A boxplot is always interesting as it shows the minimum and maximum values, the median, the first (Q1) and third (Q3) quartiles. In addition, a line of code was added to display the mean value. (Acknowledgment to Elena Kosourova for designing the boxplot code template; I only made adjustments for my dataset).

Overall, everyone seemed to be in a good mood during the Senate Hearing, especially Sam Altman, who stood out with the highest sentiment score, followed by Christina Montgomery. On the other hand, Gary Marcus seemed to have a more neutral experience (median around 0.25) and he may have felt somewhat uncomfortable at times, with values close to 0 or even negative. In addition, Congress as a whole displayed a left-skewed distribution in its sentiment scores, indicating a tendency towards neutrality or positivity. Interestingly, if we take a closer look, certain interventions stood out with extremely high or low sentiment scores.

Maybe we should interpret the results not as if people in the Senate AIHearing were happy or uncomfortable. Maybe this suggest that those who participate in the Hearing may not hold an overly optimistic view of where AI is headed, but at the same time, they are not pessimistic either. The scores may indicate that there are some concerns and are being cautious about the direction AI should take.

And what about a timeline? Did the mood during the hearing stay the same throughout? How did the mood of each group of interest evolve? To analyze the timeline, I organized the statements in the order they were captured and conducted a sentiment analysis. Since there are over 400 questions or testimonies, I defined a moving average of the sentiment scores for each group of interest ( Congress, Academia, Private) , using a window size of 10. This means that the moving average is calculated by averaging the sentiment scores over every 10 consecutive statements:

#**************************TIMELINE US SENATE AI HEARING**************************************

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import make_interp_spline

# Moving average for each organization
window_size = 10  
organizations = dfsenate['Organization'].unique()

# Create the line plot
color_palette = sns.color_palette('Set2', len(organizations))

plt.figure(figsize=(12, 6))
for i, org in enumerate(organizations):
    df_org = dfsenate[dfsenate['Organization'] == org]
    
    # moving average
    df_org['Sentiment'].fillna(0, inplace=True) # missing values filled with 0
    df_org['Moving_Average'] = df_org['Sentiment'].rolling(window=window_size, min_periods=1).mean()
    
    x = np.linspace(df_org.index.min(), df_org.index.max(), 500)
    spl = make_interp_spline(df_org.index, df_org['Moving_Average'], k=3)
    y = spl(x)
    plt.plot(x, y, linewidth=2, label=f'{org} {window_size}-Point Moving Average', color=color_palette[i])

plt.xlabel('Statement Number', fontsize=12)
plt.ylabel('Sentiment Score', fontsize=12)
plt.title('Sentiment Score Evolution during the Hearing on Oversight of AI', fontsize=16)
plt.legend(fontsize=12)
plt.grid(color='lightgray', linestyle='--', linewidth=0.5)
plt.axhline(0, color='black', linewidth=0.5, alpha=0.5)

for org in organizations:
    df_org = dfsenate[dfsenate['Organization'] == org]
    plt.text(df_org.index[-1], df_org['Moving_Average'].iloc[-1], f'{df_org["Moving_Average"].iloc[-1]:.2f}', ha='right', va='top', fontsize=12, color='black')

plt.tight_layout()
plt.show()

At the beginning, it seemed like the session was friendly and optimistic, with everyone discussing the future of AI. But as the session went on, the mood started to change. The members of Congress became less optimistic, and their questions became more challenging. This affected the panelists’ scores, with some even getting low scores (you can see this towards the end of the session). Interestingly, Altman was seen by the model as neutral or slightly positive, even during the tense moments with the members of Congress.

It’s important to remember that the model has its limitations and could border on subjectivity. While sentiment analysis isn’t flawless, it offers us an interesting glimpse into the intensity of emotions that prevailed on that day in Capitol Hill.

Final thought

In my opinion, the lessons behind this US Senate AI Hearing lie in the five most repeated words: “We need to think and know where AI should go”. It is noteworthy that words like “people” and “importance” were unexpectedly present in Sam Altman’s word cloud, going beyond the headline for a “Call for regulation”. While I hoped to find more words like “transparency”, “accountability”, “trust”, “governance”, and “fairness” in Altman’s NLP analysis, it was a relief to find some of them frequently repeated in Christina Montgomery’s testimony. This is what we are all expecting to hear more frequently when AI is on the table.

Gary Marcus emphasized “system” as much as “AI”, perhaps inviting us to see Artificial Intelligence in a broader context. Multiple technologies are emerging right now, and their combined impact on society, work, and employment in the future will come from the clash of these multiple technologies, not just from one of them. Academia plays a vital role in guiding this path, and if some kind of regulation is needed.I say this “literally” not “spiritually” (inside joke from the six-month moratorium letter).

Finally, the word “Agency” was repeated as much as “Regulation” in its different forms. This suggests that the concept of an “Agency for AI” and its role will likely be a topic of debate in the near future. An interesting reflection on this challenge was mentioned in the Senate AI Hearing by Sen. Richard Blumenthal:

“…Most of my career has been an enforcement. And I will tell you something, you can create 10 new agencies, but if you don’t give them the resources, and I’m talking not just about dollars, I’m talking about scientific expertise, you guys will run circles around ’em. And it isn’t just the, the models or the generative AI that will run models around run circles around them, but it is the scientists in your companies. For every success story in government regulation, you can think of five failures…. And I hope our experience here will be different…”
Sen. Richard Blumenthal (D-CT). US Senate Hearing on Oversight of AI ( 2023)

Although reconciling innovation, awareness, and regulation for me is challenging, I am all for raising awareness about AI’s role in our present and future but also understanding that “research” and “development” are different things. The first one should be encouraged and promoted, not contained,the second one is where the extra effort in the “thinking” and “knowing” is needed.

I hope you found this NLP analysis interesting and I want to thank Justin Hendrix and Tech Policy Press for allowing me to use their transcript in this article. You can access the complete code in this GitHub repository. (Acknowledgement also to ChatGPT for helping me fine-tune some of my code for a better presentation).

Did I miss anything? Your suggestions are always welcome and keep the conversation going.