Generative AI: Time for Scrutiny

At IDE’s annual conference, researchers called for a closer look at artificial intelligence tools that generate text, videos, images and more — whether for good or ill.

MIT IDE

Published in

MIT Initiative on the Digital Economy

6 min readMay 24, 2023

By Peter Krass

The latest iterations of generative AI have made great strides in human-machine communications. They also raise economic and ethical concerns that need to be considered moving forward.

That was the upshot of several sessions at the annual conference of the MIT Initiative on the Digital Economy (IDE), a virtual event for sponsors and other stakeholders held May 18.

Generative AI refers to artificial intelligence tools that can create new content. They “learn” patterns and structures based on training from existing digital content such as books, videos and music. Then these tools use what they’ve learned to generate entirely new, human-like content, including text, videos, images, audio and code.

Software based on generative AI includes the extremely popular ChatGPT chatbot developed by OpenAI. In just the first week after ChatGPT was widely released last year, the software reportedly gained over 1 million users. Since then, ChatGPT has attracted more than 100 million users worldwide who are estimated to generate roughly 10 million queries a day, according to the Increditools site.

During the annual conference, IDE researchers pointed out that generative AI tools have been under development for some time and carry immense implications.

ChatGPT could help in daily tasks such as planning your next vacation, but it could also put you out of a job or dispense misinformation. The outcomes vary widely.

AI and Human Labor

The full impact of ChatGPT on the labor market is a “trillion-dollar question,” said conference speaker John Horton, the lead of IDE’s AI, Marketplaces, and Labor Economics group, and an associate professor of IT at MIT Sloan. He talked about the labor market impact of GPT but noted that we are still in the early stages of implementation.

Goldman Sachs recently predicted that AI could affect as many as 300 million jobs worldwide. That could lead to “significant disruptions,” the investment firm said. [More IDE perspectives on GPT and the economy can be found here.]

Horton described jobs as essentially a series of tasks that need to be done. Many of these tasks could be done by a generative AI tool. But in the rush to adopt GPT, Horton said a better question to consider is whether a task be done by AI at all? That is, would using AI be more efficient than just having a human do it? He offered some metrics to help answer the question:

And even when AI may boost performance, another challenge quickly arises: Can a human evaluate if the job done by the AI is good enough? “If the evaluation cost is too high,” Horton said, “it may not make sense to let AI do it.”

Generative AI may produce some indirect productivity gains, too, Horton explained. For example, it may perform a common, but high-level task previously done by a highly paid person. Then, a lower-skilled (and lower-paid) person could prompt the tool and judge its performance. (Prompting refers to how a human instructs or asks a chatbot to create new content.)

Not only could the generative AI tool replace one highly paid worker in this case, but it could indirectly also help the performance of the lower-paid worker with enabling technology, Horton said.

John Horton addresses the IDE Annual Conference

It’s overly simplistic to conclude that we’re all doomed to unemployment and will be replaced by machines. In fact, some economists say we shouldn’t worry based on the “lump of labor fallacy,” explaining that the amount of work needed to be done is not fixed, but constantly expanding.

Others cite the theory of “nonsatiation,” claiming that whenever current demands are met by technology, new human demands arise, generating new tasks and new demands for human labor.

Winners and losers will be based on labor distribution, Horton said. What’s most certain is, “quite a bit of disruption.”

Humans First

Renée Richardson Gosline, lead of IDE’s Human-First AI research group and a senior lecturer in management science at MIT Sloan, takes another view of AI proliferation: We can’t put the GPT genie back in the bottle, but we can keep human needs and feedback at the center of its development.

Renée Richardson Gosline reminds IDE attendees about keeping humans at the center of AI efforts.

Gosline sees the potential for AI efficiencies as well as dangers. She related how she recently used ChatGPT to plan a vacation as an example of an effective use of the technology. In the past, she would have created an itinerary based on her extensive search of the web, social media, and other online sources. Now she simply asked ChatGPT to do it for her.

But Gosline was also quick to remind attendees of risks and potential harms. When designing and training these systems, “humans must be central,” she asserted. “We should consider the potential harms and biases before we release an AI.”

As Gosline pointed out, the Stanford Institute’s AI Index Report finds the number of reported AI incidents and controversies has multiplied from fewer than 50 in 2012, to 260 in 2021, as shown in Figure 1:

Gosline also gave an example of the potential for AI bias. A recent Reddit forum showed how Midjourney, a generative AI tool that creates images based on text prompts, is racially biased. When asked to create images of college professors, Midjourney’s creations were all white, no matter the academic subject. “We can’t just frictionlessly adopt AI,” Gosline said. “We need to think about the bias.”

AI Aversion?

A related question — where should friction be placed? — was explored by one of Gosline’s collaborators, Yunhao “Jerry” Zhang, a member of IDE’s Human-First AI research group and a recent Ph.D. graduate at MIT Sloan.

Zhang presented findings from an IDE research project conducted with Accenture. [Read the research paper here.] Human subjects were asked to rate the quality of texts generated by a mix of human writers and ChatGPT chatbots. The experiment started by creating text using four parameters:

Text written by human Accenture copywriters
Text written by ChatGPT
Text written by ChatGPT, then revised and finalized by a human
Text written by a human, then revised and finalized by ChatGPT

Next, the 1,200 human subjects were randomly arranged in three groups:

No knowledge of who created the content
Partial knowledge of the four parameters, but not about each piece of content
Full knowledge of how each piece of content was created

Each group was then shown 10 texts created under the four parameters and asked to rate their quality.

Overall, human subjects preferred content created by ChatGPT.

Those with no knowledge of how the content was created (the “baseline” group) favored AI content by an especially large margin. But, as the chart below (Figure 2) shows, so did those who were fully informed about how each piece of content was created:

That said, Zhang and his colleagues did discover two biases. One was “human favoritism.” That is, when people know content is created by humans, they think it’s better. And two, if they don’t know who created the content, there is no “AI aversion.” That is, people perceive AI-generated content as being as good as, or even better than, content generated by humans.

According to the research, “users tolerate AI’s errors to a degree greater than previously thought.” Overall, consumers seem to be “rational” in their preferences as they tend to choose the option they perceive to have a better performance, the paper concludes.

“Given these results, in order to achieve greater adoption of AI, it is imperative that firms strive to improve the performance of their AI and demonstrate that their AI products are more capable than the alternative options provided by humans.”

The research can be helpful to businesses that want to anticipate or watch for reactions to their AI development efforts — positive as well as negative.

Peter Krass is a contributing writer and editor to MIT IDE.

Generative AI: Time for Scrutiny

At IDE’s annual conference, researchers called for a closer look at artificial intelligence tools that generate text, videos, images and more — whether for good or ill.

Written by MIT IDE