Understanding AI Technology: An Introduction for Educators

Glenn Kleiman
The Generator
Published in
20 min readFeb 12, 2023

Daniela Ganelin and Glenn M. Kleiman¹
February 2023

An abstract image depicting artificial intelligence, created by Barbara Treacy using DALL-E

Artificial intelligence (AI) is playing an increasingly influential role in our everyday lives. Already, AI guides us from one location to another, translates conversations, recommends music and movies, drives cars, recognizes faces, and much more. In recent months, we’ve seen sophisticated language AI like ChatGPT become widespread in schools, businesses, and media. In the coming years, we can expect AI to show up increasingly in homes, doctors’ offices, city squares, and just about everywhere else, with broad-reaching positive and negative societal effects.

Educators need to understand AI in order to make informed decisions about its use in schools. In order to prepare their students for a world and economy shaped by artificial intelligence, educators must consider how AI can influence what students need to learn, how teaching and learning take place, and how schools operate in general. As background for education decision-makers, this article overviews the state of AI today, future directions, and ethical concerns raised by both the power and the limitations of AI tools. The final section summarizes decisions educators will need to make about the impact of AI.

AI Basics

Broadly speaking, artificial intelligence encompasses any technology that is able to solve problems that would seem to require human-like intelligence. In science fiction, we often see AI represented as robots that can talk, reason, emote, and move just like humans — artificial beings very like ourselves. In the real world, today’s technologies have not reached this level of general AI, although they are improving rapidly. Instead, they’re usually pieces of software that perform a particular function, like making predictions, very well.

AI has undergone a renaissance since about 2010, and particularly since 2018, thanks to a new paradigm that has come to dominate current work. Previously, AI developers typically tried to understand how humans approach problems and then incorporated that knowledge and reasoning as step-by-step rules in a program. For example, to make a chatbot that offers medical diagnoses, AI developers could compile a list of all the questions a doctor would ask and the decisions she would make, such as “if a patient reports ankle pain and swelling but no numbness, the likely diagnosis is a sprain.” Try interacting with ELIZA, a very early “chatbot therapist” that fooled many in the 1960s. Can you guess some of the rules behind the chatbot’s responses?

This rule-based approach to AI continues to be influential: for example, a driver following Waze directions is relying on algorithms for finding efficient routes that were developed more than 50 years ago. Unfortunately, this paradigm comes with limitations. In many cases, the world and human reasoning are too complex and fluid to be encoded in a set of simple rules. How could our medical chatbot above deal with the arrival of COVID? How could we write down rules for calming a distressed preschooler or recognizing beauty in a work of art?

Instead, the incredible developments in AI in recent years have been powered by a different paradigm: machine learning.

Machine Learning

In machine learning, AI developers don’t program explicit rules for solving a problem. Instead, a program receives a dataset of examples, analyzes patterns in the dataset, and figures out its own rules from that data.

A nice visual comes from the “Quick, Draw!” game, which tries to recognize a user’s drawing. There are no rules describing objects built in, like “stars have five points” or “coffee cups have handles.” Instead, the program compares your doodle with millions of previous users’ drawings to find the most similar example. If you play through a two-minute round, you can explore the large dataset that helped the AI make its decisions. You might also notice that, although the program makes its best guess based on its available data, its predictions are not perfect.

Machine learning can be used with many kinds of data, including images, text, audio, video, and numbers. Different algorithms — procedures for a computer to learn rules from data — exist for different kinds of data. If you’ve ever made predictions using a line of best fit, you’ve emulated a simple version of machine learning with numerical data.

Figure 1. Students’ pre-test scores.
Figure 2. Students’ scores with the line of best fit.

As a small example, suppose you were interested in deciding which students should be placed in an advanced math class. Say that last year you recorded a dataset of students’ pre-test and end-of-year scores, as shown in Fig 1. You might use a graphing calculator or spreadsheet software that finds a line of best fit (Fig 2) and its associated prediction rule: on average, End of Year Score = Pre-Test Score + 20. Now, you can make approximate predictions for this year’s students based on their pre-test scores and use those predictions to help decide which students are well prepared for an advanced class.

On a larger scale, machine learning systems make predictions using more data and devise more complex prediction rules. For instance, imagine a district superintendent trying to predict future school enrollment as she plans her budget. She might have a lot of data about her district today — population demographics, test scores, the state of the economy, the number of students in charter and private schools — but no clear way to use that information to predict the future. With the help of a national dataset, she could use machine learning to predict changes over time: a program could determine that based on past data, combining certain variables in a particular formula provides a good prediction of future enrollment.

In practice, machine learning datasets are often enormous, allowing machine learning systems to detect subtle relationships among variables that aren’t easily visible to humans. For example, banks’ fraud detection programs decide whether each credit card sale is suspicious, learning from a dataset of billions of previous transactions. To suggest movies to you, Netflix identifies users among its 200 million subscribers who are similar to you (meaning they’ve watched many of the same shows and movies as you) and recommends their other selections. Astronomers use measurements from over a billion stars to locate potentially life-supporting exoplanets, and biotech researchers screen hundreds of millions of molecules for potential new antibiotics or anti-COVID drugs.

Computer Vision

One prominent area of machine learning is computer vision for processing images and videos. Computer vision is used to analyze and create artwork, help restore historical artifacts, build surveillance systems, recognize emotions, and more. For example, your iPhone might recognize your face to unlock with Face ID; a medical system can scan X-rays and MRIs to detect signs of illness; a self-driving car’s cameras constantly scan the road to identify parts of the world like lane markers, stoplights, and pedestrians so the car can safely respond.

Teachable Machine, like Quick Draw, is a computer vision demo. As you follow the tutorial, you can train a machine learning system to distinguish between different classes of images in your webcam: for example, recognizing your smile vs. your frown, or predicting whether a piece of trash should go in the recycling bin or the landfill. As you experiment, try investigating when the system does well and when it fails. What happens if you move closer to the camera, turn off the light, ask a friend to step in, or train with only a few webcam images?

Teachable Machine is learning rules for translating from an input — an image from your webcam — to an output — a predicted kind of image. You might notice that it represents its output numerically: for example, the system might show an 87% confidence that you’re currently smiling. These outputs are a probability, not a certainty; like any machine learning system, Teachable Machine can’t predict an outcome with perfect accuracy.

How does Teachable Machine learn its rules? First, it represents its input images numerically so it can apply mathematical operations to the data: each pixel gets assigned a number based on its brightness and color. Then it applies a machine learning algorithm that powers most of today’s computer vision systems: an artificial neural network.

Neural Networks

While the above examples of drawing lines to fit data and recognizing simple drawings demonstrate some core principles of machine learning, most datasets have far more complicated patterns. Accordingly, most machine learning algorithms are more involved. Artificial neural networks, a particularly powerful and influential family of algorithms, are loosely inspired by the structure of the human brain.

Our brains are composed of about 85 billion special cells called neurons, arranged in an enormous, intricate network in which neurons are linked and can send signals to each other through synapses. Thoughts, decisions, and actions occur as neurons communicate with each other using chemical and electrical signals.

For example, when light reaches the eye, a neuron in the optic nerve sends an electrical signal to its neighbors. If the neighboring neurons get enough total input, they’ll also become activated and pass the signal on to their own neighbors, amplifying the message. Eventually, the message will get passed on to neurons that process visual information — first to neurons responsible for recognizing lines and edges, then to neurons for recognizing simple shapes, and then to neurons for identifying objects — and we’ll understand what we see. But if not enough neurons activate, the message will fade away. (This is essential because otherwise all our neurons would be active all the time, overwhelming the brain: an analogous overdrive happens during epileptic seizures.)

Figure 3. Connections between two neurons in the brain. (Image Source)

Through use over time, some connections between neurons grow stronger, letting messages travel more readily. Meanwhile, unused connections grow weaker. Shaping these connections is how learning happens: for instance, practicing an instrument might strengthen connections in the auditory cortex.

In an artificial neural network, programmers model this basic setup using mathematics instead of biological signals. It’s not an actual model of the human brain — neuroscientists don’t understand the brain well enough to model it precisely, even if we had the computational power — but it relies on some of the same basic principles. The system takes in data, analyzes it through a connected network representing levels of information in the data, and outputs a prediction: its best guess about the correct answer.

Here is a miniature representation of a neural network’s structure; a complete drawing would be much larger and more complex.

Figure 4. A diagram of a small artificial neural network. (Image Source)

Consider Teachable Machine as it learns to recognize a smile vs. a frown. An image of a face would get passed in on the left-hand side. Each pixel, represented as a number, would be a different input. Information would travel rightward through the network, and two outputs would come out on the right-hand side: the probability that the image shows a smile and the probability that it shows a frown.

Each circle in the diagram represents a neuron whose job is to perform a simple mathematical calculation. Each neuron gets information — numbers — from its neighbors on the left. As in the human brain, each neuron has a decision to make: based on its total input, it might pass on information to its neighbors on the right, or it might let the message fade away (and pass along a zero). Also, like in the human brain, the connections between different neurons have different strengths: some connections have higher weights, meaning neurons place more importance on information carried along those connections.

How does learning happen in an artificial neural network like Teachable Machine? Like in the brain, connections’ strengths change with practice. Initially, the weights start as meaningless random numbers. Then during training, the neural network learns by adjusting its weights to match a dataset. It takes a data example with a known outcome: one of the recorded webcam images. It makes a prediction using the current weights and then compares that prediction to the correct answer: did the network predict the right class for the image? Then the system tweaks the weights to make the prediction closer to the desired output. As the network repeatedly works through a large dataset of examples — often in the millions — it’s often able to learn to predict the outcome with high accuracy.

With a large enough network and dataset, lots of training time, and some clever algorithms, artificial neural networks are capable of solving extraordinarily complicated problems. Systems powered by deep learning — many-layered neural networks with specialized structures — can diagnose cancers better than human radiologists, interpret the world for self-driving cars, locate deforestation in satellite imagery, design efficient computer chips, and predict the effectiveness of new vaccines.

Different parts of the network learn to specialize in particular sub-tasks, much as different parts of the human brain have different responsibilities: one set of neurons in Teachable Machine might learn to recognize an upward curve, and another set to identify the mouth. A downside of these enormous networks is that it’s often difficult to understand why a network makes its decisions. We can inspect the weights and calculations of any individual neuron, but that doesn’t reveal how or why these particular weights work well.

Types of Machine Learning

The machine learning examples we’ve considered so far fall in the category of supervised learning: for training, a system is provided with inputs (e.g., cough recordings) and outputs (doctors’ diagnoses) from existing examples. Essentially, the algorithm’s job is to learn a rule that successfully maps the inputs to the outputs in existing examples, so it can then apply that rule to new examples.

A different category of machine learning is more open-ended. In unsupervised learning, a machine learning algorithm processes an unlabeled dataset and tries to find patterns rather than replicate existing “right answers.” One common example is clustering, which involves finding groups of similar data points. For instance, a social media service might find clusters of users who browse similar content and serve them the same news articles. Unsupervised learning can also be used for AI-powered creativity; an AI system can process a dataset of human faces, houses, or animals and then invent entirely new ones.

Yet another category is reinforcement learning, which is used for training AI to navigate complex situations, like teaching a bot to play video games or a robotic arm to do simulated surgery. Here, the AI doesn’t start with an existing dataset. Instead, it experiments with taking different actions and seeing what rewards (such as winning the game) come from each. Over time, it forms its own dataset of what actions lead to the desired outcome in different scenarios. It’s similar to training a puppy: over time, the dog might learn that sitting down when it hears “Sit!” will lead to a treat.

AI for Language Processing

One particularly exciting sphere of AI is Natural Language Processing (NLP), which is concerned with using and processing written and spoken language — and even sign language. NLP includes everyday tools used in homes and schools like translation apps, search engines, auto-captioning on videos, and conversations with virtual assistants like Alexa and Siri. Language AI is improving rapidly, and AI-written articles and customer service bots are quickly becoming commonplace.

Conversations with AI have captured the popular imagination and filled science fiction for decades. In 1950, computing pioneer Alan Turing introduced the famous “Turing Test,” proposing that a way to measure whether a machine can “think” is to see whether it can hold a conversation so well that a human observer can’t tell whether their interlocutor is human or machine. Until the last few years, this has seemed like a pipe dream. Even newer technologies like Siri were impressive but clearly limited in their abilities: Siri could respond to “tell me a joke” using a pre-programmed list but wouldn’t be able to write its own humorous stories. Translation apps could help you find the restroom in a foreign country or get the gist of a news article, but they would hardly be appropriate for compiling a global poetry collection.

In recent years, a confluence of factors has led to revolutionary leaps in the capabilities of language processing technologies. One aspect is the increasing availability of language data from which AI systems can learn. The Internet provides an extraordinarily large repository of text in hundreds of languages: millions of Wikipedia articles that have been translated into many languages; uploads of books, movies, magazines, academic journals, and newspapers; forum posts, webpages, and social media that capture language from its most casual to its most formal. At the same time, powerful computer hardware has become exponentially more available and affordable, so machine learning systems can process these mountains of data. Large AI systems today are trained using billions of times more computing power than those in 2010, when deep learning started to become increasingly used in AI labs.

Since 2018, a new technology, called Transformer, has built on these developments in data and computing power, as well as advancements in algorithms. You might have encountered it with the recently released ChatGPT (Chat Generative Pre-trained Transformer). Powered by a modified neural network with a gigantic, specialized structure, Transformer learns from masses of unlabeled language data collected from the Internet. Although Transformer’s structure doesn’t resemble a biological brain very much, its size is beginning to: the current Transformer has the equivalent of 175 billion weights, about one-thousandth as many connections as the human brain (comparable to a small mammal). Given how quickly computing power is increasing, it’s possible that we’ll soon have AI models that are similar in complexity to the human brain.

During training, Transformer’s job isn’t to learn to predict something from the text, like generating a Hmong translation or deciding whether a sentence is true. Instead, Transformer needs to learn to predict the text itself: given this input text, what’s the next word? This task seems simple, but to do it successfully on a large scale, the AI needs to master many aspects of language and real-world knowledge. For example, to complete “Before you add the sugar, make sure the butter is completely ____,” the system needs knowledge of both grammar and baking.

What happens inside Transformer’s specialized structure? Transformer relies on clever approaches for processing language data, which (like visual data) need to be represented numerically. In the last decade, algorithms have emerged for generating numerical representations that capture the meanings of words, allowing machine learning systems to find richer and more complex relationships. For example, the representations of “hot” and “cold” will show that they are opposites, but both are related to “temperature.” These algorithms rely on context in existing datasets: if “broccoli” and “okra” tend to occur in similar sentences, like “Ingredients: Four cups of chopped ___,” algorithms can conclude that these words are similar in some way. The game Semantris lets you explore the words that these algorithms find to be related. Transformer builds on these innovations for representing language as numbers. While processing an input sentence, Transformer repeatedly modifies each word’s numerical representation to capture the role of a word inside that particular sentence, so “table” gets processed differently in “there was a vase of flowers on the table” vs. “I’d like to table that motion for tomorrow.”

Training a Transformer system is an enormous undertaking. If you tried to train the current state-of-the-art system on a single powerful computer, it would take hundreds of years and use millions of dollars of energy. But once the system is trained — typically by a large tech company — it comes packed with linguistic knowledge and can easily be adapted for different uses with a little bit of extra training on examples of the task in question. Transformer can generate original and thoughtful-sounding journalism, functioning computer code, or poetry in the style of a particular poet. To its developers’ surprise, it can also successfully interpret and complete a prompt without ever being trained on the specific task. For example, Transformer models can complete requests like “write a song about ducks in the style of Leonard Cohen” or “give me a list of ideas for my edtech startup” — and sometimes the outputs will be of shockingly high quality.

ChatGPT is one of the latest and most publicized applications of Transformer technology. By default, it can be difficult to get the Transformer model to do what you want. Since the model is trained to continue rather than respond to its prompt, it takes some experimenting to find the phrasing that will produce the right output. ChatGPT adds a nicer interface where users can make requests directly in a conversational format, and the model will remember and respond to previous points in the discussion. Interestingly, ChatGPT’s developers built the system using a variant of reinforcement learning that relies on human feedback. Humans modeled conversational responses and rated the Transformer’s attempts to output the right kind of response, and the system adapted as it learned from the feedback.

Transformer-based systems — such as ChatGPT; DALL-E, which works with text and images; and Gato, which can perform a variety of physical tasks like playing video games and stacking blocks — have earned the name of Foundation Models to reflect their broad-ranging training data and resulting capacities. They’re far from perfect, as ChatGPT’s frequent fabricated and erroneous responses demonstrate, but they’re the closest thing yet to a human-like general artificial intelligence that can handle all sorts of tasks. New developments in AI happen every week, and we can expect these systems to keep improving and gaining impressive new capabilities.

Ethics and Social Impact

AI’s dazzling capacity has the potential to improve our world, but it also comes with alarming risks that can cause harm to individuals, vulnerable groups, and society overall.

One area of concern is privacy. Today’s machine learning systems are so powerful because they have access to tremendous amounts of data, and generally that information comes from humans. Language processing systems train on our emails; recommender systems track our tastes in YouTube videos and clothing sales; medical and financial AI learn from our personal records; advertisers customize offerings based on our locations, click patterns, and perceived demographic categories; autocratic governments use face recognition to follow and persecute political rivals; some schools use cameras to track whether children appear engaged during online classes. As users, it’s virtually impossible to know or control how our data are collected and used by tech companies and governments.

Another concern about AI is the prevalence of bias in machine learning systems. It’s easy to assume that computerized systems are neutral and objective, but when systems are designed by humans and trained on human-generated data, they absorb and replicate human biases. There are many examples: facial recognition systems (including those used by police to identify suspects) are much less accurate for people with darker skin, in part because most training data show people with light skin. AI systems intended to help judges set bail tend to falsely predict Black defendants will recidivate, reflecting existing racial disparities in arrests and sentencing. Resume screening systems and job advertisement portals have learned to perpetuate discrimination against women in hiring for tech jobs. Language processing systems can spew out hate speech that rivals the worst of the Internet, as well as systematically reproducing less dramatic but pervasive biases like assuming “the engineer” must be a “he.”

AI can also fabricate information, and do so convincingly. An authoritative-sounding Transformer text can include made-up facts, incorrect mathematical reasoning, and even imaginary citations; it’s dangerous to believe the model’s outputs or use it as a search engine without externally verifying its claims. Researchers are working on improving these models’ reliability and safety, but you might notice that the current safeguards are less than robust. For example, ChatGPT might decline to provide instructions for an illegal activity, but then describe the same process in detail when asked to write a fictional story on the topic.

More generally, AI systems can and will make mistakes: a medical diagnosis system or a self-driving car might be better than a human doctor or a human driver on average, but it will still sometimes make wrong decisions that lead to injuries and deaths. It’s hard to decide when a machine learning system is safe, robust, and trustworthy enough to deploy in practice, especially since neural networks can’t clearly explain the rationale for their decisions. Laws and regulations worldwide are challenged to keep up with recent technological advancements; technology advances far more quickly than legislation and policies can be updated.

The centralization of AI systems to a few privately owned Transformer systems may only amplify these problems. Because of the size and expense of these systems, a small handful of companies or governments will likely control them. Since these powerful Foundation Models can be re-used in so many contexts, a baked-in bias, inaccuracy, or privacy violation can spread to technologies used in many domains.

There are other concerns about the ethics and social impact of AI. Automating jobs like truck driver, cashier, and customer service representative could push millions out of the workforce, even as many new jobs are created. The energy demands of today’s enormous AI systems can exacerbate climate change. Realistic AI-powered fake news, deep fakes, and Internet trolls contribute to misinformation and political polarization. AI-powered drones and weaponry facilitate long-distance warfare.

Hopefully, AI will end up transforming the world for the better: improving healthcare, science, art, and education; fighting poverty, inequality, and environmental destruction; making our daily lives easier and more convenient. But it’s just as possible that its effects will be largely negative, and everyone needs to be aware and thoughtful about its increasing role in our lives — especially the young people who will inherit these systems.

Decisions about AI in Education

Since the advent of personal computers, technological advances have required educators and policymakers to become informed, carefully consider options, and make decisions about the potential applications of digital technology in education. Advances in AI require that they continue this process to update prior decisions and make new ones about how these powerful technologies can serve students, teachers, schools, and communities. There are many questions educators need to address, which we divide into the three following categories.

Using AI to enhance current practices and better meet current goals:

  • How can the new advanced AI tools, such as those that write and translate across languages, support children’s learning? What AI-enhanced tools and resources, such as simulations, virtual reality environments, tutorials, education games, chatbots, and others, should be used in the curriculum, and in what ways?
  • How can teachers be prepared and supported to use AI effectively to enhance teaching and learning?
  • What devices and capacities should be provided in the classroom, labs, shops and other school spaces to enable productive uses of AI? What technology should every student have available at each grade level?
  • How can AI enable educators to obtain more useful and timely assessments of students’ learning through adaptive, formative and embedded assessments, combined with the use of data analytics to inform day-to-day instruction as well as longer-term curriculum planning?
  • How can we use AI to help address goals such as personalizing education, reducing inequalities, supporting students with special needs and learning differences, productively connecting with students’ families and communities, fostering cross-cultural understanding, and others?
  • How can AI be used to make the organization and management of schools more effective and efficient in areas such as scheduling, record keeping, budgeting, buses, communications, food services, and others?

Mitigating potential negative impacts of AI:

  • How do we address concerns that AI tools could be detrimental to learning, such as by enabling students to have the AI do their work for them?
  • What uses of technology to monitor students are appropriate and productive? Is there a place for facial recognition, location monitoring, internet use tracking, and analysis of students’ attention and feelings in our schools?
  • How can schools address the potential negatives of networked AI technologies? For example, how can they protect students’ privacy; prevent students from accessing inappropriate materials; help students evaluate online information — some of which is produced by AI and may be intended to deceive; and avoid technology serving more as a distraction than as an aid for learning?

Changing the goals and processes of teaching and learning:

  • How does the widespread use of technology impact what students need to learn? What prior curriculum content is now obsolete and what new content needs to be introduced? How should curriculum goals be updated to prepare students for the global, digital, AI-enabled world in which they will live?
  • How does AI impact the processes of teaching and learning, including the roles of teachers, expectations for students, and how students interact and collaborate with their teachers, classmates, and others outside their classrooms?
  • How can teachers and AI systems work together, each bringing unique capabilities, to support students learning and development, in ways that enhance teacher–student relationships so that teachers can focus on guiding and inspiring each student?

It is a challenging and exciting time in education, in which teachers, administrators, parents, researchers, policymakers, other community members, and the students themselves will need to work together to provide the education our students need and deserve to prepare for their futures in the AI-rich world in which they will–and already do–live.

Daniela Ganelin is a doctoral student and Glenn Kleiman is a Senior Advisor at the Stanford Graduate School of Education. Ganelin is the primary author of the AI technology sections of this paper and Kleiman is the primary author of the final AI in education section. The authors are solely responsible for the content of this paper, and it does not contain any AI-written text. The authors thank Barbara Treacy and Chris Mah for their helpful comments on prior drafts.

Listen to The Generator Podcast

More From The Generator

--

--

Glenn Kleiman
The Generator

Glenn Kleiman is a Senior Advisor at the Stanford Graduate School of Education, where his work focuses on the potential of AI to enhance teaching and learning.