It’s needless to say: machine learning is powerful.
At the most basic level, machine learning algorithms can be used to classify things. Given a collection of cute animal pictures, a classifier can separate the pictures into buckets of ‘dog’ and ‘not a dog’. Given data about customer restaurant preferences, a classifier can predict what restaurant a user goes to next.
However, the role of humans is overlooked in the technology. It does not matter how powerful a machine learning model is if one does not use it. …
Bart. Elmo. Bert. Kermit. Marge. What do they have in common?
They’re all beloved fictional characters from TV shows many of us watched when we were young. But that’s not all — they’re also all AI models.
In 2018, researchers at the Allen Institute published the language model ELMo. The lead author, Matt Peters, said the team brainstormed many acronyms for their model, and ELMo instantly stuck as a “whimsical but memorable” choice.
What started out as an inside joke has become a full-blown trend.
Google AI followed with BERT, an incredibly powerful and now widely used Transformer-based language model…
In the last month, many US universities have announced their fall reopening plans.
Some universities have proposed a hybrid model, with some students returning to campus for in-person classes and others staying home, or offering a select number of small classes being held in-person. Others have proposed for in-person or entirely virtual classes.
This week, Harvard announced that they plan to charge their regular $50K tuition for all students to receive 100% virtual classes.
While I truly believe that virtual classes still have merit, there are many aspects that are difficult to replicate via remote or virtual education.
Even though it was created in 2009, ImageNet is the most impactful dataset in computer vision and AI today. Consisting of more than 14 million human-annotated images, ImageNet has become the standard for all large-scale datasets in AI. Every year, ImageNet even hosts a competition (ILSVRC) to benchmark progress made in the field.
There’s no denying ImageNet’s influence and importance in computer vision. However, with the growing evidence of biases that lie in AI models and datasets, we must consider the curation process with awareness of ethics and social contexts to improve for future datasets.
Recent discussion in the machine learning community has brought to light the importance and necessity of understanding not just machine learning, but all the considerations of bias and fairness behind every algorithm’s usage.
“This isn’t a call for ‘diversity’ in datasets or ‘improved accuracy’ in performance — it’s a call for a fundamental reconsideration of the institutions and individuals that design, develop, deploy this tech in the first place.” — Vidushi Marda
For newcomers to this field of fairness in AI, here is a compilation of helpful papers, books, and resources for learning more about the field and specific applications…
Crowdsourcing is widely used in machine learning as an efficient form of annotating datasets. Platforms like Amazon Mechanical Turk allow researchers to collect data or outsource the task of labelling training data from individuals all over the world.
However, crowdsourced datasets often contain significant social biases, such as gender or racial preferences and prejudices. Then, the algorithms trained on these datasets would then produce biased decisions as well.
In this short paper, researchers from Stony Brook University and IBM Research proposed a novel method to quantify bias in crowd workers:
Integrating counterfactuals into the crowdsourcing process is a new method…
In recent years, jobs across all levels require understanding and usage of technology. As a result, computer and digital literacy is the #1 entry-level skill needed in the job market.
Computer literacy allows us to engage with society — finding a job, ordering takeout, searching an answer to a question — in ways previously unimaginable. Similarly, AI literacy is becoming increasingly necessary as well, as artificial intelligence systems become more integrated into our daily lives.
Last week, OpenAI researchers announced the arrival of GPT-3, a language model that blew away its predecessor GPT-2. GPT-2 was already widely known as the best, state-of-the-art language model; in contrast, GPT-3 uses 175 billion parameters, more than 100x more than GPT-2, which used 1.5 billion parameters.
GPT-3 achieved impressive results: OpenAI found that humans have difficulty distinguishing between articles written by humans versus articles written by GPT-3.
Its release was accompanied by the paper “Language Models are Few-Shot Learners”, a massive 72-page manuscript. …
Artificial intelligence is a flourishing field, and its presence in the K-12 classroom is growing too. This article compiles resources for introducing AI and, in particular, AI ethics in the K-12 setting:
When looking at fair and ethical algorithms, transparency is a key concept often brought up in discussion. But what exactly does it mean for a machine learning algorithm to be “transparent”?
Like “fairness” and “privacy”, it sounds important and useful, but the concept of transparency is quite ambiguous and is worth exploring in more detail.
When we seek transparent algorithms, we are asking for an understandable explanation of how it works. For example:
Have you ever read an article or watched…