Solving Bias AI through Synthetic Data Facial Generation

Published in

Understanding the impact of AI in our Daily lives

4 min readDec 10, 2018

Facial recognition technology is becoming commonplace in our day to day lives.

From Facebook identifying your friends in images to banks using basic facial recognition to make their branches safer, advances in facial recognition are seamlessly integrating into every aspect of our physical and digital experiences.

Facial Recognition, if used effectively can enhance the retail experience, improve security protocols and bring people closer together in the digital space. “AI holds significant power to improve the way we live and work, but only if AI systems are developed and trained responsibly, and produce outcomes we trust,” write IBM fellows Aleksandra Mojsilovic and John R. Smith. “Making sure that the system is trained on balanced data, and rid of biases is critical to achieving such trust.”

Part and parcel with the rise of facial recognition technology are the inherent shortcomings that current systems have.

Termed Bias AI, this is the process in which an algorithm determines and reinforces preferences or preferred responses based on the limitations of a given dataset.

Due to limitations in the inclusion of content -which reflects the representation of age, race, gender- the outcomes predicted through the AI process will be swayed in one direction or the other, depending on the robust quality of the data input in the system itself.

Whether the AI algorithms are themselves biased is also an open question.

“[Machine-learning algorithms] haven’t been optimized for any definition of fairness,” says Deirdre Mulligan, associate professor, UC Berkeley School of Information. “They have been optimized to do a task.”

The question then becomes, what steps can be taken to mitigate the impact of Bias AI?

Obviously, with facial recognition technology evolving at such a rapid pace, there are countless areas of concern which must be addressed with haste.

In order to resolve these vast data gaps impacting the qualitative value of applied AI, new approaches must be taken to mitigate Bias AI and build more reflective algorithms. In turn, these algorithms must be trained on a more diverse and reflective dataset.

“If the training sets aren’t really that diverse, any face that deviates too much from the established norm will be harder to detect, which is what was happening to me,” explains Joy Buolamwini, a graduate researcher at MIT Media Lab and founder of Algorithmic Justice League and Code4Rights. “Training sets don’t just materialize out of nowhere. We actually can create them. So there’s an opportunity to create full-spectrum training sets that reflect a richer portrait of humanity.”

Synthetic Data and Facial Generation

Through the use and implementation of Synthetic Data, many of the structural bias of AI can be addressed. The chief factors which cause bais AI are non-reflective or limited data sets which include a small population of viable images. Through the use of SD, these information gaps can be generated via GAN and incorporated into a larger, authentic dataset. In forming a new, partially authentic partially synthetic dataset holes in the dataset- such as a lack of women or African Americans reflected in the data, can be resolved and filled using SD generated images.

What advantages does Synthetic Data offer over manual methods?

● Effective facial recognition system for all geographies and facial expressions

● Adaptation for ongoing cosmetic changes (shaved beards, alternate makeup, alternate haircuts, etc.)

● Cost-effectiveness

● Expedited development time

● Scalability

● Avoidance of privacy issues

● Exclusive, one-of-a-kind dataset

● Creation of a searchable database of 3D images

What does Synthetic Data bring to the party?

Guided by industry leaders, such as Neuromation, Synthetic Data providers are changing what is possible in the analysis, optimization, and transformation of datasets. With the use of GAN’s and the Synthetic Data Calibration Process, developers are able to; generate multiple angles of an image from a single input, manipulate attributions & facial emotions, downgrade the resolution so it appears to come from CCTV stream & generate 3D models from a single input of pictures.

How does this help resolve Bias AI?

“Identifying and mitigating bias in AI systems is essential to building trust between humans and machines that learn. As AI systems find, understand, and point out human inconsistencies in decision making, they could also reveal ways in which we are partial, parochial, and cognitively biased, leading us to adopt more impartial or egalitarian views. Through the use of SD, countless facial permutations can be generated and analyzed to vastly expand the size of datasets. In this process, the SD is able to mimic minority representation to create a more accurate model, capable of identifying a broad range of ethnic and geo-cultural segmentation points.”

Synthetic Data acts as the essential filler. SD optimizes authentic datasets by combining it with accurately produced, photorealistic SD points to create a new, more expansive and unbiased dataset to train your AI on.

Whereas previous models were limited to the often small and homogeneous datasets accessible to practitioners, Synthetic data facilitates the potential to morph and hybridize existing images to generate new content, filling in gaps to counterbalance systematic prejudices.

Solving Bias AI through Synthetic Data Facial Generation

Written by Yehudah Sunshine