Content Moderation of Diffusion Model Generated Images with Amazon Rekognition
In previous post, we introduced a convenient process to deploy Stable Diffusion model in a secure, scalable and robust product environment. In this article, we use Stable Diffusion model as an example, to discuss a general topic for generative AI technology — content moderation, and demonstrate a solution for that purpose, with help from Amazon Rekognition.
Content Moderation
With its stunning breakthrough in AI technology, generative AI is producing contents that more than being capable to fake the real, but also inspiring human to vision the never imagined. While it’s uplifting productivity by bring the scientific fiction working style into reality, especially in Design Industry, e.g. talking to computer to produce and iterate graphic content, it also raise questions that has yet been thought through by the society, including copyright, ownership, transparency, explainability, fairness, etc. Another important topic would be around content moderation, which is the “process of detecting contributions that are irrelevant, obscene, illegal, harmful, or insulting with regards to useful or informative contributions…The purpose of content moderation is to remove or apply a warning label to problematic content or allow users to block and filter content themselves” (Wikipedia).
Study Case
Read the following content with caution.
Using the Stable Diffusion Model deployment Endpoint, and put it on a black hat, here is some of its output:
import boto3, jsonrequest_body = {
"prompt": "human fighting with weapons",
"number": 3,
"num_inference_steps": 50,
}# Serialize data for endpoint
payload = json.dumps(request_body)client = boto3.client("sagemaker-runtime")
response = client.invoke_endpoint(
EndpointName="endpoint_name_of_your_deployment",
ContentType="application/json",
Body=payload,
)
res = response["Body"].read()
One effective way to audit Generative AI outputs would be using another AI. “Amazon Rekognition Content Moderation automates and streamlines your image and video moderation workflows using machine learning (ML), without requiring ML experience. Process millions of images and videos efficiently while detecting inappropriate or unwanted, with fully managed APIs and customizable moderation rules to keep users safe and the business compliant”. In Pythonic language, this could simply means:
rekognition = boto3.client("rekognition")for img_encoded in eval(res)["images"]: response = rekognition.detect_moderation_labels(
Image={"Bytes":
image_to_byte(decode_image_bytes(img_encoded))},
MinConfidence=50,
)
print(response)
Part of the audit result:
1. 'ModerationLabels': [{'Confidence': 68.06849670410156, 'Name': 'Suggestive', 'ParentName': ''}, {'Confidence': 68.06849670410156, 'Name': 'Barechested Male', 'ParentName': 'Suggestive'}, {'Confidence': 66.66549682617188, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 66.66549682617188, 'Name': 'Violence', 'ParentName': ''}, {'Confidence': 5.9070000648498535, 'Name': 'Physical Violence', 'ParentName': 'Violence'}]2. 'ModerationLabels': [{'Confidence': 99.74349975585938, 'Name': 'Physical Violence', 'ParentName': 'Violence'}, {'Confidence': 99.74349975585938, 'Name': 'Violence', 'ParentName': ''}, {'Confidence': 76.18090057373047, 'Name': 'Suggestive', 'ParentName': ''}, {'Confidence': 76.18090057373047, 'Name': 'Barechested Male', 'ParentName': 'Suggestive'}, {'Confidence': 29.442501068115234, 'Name': 'Graphic Violence Or Gore', 'ParentName': 'Violence'}, {'Confidence': 20.382598876953125, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 9.823200225830078, 'Name': 'Self Injury', 'ParentName': 'Violence'}]3. 'ModerationLabels': [{'Confidence': 72.16590118408203, 'Name': 'Suggestive', 'ParentName': ''}, {'Confidence': 72.16590118408203, 'Name': 'Barechested Male', 'ParentName': 'Suggestive'}, {'Confidence': 9.6358003616333, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 9.6358003616333, 'Name': 'Violence', 'ParentName': ''}]
Two helper functions used above are defined as:
import io
import base64
import numpy as np
from PIL import Imagedef decode_image_bytes(
encoded_image,
height: int = 512,
width: int = 512,
channel: int = 3
):
return Image.fromarray(
np.reshape(
np.frombuffer(
base64.decodebytes(bytes(encoded_image, encoding="utf-8")),
dtype=np.uint8,
),
(height, width, channel),
)
)def image_to_byte(image, image_format: str = "PNG"):
buffer = io.BytesIO()
image.save(buffer, format=image_format)
return buffer.getvalue()
More Examples
1. 'ModerationLabels': [{'Confidence': 19.631799697875977, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 19.631799697875977, 'Name': 'Violence', 'ParentName': ''}, {'Confidence': 5.143799781799316, 'Name': 'Physical Violence', 'ParentName': 'Violence'}]2. 'ModerationLabels': [{'Confidence': 94.7073974609375, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 94.7073974609375, 'Name': 'Violence', 'ParentName': ''}]3. 'ModerationLabels': [{'Confidence': 90.48500061035156, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 90.48500061035156, 'Name': 'Violence', 'ParentName': ''}]
1. 'ModerationLabels': [{'Confidence': 93.2550048828125, 'Name': 'Smoking', 'ParentName': 'Tobacco'}, {'Confidence': 93.2550048828125, 'Name': 'Tobacco', 'ParentName': ''}, {'Confidence': 5.313300132751465, 'Name': 'Tobacco Products', 'ParentName': 'Tobacco'}]2. 'ModerationLabels': [{'Confidence': 95.38349914550781, 'Name': 'Drug Products', 'ParentName': 'Drugs'}, {'Confidence': 95.38349914550781, 'Name': 'Drugs', 'ParentName': ''}]3. 'ModerationLabels': [{'Confidence': 44.850399017333984, 'Name': 'Drinking', 'ParentName': 'Alcohol'}, {'Confidence': 44.850399017333984, 'Name': 'Alcohol', 'ParentName': ''}, {'Confidence': 5.820499897003174, 'Name': 'Smoking', 'ParentName': 'Tobacco'}, {'Confidence': 5.820499897003174, 'Name': 'Tobacco', 'ParentName': ''}, {'Confidence': 5.159800052642822, 'Name': 'Alcoholic Beverages', 'ParentName': 'Alcohol'}]
Conclusion
Content moderation is a broader topic affects any generative AI technologies. In this post Diffusion model is used as a study case, and we demonstrated how to audit its output with help from Amazon Rekognition with minimum effort. All the sample code is available on GitHub.