Content Moderation of Diffusion Model Generated Images with Amazon Rekognition

4 min readNov 2, 2022

Text Prompt: “Double bladed Sword”, Generated by Stable Diffusion v1–5.

In previous post, we introduced a convenient process to deploy Stable Diffusion model in a secure, scalable and robust product environment. In this article, we use Stable Diffusion model as an example, to discuss a general topic for generative AI technology — content moderation, and demonstrate a solution for that purpose, with help from Amazon Rekognition.

Content Moderation

With its stunning breakthrough in AI technology, generative AI is producing contents that more than being capable to fake the real, but also inspiring human to vision the never imagined. While it’s uplifting productivity by bring the scientific fiction working style into reality, especially in Design Industry, e.g. talking to computer to produce and iterate graphic content, it also raise questions that has yet been thought through by the society, including copyright, ownership, transparency, explainability, fairness, etc. Another important topic would be around content moderation, which is the “process of detecting contributions that are irrelevant, obscene, illegal, harmful, or insulting with regards to useful or informative contributions…The purpose of content moderation is to remove or apply a warning label to problematic content or allow users to block and filter content themselves” (Wikipedia).

Study Case

Read the following content with caution.

Using the Stable Diffusion Model deployment Endpoint, and put it on a black hat, here is some of its output:

import boto3, jsonrequest_body = {
    "prompt": "human fighting with weapons",
    "number": 3,
    "num_inference_steps": 50,
}# Serialize data for endpoint
payload = json.dumps(request_body)client = boto3.client("sagemaker-runtime")
response = client.invoke_endpoint(
    EndpointName="endpoint_name_of_your_deployment",
    ContentType="application/json",
    Body=payload,
)
res = response["Body"].read()

Text Prompt: “human fighting with weapons”, 44.5s

One effective way to audit Generative AI outputs would be using another AI. “Amazon Rekognition Content Moderation automates and streamlines your image and video moderation workflows using machine learning (ML), without requiring ML experience. Process millions of images and videos efficiently while detecting inappropriate or unwanted, with fully managed APIs and customizable moderation rules to keep users safe and the business compliant”. In Pythonic language, this could simply means:

rekognition = boto3.client("rekognition")for img_encoded in eval(res)["images"]:    response = rekognition.detect_moderation_labels(
        Image={"Bytes":
            image_to_byte(decode_image_bytes(img_encoded))},
        MinConfidence=50,
    )
    print(response)

Part of the audit result:

1. 'ModerationLabels': [{'Confidence': 68.06849670410156, 'Name': 'Suggestive', 'ParentName': ''}, {'Confidence': 68.06849670410156, 'Name': 'Barechested Male', 'ParentName': 'Suggestive'}, {'Confidence': 66.66549682617188, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 66.66549682617188, 'Name': 'Violence', 'ParentName': ''}, {'Confidence': 5.9070000648498535, 'Name': 'Physical Violence', 'ParentName': 'Violence'}]2. 'ModerationLabels': [{'Confidence': 99.74349975585938, 'Name': 'Physical Violence', 'ParentName': 'Violence'}, {'Confidence': 99.74349975585938, 'Name': 'Violence', 'ParentName': ''}, {'Confidence': 76.18090057373047, 'Name': 'Suggestive', 'ParentName': ''}, {'Confidence': 76.18090057373047, 'Name': 'Barechested Male', 'ParentName': 'Suggestive'}, {'Confidence': 29.442501068115234, 'Name': 'Graphic Violence Or Gore', 'ParentName': 'Violence'}, {'Confidence': 20.382598876953125, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 9.823200225830078, 'Name': 'Self Injury', 'ParentName': 'Violence'}]3. 'ModerationLabels': [{'Confidence': 72.16590118408203, 'Name': 'Suggestive', 'ParentName': ''}, {'Confidence': 72.16590118408203, 'Name': 'Barechested Male', 'ParentName': 'Suggestive'}, {'Confidence': 9.6358003616333, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 9.6358003616333, 'Name': 'Violence', 'ParentName': ''}]

Two helper functions used above are defined as:

import io
import base64
import numpy as np
from PIL import Imagedef decode_image_bytes(
    encoded_image, 
    height: int = 512, 
    width: int = 512, 
    channel: int = 3
):
    return Image.fromarray(
        np.reshape(
            np.frombuffer(
                base64.decodebytes(bytes(encoded_image, encoding="utf-8")),
                dtype=np.uint8,
            ),
            (height, width, channel),
        )
    )def image_to_byte(image, image_format: str = "PNG"):
    buffer = io.BytesIO()
    image.save(buffer, format=image_format)
    return buffer.getvalue()

More Examples

1. 'ModerationLabels': [{'Confidence': 19.631799697875977, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 19.631799697875977, 'Name': 'Violence', 'ParentName': ''}, {'Confidence': 5.143799781799316, 'Name': 'Physical Violence', 'ParentName': 'Violence'}]2. 'ModerationLabels': [{'Confidence': 94.7073974609375, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 94.7073974609375, 'Name': 'Violence', 'ParentName': ''}]3. 'ModerationLabels': [{'Confidence': 90.48500061035156, 'Name': 'Weapon Violence', 'ParentName': 'Violence'}, {'Confidence': 90.48500061035156, 'Name': 'Violence', 'ParentName': ''}]

Text Prompt: “human carrying ak47 with mask”, 44.8s

1. 'ModerationLabels': [{'Confidence': 93.2550048828125, 'Name': 'Smoking', 'ParentName': 'Tobacco'}, {'Confidence': 93.2550048828125, 'Name': 'Tobacco', 'ParentName': ''}, {'Confidence': 5.313300132751465, 'Name': 'Tobacco Products', 'ParentName': 'Tobacco'}]2. 'ModerationLabels': [{'Confidence': 95.38349914550781, 'Name': 'Drug Products', 'ParentName': 'Drugs'}, {'Confidence': 95.38349914550781, 'Name': 'Drugs', 'ParentName': ''}]3. 'ModerationLabels': [{'Confidence': 44.850399017333984, 'Name': 'Drinking', 'ParentName': 'Alcohol'}, {'Confidence': 44.850399017333984, 'Name': 'Alcohol', 'ParentName': ''}, {'Confidence': 5.820499897003174, 'Name': 'Smoking', 'ParentName': 'Tobacco'}, {'Confidence': 5.820499897003174, 'Name': 'Tobacco', 'ParentName': ''}, {'Confidence': 5.159800052642822, 'Name': 'Alcoholic Beverages', 'ParentName': 'Alcohol'}]

Text Prompt: “man smoking cigarette in casino while drinking alcoholic beverages”, 44.5s

Conclusion

Content moderation is a broader topic affects any generative AI technologies. In this post Diffusion model is used as a study case, and we demonstrated how to audit its output with help from Amazon Rekognition with minimum effort. All the sample code is available on GitHub.

Content Moderation of Diffusion Model Generated Images with Amazon Rekognition

Content Moderation

Study Case

More Examples

Conclusion

Written by Baichuan Sun