Content moderation to Zero Shot classification

5 min readAug 14, 2023

What if we wanted to analyze a small piece of text with no additional information or context and be able to get the most reasonable label that we wish to define for our own data? This can feed the more deterministic policy engines, rule engines and even be a part of a larger context driven analysis as required. OpenAI does provide for a means to ‘content moderate’ with preset classifications that can determine if your text belongs to one or more of the more vile categories. However, this analysis is more about how we can get more custom to defining our own labels against a given sentence or phrase.

We will look at 4 categories — viz., Politics, PHI/ PII, Legal matters, Company performance. Given that we dont have the option of gathering probability scores from Open AI on such custom labels (at this point in time), we will try the more user oriented prompt engineering route in Option 1 while the Option 2 evaluates other pretrained models from Hugging Face for the same.

We will also go with some sample sentences that have been wontedly twisted to align with more than one category. For example, our csv input file has the following lines as ‘payload’.

The issue between ministers took a tangent when they started making it personal.
I tried to negotiate data privacy with my cat but he just ignored me and hacked my keyboard for a nap.
The senate hearing was about whether a drug in trials could be used for this patient alone. He has a specific condition with his blood that does not have a medicine as yet.
What started as a political debate ended up discussing company priorities for 2023 and beyond in terms of who has a better story with hyperscalers.
The court’s landmark decision on free speech ignited discussions on the fine line between expression and harmful content in online platforms- intertwining legal considerations with debates over online governance.
I told my doctor a political joke during my PHI checkup now my medical record reads: Patient’s sense of humor: dangerously bipartisan.
User managed access gives you the so called benefit of controlling your identity but then how many people scrutinize the app permissions on your phone that leverage first name-email-phone numbers?

from langchain.chat_models import ChatOpenAI
import pandas as pd 
from langchain.prompts import PromptTemplate
from langchain.chains.llm import LLMChain
from IPython.display import HTML
model_name = 'gpt-4'
llm = ChatOpenAI(model_name = model_name, temperature=0)
payload_chain = LLMChain(llm=llm, prompt=moderationPrompt)

moderationPrompt = PromptTemplate(
    template = """
        Please rate the article below in a continuous scale 0.00-100.00 based on the presence and applicability of each category:
    [ politics | PHI | legal | about company | none of these ]
    Definitions:
    phi: protected health information or personally identifiable information present
    politics: political decisions, governance, parties, elections, policies
    legal: agreement or contract language, judgements
    about company: company strategy or  earnings or reports or predictions
    article:{payload}
    Output: python floats in square bracket list
    """, input_variables=["payload"]
)

def read_csv_file(file_path):
    df = pd.read_csv(file_path)
    lines = df["payload"].dropna().tolist()  
    return lines
def perform_OpenAIclassification(lines, model_name):
    classifications = []
    for idx, sentence in enumerate(lines, start=1):  # Start line number from 1 and increment
        if pd.notna(sentence):  
            result = payload_chain.run(payload = sentence)
            result = result.strip('][').split(', ')
            result.insert(0,idx) 
            result.insert(1, model_name)
            classifications.append(result)
    return classifications 

if __name__ == "__main__":
    input_csv_file = "./input.csv"  # Replace with your CSV file path
    lines = read_csv_file(input_csv_file)
    result = perform_OpenAIclassification(lines, model_name)
    dfr = pd.DataFrame(result, columns = ['line#', 'model', 'Politics','PHI/PII','Legal','About Company','None of these'])
    output_csv_file = "gptOutput.csv"  
    dfr.to_csv(output_csv_file, index=False)

GPT-4 seems to be slightly better than the 3.5 turbo cousin at these twisted sentences. The output dataframe would look like this. It does get the larger probability right most times but for sentences like #3 where we would have expected some % to be associated with PHI/PII. It also makes a case for us to request OpenAI to provide for some customization convenience to tag our labels and leverage the faster and the more ‘well read’ capability of such models.

Note — the multi_label value is set to True. You could play around with it being False as well.

Let us also use our own human expertise to review this output (last column). We could use a simple index like this-

Reasonable — stands for the engine picking the multiple labels accurately
Partially accurate — one of the 2 or labels being accurate
Inaccurate — obviously not as good

Too small a dataset to derive a concrete outcome but they all seem to be in relatively comparable space for this task

Summary

Large language models are like one-size fits all for many purposes. For scenarios where we have very little context to lean on, where custom labels are required for zero shot classification, we still have the option of going for the alternatives that are trained on the more special purpose NLI (natural language inference) models such as those given above. The final choice for a given requirement could be based on performance (when used in real-time transactions), extent of additional context that can effectively make this more deterministic and ease of integration for a given ecosystem.

Note — a special word of thanks to those in forums that have corrected my code or shared suggestions on how to use these models better. Specifically the Open AI forum had someone that shared this intuition on how best to query GPT to get at results that are not otherwise available through API calls.

References

https://joeddav.github.io/blog/2020/05/29/ZSL.html x

https://huggingface.co/datasets/multi_nli x

https://huggingface.co/facebook/bart-large-mnli x

https://huggingface.co/valhalla/distilbart-mnli-12-3 x

https://haystack.deepset.ai/blog/the-definitive-guide-to-bertmodels x

https://ai.plainenglish.io/mastering-gpt-3-the-mathematics-of-logprobs-for-ruby-devs-1eb55fc1326 x

Content moderation to Zero Shot classification

Written by Sganesh