Deploying Your Own Embedder Model on watsonx Platform Using IBM Watson Machine Learning — featuring Thai Language

Published in

ibm-watsonx-th

3 min readMay 31, 2024

In the world of Retrieval-Augmented Generation (RAG) applications, embedder models are crucial. These models convert text into dense vectors, enabling effective retrieval and comparison of information. While numerous embedder models are available, not all fit every use case. For instance, many cloud services offer a limited selection of embedder models, which might not be localized or generalize well to all languages. Hence, the ability to deploy your own embedder model becomes essential.

In this article, we will explore how to deploy a custom embedder model, specifically the simcse-model-phayathaibert created by kornwtp, on the Watsonx platform using Watson Machine Learning (WML) deployments.

Steps to Deploy Your Embedder Model

Let’s dive into the steps involved in deploying the simcse-model-phayathaibert on the Watsonx platform using Watson Machine Learning.

Prerequisites

IBM Cloud account
Watson Machine Learning service instance
Hugging Face model (simcse-model-phayathaibert)

Step 1: Install Required Libraries

First, ensure you have the necessary libraries installed in your environment:

!pip install sentence-transformers==3.0.0
!pip install -U ibm-watson-machine-learning

Step 2: Download and Prepare the Model

Define the function to use the embedder model

import numpy as np
import requests
from ibm_watson_machine_learning import APIClient

def my_embedding_function():
    from sentence_transformers import SentenceTransformer
    model_name = 'kornwtp/simcse-model-phayathaibert'  # Replace with the desired model name
    try:
        model = SentenceTransformer(model_name)
    except Exception as e:
        return {"error": str(e)}
    
    def score(payload):
        # we assume only one batch is sent
        sentences = payload['input_data'][0]['values'][0]
        try:
            embeddings = model.encode(sentences)
            return {
                'predictions': [
                    {
                        'fields': ['sentence', 'embedding'],
                        'values': [[sentence, embedding.tolist()] for sentence, embedding in zip(sentences, embeddings)]
                    }
                ]
            }
        except Exception as e:
            return {"error": str(e)}
    return score

# Example usage:
embedding_function = my_embedding_function()

# Example payload
payload = {
    'input_data': [
        {
            'values': [
                ["กลุ่มผู้ชายเล่นฟุตบอลบนชายหาด", "กลุ่มเด็กชายกำลังเล่นฟุตบอลบนชายหาด"]
            ]
        }
    ]
}

result = embedding_function(payload)
print(result)

Step 3: Configure Watson Machine Learning

Authenticate and set up your Watson Machine Learning (WML) service:

url='https://us-south.ml.cloud.ibm.com'
api_key='your-api-key'
wml_credentials = {
    "url": url,
    "apikey": api_key
}

client = APIClient(wml_credentials)

space_id = 'your_space_id'

client.spaces.list(limit=10)
client.set.default_space(space_id)

Creating a deployment space

%%writefile environment.yml
channels:
  - empty
  - nodefaults
dependencies:
- pip:
  - sentence-transformers==3.0.0

meta_props = {
   client.package_extensions.ConfigurationMetaNames.NAME: "transformers",
   client.package_extensions.ConfigurationMetaNames.TYPE: "conda_yml"
}
pkg_extn_details = client.package_extensions.store(meta_props, "./environment.yml")
pkg_extn_id = client.package_extensions.get_id(pkg_extn_details)

Step 4: Deploy the Model

Create a software specification:

base_id = client.software_specifications.get_id_by_name("runtime-23.1-py3.10")
meta_props = {
   client.software_specifications.ConfigurationMetaNames.NAME: "default with sentence-transformer",
   client.software_specifications.ConfigurationMetaNames.PACKAGE_EXTENSIONS: [{'guid': pkg_extn_id}],
   client.software_specifications.ConfigurationMetaNames.BASE_SOFTWARE_SPECIFICATION: {'guid': base_id}
}
sw_spec_details = client.software_specifications.store(meta_props)
sw_spec_id = client.software_specifications.get_id(sw_spec_details)

Store function:

function_props = {
    client.repository.FunctionMetaNames.NAME: 'simcse embedder-v2',
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_UID: sw_spec_id
}
function_details = client.repository.store_function(my_embedding_function, function_props)
function_id = client.repository.get_function_id(function_details)
print(function_id)

Create a deployment for the model, once deployed it will show up in your space:

hardware_spec_id = client.hardware_specifications.get_id_by_name('M')

deployment_props = {
    client.deployments.ConfigurationMetaNames.NAME: 'simcse embedder deployment v3',
    client.deployments.ConfigurationMetaNames.ONLINE: {},
    client.deployments.ConfigurationMetaNames.HARDWARE_SPEC: { "id": hardware_spec_id, 'num_nodes': 1}
}
deployment_details = client.deployments.create(function_id, deployment_props)
deployment_id = client.deployments.get_id(deployment_details)

payload = {
    'input_data': [
        {
            'values': [
                ["กลุ่มผู้ชายเล่นฟุตบอลบนชายหาด", "กลุ่มเด็กชายกำลังเล่นฟุตบอลบนชายหาด"]
            ]
        }
    ]
}

result = client.deployments.score(deployment_id, payload)
print(result)

{'predictions': [{'fields': ['sentence', 'embedding'],
   'values': [['กลุ่มผู้ชายเล่นฟุตบอลบนชายหาด',
     [1.1157499551773071,
      -0.1111806258559227,
      -0.5575138330459595,
      0.4732488691806793,
      0.4931944012641907,
      -0.0910203829407692,
      0.861914873123169,
      1.1215389966964722,
      1.0791149139404297,
      -0.08809694647789001,
      0.613644540309906,
      0.9309089183807373,
      0.372568279504776,
      -0.3043205738067627,
      -0.48598262667655945,
      0.3027377128601074,
      1.9552265405654907,
      -0.04576759785413742,
      -0.5975145697593689,
      0.34542346000671387,
      -0.3229110836982727,
      0.07778410613536835,
      1.0909312963485718,
...
      0.4387320876121521,
      -0.39540353417396545,
      0.1730499118566513,
      -0.8454257845878601,
      0.9247894883155823]]]}]}

This response will include the embedded vector for the input text, demonstrating that your custom embedder model is now successfully deployed and operational on the watsonx platform.

Extra

You can also https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/ml-deploy-functions_local.html?context=cpdaas&locale=it#store-wmls read more here to see parameters such as `client.hardware_specifications.get_id_by_name`. Here I used size M which is 4vCPU and 16 GB RAM. Also can adjust the num_nodes for more instances.

Conclusion

Deploying a custom embedder model on the Watsonx platform using Watson Machine Learning allows you to leverage localized and specialized models for your RAG applications. By following these steps, you can ensure that your applications have access to high-quality, tailored embeddings, improving their overall performance and reliability.

Reference article

https://medium.com/ibm-data-ai/deploy-and-scale-pre-trained-nlp-models-in-minutes-with-watson-machine-learning-and-huggingface-bf55147997ad

Deploying Your Own Embedder Model on watsonx Platform Using IBM Watson Machine Learning — featuring Thai Language

Steps to Deploy Your Embedder Model

Conclusion

Reference article

Written by Mew Leenutaphong