Box AI APIのプレビュー

Published in

Box Developer Japan Blog

16 min readNov 14, 2023

先日開催されたBoxWorksの基調講演で、Boxが新しいBox AI APIを導入してBox Platformをさらに改善しようとしていることを聞いた方もいるかもしれません。Box AI APIをカスタムアプリ内で活用することで、コンテンツのセキュリティとコンプライアンスの要件を維持しながら、企業固有のニーズを満たし、企業のコンテンツを強化できるようになります。

Box AI APIの近日公開に向けて、Boxでは、素晴らしい機能をお届けできるよう懸命に取り組んでいます。実際、開発者がAI APIをどのように使用し、どのような優れたユースケースを提案できるかを考察するために社内ハッカソンを開催しました。

社内ハッカソン優勝者からの優れたアイデア、サンプルアプリ、既存のPython SDKの拡張方法など、Box AI APIを少しだけ紹介します。

免責: Box AI APIのドキュメントは変更される可能性があり、正式リリースが近付くにつれて、Box AIの開発者向けリソースはさらに追加される予定です。

エンドポイント

基本的にBoxでは、/2.0/ai/askと/2.0/ai/text_genという2つのエンドポイントを使用していました。

これらはAIを操作するためのモードを表しており、askは質疑応答、text_genは会話のテキスト生成に使用されます。

APIの観点から見た違いは、askエンドポイントを使用した場合は会話の履歴が送信されない点で、このエンドポイントは、あるコンテンツに対する質問と回答を目的としています。

一方、text_genエンドポイントを使用した場合は、会話の履歴が送信されます。そのため、AIは会話全体を認識し、以前の質問と回答に基づいてテキストを作成します。

仕組み

AI APIを操作するには、通常、コンテンツとそのコンテンツに関するプロンプトを送信します。

コンテンツには、特定のファイルまたは一連のファイル、テキストレプリゼンテーションのほか、単純なフレーズのようなコンテンツの一部分を使用できます。

通常、プロンプトとは質問のことですが、何でも使用して、AIの反応を確認することができます。

ダイレクト回答とストリーミング回答

AIが回答を作成するまでに多少時間がかかる場合があるため、完成した回答を待つか、回答が完成するまで単語が順次表示されるようにするかを選択することもできます。これによりユーザーには何らかの情報が返されるので、反応があるまで長時間待つことがなくなります。

例

リクエストがどのように行われるかを確認できるように、いくつか例を示します。コンテキストについては、ダイビング旅行の同意書に関連したコンテンツを送信しています。

1つの項目に対するQA

curl --location 'https://api.box.com/2.0/ai/ask' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer Gj...PE' \
--data '{
    "prompt": "summarize document",
    "items": [
        {
            "type": "file",
            "id": "1282567002306",
            "content": ""
        }
    ],
    "mode": "single_item_qa",
    "config": {
        "is_streamed": false
    }
}'

結果は次のとおりです。

{
    "answer": "The document provided is a liability waiver for participants 
               engaging in water activities, specifically scuba diving. 
              It states that individuals must be able to swim and be in good 
              physical condition to participate. 
              The purpose of signing the document is to exempt and release 
              the dive center, its employees, agents, and dive boats from 
              any liabilities arising from their acts or omissions.
              ...
              - Full name needs to be provided as well as signature.
              Please let me know how I can further assist you based on this 
              information!",
    "created_at": "2023-10-04T07:35:06.154290294-07:00",
    "completion_reason": "done"
}

ここで、ドキュメント全体を送信するのではなく、抜粋を送信します。

curl --location 'https://api.box.com/2.0/ai/ask' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer Gj...PE' \
--data '{
    "prompt": "do I need to know how to swim to go diving?",
    "items": [
        {
            "type": "file",
            "id": "1282567002306",
            "content": "YOU MUST BE ABLE TO SWIM TO PARTICIPATE IN ANY IN WATER ACTIVITIES."
        }
    ],
    "mode": "single_item_qa",
    "config": {
        "is_streamed": false
    }
}'

結果は次のとおりです。

{
    "answer": "According to the document you provided, it states that 
    \"YOU MUST BE ABLE TO SWIM TO PARTICIPATE IN ANY IN WATER ACTIVITIES.\" 
    Therefore, based on this information, it can be inferred that knowing 
    how to swim is a requirement for participating in diving or any other 
    water activities.",
    "created_at": "2023-10-04T07:45:23.701917249-07:00",
    "completion_reason": "done"
}

次は、同じ例ですが、ストリーム形式を使用します。

curl --location 'https://api.box.com/2.0/ai/ask' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer Gj..PE' \
--data '{
    "prompt": "do I need to know how to swim to go diving?",
    "items": [
        {
            "type": "file",
            "id": "1282567002306",
            "content": "YOU MUST BE ABLE TO SWIM TO PARTICIPATE IN ANY IN WATER ACTIVITIES."
        }
    ],
    "mode": "single_item_qa",
    "config": {
        "is_streamed": true
    }
}'

結果は次のとおりです。

{"answer":"According","created_at":"2023-10-04T07:47:32.567377685-07:00"}
{"answer":" to","created_at":"2023-10-04T07:47:32.584226294-07:00"}
{"answer":" the","created_at":"2023-10-04T07:47:32.634503092-07:00"}
{"answer":" document","created_at":"2023-10-04T07:47:32.671469331-07:00"}
{"answer":" you","created_at":"2023-10-04T07:47:32.68524942-07:00"}
...
{"answer":" any","created_at":"2023-10-04T07:47:34.467767413-07:00"}
{"answer":" other","created_at":"2023-10-04T07:47:34.509386552-07:00"}
{"answer":" water","created_at":"2023-10-04T07:47:34.529145131-07:00"}
{"answer":" activities","created_at":"2023-10-04T07:47:34.565221139-07:00"}
{"answer":".","created_at":"2023-10-04T07:47:34.590461678-07:00"}
{"answer":"","created_at":"2023-10-04T07:47:34.653314326-07:00","completion_reason":"done"}

複数の項目に対するQA

次に、回答に役立つ情報を含む複数のコンテンツがある場合を想像してください。この方法を使用すると、複数のコンテンツを同時に送信できます。

複数の項目を指定するか、必要に応じて項目とコンテンツを指定するだけです。

たとえば、ダイビングの同意書に、船旅に関連した追加のドキュメントがあったとします。

curl --location 'https://api.box.com/2.0/ai/ask' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer Gj...PE' \
--data '{
    "prompt": "do I need to know how to swim?",
    "items": [
        {
            "type": "file",
            "id": "1282567002306",
            "content": ""
        },
        {
            "type": "file",
            "id": "1282565377545",
            "content": ""
        }
    ],
    "mode": "multiple_item_qa",
    "config": {
        "is_streamed": false
    }
}'

テキスト生成

テキスト生成も同様に機能しますが、会話の履歴 (プロンプトと回答を含む) を送信します。これにより、AIは以前の回答とプロンプトに基づいてテキストを生成します。

text_genに対する最初のリクエストの例:

curl --location 'https://api.box.com/2.0/ai/text_gen' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer Gj...PE' \
--data '{
    "prompt": "do I need to know how to swim?",
    "items": [
        {
            "type": "file",
            "id": "1282567002306",
            "content": ""
        }
    ],
    "config": {
        "is_streamed": false
    }
}'

後続のリクエストには会話の履歴が含まれます。

curl --location 'https://api.box.com/2.0/ai/text_gen' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer Gj...PE' \
--data '{
    "prompt": "do I need to know how to swim?",
    "items": [
        {
            "type": "file",
            "id": "1282567002306",
            "content": ""
        }
    ],
    "dialogue_history": [
        {
         "prompt": "do I need to know how to swim?",
         "answer": "According to the document you provided, it states that 
         \"YOU MUST BE ABLE TO SWIM TO PARTICIPATE IN ANY IN WATER ACTIVITIES.\" 
         Therefore, based on this information, it can be inferred that knowing 
         how to swim is a requirement for participating in diving or any other 
         water activities.",
         "created_at": "2023-10-04T07:45:23.701917249-07:00",
         "completion_reason": "done"
        }
    ],
    "config": {
        "is_streamed": false
    }
}'

Python SDK

ハッカソンの時点では、Python SDKにはこれらのエンドポイントが含まれていませんでしたが、これがSDKをオープンソースのままにしている理由の1つです。

Python SDKの拡張は非常に簡単であることが判明したため、これはコミュニティにとって興味深いものになるかもしれないと考えました。

たとえば、通常の/ai/askエンドポイントの呼び出しは、次のようになります。すべての処理を行う汎用のsession.post() メソッドを簡単に取得できることに注目してください。

def _get_ai_api_response(self, prompt: str, ai_question: AIQuestion) -> AIAnswer:
        ai_question_json = ai_question.to_json()
        ai_question_json["config"] = {"is_streamed": False}
        data = json.dumps(ai_question_json)
        # print(data)

        url = self.get_url("ai/ask")
        box_response = self._session.post(url, data=data, expect_json_response=True)
        
        response = box_response.json()
        response_object = self.translator.translate(
            session=self._session,
            response_object=response,
        )

        return AIAnswer(
            answer=response_object["answer"],
            created_at=response_object["created_at"],
            completion_reason=response_object["completion_reason"],
            prompt=prompt,
        )

ストリーミングバージョンでは、反復子が返されます。まだsession.postを使用していますが、今回は完成したJSON回答を想定していません。この時点から、複数の行を表すデータのチャンクが生成されます。

def _get_ai_api_response_streamed(self, prompt: str, ai_question: AIQuestion) -> AIAnswer:
        ai_question_json = ai_question.to_json()
        ai_question_json["config"] = {"is_streamed": True}
        data = json.dumps(ai_question_json)
        # print(data)

        url = self.get_url("ai/ask")
        box_response = self._session.post(url, data=data, expect_json_response=False)

        for chunk in box_response.network_response.request_response.iter_lines():
            if chunk:
                response_object = self.translator.translate(
                    session=self._session,
                    response_object=json.loads(chunk),
                )

                yield AIAnswer(
                    answer=response_object["answer"],
                    created_at=response_object["created_at"],
                    completion_reason=response_object.get("completion_reason"),
                    prompt=prompt,
                )

このコードをさらに詳しく確認したい場合は、こちらのGitHubリポジトリをご覧ください。

サンプルアプリ

GitHubリポジトリにもいくつかのサンプルアプリが含まれていますが、それ以外のアプリを以下に示します。

次の動画もご覧ください。

ハッカソンの優勝者

今回の社内ハッカソンの優勝者は、Box社員のJake DolgenosとBrad Rosenfieldでした。この2名は、Box AI APIで何ができるかを示す非常に興味深くクリエイティブなプロジェクトを1つではなく2つも提出しました。

RFPの自動化 (英語)

インテリジェントな編集 (英語)

詳細については、Boxのフォーラムの投稿Box AI API Documentation (英語) を参照してください。

アイデア、コメント、フィードバックがある場合は、コミュニティフォーラム (英語のみ) にコメントをお送りください。

Box AI APIのプレビュー

エンドポイント

仕組み

ダイレクト回答とストリーミング回答

例

1つの項目に対するQA

複数の項目に対するQA

テキスト生成

Python SDK

サンプルアプリ

ハッカソンの優勝者

RFPの自動化 (英語)

インテリジェントな編集 (英語)

Written by Yuko Taniguchi