Unlock the Power of AWS Bedrock and Anthropic’s Claude II model with a User-Friendly API in Python

5 min readOct 31, 2023

Introduction

Although arriving slightly later on the scene, AWS is rapidly accelerating its efforts to secure a significant presence in the GenAI market. Within AWS, users have two primary options to access GenAI features. The first is Bedrock, a dedicated service for foundational models, including AWS’s own Titan, Anthropic’s Claude, AI21 Labs’ Jurassic models, Cohere’s command models, and Stability AI’s stable diffusion models. The second option, Sagemaker Jumpstart, offers an extensive range of open-source models and has a strong partnership with Hugging Face.

Of course, most of us are embedding the power of these LLMs in our software applications. In order to do so, you’ll need to choose a model and a Cloud partner. Now that AWS is on the GenAI rocket 🚀 you’ll be interested in a guide to easily use these models with an easy API call in your application. Here is your guide to using a Claude II model that is hosted on AWS Bedrock.

Anthropic’s Claude II model

Claude II is an AI model developed by Anthropic. It is an improvement over their previous model, Claude 1.3, and has better performance, and longer responses. Claude II has improved coding, math, and reasoning skills. For example, it scored 76.5% on the multiple-choice section of the Bar exam 1. It can also write longer documents — from memos to letters to stories up to a few thousand tokens.

Prerequisites

Before we begin, make sure you have the following:

An AWS account with billing enabled
Python is installed on your machine

Required Steps:

Create a Service Account
Activate the Claude II model in Bedrock
Call the Claude II model in your Python environment

Let’s go!

1. Create a Service Account

Search for the IAM service in the search bar in your AWS console, then go to the Users tab and click on Create User. In step 2, choose “Attach policies directly” and let’s make things easy by adding “AdministratorAccess” as a policy. Finally, create the user.

Next, click on the newly created user. Navigate to “Security credentials” and in “Access keys” choose to create an access key. Define the use case “other” as we’ll want to have an easy key/password authentication option. You can skip the addition of a tag, this is often used to keep track of all services that you create. Finally, you get your access key and secret access key. These are the credentials that we’ll use to authenticate to the AWS API.

Having created a service account with administrative privileges, you would typically grant access to specific services or assign permissions for carrying out certain tasks within a service. With this service account in place, the next step involves activating Bedrock its models.

2. Activate the Claude II model in Bedrock

Now we’ll go to the Bedrock service by searching for it in the AWS Console search bar. In there, you’ll navigate to “Model Access” and confirm that your preferred model is granted access. If not, you’ll get the option here to activate it. In the screenshot below, we see that access has been given to the Claude model.

3. Call the Claude II model in your Python environment

In the example code below, a function named call_bedrock is defined to facilitate communication with the AWS Bedrock API. Using the Boto3 library, the function sets up a connection to the Bedrock service and configures desired model parameters, prompt, and assistant information.

It then selects the ‘anthropic.claude-v2’ model and configures the content types for the API request. Upon invoking the Bedrock API, the function processes the response, extracting the completion result. To showcase the call_bedrock function in action, the example includes a test call that requests the model to explain black holes to 8th graders in a concise and captivating story format.

# Make sure that you have the latest version of boto3
!pip install --upgrade boto3
!print(boto3.__version__) --> should be at least 1.28

# Import boto3 and json package 
import boto3
import json

# Define the access key and secret - insert your generated credentials here
access_key = "your_access_key"
access_secret = "your_secret"

# Define a function to call the bedrock API
def call_bedrock(prompt, assistant):
    # Call the bedrock client 
    bedrock = boto3.client(service_name='bedrock-runtime',
                           region_name='us-east-1',
                           aws_access_key_id=access_key,
                           aws_secret_access_key=access_secret)

    # Tweak your preferred model parameters, prompt and assistant information
    body = json.dumps({
        "prompt": f"\n\nHuman:{prompt}\n\nAssistant:{assistant}",
        "max_tokens_to_sample": 500,
        "temperature": 0.2,
        "top_p": 0.9,})

    # Define the type of model that will be used 
    modelId = 'anthropic.claude-v2'
    accept = 'application/json'
    contentType = 'application/json'

    # Call the Bedrock API
    response = bedrock.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
    response_body = json.loads(response.get('body').read())
    print(response_body.get('completion'))

# Test out the call_bedrock function 
call_bedrock("explain black holes to 8th graders", "Write this in a nice quick story")

Voila! Here is our nice story explaining black holes written with Antrophic’s Claude II model. 🤩

Possible model parameters:

Temperature — Tunes the degree of randomness in a generation. Lower temperatures mean fewer random generations.
Top P — If set to float less than 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.
Top K — Can be used to reduce the repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation.
Maximum Length — Maximum number of tokens to generate. Responses are not guaranteed to fill up to the maximum desired length.
Stop sequences — Up to four sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

Let’s use this code in order to call the Anthropic Claude II model and empower your applications. ❤️ If you found this article helpful, I’d be grateful if you could follow me on Medium and give it a clap or two. Your support means a lot to me. Thank you!

Check out how you can use Anthropic’s Claude II model in GCP. 🤩

Getting Started with PaLM II and Python: Your Ultimate Guide to Turbocharged GenAI!

A beginner guide to start using the newest and greatest GCP GenAI Model, PaLM II with python.

medium.com

Enjoy!

Alexandre t’Kint