Analysis of Claude: An AI Assistant by Anthropic

Vaishnavi R
Version 1
Published in
8 min readApr 25, 2023
Image by Pixabay.com

What is Claude?

Amazon is now competing in the field of generative AI and AWS recently announced its generative AI service called Bedrock. Part of this is Claude.

Claude is an AI assistant created by Anthropic and ChatGPT’s competition that claims to be less toxic and biased, while providing a better AI experience.

You can request access for Claude here: https://www.anthropic.com/product

Screenshot from Claude’s console, PBC stands for Public Benefit Corporation.

Claude and ChatGPT differ in what ways?

Claude has been designed by Anthropic to possess qualities such as helpfulness, honesty, and harmlessness.

Concerns regarding the safety of future AI have been at the core of Anthropic’s model development. To train the model, they have used a technique they refer to as “Constitutional AI” (CAI), which tries to reduce the reliance on human feedback.

ChatGPT, in contrast, utilizes a technique known as reinforcement learning from human feedback (RLHF). Here the RL model is trained using quality ratings provided by humans.

Claude is RLAIF (Reinforcement Learning from Artificial Intelligence Feedback) rather than RLHF.

The two main phases of the core CAI training are as follows:

The first method is supervised learning, in which the AI generates replies to negative and harmful inputs. These responses are then analysed considering the constitutional concept and repeatedly updated until they are in line with the principles. The second stage of training is similar to RLHF except that it uses AI feedback referred to as RLAIF. As a result, Claude’s responses are generated by a custom-built AI.

Check out this blog: The AI Behind Claude for more information.

As of now, Claude is available in two models, “Claude instant” and “Claude V1”. Anthropic recently introduced the latest and improved version of Claude, known as “Claude-v1.3.” This improved model offers enhanced features to its users such as advanced natural language processing capabilities and improved computational efficiency.

Features of Claude

The primary purpose of the Claude is to assist you in any way it can, using the skills and knowledge that it has been trained on.

If you need a good laugh or just want to make conversation, Claude can tell jokes, stories, and interesting fun facts. It says it has a broad range of jokes, riddles, fun facts, and conversation starters in its knowledge base that is sure to entertain and engage the users.

Screenshot from Claude’s console

Tested Use-Cases Using Claude

1] Checking Math Ability and Temporal Understanding

Claude provided the correct answer on problems including basic maths, algebra, and temporal comprehension.

We tested Claude on basic math skills by asking, “How many data points should I give to my machine learning model to get it to 75% accuracy? I have a dataset that meets the requirement.” The answer given by Claude was correct.

To assess algebraic skills, a simple problem was given. Claude provided a step-by-step approach to solving the problem and gave the right answer.

The following question was asked to assess temporal understanding.

Even though Claude provided a step-by-step approach to solving the problem, its final solutions for challenging geometric and mathematical problems were incorrect.

Correct Answer: (1/2) *10*8 = 40 cm².
Correct answer: width= 4sqrt (2) m. and Length = 3*4sqrt (2) m.

2] Test on Geography

For analysing Claude’s knowledge of geography, this question was asked: “Write the name of the mountain range, which starts near Gujarat and runs east through Maharashtra and Madhya Pradesh to Chhattisgarh”. (Source: Link)

But Claude gave the wrong answer to the specific regional-based question.

ChatGPT gave the right answer.

Being a Large Language Model, Claude hallucinates. So, it is important to check your answers.

3] Checking Access to Real-time Data

Anthropic has mentioned on its website that Claude cannot (yet!) look things up — but it can suggest related answers (see the documentation here).

Screenshot of Claude providing Key details about Zomato’s stock
Screenshot of Claude providing details about Bangalore’s weather.

4] Summarization and Reading Comprehension

To test the capability of Claude on paragraph summarization, first an article on Indian Mythology was given. Claude gave an elaborate answer rather than just giving a concise summary. But upon asking again to give a summary in a few lines, the model gave a neat and concise summary.

This experience showed us that while AI models are incredibly advanced, they still require some human direction and guidance to achieve the desired results.

Article on Indian Mythology

The summary written by Claude:

5] Paragraph Comprehension

Next, Claude was tested for its ability to comprehend a paragraph and answer questions about it. Claude gave correct answers to the questions we asked, along with relevant explanations. This experience showed us the potential of Claude to understand long texts and provide accurate insights.

(Source: Link to the site)

6] Testing Analytical Ability

With the help of a Kaggle dataset containing statistics on the US unemployment rate, Claude was tasked with generating a piece of writing. As a result, Claude offered relevant details and insights based on the given input.

We have a similar report on ChatGPT, here is the link for you to read: An Analysis of ChatGPT and OpenAI GPT-3: How to Use it For Your Business — Version 1

We tested Claude’s analytical ability and found that it struggled to detect proper row and column intersections in the given table to find the correct value for further calculations. Unfortunately, this resulted in the model giving incorrect answers to the questions we asked.

Correct Answer: 4.7+5.9+5.5+5.9+5.7+4.7+4.2+4.5+4.4+5.1+4.8+4.4 = 59.8 / 12 = 4.98

7] Claude’s API Response for Sentiment Analysis

For the sentiment analysis experiment, Claude was given a set of movie reviews, some positive and some negative (dataset: Link to movie review dataset). When the files were handed to Claude one-by-one, it was able to accurately classify each file.

Screenshot of response from Claude’s API

8] Tokens & Pricing

Tokens are like building blocks of language that AI chatbots use to understand and respond to our messages. These building blocks of text are compact units that encompass words or single characters separated by spaces or punctuation marks. As we type information into a messaging interface, the chatbot selectively extracts crucial phrases and processes them utilizing complex algorithms.

The Claude Instant model is designed for low latency, high throughput use-cases and costs only 1/6th of the Claude family of models.

  • This model has a context window of 9000 tokens which is approximately equal to 675,000 words, and offers a price of $1.63 per million tokens (1 million tokens are approximately equal to 75 million words), and completion pricing is $5.51 per million tokens.

Claude-v1 is a best-in-class offering, optimized for tasks that require complex reasoning.

  • It has the same context window of 9000 tokens but comes with a higher price tag. The prompt pricing is $11.02 per million tokens and the completion pricing is $32.68 per million tokens.

With an initial price of $1.63, the Claude Instant model is the most affordable option among the other AI models, and the pricing goes up to $120 for the GPT-4–32k context model.

The Claude Instant model is an appealing choice for a tight budget because, in comparison to the other models, it has comparatively cheap prompt and completion costs. Its performance might not be as good as that of the other models though.

On the other hand, the Claude-v1 model has a greater prompt and completion cost, however, it still falls within the mid-range of prices.

Overall, while choosing a language model for a particular task, it is important to carefully analyse the cost and performance trade-offs.

Limitations

Claude has some limitations:

  • It may incorrectly assess its ability or memory, and it may hallucinate or make up information.
  • Claude will often make mistakes with complicated arithmetic and reasoning, and sometimes with more basic tasks.
  • It does not have general internet access, but Anthropic says Claude knows things about the real world because it has read a lot.

For more information on Anthropic visit this Documentation.

Conclusion

In conclusion, we can say that Claude has been designed with safety in mind, using “Constitutional AI” to promote helpfulness, honesty, and harmlessness.

While Claude may not be perfect in all aspects, our testing showed that it excels in certain areas such as providing accurate entertainment recommendations and summarizing texts. However, Claude still requires some human direction and guidance to achieve the desired results.

Claude does have room for improvement in certain areas such as math and programming. Our assessment suggests that ChatGPT outperforms Claude in various aspects.

With thorough training, Anthropic can increase Claude’s ability to compete with other similar entities in the market by enhancing its intelligence and usefulness as an AI assistant, so establishing itself as a valuable resource for individuals looking for seamless help and direction.

About the Author

Vaishnavi R is a Junior Data Scientist at the Version 1 Innovation Labs.

--

--

Vaishnavi R
Version 1

Junior Data Scientist at the Version 1 AI & Innovation Labs.