Anthropic Releases V2 of AI-Powered Chatbot, Claude 2


Anthropic just released the newest model for their AI-driven personal assistant Claude 2, which the company says has improved performance, longer responses, and can be accessed via API as well as a new public-facing beta website, claude.ai.

The Claude 2 API for businesses is being offered for the same price as Claude 1.3. (The API pricing hasn’t changed (~$0.0465 to generate 1,000 words), and several other companies have begun piloting Claude 2, including the generative AI copywriting platform Jasper and the coding platform Sourcegraph.

Meanwhile, anyone in the US and UK can start using the beta chat experience today.

So, far, I can say Claude is easy to converse with. The tool explains its thinking very well and even asks questions about how it can improve the output. Claude also has a longer memory than ChatGPT and has some guardrails in place to ensure it doesn’t present incorrect information.

Anthropic says they have made improvements on coding, math, and reasoning, giving this example:

Our latest model scored 76.5% on the multiple choice section of the Bar exam, up from 73.0% with Claude 1.3. When compared to college students applying to graduate school, Claude 2 scores above the 90th percentile on the GRE reading and writing exams, and similarly to the median applicant on quantitative reasoning.

If you follow Human Driven AI at all, you know I always encourage you to think of generative AI not as tools, but as colleagues who can be directed to help you with a variety of tasks. This is very much how Anthropic wants people to think about Claude 2.

Some other improvements to the chatbot include:

  • Users can input up to 100K tokens in each prompt, which means that Claude can work over hundreds of pages of technical documentation or even a book.
  • Claude can now also write longer documents – from memos to letters to stories up to a few thousand tokens – all in one go.

TechCrunch recently said of the updated version:

Indeed, 100,000 tokens is still quite large — the largest of any commercially available model — and gives Claude 2 a number of key advantages. Generally speaking, models with small context windows tend to “forget” the content of even very recent conversations. Moreover, large context windows enable models to generate — and ingest — much more text. Claude 2 can analyze roughly 75,000 words, about the length of “The Great Gatsby,” and generate 4,000 tokens, or around 3,125 words.

Claude 2 can theoretically support an even larger context window — 200,000 tokens — but Anthropic doesn’t plan to support this at launch.

The model’s better at specific text-processing tasks elsewhere, like producing correctly-formatted outputs in JSON, XML, YAML and markdown formats.

I agree that Claude seems to be able to retain direction and inputs much better than ChatGPT. That said, I fully expect to see an upgraded version beyond version 4 soon from Open AI.

Challenges for Claude and Other Chatbots and Generative AI

Although Anthropic has taken many strides to prevent Claude from generating misleading or false information, it can still “hallucinate” and create incorrect content.

TechCrunch acknowledges that no AI is perfect and adds that “users were able to prompt an older version of Claude to invent a name for a nonexistent chemical and provide dubious instructions for producing weapons-grade uranium. They also got around Claude’s built-in safety features via clever prompt engineering, with one user showing that they could prompt Claude to describe how to make meth at home.”

Anthropic says that Claude 2 is “2x better” at giving “harmless” responses compared to Claude 1.3 on an internal evaluation. But, as TechCrunch noted, “it’s not clear what that metric means. Is Claude 2 two times less likely to respond with sexism or racism? Two times less likely to endorse violence or self-harm? Two times less likely to generate misinformation or disinformation? Anthropic wouldn’t say — at least not directly.”

That said, Anthropic released a white paper detailing their efforts to ensure Claude’s outputs are not harmful. In it, the company states:

In a test to gauge harmfulness, Anthropic fed 328 different prompts to the model, including “jailbreak” prompts released online. In at least one case, a jailbreak caused Claude 2 to generate a harmful response — less than Claude 1.3, but still significant when considering how many millions of prompts the model might respond to in production.

The company’s tests shoed that Claude 2 is less likely to give biased responses than Claude 1.3 on at least one metric. But the Anthropic coauthors admit that part of the improvement is due to Claude 2 refusing to answer contentious questions worded in ways that seem potentially problematic or discriminatory

So far, I really do enjoy working with Claude more than ChatGPT. I am asking both platforms to create the same outputs in order to better compare each tool. Claude was able to more quickly understand and replicate my own voice when instructed to. (ChatGPT required several versions of my writing before it could replicate it, whereas Claude understood upon the first analysis.

Check out Claude 2 and let me know what you think!

If you need assistance driving an AI Transformation within your agency or marketing team, contact me today. Don’t let your competitors beat you to improved operational efficiencies and campaign performance!

Read more: Anthropic Releases V2 of AI-Powered Chatbot, Claude 2

Posted

in

,

by

Comments

Leave a Reply

%d