Anthropic introduces Claude 3.5 Sonnet, matching GPT-4o on benchmarks

Enlarge (credit: Anthropic / Benj Edwards)

On Thursday, Anthropic announced Claude 3.5 Sonnet, its latest AI language model and the first in a new series of “3.5” models that build upon Claude 3, launched in March. Claude 3.5 can compose text, analyze data, and write code. It features a 200,000 token context window and is available now on the Claude website and through an API. Anthropic also introduced Artifacts, a new feature in the Claude interface that shows related work documents in a dedicated window.

So far, people outside of Anthropic seem impressed. “This model is really, really good,” wrote independent AI researcher Simon Willison on X. “I think this is the new best overall model (and both faster and half the price of Opus, similar to the GPT-4 Turbo to GPT-4o jump).”

As we’ve written before, benchmarks for large language models (LLMs) are troublesome because they can be cherry-picked and often do not capture the feel and nuance of using a machine to generate outputs on almost any conceivable topic. But according to Anthropic, Claude 3.5 Sonnet matches or outperforms competitor models like GPT-4o and Gemini 1.5 Pro on certain benchmarks like MMLU (undergraduate level knowledge), GSM8K (grade school math), and HumanEval (coding).

Read 17 remaining paragraphs | Comments

What's your reaction?

Excited

Happy

In Love

Not Sure

Silly

Anthropic introduces Claude 3.5 Sonnet, matching GPT-4o on benchmarks

What's your reaction?

Louisiana’s Ten Commandments Commandment Is Classic Public Schooling. LA GATOR Is, but Almost Wasn’t, the Solution.

SCOTUS Upholds a Tax on Stock Ownership in Narrow Opinion

We made a cat drink a beer with Runway’s AI video generator, and it sprouted hands

CrowdStrike blames testing bugs for security update that took down 8.5M Windows PCs

Elon Musk claims he is training “the world’s most powerful AI by every metric”

More in:Editor's Pick

How Russia-linked malware cut heat to 600 Ukrainian buildings in deep winter

The first GPT-4-class AI model anyone can download has arrived: Llama 405B

Microsoft says 8.5M systems hit by CrowdStrike BSOD, releases USB recovery tool

Astronomers discover technique to spot AI fakes using galaxy-measurement tools

Posts List

CrowdStrike fixes start at “reboot up to 15 times” and get more complex from there

Friday Feature: Homeschool CPA

Major outages at CrowdStrike, Microsoft leave the world with BSODs and confusion

Share

What's your reaction?

You may also like

More in:Editor's Pick

Posts List

Latest Posts