Even some of the best AI can’t beat this new benchmark

The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number of data labeling and AI development services, have released a challenging new benchmark for frontier AI systems. The benchmark, called Humanity’s Last Exam, includes thousands of crowdsourced questions touching on subjects like mathematics, humanities, and the natural sciences. To make […] © 2024 TechCrunch. All rights reserved. For personal use only.

Tomas Kauer - Moderator

Jan 24, 2025 - 09:00

Even some of the best AI can’t beat this new benchmark

The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number of data labeling and AI development services, have released a challenging new benchmark for frontier AI systems. The benchmark, called Humanity’s Last Exam, includes thousands of crowdsourced questions touching on subjects like mathematics, humanities, and the natural sciences. To make […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Tags:

Previous Article

15 Best Catskills Hotels 2025: Top Upstate Hudson Valley Retreats

JetBrains launches Junie, a new AI coding agent for its IDEs

What's Your Reaction?

0

Like

0

Dislike

0

Love

0

Funny

0

Angry

0

Sad

0

Wow

Tomas Kauer - Moderator

Related Posts

Georgia Tech to end China partnerships following concerns over military ties

Georgia Tech to end China partnerships following concer...

Tomas Kauer - Mode... Sep 8, 2024

Filigran secures $35M for its cybersecurity threat management suite

Filigran secures $35M for its cybersecurity threat mana...

Tomas Kauer - Mode... Oct 28, 2024

Adam Neumann’s startup Flow opens co-living community in Saudi Arabia

Adam Neumann’s startup Flow opens co-living community i...

Tomas Kauer - Mode... Sep 21, 2024