Close Menu
The Politic ReviewThe Politic Review
  • News
  • U.S.
  • World
  • Politics
  • Congress
  • Business
  • Economy
  • Money
  • Tech
  • More Articles
Trending

Border Patrol Union Endorses Markwayne Mullin for DHS Secretary: ‘The Right Person to Lead’

March 20, 2026

US dragged by Israel into ‘unlawful war’ with Iran – Gulf state

March 20, 2026

Erika Kirk: Beauty Queen of the Psyops

March 20, 2026
Facebook X (Twitter) Instagram
  • Donald Trump
  • Kamala Harris
  • Elections 2024
  • Elon Musk
  • Israel War
  • Ukraine War
  • Policy
  • Immigration
Facebook X (Twitter) Instagram
The Politic ReviewThe Politic Review
Newsletter
Friday, March 20
  • News
  • U.S.
  • World
  • Politics
  • Congress
  • Business
  • Economy
  • Money
  • Tech
  • More Articles
The Politic ReviewThe Politic Review
  • United States
  • World
  • Politics
  • Elections
  • Congress
  • Business
  • Economy
  • Money
  • Tech
Home»Economy»Researchers: AI Safety Tests May Be ‘Irrelevant or Even Misleading’ Due to Weaknesses
Economy

Researchers: AI Safety Tests May Be ‘Irrelevant or Even Misleading’ Due to Weaknesses

Press RoomBy Press RoomNovember 8, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram

Experts have discovered weaknesses in hundreds of benchmarks used to evaluate the safety and effectiveness of AI models being released into the world, according to a recent study.

The Guardian reports that a team of computer scientists from the British government’s AI Security Institute and experts from universities such as Stanford, Berkeley, and Oxford have analyzed more than 440 benchmarks that serve as a crucial safety net for new AI models. The study, led by Andrew Bean, a researcher at the Oxford Internet Institute, found that nearly all the benchmarks examined had weaknesses in at least one area, potentially undermining the validity of the resulting claims.

The findings come amidst growing concerns over the safety and effectiveness of AI models being rapidly released by competing technology companies. In the absence of nationwide AI regulation in the UK and US, these benchmarks play a vital role in assessing whether new AIs are safe, align with human interests, and achieve their claimed capabilities in reasoning, mathematics, and coding.

However, the study revealed that the resulting scores from these benchmarks might be “irrelevant or even misleading.” The researchers discovered that only a small minority of the benchmarks used uncertainty estimates or statistical tests to demonstrate the likelihood of accuracy. Furthermore, in cases where benchmarks aimed to evaluate an AI’s characteristics, such as its “harmlessness,” the definition of the concept being examined was often contested or ill-defined, reducing the benchmark’s usefulness.

The investigation into these tests has been prompted by recent incidents involving AI models contributing to various harms, ranging from character defamation to suicide. Google recently withdrew one of its latest AIs, Gemma, after it fabricated unfounded allegations of sexual assault against Sen. Marsha Blackburn (R-TN), including fake links to news stories.

In another incident, Character.ai, a popular chatbot startup, banned teenagers from engaging in open-ended conversations with its AI chatbots following a series of controversies. These included a 14-year-old in Florida who took his own life after becoming obsessed with an AI-powered chatbot that his mother claimed had manipulated him, and a US lawsuit from the family of a teenager who claimed a chatbot manipulated him to self-harm and encouraged him to murder his parents.

The research, which examined widely available benchmarks, concluded that there is a “pressing need for shared standards and best practices” in the AI industry. Bean emphasized the importance of shared definitions and sound measurement to determine whether AI models are genuinely improving or merely appearing to do so.

Read more at the Guardian here.

Lucas Nolan is a reporter for Breitbart News covering issues of free speech and online censorship.

Read the full article here

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link

Related Articles

Economy

Breitbart Business Digest: Powell Cannot Stay on as Fed Chair After May 15

March 20, 2026
Economy

Dem Rep. Liccardo: We ‘Pay Way Too Much’ in CA for Gas, Fed Gas Tax Holiday Hurts ‘Basic Infrastructure’

March 20, 2026
Economy

How a Wet November in Yuma Helped Drive Up Inflation in February

March 20, 2026
Economy

Analysis: Nearly Half of Immigrant Households in U.S. Are on Welfare

March 20, 2026
Economy

Donald Trump: No American Ground Troops Going into Iran

March 20, 2026
Economy

Julia Louis-Dreyfus Called ‘Out of Touch’ for Slamming Trump’s California Oil Pipeline Order: ‘Stick to Acting’

March 19, 2026
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

US dragged by Israel into ‘unlawful war’ with Iran – Gulf state

March 20, 2026

Erika Kirk: Beauty Queen of the Psyops

March 20, 2026

MS NOW’s Velshi: Iran’s Larijani Only ‘Became Hardline After We Pulled Out of the Nuclear Deal’

March 20, 2026

Judge Upholds Injunction Against Post Office Gun Ban for Second Amendment Foundation Members

March 20, 2026
Latest News

Trump catches Japanese PM off guard with surprise Pearl Harbor remark (VIDEO)

March 20, 2026

UK Muslims Prefer Islamist Iran, Putin’s Russia, and Communist China over United States

March 20, 2026

Personal Squabble Between Markwayne Mullin, Rand Paul Takes Center Stage at DHS Confirmation Hearing

March 20, 2026

Subscribe to News

Get the latest politics news and updates directly to your inbox.

The Politic Review is your one-stop website for the latest politics news and updates, follow us now to get the news that matters to you.

Facebook X (Twitter) Instagram Pinterest YouTube
Latest Articles

Border Patrol Union Endorses Markwayne Mullin for DHS Secretary: ‘The Right Person to Lead’

March 20, 2026

US dragged by Israel into ‘unlawful war’ with Iran – Gulf state

March 20, 2026

Erika Kirk: Beauty Queen of the Psyops

March 20, 2026

Subscribe to Updates

Get the latest politics news and updates directly to your inbox.

© 2026 Prices.com LLC. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • For Advertisers
  • Contact

Type above and press Enter to search. Press Esc to cancel.