Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Hamilton Tiger-Cats’ DB Jamal Peters stretchered off field, taken to hospital after blow to head

    June 12, 2026

    Mexico’s 2-0 win makes a statement in World Cup opener

    June 12, 2026

    Menswear Designer Nigel Cabourn Dies at 77

    June 12, 2026
    Facebook X (Twitter) Instagram
    Select Language
    Facebook X (Twitter) Instagram
    NEWS ON CLICK
    Subscribe
    Friday, June 12
    • Home
      • United States
      • Canada
      • Spain
      • Mexico
    • Top Countries
      • Canada
      • Mexico
      • Spain
      • United States
    • Politics
    • Business
    • Entertainment
    • Fashion
    • Health
    • Science
    • Sports
    • Travel
    NEWS ON CLICK
    Home»Business & Economy»US Business & Economy»Anthropic’s Claude Fable 5 plays it too safe on safety, developers say
    US Business & Economy

    Anthropic’s Claude Fable 5 plays it too safe on safety, developers say

    News DeskBy News DeskJune 11, 2026No Comments3 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    Anthropic’s Claude Fable 5 plays it too safe on safety, developers say
    Share
    Facebook Twitter Pinterest Email Copy Link

    Anthropic on Tuesday launched Claude Fable 5, its most capable public model. But within two days, users began reporting that its safety system was blocking benign or legitimate prompts.

    Fable 5 is the first public model derived from Anthropic’s Mythos family, whose original iteration showed unusual skill during training at finding software bugs and exploiting them to disrupt or take control of systems. That raised enough concern inside Anthropic that the company grouped cybersecurity with other high-risk domains, including biology and chemistry, when setting limits on Mythos-derived public models.

    For Fable 5, that means prompts flagged as sensitive in those areas are routed to Claude Opus 4.8, a less capable model with its own guardrails. Anthropic says the fallback affects about 0.05% of queries and notifies users when it happens.

    But reports of false positive reports quickly mounted. That’s because Anthropic erred on the side of caution when it designed the classifiers used to detect and downgrade potentially dangerous uses of its model. It was also challenged to balance accuracy with transparency.

    Try telling that to developers. Across social media, people have complained aboutClaude Fable 5 rejecting queries about everything from RNA sequencing data for sheep to résumé editing, to shopping lists. 

    “The word ‘cancer’ is flagged as a biosecurity risk by Claude Fable 5!” said scientist Derya Unutmazon X. “Our Anthropic overlords deciding which prompts the peasants are allowed to use.,” added founder and developer Bojan Tunguz on X.

    Anthropic now says it’s working on the problem. “A hidden safeguard is harder to probe and work around,” Anthropic says in a statement emailed to Fast Company. “This means the safeguards can be targeted much more narrowly. A visible safeguard needs to cast a wider net to be more robust, resulting in more requests being incorrectly flagged.”

    “We made the wrong tradeoff and we apologize for not getting the balance right,” the company adds. 

    Now Anthropic says it’s working to refine the classifiers so that less queries trigger false positives. For Claude subscribers, query downgrades (to Opus 4.8) will be more obvious. Developers accessing Fable 5 via the Claude API will see a reason for the model’s refusal of a prompt, the company says. 

    Meanwhile, at least one AI researcher appears to have coerced Fable 5 into responding to a banned prompt. Pliny the Liberator claimed on X to bypass Fable 5’s filters roughly 24 to 48 hours after launch. Pliny described using a multi-agent approach involving a previously jailbroken Claude Opus 4.8, along with techniques including query decomposition, long-context framing, fiction and narrative structures, and academic taxonomies. 

    Before launch, Anthropic said more than 1,000 hours of internal and external red-teaming, including bug bounty efforts, had identified no universal jailbreaks. The company has acknowledged that preventing all sophisticated, multi-turn, or agentic attacks is likely not possible and says it continues to refine its classifiers.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
    News Desk
    • Website

    News Desk is the dedicated editorial force behind News On Click. Comprised of experienced journalists, writers, and editors, our team is united by a shared passion for delivering high-quality, credible news to a global audience.

    Related Posts

    US Business & Economy

    SpaceX IPO update: Latest SPCX stock price, trading start time for closely watched Nasdaq debut

    June 11, 2026
    US Business & Economy

    El Niño is here—and it will ‘pour fuel on the fire of a warming world’

    June 11, 2026
    US Business & Economy

    The top 3 secrets of innovation that nobody talks about

    June 11, 2026
    US Business & Economy

    Here’s how much the 2026 World Cup will cost companies in lost employee productivity—the number is staggering

    June 11, 2026
    US Business & Economy

    Scientists call it a ‘tragic loss.’ Why the U.S. is shutting down a major ocean monitoring network

    June 11, 2026
    US Business & Economy

    Sustainable fashion isn’t a standalone category

    June 11, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Don't Miss

    Hamilton Tiger-Cats’ DB Jamal Peters stretchered off field, taken to hospital after blow to head

    News DeskJune 12, 20260

    Photo: Samantha Keen/3DownNation. All rights reserved. Hamilton Tiger-Cats defensive back Jamal Peters was stretchered off…

    Mexico’s 2-0 win makes a statement in World Cup opener

    June 12, 2026

    Menswear Designer Nigel Cabourn Dies at 77

    June 12, 2026

    Karol G’s Wordless Evil Eye Moment Draws 2.3 Million Likes and Zero Answers

    June 12, 2026
    Tech news by Newsonclick.com
    Top Posts

    Lisa Vanderpump Hit With 6-Figure Lawsuit

    May 13, 2026

    Saskatoon Dragonfly delivery drivers continue strike, citing unfair work conditions

    May 13, 2026

    20/20: Season 49; ABC News Series Renewed for 2026-27 – canceled + renewed TV shows, ratings

    May 13, 2026

    Brewers Notes: Yelich, Black, Priester, Lockridge

    May 13, 2026
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    Editors Picks

    Hamilton Tiger-Cats’ DB Jamal Peters stretchered off field, taken to hospital after blow to head

    June 12, 2026

    Mexico’s 2-0 win makes a statement in World Cup opener

    June 12, 2026

    Menswear Designer Nigel Cabourn Dies at 77

    June 12, 2026

    Karol G’s Wordless Evil Eye Moment Draws 2.3 Million Likes and Zero Answers

    June 12, 2026
    About Us

    NewsOnClick.com is your reliable source for timely and accurate news. We are committed to delivering unbiased reporting across politics, sports, entertainment, technology, and more. Our mission is to keep you informed with credible, fact-checked content you can trust.

    We're social. Connect with us:

    Facebook X (Twitter) Instagram Pinterest YouTube
    Latest Posts

    Hamilton Tiger-Cats’ DB Jamal Peters stretchered off field, taken to hospital after blow to head

    June 12, 2026

    Mexico’s 2-0 win makes a statement in World Cup opener

    June 12, 2026

    Menswear Designer Nigel Cabourn Dies at 77

    June 12, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Editorial Policy
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    • Advertise
    • Contact Us
    © 2026 Newsonclick.com || Designed & Powered by ❤️ Trustmomentum.com.

    Type above and press Enter to search. Press Esc to cancel.