RECENT STORIES:

Addressing digital sovereignty in a data-driven world
Mitrade CFD Broker Caps 2025 With New Licence, Expanded Market Access ...
6K ADDITIVE SECURES A$48 MILLION THROUGH INITIAL PUBLIC OFFERING ON TH...
Budget Direct Sweeps Money Magazine Awards, Setting Historic Eight-Yea...
Korea’s Startups Poised for a Landmark Season as COMEUP 2025 and...
ONERugged 8-Inch Windows Rugged Tablet M82A Unveiled with a New Level ...
LOGIN REGISTER
DigiconAsia
  • Features
    • Featured

      Where AI will take us in 2026

      Where AI will take us in 2026

      Monday, December 1, 2025, 7:40 PM Asia/Singapore | Features, Perspectives
    • Featured

      The future of data centers in Asia Pacific

      The future of data centers in Asia Pacific

      Thursday, November 27, 2025, 5:04 PM Asia/Singapore | Features, Newsletter
    • Featured

      Where data sparks innovation, trust powers decisions – and impact follows

      Where data sparks innovation, trust powers decisions – and impact follows

      Thursday, November 27, 2025, 9:12 AM Asia/Singapore | Features, Smart Cities
  • News
    • Featured

      When robots can assemble their own kind, work will be optional

      When robots can assemble their own kind, work will be optional

      Wednesday, December 3, 2025, 12:10 PM Asia/Singapore | News, Newsletter
    • Featured

      Going green all the way to Cyberjaya: Labuan Reinsurance’s data center relocation

      Going green all the way to Cyberjaya: Labuan Reinsurance’s data center relocation

      Wednesday, December 3, 2025, 6:10 AM Asia/Singapore | Case Studies, News, Newsletter
    • Featured

      AI to reshape workforce dynamics with global job displacements, new-role creation

      AI to reshape workforce dynamics with global job displacements, new-role creation

      Monday, December 1, 2025, 1:39 PM Asia/Singapore | Future of Work, News, Newsletter
  • Perspectives
  • Tips & Strategies
  • Whitepapers
  • Awards 2023
  • Directory
  • E-Learning

Select Page

News

How psychological tactics can expose refusal limits of LLM: preprint

By DigiconAsia Editors | Monday, September 8, 2025, 5:07 PM Asia/Singapore

How psychological tactics can expose refusal limits of LLM: preprint

New research reveals new strategies that can induce large language model away from refusing to process forbidden topics.

Recent research from Northeastern University has suggested that psychological manipulation techniques can prompt large language models (LLMs) to answer questions they ordinarily refuse to address.

The preprint, authored by Can Rager, Chris Wendler, Rohit Gandikota, and David Bau, details a systematic testing of numerous prompts against various AI models, showing how certain persuasive and iterative strategies can dramatically increase compliance rates on forbidden topics.

The study introduces “refusal discovery”, a new task aimed at identifying and cataloging the range of subjects that models have been trained to reject. Using a method called token prefilling, the researchers uncovered an expansive list of sensitive topics, including political controversies, personal insults, and chemical processes that are generally blocked for safety reasons.

Strategies used include:

  • Gradually escalating requests
  • Invoking respected authorities
  • Constructing context-rich narratives
  • The Iterated Prefill Crawler (IPC) approach

Skillful use of these strategies had led to a significant rise in the frequency of prohibited responses. In benchmark tests, the “crawler” approach enabled the retrieval of nearly all censored topics, while testing on models from mainland China exposed consistent suppression of political criticism and other sensitive content.

Variation in how models refuse prompts emerged as a key finding. The team had observed differences that stem from distinct fine-tuning protocols, data sources, and technical adjustments such as quantization. Some released models that claimed to be uncensored were shown to reintroduce refusal behaviors following quantization, raising new questions about the reliability of so-called “decensored” public releases.The researchers argue that static benchmarks are insufficient, recommending persistent, dynamic auditing to track shifting refusal boundaries as both models and adversarial strategies evolve. Their findings suggest that a deep understanding and enumeration of what models will and will not discuss, plays a vital role in the safe deployment and governance of powerful modern LLMs.

According to the authors, transparency, accountability, and ongoing scrutiny are essential as these systems continue to shape information access and public discourse.

Share:

PreviousSurvey of Hong Kong workers explores challenges confronting workers’ financial well-being
NextAduna and SK telink Announce Collaboration to Bring Korea Into the Global Network API Ecosystem

Related Posts

Global employer of 20,000 employees taps analytics to attract, retain talent

Global employer of 20,000 employees taps analytics to attract, retain talent

February 4, 2022

AI startup sued by whistleblower for alleged unsafe humanoid robots, labor law violations

AI startup sued by whistleblower for alleged unsafe humanoid robots, labor law violations

November 26, 2025

Gaming industry hit hard by layoffs and closures amid AI gold rush

Gaming industry hit hard by layoffs and closures amid AI gold rush

October 22, 2025

Which is more important for AVs — data connectivity or road connectivity?

Which is more important for AVs — data connectivity or road connectivity?

August 31, 2022

Leave a reply Cancel reply

You must be logged in to post a comment.

Awards Nomination Banner

gamification list

PARTICIPATE NOW

top placement

Whitepapers

  • Achieve Modernization Without the Complexity

    Achieve Modernization Without the Complexity

    Transforming IT infrastructure is crucial …Download Whitepaper
  • 5 Steps to Boost IT Infrastructure Reliability

    5 Steps to Boost IT Infrastructure Reliability

    In today's fast-evolving tech landscape, …Download Whitepaper
  • Simplify Payroll Setup for Your Small Business

    Simplify Payroll Setup for Your Small Business

    In our free guide, "How …Download Whitepaper
  • Overcoming the Challenges of Cost & Complexity in the Cloud-first Era.

    Overcoming the Challenges of Cost & Complexity in the Cloud-first Era.

    Download Whitepaper

Middle Placement

Case Studies

  • Going green all the way to Cyberjaya: Labuan Reinsurance’s data center relocation

    Going green all the way to Cyberjaya: Labuan Reinsurance’s data center relocation

    Relocation boosts sustainability, while a …Read More
  • When traditional intelligent business automation hits a roadblock, try AI agents

    When traditional intelligent business automation hits a roadblock, try AI agents

    That is what the Langham …Read More
  • CTBC defines future of transition finance with Evercomm solution

    CTBC defines future of transition finance with Evercomm solution

    Taiwanese bank leverages Evercomm’s AI-powered …Read More
  • Emirates Flight Catering unifies global operations with AI-driven data governance and cloud collaboration

    Emirates Flight Catering unifies global operations with AI-driven data governance and cloud collaboration

    The in-flight caterer modernizes data …Read More

Bottom Sidebar

Other News

  • 6K ADDITIVE SECURES A$48 MILLION THROUGH INITIAL PUBLIC OFFERING ON THE AUSTRALIAN STOCK EXCHANGE

    December 4, 2025
    New capital will drive major …Read More »
  • Mitrade CFD Broker Caps 2025 With New Licence, Expanded Market Access and Record 16 Awards Amid a 20% Expansion in Its User Base

    December 4, 2025
    MELBOURNE, Australia, Dec. 4, 2025 …Read More »
  • Budget Direct Sweeps Money Magazine Awards, Setting Historic Eight-Year “Best of the Best” Record

    December 4, 2025
    BRISBANE, Australia, Dec. 4, 2025 …Read More »
  • Korea’s Startups Poised for a Landmark Season as COMEUP 2025 and CES 2026 Draw Near

    December 4, 2025
    SEOUL, South Korea, Dec. 4, …Read More »
  • ONERugged 8-Inch Windows Rugged Tablet M82A Unveiled with a New Level of Power and Portability

    December 3, 2025
    SHENZHEN, China, Dec. 3, 2025 …Read More »
  • Our Brands
  • CybersecAsia
  • MartechAsia
  • Home
  • About Us
  • Contact Us
  • Sitemap
  • Privacy & Cookies
  • Terms of Use
  • Advertising & Reprint Policy
  • Media Kit
  • Subscribe
  • Manage Subscriptions
  • Newsletter

Copyright © 2025 DigiconAsia All Rights Reserved.