RECENT STORIES:

Addressing digital sovereignty in a data-driven world
SOLAR & STORAGE LIVE THAILAND 2026: LEADING THE FUTURE OF SUSTAINA...
PhotonPay Expands UK Local Payment Rails via New Collaboration with Cl...
Rich Sparkle (ANPA.US) Shockingly Announces: Gathering Global Top Star...
ATFX Releases Q1 2026 Trader Magazine Spotlighting Policy Divergence a...
MediCapture Launches aiScope™ Pilot for Veterinary Sciences at VMX 202...
LOGIN REGISTER
DigiconAsia
  • Features
    • Featured

      When AI and IoT converge

      When AI and IoT converge

      Thursday, January 15, 2026, 12:36 PM Asia/Singapore | Features
    • Featured

      Low-code platform enables digital-first agility

      Low-code platform enables digital-first agility

      Friday, December 26, 2025, 1:38 AM Asia/Singapore | Case Studies, Features
    • Featured

      Agents of change – the future of AI-powered e-commerce

      Agents of change – the future of AI-powered e-commerce

      Wednesday, December 24, 2025, 1:22 PM Asia/Singapore | e-Commerce, Features
  • News
    • Featured

      Asia Pacific’s first intelligent legal work platform

      Asia Pacific's first intelligent legal work platform

      Wednesday, January 14, 2026, 2:50 PM Asia/Singapore | News, Newsletter
    • Featured

      AI leader dismisses dire warnings about irresponsible AI as industry sabotage

      AI leader dismisses dire warnings about irresponsible AI as industry sabotage

      Tuesday, January 13, 2026, 4:44 PM Asia/Singapore | News, Newsletter
    • Featured

      Cybersecurity hiring stagnates as AI tools take center stage

      Cybersecurity hiring stagnates as AI tools take center stage

      Monday, January 12, 2026, 4:50 PM Asia/Singapore | Future of Work, News, Newsletter
  • Perspectives
  • Tips & Strategies
  • Whitepapers
  • Awards 2023
  • Directory
  • E-Learning

Select Page

News

How psychological tactics can expose refusal limits of LLM: preprint

By DigiconAsia Editors | Monday, September 8, 2025, 5:07 PM Asia/Singapore

How psychological tactics can expose refusal limits of LLM: preprint

New research reveals new strategies that can induce large language model away from refusing to process forbidden topics.

Recent research from Northeastern University has suggested that psychological manipulation techniques can prompt large language models (LLMs) to answer questions they ordinarily refuse to address.

The preprint, authored by Can Rager, Chris Wendler, Rohit Gandikota, and David Bau, details a systematic testing of numerous prompts against various AI models, showing how certain persuasive and iterative strategies can dramatically increase compliance rates on forbidden topics.

The study introduces “refusal discovery”, a new task aimed at identifying and cataloging the range of subjects that models have been trained to reject. Using a method called token prefilling, the researchers uncovered an expansive list of sensitive topics, including political controversies, personal insults, and chemical processes that are generally blocked for safety reasons.

Strategies used include:

  • Gradually escalating requests
  • Invoking respected authorities
  • Constructing context-rich narratives
  • The Iterated Prefill Crawler (IPC) approach

Skillful use of these strategies had led to a significant rise in the frequency of prohibited responses. In benchmark tests, the “crawler” approach enabled the retrieval of nearly all censored topics, while testing on models from mainland China exposed consistent suppression of political criticism and other sensitive content.

Variation in how models refuse prompts emerged as a key finding. The team had observed differences that stem from distinct fine-tuning protocols, data sources, and technical adjustments such as quantization. Some released models that claimed to be uncensored were shown to reintroduce refusal behaviors following quantization, raising new questions about the reliability of so-called “decensored” public releases.The researchers argue that static benchmarks are insufficient, recommending persistent, dynamic auditing to track shifting refusal boundaries as both models and adversarial strategies evolve. Their findings suggest that a deep understanding and enumeration of what models will and will not discuss, plays a vital role in the safe deployment and governance of powerful modern LLMs.

According to the authors, transparency, accountability, and ongoing scrutiny are essential as these systems continue to shape information access and public discourse.

Share:

PreviousSurvey of Hong Kong workers explores challenges confronting workers’ financial well-being
NextAduna and SK telink Announce Collaboration to Bring Korea Into the Global Network API Ecosystem

Related Posts

No more passport fumbling and multiple ID-verification delays at this airport

No more passport fumbling and multiple ID-verification delays at this airport

August 21, 2020

Is Edge Computing an opportunity, threat or distraction for telco operators?

Is Edge Computing an opportunity, threat or distraction for telco operators?

August 13, 2020

Telcos need to address inflation risks amid runaway inflation levels

Telcos need to address inflation risks amid runaway inflation levels

February 1, 2023

Are embedded finance and Web3 the eye of 2023 financial storm?

Are embedded finance and Web3 the eye of 2023 financial storm?

February 27, 2023

Leave a reply Cancel reply

You must be logged in to post a comment.

Awards Nomination Banner

gamification list

PARTICIPATE NOW

top placement

Whitepapers

  • Achieve Modernization Without the Complexity

    Achieve Modernization Without the Complexity

    Transforming IT infrastructure is crucial …Download Whitepaper
  • 5 Steps to Boost IT Infrastructure Reliability

    5 Steps to Boost IT Infrastructure Reliability

    In today's fast-evolving tech landscape, …Download Whitepaper
  • Simplify Payroll Setup for Your Small Business

    Simplify Payroll Setup for Your Small Business

    In our free guide, "How …Download Whitepaper
  • Overcoming the Challenges of Cost & Complexity in the Cloud-first Era.

    Overcoming the Challenges of Cost & Complexity in the Cloud-first Era.

    Download Whitepaper

Middle Placement

Case Studies

  • Harnessing the data lakehouse and AI to revolutionize customer experience

    Harnessing the data lakehouse and AI to revolutionize customer experience

    UOB achieved 99% cash availability …Read More
  • Bhutan sovereign wealth fund pilots offline data relay to stabilize distributed-ledger challenges

    Bhutan sovereign wealth fund pilots offline data relay to stabilize distributed-ledger challenges

    Amid remote connectivity gaps in …Read More
  • Low-code platform enables digital-first agility

    Low-code platform enables digital-first agility

    Few industries demand agility and …Read More
  • Going green all the way to Cyberjaya: Labuan Reinsurance’s data center relocation

    Going green all the way to Cyberjaya: Labuan Reinsurance’s data center relocation

    Relocation boosts sustainability, while a …Read More

Bottom Sidebar

Other News

  • SOLAR & STORAGE LIVE THAILAND 2026: LEADING THE FUTURE OF SUSTAINABLE ENERGY IN THAILAND

    January 15, 2026
    BANGKOK, Jan. 15, 2026 /PRNewswire/ …Read More »
  • PhotonPay Expands UK Local Payment Rails via New Collaboration with ClearBank

    January 15, 2026
    HONG KONG, Jan. 15, 2026 …Read More »
  • Rich Sparkle (ANPA.US) Shockingly Announces: Gathering Global Top Stars Khaby Lame and Crazy Little Brother Yang to Launch a New Era of Capitalization for 700 Million Fans

    January 15, 2026
    Building a Super Commercial Entity …Read More »
  • MediCapture Launches aiScope™ Pilot for Veterinary Sciences at VMX 2026

    January 15, 2026
    Medical AI Made Easy—From Lab …Read More »
  • ATFX Releases Q1 2026 Trader Magazine Spotlighting Policy Divergence and Global Market Volatility

    January 15, 2026
    HONG KONG, Jan. 15, 2026 …Read More »
  • Our Brands
  • CybersecAsia
  • MartechAsia
  • Home
  • About Us
  • Contact Us
  • Sitemap
  • Privacy & Cookies
  • Terms of Use
  • Advertising & Reprint Policy
  • Media Kit
  • Subscribe
  • Manage Subscriptions
  • Newsletter

Copyright © 2026 DigiconAsia All Rights Reserved.