Advanced software tools can rapidly strip safety controls from generative AI models: report

Multiple investigation show that available software can bypass AI guardrails in minutes, enabling harmful outputs and highlighting vulnerabilities, regulatory concerns.

According to a Financial Times (FT) investigation this week, special software tools can remove built-in safety controls from Meta and Google generative AI systems within minutes. Once altered, the models were no longer restricted from addressing harmful topics such as biological threats, malicious software, and illegal exploitation.

Highlighting concerns about how fragile current AI safeguards may be, FT had performed tests to evaluate how easily AI guardrails could be bypassed. Results showed that widely available toolkits can be used to override safeguards using methods such as targeted fine-tuning; adversarial training data, and automated prompt manipulation.

These approaches do not require retraining a model from scratch but instead adjust behavior enough to bypass restrictions. The FT report noted that such tools are already being used to produce large numbers of modified models with weakened or removed safeguards.

Multiple clear indications of AI jail-breakability

These findings align with a growing body of research suggesting that current alignment techniques may be fundamentally vulnerable.

A study published earlier this year in Nature Communications had found that advanced AI systems could act as automated jailbreak agents, successfully bypassing protections in most cases without human input.
Another paper presented at the International Conference on Learning Representations 2026 had introduced a method known as Head-Masked Nullspace Steering, which disables specific internal mechanisms responsible for enforcing refusals, achieving extremely high success rates in defeating safety measures.
The issue is especially pronounced for open-weight models from Meta and Google. While making model weights publicly accessible supports innovation and research, it also allows users to alter systems in ways that remove safety features.
Security experts have pointed out that many protections are only applied at a superficial level, meaning that once the underlying model is accessible, those safeguards can be stripped away using readily available techniques.
Earlier reporting from The New York Times have reinforced these concerns, citing research from cybersecurity firm LayerX that showed how easily safety protections could be bypassed in other leading AI systems.

Regulators in the US, EU, and UK are increasingly signaling that voluntary safety commitments by AI firms may not be enough, and this could lead to increased pressure for enforceable standards across both proprietary and open-weight models until stronger safeguards and independent verification mechanisms.

Featured

Creating value with AI upskilling

Featured

Sovereign AI – a competitive advantage

Featured

Deployment outpacing validation in digital experience

Featured

AI ambitions in APAC at risk, with poor infrastructure stalling AI growth

Featured

Study finds 13-sided “ein Stein” hat shape may have important practical applications

Featured

Getting on with RAMageddon: How the AI bubble is impacting electronics industries

Advanced software tools can rapidly strip safety controls from generative AI models: report

Leave a reply Cancel reply

Awards Nomination Banner

gamification list

top placement

Whitepapers

Achieve Modernization Without the Complexity

5 Steps to Boost IT Infrastructure Reliability

Simplify Payroll Setup for Your Small Business

Overcoming the Challenges of Cost & Complexity in the Cloud-first Era.

Middle Placement

Case Studies

Bank of Maldives updates core systems to support digital and Islamic banking operations

Xiaomi streamlines global payments across 18 markets

The 48-hour lifeline: How the IRC rewrote the rules for crisis care

CALB upgrades data platform to support analytics, security, and battery lifecycle tracking

Bottom Sidebar

Other News

Xinhua Silk Road: Guotai Haitong Securities leverages AI fintech to boost high-quality development of capital market

Weichai unveils microgrid solutions for Southeast Asia’s power market

The HKMA 58th Distinguished Salesperson Award (DSA) Presentation Ceremony

WEKA and Andromeda Partner to Power AI Workloads at Global Scale

Tesollo, a Robotic Hand Specialist, Kicks Off IPO Process | Appoints KB Securities as Lead Underwriter

Featured

Creating value with AI upskilling

Featured

Sovereign AI – a competitive advantage

Featured

Deployment outpacing validation in digital experience

Featured

AI ambitions in APAC at risk, with poor infrastructure stalling AI growth

Featured

Study finds 13-sided “ein Stein” hat shape may have important practical applications

Featured

Getting on with RAMageddon: How the AI bubble is impacting electronics industries

Advanced software tools can rapidly strip safety controls from generative AI models: report

Related Posts

Leave a reply Cancel reply

Awards Nomination Banner

gamification list

top placement

Whitepapers

Middle Placement

Case Studies

Bottom Sidebar

Other News