An audit report costing taxpayers nearly US$290,000 had included fabricated references attributed to AI-generated content from GPT-4o, showing substandard verification diligence.
The Australian government is set to receive a partial refund from Deloitte Australia after a commissioned welfare technology audit was revealed to contain numerous errors, many linked to undisclosed use of generative AI tools in the report’s drafting.
The Department of Employment and Workplace Relations had contracted the firm for nearly A$440,000 to review automated penalty systems within Australia’s welfare structure, but the report — first published in July 2025 — was subsequently flagged for fabricated citations and fake legal references, including a nonexistent Federal Court judgment.
Academic scrutiny triggered the discovery, most notably by Sydney University’s Chris Rudge, who had identified up to 20 fictitious references, including a non-existent academic book fictitiously attributed to an actual law professor at the institution. Rudge raised wider concerns about the reliability of AI-generated content in legal audits, noting that errors such as misquoting court cases could undermine the audit’s core function of assessing departmental compliance.
In response, the Department released a revised version on October 3, removing inaccurate citations and, for the first time, disclosing that Azure OpenAI’s GPT-4o model had been involved in compiling sections of the report.
While department officials state the report’s recommendations remain unchanged (a point of contention raised by Rudge, according to an Ars Technica report), Deloitte has agreed to reimburse the contract’s final instalment, although the exact amount will be published after the transaction concludes.
Deloitte has stated that the matter has been resolved directly with its client, and did not comment on how much of the problem originated from AI use.
Senator Barbara Pocock of the Australian Greens has called for a full refund, criticizing the firm’s “misuse” of AI and likening the errors to academic misconduct. The controversy has heightened calls for rigorous controls over AI-generated material in sensitive public sector consulting.
Why basic fact-checking protocols were not followed in a report issued from a Big 4 consultancy — one that trades on its reputation for excellence — remains unanswered. Ultimately, the core issue in this controversy is not the firm’s use of AI tools, but its failure to apply elementary diligence — an omission that undermines fundamental professional standards.