AI firms must address hallucinations before GOV.UK chatbot can roll out, digital chief claims

Trials of generative AI tool for government website have found that automated system can become confrontational – or even ‘seductive’ – meaning that the technology is not yet suitable for widespread deployment

Trials of a generative AI-powered chatbot for GOV.UK users have found ongoing issues with so-called hallucinations that must be addressed before the technology can be widely deployed, according to one of the government’s digital leaders.

GOV.UK Chat – which is designed to provide “human-like responses” to users’ questions – was developed by the Government Digital Service, and is based on technology from OpenAI, the creator of ChatGPT. The government chatbot recently went through a pilot exercise in which 1,000 citizens were invited to try out the platform.

An update recently posted by GDS claimed that the testing process had shown promise but shed light on “issues of accuracy and reliability”.

This included several reported instances of hallucinations – the term applied to instances in which an AI system wrongly identifies or misunderstands patterns, and provides responses that are inaccurate, nonsensical – or worse, according to Paul Willmott, chair of government’s Central Digital and Data Office.

Related content

Speaking at an event this morning, he said: “We have experimented with a generative advice [tool] on GOV.UK. You will just say ‘I’m trying to do this’, or ‘I’m annoyed about this’… The challenge we are having – which is exactly the same as in the commercial sector – is what to do with the 1% of hallucinations where the agent starts to get challenging, or abusive – or even seductive.”

Even if only present in a tiny minority of instances, these issues mean that GOV.UK Chat is not yet ready for widespread deployment, according to Willmott. Addressing hallucinations will require the support of the likes of OpenAI and other creates of large language models.

“Until we have managed to iron that out – which will require the support of the foundational model creators – we won’t be able to put this live,” he said.

Elsewhere, the CDDO chair – whose part-time role is split with his day job as chief digital adviser for the LEGO Brand Group – said that his mantra for ministers interested in artificial intelligence is that “you cannot have the cherry unless you’re prepared to pay for the cake”.

“That’s infrastructure, moving things to the cloud getting data standardised – you need to invest in the right way to get the benefits of AI,” he added.

Willmott’s comments are made in the same week that one of government’s largest organisations, the Department for Work and Pensions, effectively banned its 90,000 employees from using ChatGPT for any official business or on any government-owned device. The DWP’s decision was revealed less than three weeks after the CDDO published the Generative AI Framework for HMG – a set of government-wide guidelines setting out 10 principles to inform civil servants’ use of LLMs. The last of these tenets is that any official wishing to access generative AI tools “must make sure you’re acting in line with the policies of your organisation”.

Sam Trendall

Learn More →

PublicTechnology

AI firms must address hallucinations before GOV.UK chatbot can roll out, digital chief claims

Trials of generative AI tool for government website have found that automated system can become confrontational – or even ‘seductive’ – meaning that the technology is not yet suitable for widespread deployment

Related content

Sam Trendall

Leave a Reply Cancel reply

Major police database suffered 18 ‘unplanned outages’ last year

Whitehall non-exec appointments need to be more ‘efficient, transparent and fair’, MPs find

Government commits £51m to bring HMRC phonelines up to ‘performance standards’

Hear for the first time from CDDO’s new strategy chief at PublicTechnology Live in London on 21 May

Tech pros call for rule change to allow media to refute AI-powered misinformation on election day

Trials of generative AI tool for government website have found that automated system can become confrontational – or even ‘seductive’ – meaning that the technology is not yet suitable for widespread deployment

Related content

Related Posts

Sam Trendall

Leave a Reply Cancel reply