GDS pledges to learn from coding gaffe that caused GOV.UK search issues
Organisation promises ‘multiple steps’ to ensure mistakes are not repeated
The Government Digital Service has indicated that it is taking “multiple steps” to make sure it avoids repeating the coding mistakes that led to the GOV.UK search engine delivering duplicate and missing results.
The GOV.UK search function runs on open-source search technology Elasticsearch, which GDS has customised with an “API wrapper” called Rummager. On 18 May 2017, GDS enacted “what should have been a simple code change to Rummager, to make it compatible” with the upgraded version of the search engine, dubbed Elasticsearch 2.4.
But the code change “unexpectedly caused problems for a small percentage of users on GOV.UK”, GDS said. These problems involved some specialist pages failing to appear in the dedicated “finder” pages for that topic or area, and some results appearing in duplicate.
These issues were spotted within 24 hours, GDS said, and the offending code change – which amounted to no more than three lines – was rolled back. However, about 3,000 GOV.UK pages were updated by content teams while the faulty code was live.
- GDS pledges to address GOV.UK search and navigation issues
- GDS to cut GOV.UK frontend templates from 140 to 10
- GDS to audit all GOV.UK pages by 2020
“We decided against using our nightly backup to restore GOV.UK to a previous time, because otherwise these content updates from publishers would have been wiped out from search,” GDS said. “It would have been harder to repopulate the missing data within search than to correct the corrupt data, because this would have involved resending data from all of the GOV.UK publishing apps to Rummager.”
It added: “After rolling back the faulty code, our developers fully investigated and fixed the search documents that had been updated while that code was live. During this period of investigation, the quality of search results was affected.”
This was caused, GDS explained, by the disabling of a function that ranks pages that generate a greater volume of traffic higher in search results.
GDS ultimately discovered that the problems were caused by the “_id” and “_type” values of certain pages. These were values that were assigned to pages by Rummager, but which Elasticsearch 2.4 were unable to read as identifiers.
“We hadn’t updated some parts of the Rummager code that depended on those redundant fields, and this is what caused the duplication and missing search errors,” GDS said. “We didn’t catch this during code review, and our automated tests didn’t catch it either. Part of the problem was that documents needed to be indexed twice by Rummager before the problem was visible to users.”
Having replaced the errant code and, where necessary, written new code to ensure unnecessary duplicate documents were deleted, GDS said it has learned from the experience.
“This incident taught us that you need to understand the core concepts and constraints of your database and always check your assumptions,” it said. “There are multiple steps we are taking to make sure this doesn’t happen again."
While GDPR is right to provide individuals with greater control over how their information is used, the benefits of sharing data should not be overlooked, believes Rose Lasko-Skinner of Reform
Paul Maltby claims councils must first renew ageing infrastructure before realising the benefits of machine learning and automation
Theresa May uses speech in Macclesfield to announce plans to work with technology sector and NHS to improve diagnoses
Department issues contract notice seeking external supplier for two-year contract to install unified communications environment
The cautionary tale of the Leicestershire teenager who hacked high-ranking officials of NATO allies shows the need for improved password security
Calm has turned a section of the 57,509-word EU document into a sleep-inducing audio book
Which? said a lack of knowledge about data among consumers had led to suspicion and doubt over useful innovations
BT's Konstantinos Karagiannis explains ethical hacking and why it's important to exploit vulnerabilities