EXCL: HMRC looks to improve intel with ‘tuning’ of 55 billion item compliance database


Connect has been in operation for 15 years and matches data from over 130 sources to help detect tax avoiders – although ‘human insight always makes the final judgement’, department claims

HM Revenue and Customs is to spend up to £3m conducting a “tuning” exercise to help address issues of “over and under linking” of 55 billion items of data used to help detect individuals and businesses evading tax.

The Connect system was launched about 15 years ago and draws data from more than 130 sources. This is understood to include the likes of: domestic and overseas bank, credit card, and other financial information; online web browsing, social media and shopping records; Land Registry files and details of rented properties; travel and border-entry data; HMRC’s own personal and business tax returns; and scores of other sources.

The controversial database – which has been decried by industry critics for “spying” and conducting “Big Brother surveillance” activities – cross-references data and uses artificial intelligence and automation to match entries from across this array of sources to individual people or entities. The aim is to help detect those who have provided HMRC with inaccurate information on their income, assets, or spending.

A total of about 4,500 departmental analysts focused on compliance use the intelligence gained from the Connect system to help support investigations which, since the platform’s creation, are reported to have resulted in the state clawing back billions of pounds of unpaid revenue.

But, according to freshly released commercial documents, there is a growing need to conduct work to clean up the database, and ensure that information on people and organisations is being matched correctly.

“Over the [last] 15-plus years new data sources have been added and redundant data sources have been removed,” HMRC said in a newly published contract. “Some data linking has been lost due to metadata key data no longer available. This work will look to improve the quality of the data to ensure the correct tax is collected and the tax gap is reduced.”

It added: “[The work] is required to: investigate the source of over- and under-linking within the monthly compliance build and outputs to [compliance analytics tools]; validate identifiers across sources and introduce master keys to improve matching; address entity tuning; test and apply natural networks; develop the code to rectify poor data; [and] test code changed and quantify benefits.”


Related content


To deliver these objectives, HMRC has signed an initial £1.5m one-year deal for a “network tuning” exercise to be led by BAE Systems Applied Intelligence.

The agreement, which can be extended for a further 12 months, will require the defence firm to  “design, build, run and improve analytics and cognitive products”. The text of the contract explains that an external engagement was necessary as “the HMRC team that runs Connect… needs contractor support to provide a highly specialised skill set that it is very difficult to recruit for as permanent civil servants”.

The BAE contractors will join a project that is already way, with the contract stating that “work has already begun specifically to look at tuning [data on] businesses, individuals, self-assessment, VAT… and this work has seen far more accurate cases being built by analysts and less workarounds in place cutting down the time to deliver cases and increasing productivity”.

PublicTechnology contacted HMRC and asked for any further available information on the tuning exercise, and whether any current data-quality issues have resulted in investigations against people or businesses being launched on false pretences. We also asked for an update on the scope and remit of Connect more generally as, despite its decade and a half in operation, government has published very limited information on the tool over the years.

In response, a spokesperson said: “Connect is a powerful analytical tool that we have used since 2010, which has helped make us a world leader in using data and insight to inform and manage risk. The Connect system is not the sole deciding factor in beginning or deciding the direction of a tax investigation. Other factors are also considered and human insight always makes the final judgement.”

Connect is currently undergoing a wider upgrade process, via the Protect Connect Programme that is part of government’s major projects portfolio. According to the most recent set of project data released by HMRC, the programme “aims to safeguard the operation of HMRCs most critical repayment risking services, future-proofing them by hosting them in the cloud and laying the essential foundation for development of future strategic risking capabilities, [which] aligns both the HMRC compliance and IT strategies, enhancing the understanding of customers and developing increased insight using a single data and analytics platform”.

As of the end of the 2022/23 fiscal year, the ultimate delivery cost of Protect Connect stood at £235m – a figure which was more than double the £115m estimate of the prior year.

“The whole life cost increase is primarily due to the programme business case lifecycle being extended from five to nine years in order to comply with HM Treasury Green Book standards to include a 5 year run period, post go-live,” according to HMRC transparency data.

Sam Trendall

Learn More →

Leave a Reply

Your email address will not be published. Required fields are marked *