Round table - The Big Data debate

Written by Colin Marrs on 29 October 2014 in Features
Features

From norovirus to illegal gangmasters, Big Data is helping government get to grips with serious issues. Public servants gathered to share their experiences.

According to technical blog thedvnc.org, over 40% of the computing power ever manufactured was built in 2013. That demonstrates the staggering rise in information processing capacity – and this growth has been matched by rapid increases in the amount of information available for processing. The result is a world of ‘big data’, in which information can be combed through and analysed to identify previously-hidden trends, dynamics and connections.

For public servants, big data provides a way to gather much stronger evidence to underpin policy-making. Challenges remain, however – from identifying the most useful data sets, to understanding the results and feeding back into the policy-making process. In September a round table, held in association with data analytics provider SAS, discussed how officials can use big data to help shape policy decisions and provide feedback on impacts.

Using big data in government

As the conversation got underway, Simon Dennis from SAS outlined the changes that have transformed the data analysis landscape in recent years: “For a long time HMRC could only analyse 10% of their data because the processing power wasn’t there,” he said. “It is now relatively cheap to analyse 100% of the data.”

James Miller from HMRC stressed that the speed at which data can now be analysed is as important as the ability to handle large volumes of info – and the ‘real time’ updating system established to support Universal Credit means that recent data can now be examined and analysed. “Tax data has had a huge time lag on it of a year or more, and that makes it quite hard to understand what policy changes should be implemented,” he said. “Getting to analysing data in real time allows us to recognise the effect of operational changes more quickly and to have policy options we never thought of before.”

His HMRC colleague, Gill Standen, added: “We can see trends when they start, and react accordingly. We have already seen that with the policies we have changed with regard to employment income and subsistence allowances.”

John Sheridan from The National Archives has identified another way to use analytics to inform policy: he’s carrying out a project to provide legal researchers with the technology to understand how the statue book works and use those insights to deliver better law. “There was a raft of things that were too hard to do, or where it wasn’t obvious how to go about counting things or measuring them, that have now become easy,” he said.

Analysing data across organisations

Departments may also be able to derive valuable insights from accessing datasets belonging to other parts of government or non-governmental organisations.

For example, DWP is using information gleaned directly from HMRC’s computers to determine what an absent parent should pay towards the upkeep of their children: “We used to assess child maintenance liabilities by asking the various parties to produce evidence. The new system makes the service quite a lot quicker, and it’s cheaper,” explained Adam Taylor from DWP.

Standen said HMRC’s newfound ability to analyse lots of information quickly has helped other departments, too: “It has allowed us to do some amazing cross-government work, particularly with the National Crime Agency, Cabinet Office and DWP with regards to tackling modern-day slavery. The sharing of that data and its availability in real time has real-life outcomes, like the arrest of illegal gang-masters.”

She added that HMRC is also benefiting from data gathered outside the organisation: “Using information from representative bodies in the finance and accounting industries has enabled us to map out the peaks and troughs in tax return filing behaviours,” she explained, giving HMRC the information to plan its staffing levels.

Meanwhile, Penny Bramwell from the Food Standards Agency had another example of the benefits of mining information from other sources: “We are interested in the application of big data to get earlier insights about food fraud and intervene more quickly where there are food safety incidents,” she said. “We co-sponsor a project that generates genome sequences of food pathogens. Last year the US was able to use it to work out that there was about to be an incident, and intervened to stop contaminated food entering the supply chain.”

Exploiting data from social media

One of the topics that generated the liveliest debate was the potential for using information gleaned from informal, non-governmental sources such as social media.

Sue Bateman is leading work in the Cabinet Office to establish pilot projects that illustrate the benefits of data science. One venture examined the Twitter accounts of several ambassadors to find out who they were reaching: “They wanted to understand the impact of their messaging and how different their virtual network was from their traditional face-to-face network,” she explained. Bramwell added that outbreaks of the norovirus, or winter vomiting bug, have been successfully identified at an early stage by analysing the number of people within a location tweeting that they are feeling unwell.

Duncan Gilchrist from DWP questioned whether Twitter and Facebook users are sufficiently representative of the population for the information derived to be reliable: “Have we got enough digital penetration yet that big data from social media is usable in government, or is it going to give us a pretty distorted picture?” he asked.

“This has some pretty risky implications,” responded Matt Walker, a statistician at DECC. “We try in government statistics not to make inferences. That is the difference between formal government statistics, which are important for government reporting, and data from social media. It is an interesting add-on but I don’t think we should move totally away from our routine statistics.”

“It is important to think about the biases that could be built into certain data sets,” Sue Bateman concurred. Meanwhile, Bramwell argued that the information’s usefulness depends on the question being asked. In the case of the norovirus data, she said: “It is a signal – and a bigger signal than the notifications to GPs and hospitals.”

Ethical issues and public trust

Some of the participants questioned the ethics of using social media and other personal data. “We are not Facebook. If you are government you cannot be in the business of grotesquely exploiting your citizens’ information,” warned Sheridan. “Government needs to do more to retain public trust. It must be much less gung-ho than a large US company operating with a very enabling set of terms and conditions.”

Helen Fleming of the Competition and Markets Authority believes that people are increasingly sensitive to how data is used: “Big data has given any organisation the ability to use data in a way that gets much deeper and richer information about people,” she said. “Government needs to keep in line with the way consumers and citizens feel about these sorts of things. We need to get some fair and visible terms and conditions in place for citizens to sign up to.”

Making sure that individuals are not identifiable from the information in a data set, a process known as anonymisation, is a vital technique for addressing such worries, said Sheridan: “But it turns out it’s quite hard. No-one knows how to do it really well in all of the different contexts.”

Misuse is not citizens’ only concern, suggested Ulele Andrews of the central policy profession team: “The other risk is that if government actively collects additional data, people are concerned about it getting lost or hacked. The loss of Ebay passwords – that sort of thing really scares people,” she said.

Big data for policy making

Adam Taylor argued that data analysis might help support evidence-based policy-making: “There is always a danger in a democracy that what people think on a given day will drive policy, rather than data which actually shows whether there is a correlation or causal link between a phenomenon and a solution. The good thing about big data is that it is a pre-existing set of findings which is sitting there to be tapped into,” he said.

Miller agreed, arguing that politicians are already keen to base policy decisions on sound information: “I think ministers are concerned about what the underlying trends are. There’s nothing worse than a U-turn when it all goes wrong because you didn’t understand it,” he said.

The information revolution could also provide swifter evidence about policies’ effectiveness, added Gilchrist: “Generally, the time before data on a major policy change can be measured is sufficiently long that very few ministers are held to account for it. This might change that.”

Big data could also lead to more challenges from the public, said Fleming: “People could pull out interesting bits of information that are credible because they are based on decent data, and stand them up in opposition to government policy.”

Sheridan responded that this could drive change and improvement: “You will have to be on your mettle because you are going to be held to account,” he said. “You will start seeing some of the rigour that you see in other realms. When it works well, peer review in academia is fantastic – so what does that look like with policy-making?”

Designing more intelligent services

Private companies are using data analysis to single out individual customers and offer personalised services – and government can do the same, said Fleming: “The citizens’ expectations of us will be different. We can do things much more intelligently in future if we have information that allows us to target people at the right moment. We will get much better value and they will get a much better service.”

John Tibble from SAS argued that big data offers operational benefits for areas such as border control: “There is an awful lot of work done on trying to discover the people that they are most interested in, but the flip side of that is they can work out the people they are least interested in, which may inform their policy about how they treat those people,” he said. “The speed at which information can be analysed is important in a complex process like this.”

Angela Measures, who works in customer design strategy at HMRC, outlined the benefits to customer service already accruing from having more information about service users: “At HMRC the people who are designing business plans now stop to think about how well the service is being designed around the data that we have about customers”, she said.

Capability and skills

One of the obstacles to making better use of big data is a lack of people able to step out of their existing civil service professions and add to their existing skills an understanding of data analysis and statistics, argued Sheridan. “You need to combine enough knowledge about the data with enough knowledge of the kinds of questions that you want to ask of that data in order to be able to frame really useful queries,” he said. “That will rely on people in each of the areas converging towards some of the same territory.” He added that designing computer systems which allow non-data scientists to pose and answer questions will also be part of the solution.

The policy profession’s Andrews suggested another approach: “I agree about the need to broaden out skills,” she commented, “but what about working more closely in multi-disciplinary teams so that each individual doesn’t have to have the full picture?”

Bateman suggested a middle path: “We need some kind of mixed team. In Cabinet Office we are talking very actively to the analytical professions, and embracing some of these new techniques is a big step change for them. We also need our policy-makers and people who are designing services to know that the world is changing, and while we don’t want them to become data scientists we want them to know enough about the change to be able to intelligently ask questions and to be able to use some of the insights coming out of the analysis in policy-making.”

The enormous step change over the past few years in the quantity of data available and the power to process it offers huge opportunities for creating and delivering more intelligent services. However, the big data revolution is so novel and its implications so wide-ranging that it will be years before it can be fully harnessed.

This article was first published on Civil Service World, sister title to PublicTechnology.net

Share this page

Tags

CONTRIBUTIONS FROM READERS

Please login to post a comment or register for a free account.

Related Articles

Nearly a third of Defra’s 2,000 applications are past end of life, auditors find
7 December 2022

Department must commit a decade of work and three quarters of its IT budget to upgrading ageing kit

Common complaints – why the ICO is considering revamping FOI casework
7 December 2022

A ‘perfect storm’ of factors helped create a significant backlog of information-access complaints – but the data watchdog has a plan to improve. PublicTechnology takes a closer look.

Tech giants paying ‘significantly more tax’ following HMRC digital services levy
1 December 2022

Tax agency raised £83m more than expected during 2021 fiscal year

EXCL: Cabinet Office alerted to data breach – and fails to respond for 10 days
25 November 2022

Personal details of civil servant and supplier exposed by inadequately redacted document, discovered by PublicTechnology