Unstructured data from health forums is an ‘untapped’ resource for policymakers

Written by Rebecca Hill on 19 April 2017 in News

Pilot study suggests qualitative data from online forums could improve healthcare professionals’ understanding of patient needs

Data from online forums could be used by providers to reassess their services - Photo credit: Flickr, Till Westermeyer, CC BY-SA 2.0

The wealth of unstructured, qualitative data available in online forums is an untapped resource that healthcare professionals and policymakers should use to their advantage, a report has said.

The think tank Demos and health charity The King’s Fund have published a joint report on how algorithms could be used to help analyse posts about mental health made on publicly available websites or forums.

Data from these forums offers a different perspective on health, the report said, as it gives an insight into the lived experience of those with mental health problems, as well as the advice and support given by their peers and assessments of interactions with health services.

The report said that the data could be analysed at scale by using natural language processing algorithms, which involves training computers to better process and manipulate human languages.

Related content

The King’s Fund calls for clarity on funding for fully-digital NHS
DH trials algorithm tool to analyse consultation responses in effort to handle ‘click democracy’
GDS expands data science training programme for civil servants

The pilot study used a modified web scraper on more than 1 million posts made between June 2004 and May 2016 on six online forums. The data was psuedonymised and then used to train natural language processing algorithms to understand how people discuss mental health online.

The software was tested on three questions, asking whether it could accurately identify: cries for help, where people wanted guidance from other users; discussions about cognitive behavioural therapy; and cases of co-morbidity, where a mental health problem coincided with long-term physical conditions.

The report said that the software had accuracy rates of around 65% for both cries for help and identifying posts about CBT, with this increasing to 72% accuracy in identifying posts where the person had had CBT. The team also claimed a 98% accuracy for the 50 posts they assessed for co-morbidity.

According to the authors, there is huge potential for analysis of publicly available data to be used to inform policymaking, for instance by offering health regulators more insight into the performance of providers and giving service providers themselves a better understanding of their users.

Josh Smith, a co-author on the report and researcher at Demos, said the study “highlights the potential for new technology and methodologies to provide a whole new perspective on mental health”.

However, the report also acknowledged the “significant technical, methodological and ethical challenges still to overcome”, including concerns that free text entered in online forums might include identifiable data, which would make it difficult to fully anonymise data.

The report stressed that the approach “is not and never will be a silver bullet”, saying that the data should only be seen as a complementary source of information.

The work received ethical approval from the University of Sussex Ethics Review Panel, but was not considered a clinical study as it did not recruit patients from the NHS and didn’t gather clinical data or make interventions that would affect anyone’s care.

The Department of Health recently revealed that it was looking into the use of algorithms to make sense of unstructured data, with digital strategy manager, Laurence Erikson, saying that it was using machine learning to help analyse responses to public consultations.

“The findings so far are intriguing,” Erikson said in a blogpost about the work, published in January. “We found that the machine learning approach reinforced some of the findings of the manual approach, but also identified new insights from the consultation responses.”

In February, MPs announced plans to investigate the use of algorithms in decision-making, to look at whether, and how, they can be used in a transparent or accountable way.

Share this page



Add new comment

Related Articles

Liverpool CCG plans telehealth pilot with hospital consultants
9 February 2018

As part of a mission to grow its use of healthcare technology, the city will trial the expansion of telehealth referrals in secondary as well as primary care

Eduserv datacentre closure to leave public-sector clients in need of a new hosting home
19 January 2018

Public sector ICT charity to migrate all customers – understood to include local and central government bodies – by the end of 2018, after deciding public cloud is the way forward


Related Sponsored Articles

How to quantify cyber risk
15 March 2018

BT's Malcolm Stokes explains how organisations can attribute accurate figures to cyber risks in order to make a viable business case.

Cyber security is one of the greatest man-made challenges of our time
6 March 2018

BT's Ben Azvine argues that the frequency and impact of breaches is increasing and we need to continuously adapt and innovate to stay ahead of the threat environment

Who keeps your organisation secure?
19 February 2018

BT's Amy Lemberger argues that having the right security in place to protect your organisation is no longer just an option. It is a necessity.