New UK centre for text mining may improve information management

This is the potential of text mining.

The JISC, BBSRC AND EPSRC have announced funding of £1m to establish a National Centre for Text Mining. The remit of the Centre, the first publicly funded centre in the world, is to contribute to the associated national and international research agenda, to establish a service for the wider academic community, and to make connections with industry.

Text mining attempts to discover new, previously unknown information by applying techniques from natural language processing, data mining, and information retrieval:

> To identify and gather relevant textual sources

> To analyse these to extract facts involving key entities and their properties

> To combine the extracted facts to form new facts or to gain valuable insights.

Text mining finds applications in many diverse areas of wide interest such as drug discovery and predictive toxicology, protein interaction, competitive intelligence, protection of the citizen, identification of new product possibilities, detection of links between lifestyle and states of health, and many more.

Led by UMIST, the National Centre for Text Mining will be run by an internationally leading consortium. The consortium has four UK partner institutions: UMIST, the Victoria University of Manchester , the University of Liverpool, and the University of Salford. These core partners are extended by international partners: the University of California Berkeley, the University of Geneva, the San Diego Supercomputing Centre, and the University of Tokyo, with the European Bioinformatics Institute having presence on the Technical Directorate. It is anticipated that the Centre will engage as part of the related emerging networks of excellence.

The Centre will be initially focused on biological and biomedical science. This area of science has the largest user community and the fastest growing literature, and the area where most applications research in text mining is being undertaken. At the same time, the tools developed by the Centre will be of interest and relevant to the needs of the wider academic community. A major challenge for the Centre will be to handle efficiently and robustly very large volumes of text and the intermediate data produced while processing.

The Centre will be housed in the under-construction £34M Manchester Interdisciplinary Biocentre to facilitate interaction between text mining researchers and bio-domain users. As a measure of its commitment to the Centre, the consortium is itself investing some £800K, including the establishment of a new Chair in Text Mining and the full-time secondment of staff. Further, the North-West Development Agency, the National Centre for e-Social Science, the Consortium for Post-Genome Science, and e-Science Northwest have been most supportive of the initiative.

Professor John Garside, Principal and Vice-Chancellor of UMIST, said: "I'm delighted that UMIST and the new University of Manchester have the opportunity with this new centre to make a leading contribution to the critical task of deriving meaning from text. The consortium represents expertise in all the component areas of text mining, with an impressive array of international partners.'

Professor Ross King, of the University of Wales, and member of the JISC Committee for the Support of Research (JCSR), said: 'The setting up of the UK Centre in Text Mining is a very exciting development. The amount of scientific literature is growing so fast that there is an urgent need for novel computer based tools to help scientists keep up. The success of Google has shown how useful text retrieval programs can be. The tools developed will be applicable to academics in all subject areas, including social science and arts and humanities, for example in analysing ancient texts found by archaeologists.'

Professor John Keane (Computation, UMIST; Proposal Coordinator, and Interim Co-Director): 'The Centre will play a leading role both nationally and internationally in developing the research agenda in text mining, promulgating associated best practice, and developing service provision. All those involved look forward to the challenges and opportunities that lie ahead.'

Dr Sophia Ananiadou (Computing, Science and Engineering, Salford; Interim Co-Director): "The Centre will address the increasing needs of the bioscience community to gather and structure scientific knowledge from texts. The synergy of text mining and bioscience will be beneficial to scientists from both communities.'

Related links to this article:

JISC