Mining the language of science

Scientists are developing a computer that can read vast amounts of scientific literature, make connections between facts and develop hypotheses. Ask any biomedical scientist whether they manage to keep on top of reading all of the publications in their field, let alone an adjacent field, and few will say yes. New publications are appearing at a double-exponential rate, as measured by MEDLINE - the US National Library of Medicine's biomedical bibliographic database - which now lists over 19 million records and adds up to 4,000 new records daily. For a prolific field such as cancer research, the number of publications could quickly become unmanageable and important hypothesis-generating evidence may be missed. But what if scientists could instruct a computer to help them? To be useful, a computer would need to trawl through the literature in the same way that a scientist would: reading the literature to uncover new knowledge, evaluating the quality of the information, looking for patterns and connections between facts, and then generating hypotheses to test. Not only might such a program speed up the progress of scientific discovery but, with the capacity to consider vast numbers of factors, it might even discover information that could be missed by the human brain. Although still under development, the system can be used to make connections that would be difficult to find, even if it had been possible to read all the documents." - Dr Anna Korhonen The aim of Anna Korhonen and researchers in the Natural Language and Information Processing Group in the Computer Laboratory is to develop computers that can understand written language in the same way that humans do.
account creation

TO READ THIS ARTICLE, CREATE YOUR ACCOUNT

And extend your reading, free of charge and with no commitment.



Your Benefits

  • Access to all content
  • Receive newsmails for news and jobs
  • Post ads

myScience