Big Data mining at Pharma by IBM
Technology firm IBM, US, recently launched a cloud analytics platform, called IBM Strategic IP Insight Platform (SIPP), to allow researchers to extract data from patents and scientific journals. The new application will reportedly allow researchers to speed up their searches of massive amounts of patents and scientific journals to find information on pharmaceutical chemicals.
Search algorithms in the SIPP are said to make extraction of drawings, figures and articles from scientific publications faster. The firm donated a large amount of data it curated using SIPP to the National Institutes of Health (NIH). The data contains 12 million patents and 20 million Medline scientific abstracts. Researchers at the NIH are expected to use the information to discover new medication and research cures for cancer.
Scientists traditionally had to search for the chemical names in paper journals. IBM’s new cloud-based platform will now help them curate the data on molecules and chemicals within 24 hours of publication. In the database, the chemical names map out to synonyms for the chemicals.
For the NIH project, IBM took pharmaceutical data from AstraZeneca, Bristol-Myers Squibb, DuPont and Pfizer. It extracted the data from 2.4 million chemical compounds, 4.7 million patents and 11 million biomedical journals.