ECU database remains important for cancer researchers

What started as a master’s thesis about 10 years ago continues today to be an important database for cancer researchers throughout the world.

Dr. Qin Ding, associate professor in the East Carolina University Department of Computer Science, and former graduate student Boya “Tina” Xie developed an online database called mirCancer. The database, which uses a text mining approach the two developed, has proven valuable to thousands of researchers seeking information about microRNAs, also known as miRNAs. They can be used as indicators of cancer during diagnosis or as suppressors in treatment, Ding said.

Dr. Qin Ding

“Many researchers have done experimental studies to discover the relationships between cancer and miRNA, which miRNAs are associated with which type of cancers,” Ding said. “However, the research results are scatted in huge volumes of publications and the number of publications is growing in exponential order, making it difficult for other researchers to get the complete and up-to-date research results.”

Xie, now a senior data and applied scientist at Microsoft, noticed this problem when her then-roommates in ECU’s Brody School of Medicine had to do research related to microRNAs.

“I heard them talk about their experiments every day at dinner, learned that there are vast numbers of microRNAs and how they regulate human cancers,” said Xie, who received her master’s degree in 2010. “I also learned that finding candidate microRNA and doing literature review are time-consuming steps. At the same time, I was attending Dr. Ding’s database class, and there was a course requirement to implement a database. Inspired by my roommates’ dinner conversation and motivated by Dr. Ding’s course, I started to create a database for microRNAs that later became this project.”

MirCancer provides a comprehensive online searchable database for microRNA profiles in different human cancers, based on documented experiments and approved results.

“Before this project, researchers needed to manually read paper by paper to get the same information,” Xie said.

The database went live in 2012 and has been updated 27 times. Today, it includes 9,080 relationships between 196 types of cancers and 57,984 types of microRNAs extracted from 7,288 research articles, 10 times more than the original version of the database.

Researchers can access the database for free and can also download it. It also provides links to original publications.

Ding said that within just the last two months, more than 600 users from countries including China, Japan, Germany, Italy, India and the United States have accessed the database. It’s been downloaded 70 times. Ding also said their 2013 publication on the mirCancer database in the Bioinformatics journal has been cited in more than 300 journal articles.

“We are very honored that our work has been highly valued by other researchers in the field,” Ding said. “The research impact of our work turns out to be much more far reaching than what we expected when we first developed the project.”

Ding said the project would not be possible without support from the College of Engineering and Technology’s IT team and technical support from John Jones, instructional technology consultant and adjunct instructor in the Department of Computer Science, and director of information technology for the college.

Xie said the project directly relates to her work at Microsoft.

“Working on this project gives me experience in text mining, which directly gets me to my current position,” she said. “Through this project, I’ve become familiar with biomedical literature and related systems, which gives me an advantage dealing with interdisciplinary tasks at work, especially biomedical literature related ones.”

But beyond that, Xie sees the importance of the project.

“I get excited and motivated every time knowing people are using the website and data,” Xie said. “I feel my work is recognized by and contributing to the research community.

— By Ken Buday