Yaya Cui, an investigator in plant sciences at the Bond Life Sciences Center examines data on fast neuron soybean mutants that are represented on the SoyKB database.
Yaya Cui, an investigator in plant sciences at the Bond Life Sciences Center examines data on fast neuron soybean mutants that are represented on the SoyKB database.

The most puzzling scientific mysteries may be solved at the same machine you’re likely reading this sentence.

In the era of “Big Data” many significant scientific discoveries — the development of new drugs to fight diseases, strategies of agricultural breeding to solve world-hunger problems and figuring out why the world exists — are being made without ever stepping foot in a lab.

Developed by researchers at the Bond Life Sciences Center, SoyKB.org allows international researchers, scientists and farmers to chart the unknown territory of soybean genomics together — sometimes continents away from one another — through that data.

 

Digital solutions to real-world questions

As part of the Obama Administration’s $200 million “Big Data” Initiative, SoyKB (Soy Knowledge Base) was born.

The digital infrastructure changes the way researchers conduct their experiments dramatically, according to plant scientists like Gary Stacey, Bond LSC researcher, endowed professor of soybean biotechnology and professor of plant sciences and biochemistry.

“It’s very powerful,” Stacey said. “Humans can only look at so many lines in an excel spreadsheet — then it just kind of blurs. So we need these kinds of tools to be able to deal with this high-throughput data.”

The website, managed by Trupti Joshi, an assistant research professor in computer science at MU’s College of Engineering, enables researchers to develop important scientific questions and theories.

“There are people that during their entire career, don’t do any bench work or wet science, they just look at the data,” Stacey said.

The Gene Pathway Viewer available on SoyKB, shows different signaling pathways and points to the function of specific genes so that researchers can develop improvements for badly performing soybean lines.

“It’s much easier to grasp this whole data and narrow it down to basically what you want to focus on,” Joshi said.

A 3D-protein modeling tool lends itself especially to drug design. A pharmaceutical company could test the hypothesis and in some situations, the proposed drug turns out to yield the expected results — formulated solely by data analysis.

The Big Data initiative drives a blending of “wet science” — conducting experiments in the lab and gathering original data — and “dry science” — using computational methods.

Testament of the times?

“Oh, absolutely,” Joshi said.

 

Collaboration between the “wet” and “dry” sciences

Before SoyKB, data from numerous experiments would be gathered and disregarded, with only the desired results analyzed. The website makes it easy to dump all of the data gathered to then be repurposed by other researchers.

“With these kinds of databases now, all the data is put there so something that’s not valuable to me may be valuable to somebody else,” Stacey said,

Joshi said infrastructure like SoyKB is becoming more necessary in all realms of scientific discovery.

“(SoyKB) has turned out to be a very good public resource for the soybean community to cross reference that and check the details of their findings,” she said.

Computer science prevents researchers having to reinvent the wheel with their own digital platforms. SoyKB has a translational infrastructure with computational methods and tools that can be used for many disciplines like health sciences, animal sciences, physics and genetic research.

“I think there’s more and more need for these types of collaborations,” Joshi said. “It can be really difficult for biologists to handle the large scope of data by themselves and you really don’t want to spend time just dealing with files — You want to focus more on the biology, so these types of collaborations work really well.

It’s a win-win situation for everyone,” she said.

The success of SoyKB was perhaps catalyzed by Joshi. She adopted the website and the compilation of data in its infant stages as her PhD dissertation.

Joshi is unique because she has both a biology degree and a computer science background. Stacey said Joshi, who has “had a foot in each camp,” serves as an irreplaceable translator.

Most recently, the progress of SoyKB as part of the Big Data Initiative was presented at the International Conference on Bioinformatics and Biomedicine Dec. 2013 in Shanghai. The ongoing project is funded by NSF grants.