Big (Data) Science Matters
Big Data Science is transforming the biomedical research landscape
In 1967, when nuclear physicist Alvin Weinberg1 coined the term Big Science, he was most interested in launching a concentrated effort to develop nuclear technologies. But he anticipated that large-scale approaches to biomedicine would also be productive. In fact, they have been—from the War on Cancer in the 1970s, to the Human Genome Project of the 1980s and 1990s, and The Cancer Genome Atlas and ENCODE projects of the 2000s.
Since Big Science projects require significant funding, there has long been a perceived tension between big research efforts and smaller ones. But I would argue that this is a false dichotomy. Big Science efforts like those listed above have boosted our fundamental understanding of biomedicine (to the benefit of the entire biomedical research community) and produced scientific tools and methods that have had a multiplier effect when distributed and used by the research community in projects big and small.
Biomedical computation has also had some “Big” initiatives, including the National Centers for Biomedical Computing (NCBCs), which flourished from 2004-2014 and the current Big Data to Knowledge (BD2K) Centers of Excellence, which were funded for four years starting in the fall of 2015. Compared with the NCBCs, the BD2K Centers are more focused on a single mission: extracting knowledge from big data. At the same time, because there are 13 BD2K Centers rather than 7 NCBCs, their coverage of biomedical data science, and indeed the entire spectrum of biomedicine is more thorough (see pages 4-5 for a graphic showing the Centers’ pan-NIH impact).
Just two and a half years into their four-year grants, the BD2K centers are already proving their value. This Special Issue of Biomedical Computation Review offers a glimpse at how Big Data Science can transform the biomedical research landscape in ways that benefit the research community and increase our knowledge and understanding of biomedicine.
In The FAIR Data-Sharing Movement: BD2K Centers Make Headway, you will read about the ways various BD2K Centers are establishing state-of-the-art methodologies for making data findable, accessible, interoperable and reusable. Fulfilling these goals is essential if biomedical researchers are going to make use of big data sources to advance biomedical knowledge. And the BD2K Centers are at the forefront of making that happen.
In this issue’s other feature story, Text Mining: How the BD2K Centers are Making Knowledge Accessible, you will see how top-notch computer scientists are bringing their text-mining tools to bear in biomedicine. From Chris Ré and his team at the Mobilize Center to Jiawei Han and his colleagues at the KnowEnG Center, the level of excellence is nothing short of remarkable.
And then there are the four UnderCurrents in this issue, each describing how the BD2K Centers are making a difference in targeted areas of biomedicine. You’ll read about how BD2K Centers are harnessing the vast stream of data coming from wearable sensors to improve health; how large-scale collaborative BD2K projects are deepening our understanding of brain diseases; how the Centers are striving to map the universe of drugs, predict drug responses and adverse reactions, and develop tools for drug repurposing; and how BD2K researchers are using data to detect and predict disease onset and progression.
This Special Issue shows that Big (Data) Science matters in biomedicine. It matters not only to the researchers doing it, but to the entire research community and, more importantly, to the advancement of science and the improvement of health. It’s a perfect fit for the NIH mission: to uncover new knowledge that will lead to better health for everyone.
1 Weinberg, A.M. 1967. Reflections on Big Science. The M.I.T. Press, Cambridge, MA. 182 pp.