9 January, 2017
Clemson Researchers Optimizing End-To-End Movement Of ‘Big Data’
CLEMSON, South Carolina — Today’s scientists are riding an unprecedented wave of discovery, but the immensity of the data needed to facilitate many of these breakthroughs is creating internet roadblocks that are becoming increasingly detrimental to research.
Finding ways to deal with “Big Data,” which is defined as data sets too large and complex for both traditional computers and average network throughput to handle, has become a science in itself.
But with an eye to the future, Clemson University researchers are playing a leading role in developing state-of-the-art methods to transfer these enormous datasets from place to place using the 100 gigabit Ethernet Internet2 Network. Owned by the nation’s leading higher education institutions, the advanced Internet2 Network is the nation’s largest and fastest coast-to-coast research and education infrastructure designed for next-generation scientific collaboration and Big Data transfer.
“We’ve leveraged advanced research networks from Internet2 and parallel file system technologies to choose the optimal ways to send and receive massive data sets around the country and world,” said Alex Feltus, associate professor in genetics and biochemistry in Clemson University’s College of Science. “What used to take days now takes hours – or even less. And these same methods apply to any project that uses large, contemporary data sets.”
Genomics research is rapidly becoming one of the leading generators of Big Data for science, with the potential to equal if not surpass the data output of the high-energy physics community. Like physicists, university-based life-science researchers must collaborate with counterparts and access data repositories across the nation and around the globe.