BIG RED News: UNL Team Complex Biosisters Bring Home Grand Prize in Hackathon

Left to right: PhD students Bridget Tripp, Ashley Stengel, Carrie Brown, Kimbery Stanke
Left to right: PhD students Bridget Tripp, Ashley Stengel, Carrie Brown, Kimbery Stanke

In addition to the grand prize for application development, a group of four PhD students in the Complex BioSystems Program came away with third place with their educational video at the Early Career Big Data Summit (ECBDS). The event, hosted by the University of North Dakota and the Midwest Big Data Hub, included multi-industry panel discussions, researcher lightening talks and a hands-on application hackathon.

As the only all-female team and the only one representing Life Sciences, the four UNL women set out to highlight the role of big data in biological research. Together they created On Demand Taxonomic Reference Database Compilation for genome sequencing. Application development, led by Carrie Brown and Bridget Tripp, was the brain child of Carrie’s research on the presence of an obligate symbiont in absence of its host.

“Upon further investigation, I realized that the host was probably present but was not being identified due to the outdated nature of the reference database being used. With the rate at which reference sequences change within public repositories such as NCBI and JGI, a researcher who wants the most update information for their analysis would be better off compiling their own instead of waiting for updates from the datasets available,” she explained.

Writing bash script with data from the National Center for Biotechnology Information (NCBI),
Bacterial 16s sequences were obtained and then clustered by sequence similarity, with the data then placed into the QIIME required format. According to the group, future directions include incorporating JGI (Joint Genome Institute) and EMBL (European Molecular Biology Laboratory) datasets, as well as expanding the selection to include Archaea and Fungi sequences.

“The idea is that since the underlying taxonomic reference databases for the major programs are out of date and the data changes rapidly (300 changes from Apr 4-8, alone), there is a need for an on demand compilation for bioinformatic scientists to run their data set against,” added Kimberly.

The video, The Omics of Big Data, headed by Ashley Stengel and Kimberly Stanke, was designed to highlight the various “-omics” and their applications to educate K-12 students. Through scripted animations and narration, the educational video seeks to answer the question ‘What is big data?’
(link)

“Because big data is such a comprehensive and vast area of analysis we chose to focus on one aspect: bioinformatics. The purpose of the video is to educate k-12 students, so focusing on that one area allowed us to provide greater depth in our explanations,” explained Ashley.

Ashley Stengel was also chosen to present in the lighting talks in which she spoke on her research …
A new experience for them, each say they grew professionally as researchers, and winning the highest award in a data hackathon was validation as data scientists.

“The four of us, with our diverse backgrounds, make a dynamic team capable of taking on even the most insurmountable challenges. This hackathon was no different. It was eight hours of intellectual challenge, excitement, creativity, and continuous laughter,” recounted Bridget.

The Quantitative Life Sciences Initiative is a participating member of the National Science Foundation supported Midwest Big Data Hub (MBDH) and represents the university and the state directly through its work with the Digital Agriculture spoke.