Masters Defenses

There are three upcoming defenses!
There are three upcoming defenses!

Rodrigo Cotta's masters defense will be Tuesday, April 19 at 3:30 p.m. in Avery 256C. Corey Svehla's masters defense will be Thursday, April 21 at 2:00 p.m. in Avery 347. Mouna Hammoudi's masters defense will be Monday, May 2 at 3:30 p.m. in Avery 347.

--------------------------------

Rodrigo Cotta's Abstract:

Load-Aware Grouping in Apache Storm

Big data applications are characterized by big data volume and high data velocity. Several systems dedicated to process streams of data in real time have been developed recently, e.g., Apache Storm, Apache Flink and Heron. One challenge for executing big data applications is to evenly distribute data processing among different machines. Apache Storm uses groupings to assign tuples, where by default it shuffles tuples evenly among available downstream machines. While this load-oblivious approach can work pretty well in a homogeneous cluster, it often leads to unbalanced loads and congestions in heterogeneous clusters.

In this project we take advantage of Storm’s custom grouping extension and have developed several load-aware grouping techniques to balance tuples among storm executors. We implemented our techniques in Apache Storm and carried out experiments on a cluster. Experimental data have shown that in comparison to the default Apache Storm mechanism, our techniques can achieve significantly better performance on data processing throughput and latency.

-------------------------------------------------

Corey Svehla's Abstract:

Implementation and Comparison of Phylogenetic Algorithms

Phylogenetic trees are used to show possible relationships between biological species. Phylogenetic trees also are hypothetical evolutionary trees because modern biological classification usually follows evolutionary history. Hence phylogenetic tree algorithms can be evaluated based on how closely the phylogenetic trees they generate follow evolutionary history. In this project, we have used MATLAB to implement two phylogenetic tree algorithms. The first is the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) algorithm, and the second is the Common Mutation Similarity Matrix (CMSM) algorithm. The implementation has several advanced features, including detailed interactive figures that allow users to see the list of mutations between sequences at each step of the phylogenetic tree. These two algorithms are compared based on the total number of mutation events that they imply to have occurred during evolution from a common ancestor of the set of input genome sequences.

-------------------------------------------------

Mouna Hammoudi's Abstract:

Why do Record/Replay Tests of Web Applications Break?

Software engineers often use record/replay tools to enable the automated testing of web applications. Tests created in this manner can then be used to regression test new versions of the web applications as they evolve. Web application tests recorded by record/replay tools, however, can be quite brittle; they can easily break as applications change. For this reason, researchers have begun to seek approaches for automatically repairing record/replay tests. To date, however, there have been no comprehensive attempts to characterize the causes of breakages in record/replay tests for web applications. In this work, we present a taxonomy classifying the ways in which record/replay tests for web applications break, based on an analysis of 453 versions of popular web applications for which 1065 individual test breakages were recognized. The resulting taxonomy can help direct researchers in their attempts to repair such tests. It can also help practitioners by suggesting best practices when creating tests or modifying programs, and can help researchers with other tasks such as test robustness analysis and IDE design.