Graduate students defending theses, dissertations

CSE Graduate Defenses
CSE Graduate Defenses

Dissertation Defense: Jieting Wu

“Deploying, Improving and Evaluating Edge Bundling Methods For Visualizing Large Graphs”

Committee Members: Dr. Hongfeng Yu (Advisor)
Dr. Ying Lu, Dr. David Swanson, and Dr. Zhenghong Tang

Monday, November 26, 2018, 10:00 a.m.

211 Schorr Center

Abstract: The tremendous increase in the scale of graphs has been witnessed in a wide range of fields, which demands efficient and effective visualization techniques to assist users in better understandings of large graphs. Conventional node-link diagrams are often used to visualize graphs, whereas excessive edge crossings can easily incur severe visual clutter in the node-link diagram of a large graph. Edge bundling can effectively remedy visual clutter and reveal high-level graph structures. Although significant efforts have been devoted to developing edge bundling, three challenging problems remain. First, edge bundling techniques are not easy to deploy for web-based applications. The state-of-the-art edge bundling methods often require special systems supports and techniques, such as high-end GPU acceleration, for large graphs, which makes these methods less portable, especially, for ubiquitous mobile devices. Second, the quantitative quality of edge bundling results is barely assessed in the literature. Currently, the comparison of edge bundling mainly focuses on computational performance. Third, though the family of edge bundling have a rich bundling layouts, it lacks a generic method to generate different styles of edge bundling.

In this research, I aim to address these problems and have made the following contributions. First, I provide an efficient framework to deploy edge bundling for web-based platforms. My framework can generate high-quality edge bundling results on web-based platforms, and achieve a speedup of 50X compared to the previous state-of-the-art edge bundling method on a graph with half of a million edges. Second, I propose a moving least squares based approach to improve the layout quality. My approach can generate a better bundling results compared to other methods based on a quality metric. Third, I provide an information-theoretic metric to evaluate the edge bundling methods. I leverage information theory in this metric. With my information-theoretic metric, domain users can choose appropriate edge bundling methods with proper parameters for their applications. Last but not least, I present a deep learning framework for edge bundling visualizations. Through a training process that learns the results of a specific edge bundling method, my deep learning framework can infer the final layout of the edge bundling method. My deep learning framework is a generic framework that can generate the corresponding results of different edge bundling methods.





Master’s Thesis Defense: Jun Wu

“Reducing the Tail Latency of a Distributed NoSQL Database”

Committee Members: Dr. Lisong Xu, Advisor
Dr. Qiben Yan and Dr. Hongfeng Yu

Thursday, November 15, 2018
2:00 p.m.
211 Schorr Center

Abstract: The request latency is an important performance metric of a distributed database, such as the popular Apache Cassandra, because of its direct impact on the user experience. Specifically, the latency of a read or write request is defined as the total time interval from the instant when a user makes the request to the instant when the user receives the request, and it involves not only the actual read or write time at a specific database node, but also various types of latency introduced by the distributed mechanism of the database. Most of the current work focuses only on reducing the average request latency, but not on reducing the tail request latency that has a significant and severe impact on some of database users. In this thesis, we investigate the important factors on the tail request latency of Apache Cassandra, then propose two novel methods to greatly reduce the tail request latency. First, we find that the background activities may considerably increase the local latency of a replica and then the overall request latency of the whole database, and thus we propose a novel method to select the optimal replica by considering the impact of background activities. Second, we find that the asynchronous read and write architecture handles the local and remote requests in the same way, which is simple to implement but at a cost of possibly longer latency, and thus we propose a synchronous method to handle local and remote request differently to greatly reduce the latency. Finally, our experiments on Amazon EC2 public cloud platform demonstrate that our proposed methods can greatly reduce the tail latency of read and write requests of Apache Cassandra.




Master’s Thesis Defense: Naureen Hoque

“Supporting Diverse Customers and Prioritized Traffic in Next-Generation Passive Optical Networks”

Committee Members: Dr. Byrav Ramamurthy, Advisor
Dr. Massimiliano Pierobon and Dr. Lisong Xu

Monday, November 19, 2018, 9:30 a.m.

211 Schorr Center

Abstract: The already high demand for more bandwidth usage has been growing rapidly. Access network traffic is usually bursty in nature and the present traffic trend is mostly video-dominant. This motivates the need for higher transmission rates in the system. At the same time, the deployment costs and maintenance expenditures have to be reasonable. Therefore, Passive Optical Networks (PON) are considered promising next-generation access technologies. As the existing PON standards are not suitable to support future-PON services and applications, the FSAN (Full Service Access Network) group and the ITU-T (Telecommunication Standardization Sector of the International Telecommunication Union) have worked on developing the NG- PON2 (Next Generation PON 2) standard.

Resource allocation is a fundamental task in any PON and it is necessary to have an efficient scheme that reduces delay, maximizes bandwidth usage, and minimizes the resource wastage. A variety of DBA (Dynamic Bandwidth Allocation) and DWBA (Dynamic Wavelength and Bandwidth Allocation) algorithms have been proposed which are based on different PONs (e.g. EPON, GPON, XG-PON, 10G- EPON, etc.). But to our knowledge, no DWBA scheme for NG-PON2 system, with diverse customers and prioritized traffic, has been proposed yet. In this work, this problem is addressed and five different dynamic wavelength and bandwidth allocation (DWBA) schemes are proposed. First, mixed integer linear programming (MILP) models are developed to minimize the total delay of the high priority data. Due to the MILP’s high computational complexity, heuristic algorithms are developed based on the MILP model insights. The five heuristics algorithms are: No Block-Split Heuristic (NBH), Equal Block-Split Heuristic (EBH), Priority Based No Block-Split Heuristic (P-NBH), Priority Based Equal Block-Split Heuristic (P-EBH), and Priority Based Decider Block-Split Heuristic (P-DBH). Six priority classes of requests are introduced with the goal of minimizing the total delay for the high priority data and to lessen the bandwidth wastage of the system. Finally, experiments for the performance evaluation of the five DWBA schemes are conducted. The results show that P-NBH, P-EBH, P-DBH schemes show a 47.63% less delay and 30% of less bandwidth wastage on average for the highest priority data transmission than the schemes without priority support (NBH and EBH). Among these five schemes, NBH method has the highest delay, whereas EBH and P-EBH waste more bandwidth than the other schemes. P-DBH is the most efficient among the five because this scheme offers the lowest delay for high priority data and the minimum bandwidth wastage for lower priority ones.




Master’s Thesis Defense: Bruno Vieira Resende e Silva

“GAINDroid: General Automated Incompatibility Notifier for Android Application”

Committee Members: Dr. Hamid Bagheri, Advisor
Dr. ThanhVu Nguyen and Dr. Witawas Srisa-an

Tuesday, November 20, 2018, 9:30 a.m.

103 Avery Hall

Abstract: With the ever-increasing popularity of mobile devices over the last decade, mobile applications and the frameworks upon which they are built frequently change, leading to a confusing jumble of devices and applications utilizing differing features even within the same framework. For Android apps and devices—-the largest such framework and marketplace—-mismatches between the version of the app API installed on a device and the version targeted by the developers of an app running on that device can lead to run-time crashes, providing a poor user experience.


In this thesis, we present GAINDroid, a new analysis approach that automatically detects four types of mismatches to which an app may be vulnerable across versions of the Android API it declaratively supports. We applied GAINDroid to 3,571 apps and compared the results of our analysis against state-of-the-art tools. The experimental results corroborate its ability to remarkably outperform the existing analysis techniques in terms of both the number and type of mismatches correctly identified as well as run-time performance of the analysis.




Master’s Thesis Defense: Ravi Kiran Puttaswamy

“Scale-Out Algorithm For Apache Storm In SaaS Environment”

Committee Members: Dr. Ying Lu, Advisor
Dr. David Swanson and Dr. Hongfeng Yu

Tuesday, November 20, 2018, 2:00 p.m.

211 Schorr Center

Abstract: The main appeal of the Cloud is in its cost effective and flexible access to computing power. Apache Storm is a data processing framework used to process streaming data. In our work we explore the possibility of offering Apache Storm as a software service. Further, we take advantage of the cgroups feature in Storm to divide the computing power of worker machine into smaller units to be offered to users. We predict that the compute bounds placed on the cgroups could be used to infer approximate the state of the workflow. We discuss the limitations of the current schedulers in facilitating this type of approximation as the resources are distributed in arbitrary ways. We implement a new custom scheduler that allows the user with more explicit control over the way resources are distributed to components in the workflow. We further build a simple model to approximate the current state and also predict the future state of the workflow due to changes in resource allocation. We propose a scale-out algorithm to increase the throughput of the workflow. We use the predictive model to measure the effects of many candidate allocations before choosing it. Our approach analyzes the strengths and drawbacks of Stela algorithm and design a complementary algorithm. We show that the combined algorithm complement each others strengths and drawbacks and provides allocations to maximize throughput for much larger set of scenarios. We implement the algorithm as a stand alone scheduler and evaluate the strategy through physical simulation on the testbed and by software simulations for a set of workflows.