Du Li will present his PhD dissertation defense "Dynamic Data Race Detection and Healing" on Thursday, August 30 at 12:15pm in 103C Avery Hall.
Abstract:
Perpetual availability is an important operational goal in today's computer systems. However, achieving this goal is hard because modern software systems contain faults that can cause them to fail. In addition, the prevalence of multi-threading makes the goal even more challenging since it can introduce concurrent faults.
Data races, which involves two concurrent accesses to the same data and at least one is a write, are the most common concurrency faults. Numerous approaches have been proposed to dynamically capture data races. However, the overhead of existing work is still too high to be efficiently used in deployed systems. In this dissertation, our main goal is to explore different techniques that can satisfy this need.
As our first step, we investigate the main sources of race detection overhead and find that a large effort is spent on monitoring operations that either cannot involve races or can participate in races but they are monitored again and again. Based on these observations, we propose two orthogonal optimizations for race detection: Stationary Object Suppression (SOS) and Loop Iteration Sampling (LIS).
To achieve perpetual availability, the next step is to address these software faults as they are detected during deployment. We propose a race healing system, RaceDr, which can automatically generate and apply repairs during program execution. The system fixes faulty code immediately after a race is detected to prevent the race from occurring again. Our investigation shows that the overhead of our repair system is about 15\% over a system that operates without the repair capability.