![Miryung Kim](https://newsroom.unl.edu/announce/files/file113828.jpg)
Colloquium: Miryung Kim
Tuesday, April 9
4 p.m.
Avery 115
The reception will begin 3:30 p.m. in Avery 348.
Title: "Software Engineering for Data Science and Big Data Analytics"
Abstract: The demand for analyzing large scale telemetry, machine, and quality data is rapidly increasing in software industry. Data scientists are becoming popular within software teams. We conducted a large scale survey with 793 professional data scientists at Microsoft to understand their educational background, problem topics that they work on, tool usages, and activities.
To process massive quantities of data, data scientists leverage data-intensive scalable computing (DISC) systems in the cloud, such as Google's MapReduce, Hadoop, and Apache Spark. While DISC systems help to address the scalability challenges of big data analytics, they also introduce new challenges in debugging. In this talk, I will first describe interactive, real-time debugging primitives that we designed for the next generation data-intensive scalable cloud computing platform, Apache Spark and briefly describe data provenance and optimized incremental computation capabilities that we built within Apache Spark to effectively and efficiently support debugging. Then, I will describe automated debugging that combines insights from automated fault isolation in software engineering and data provenance in database systems to find a minimum set of failure-inducing inputs.
Bio: Miryung Kim is an Associate Professor in the Department of Computer Science at the University of California, Los Angeles and is a Director of Software Engineering and Analysis Laboratory. She is known for her research on code clones--code duplication detection, management, and removal solutions. Recently, she has taken a leadership role in creating and defining the emerging area that intersects software engineering and data science.
She received her B.S. in Computer Science from Korea Advanced Institute of Science and Technology in 2001 and her M.S. and Ph.D. in Computer Science and Engineering from the University of Washington in 2003 and 2008 respectively. She ranked No. 1 among all engineering and science students in KAIST in 2001 and received the Korean Ministry of Education, Science, and Technology Award, the highest honor given to an undergraduate student in Korea. She received various awards including an NSF CAREER award, Google Faculty Research Award, and Okawa Foundation Research Award. Between January 2009 and August 2014, she was an assistant professor at the University of Texas at Austin. Her research is funded by National Science Foundation, Air Force Research Laboratory, Google, Microsoft, IBM, Intel, Okawa Foundation, and Samsung and currently, she is leading 5M Office of Naval Research project on synergistic software customization. She is a co-Program Chair of the IEEE 35th International Conference on Software Evolution and Maintenance and an Associate Editor of IEEE Transactions on Software Engineering.
Webpage: http://web.cs.ucla.edu/~miryung/