Math colloquium with Vinod Variyam this Friday

Vinod Variyam
Vinod Variyam

Math Colloquium: Vinod Variyam
Friday, March 8
4–5 PM
115 Avery Hall

"Distinct Elements in Streams: An Algorithm For The (Text) Book"

The basic computational task of estimating the number of unique elements in datasets, known as the Distinct Elements Problem, is critical in numerous domains including network traffic analysis, database management, marketing analysis, bioinformatics, textual data analysis, and fraud detection. This problem becomes particularly challenging when the dataset is too large to be stored or processed in its entirety.

This talk will highlight recent algorithmic progress on the distinct elements problem within the data streaming model. The data streaming model is popular for handling large amounts of streaming data. In this framework, data items arrive sequentially from a finite set, with the model’s parameters imposing strict limitations on storage and processing capabilities. Designing algorithms with limited resources for the distinct elements problem has been a significant focus for researchers, given its broad appeal. While earlier approaches relied on hashing techniques, a pure sampling-based algorithm remained elusive for many years. We have recently discovered a simple sampling-based algorithm (the CVM algorithm) for the distinct elements problem. This talk will present the CVM algorithm and its generalization in a broader data streaming scenario.

The talk is based on joint research with Sourav Chakraborty of the Indian Statistical Institute and Kuldeep S. Meel of the University of Toronto.

Professor Vinodchandran Variyam's research interests are centered around computer science subjects that involve fundamental computational efficiency concerns. His work encompasses various core computer science areas, such as computational complexity theory, machine learning, and large data management. He has made significant contributions to complexity theory, advancing the state-of-the-art in several important topics, including de-randomization, circuit complexity, space-limited computations, and Kolmogorov complexity. Beyond computational complexity, Vairyam is currently engaged in research themes that include reproducible computations, sample efficiency in distribution learning and testing, and algorithms development for streaming data and their applications. His work in these areas has the potential to make substantial advancements in the field and contribute to solving complex challenges related to efficiency and computation.