Hyo-Sang Lim - Purdue University
Students: Fall 2024, unless noted otherwise, sessions will be virtual on Zoom.
Provenance-based Data Trustworthiness Assessment in Data Streams
Feb 17, 2010
Download: MP4 Video Size: 379.8MBWatch on YouTube
Abstract
This talk presents a systematic approach for estimating the trustworthiness of data items in data stream environments (such as sensor networks). The approach uses the data item provenance as well as their values. To obtain trust scores, the approach exploits a cyclic framework which well reflects the inter-dependency property: the trust scores of data items affect the trust scores of network nodes, and vice versa. The trust scores of data items are computed from their value similarity and provenance similarity. The value similarity comes from the principle that "the more similar values for the same event, the higher the trust scores," and we compute it under the assumption of normal distribution. The provenance similarity is based on the principle that "the more different provenances with similar values, the higher the trust scores," and we compute it using the tree similarity. Since new data items continuously arrive in DSMSs, we need to evolve (i.e., recompute) trust scores to reflect those new items. As evolution scheme, we propose the batch mode for computing scores (non)periodically along with the immediate mode. Experimental results show that the approach is efficient and effective in data stream environments.About the Speaker
Hyo-Sang Lim is a post-doc in the department of computer science and CERIAS at Purdue University. He received his B.S. degree in computer science from Yousei University, South Korea and M.S. and Ph.D. degrees in computer science from KAIST (Korea Advanced Institute of Science and Technology). His research interests include query processing and security issues in data streams, sensor networks, and spatial databases.