Abstract
The notion of confidence policy is a novel notion that exploits trustworthiness of data items in data management and query processing. In this paper we address the problem of enforcing confidence policies in data stream management
systems (DSMSs), which is crucial in supporting users with different access rights, processing confidence-aware continuous queries, and protecting the secure streaming
data. For the paper, we first propose a DSMS-based framework of confidence policy management and then present a systematic approach for estimating the trustworthiness
of data items. Our approach uses the data item provenance as well as their values. We introduce two types of data provenance: the physical provenance which represents the delivering history of each data item, and the logical provenance which describes the semantic meaning of each data item. The logical provenance is used for grouping data items into semantic events with the same meaning
or purpose. By contrast, the tree-shaped physical provenance is used in computing trust scores, that is, quantitative measures of trustworthiness. To obtain trust
scores, we propose a cyclic framework which well reflects the inter-dependency property: the trust scores of data items affect the trust scores of network nodes,
and viceversa. The trust scores of data items are computed from their value similarity and provenance similarity. The value similarity comes from the principle
that “the more similar values for the same event, the higher the trust scores,” and we compute it under the assumption of normal distribution. The provenance similarity
is based on the principle that “the more different physical provenances with similar values, the higher the trust scores,” and we compute it using the tree similarity.
Since new data items continuously arrive in DSMSs, we need to evolve (i.e., recompute) trust scores to reflect those new items. As evolution scheme, we propose the batch mode for computing scores (non)periodically along with the immediate mode. To our best knowledge, our approach is the first supporting the enforcement of confidence policies in DSMSs. Experimental results show that our approach is very efficient.