Assessing the Trustworthiness of Streaming Data

Get BibTex-formatted data

Download

PDF

Author

Hyo-Sang Lim, Yang-Sae Moon, Elisa Bertino

Tech report number

CERIAS TR 2010-09

Entry type

techreport

Abstract

The notion of confidence policy is a novel notion that exploits trustworthiness of data items in data management and query processing. In this paper we address the problem of enforcing confidence policies in data stream management systems (DSMSs), which is crucial in supporting users with different access rights, processing confidence-aware continuous queries, and protecting the secure streaming data. For the paper, we first propose a DSMS-based framework of confidence policy management and then present a systematic approach for estimating the trustworthiness of data items. Our approach uses the data item provenance as well as their values. We introduce two types of data provenance: the physical provenance which represents the delivering history of each data item, and the logical provenance which describes the semantic meaning of each data item. The logical provenance is used for grouping data items into semantic events with the same meaning or purpose. By contrast, the tree-shaped physical provenance is used in computing trust scores, that is, quantitative measures of trustworthiness. To obtain trust scores, we propose a cyclic framework which well reflects the inter-dependency property: the trust scores of data items affect the trust scores of network nodes, and viceversa. The trust scores of data items are computed from their value similarity and provenance similarity. The value similarity comes from the principle that “the more similar values for the same event, the higher the trust scores,” and we compute it under the assumption of normal distribution. The provenance similarity is based on the principle that “the more different physical provenances with similar values, the higher the trust scores,” and we compute it using the tree similarity. Since new data items continuously arrive in DSMSs, we need to evolve (i.e., recompute) trust scores to reflect those new items. As evolution scheme, we propose the batch mode for computing scores (non)periodically along with the immediate mode. To our best knowledge, our approach is the first supporting the enforcement of confidence policies in DSMSs. Experimental results show that our approach is very efficient.

Download

PDF

Date

2010 – 7 – 9

Key alpha

Lim

Affiliation

Purdue University, Kangwon National University, South Korea

Publication Date

2010-07-09

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.

@Techreport{ Lim,
	title = "Assessing the Trustworthiness of Streaming Data",
	author = "Hyo-Sang Lim, Yang-Sae Moon, Elisa Bertino",
	year = "2010",
	month = "7",
	day = "9",
	abstract = "The notion of confidence policy is a novel notion that exploits trustworthiness of data items in data management and query processing. In this paper we address the problem of enforcing confidence policies in data stream management
systems (DSMSs), which is crucial in supporting users with different access rights, processing confidence-aware continuous queries, and protecting the secure streaming
data. For the paper, we first propose a DSMS-based framework of confidence policy management and then present a systematic approach for estimating the trustworthiness
of data items. Our approach uses the data item provenance as well as their values. We introduce two types of data provenance: the physical provenance which represents the delivering history of each data item, and the logical provenance which describes the semantic meaning of each data item. The logical provenance is used for grouping data items into semantic events with the same meaning
or purpose. By contrast, the tree-shaped physical provenance is used in computing trust scores, that is, quantitative measures of trustworthiness. To obtain trust
scores, we propose a cyclic framework which well reflects the inter-dependency property: the trust scores of data items affect the trust scores of network nodes,
and viceversa. The trust scores of data items are computed from their value similarity and provenance similarity. The value similarity comes from the principle
that “the more similar values for the same event, the higher the trust scores,” and we compute it under the assumption of normal distribution. The provenance similarity
is based on the principle that “the more different physical provenances with similar values, the higher the trust scores,” and we compute it using the tree similarity.
Since new data items continuously arrive in DSMSs, we need to evolve (i.e., recompute) trust scores to reflect those new items. As evolution scheme, we propose the batch mode for computing scores (non)periodically along with the immediate mode. To our best knowledge, our approach is the first supporting the enforcement of confidence policies in DSMSs. Experimental results show that our approach is very efficient.",
	affiliation = "Purdue University, Kangwon National University, South Korea",
}