The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

On Watermarking Semistructures

Download

Download PDF Document
PDF

Author

Radu Sion and Mikhail Atallah and Sunil Prabhakar

Tech report number

CERIAS TR 2001-54

Entry type

techreport

Abstract

Watermarking, in the traditional sense is the technique of embedding un-detectable (un-perceivable) hidden information into multimedia objects (i.e. images, audio, video, text) mainly to protect the data from unauthorized duplication and distribution by enabling provable ownership over the content. Whereas considerable work has been invested in this topic, little has been done (with the notable exception of attempts in software watermarking and recent progress in the area of natural language processing to enable the same concept in the area of semi-structured non-media data such as XML, databases and non-multimedia repositories. We believe that there is much to be gained from the ability to embed non-destructive hidden information in this kind of content, in particular considering current mainstream migration of business interactions towards distributed computing technologies using markup languages such as XML and underlying database storage. Watermarking in the area of semi-structured data presents a whole new set of challenges and associated trade-offs. One characterizing main difference can be expressed simply as \"lack of bandwidth\", deriving from the inherent lack of a major noise component in that domain. We present some of the issues encountered in the course of our ongoing work in watermarking XML and numeric database content. We define a preliminary model-level analysis of the new domain and corresponding transforms. We design a method for watermarking semistructures based on a novel canonical labeling algorithm that self-adjusts to the specifics of the content. Labeling is tolerant to a significant number of graph attacks (\"surgeries\") and relies on a complex \"training\" phase at watermarking time in which it reaches a optimal stability point with respect to the expected attacks. Watermark detection works without requiring the original un-marked object. We analyse how to perform efficient and useful generic node content summarisation, hashing. We treat the issue of graph partitioning in the framework of hierarchical watermarking and show how hierarchical watermarking effectively amplifies the power of weak marking algorithms leading to an ultimately more powerful and robust watermark. We perform experiments enforcing some of the introduced algorithms (e.g. labeling) under different attack conditions and present some of the conclusions. Future envisioned medium and long term research issues are outlined.

Download

PDF

Booktitle

(submission)

Institution

Purdue University

Key alpha

sion2002wmsemistructures

Affiliation

CERIAS and Computer Sciences

Publication Date

1900-01-01

Language

English

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.