Abstract
Two main results in the area of information hiding in natural lan-
guage text are presented. A semantically-based scheme dramatically im-
proves the information-hiding capacity of any text through two tech-
niques: (i) modifying the granularity of meaning of individual sentences,
whereas our own previous scheme kept the granularity fixed, and (ii) halv-
ing the number of sentences affected by the watermark. No longer a “long
text, short watermark†approach, it now makes it possible to watermark
short texts like wire agency reports. Using both the above-mentioned se-
mantic marking scheme and our previous syntactically-based method hides
information in a way that reveals any non-trivial tampering with the text
(while re-formatting is not considered to be tampering—the problem would
be solved trivially otherwise by hiding a hash of the text) with a probabil-
ity 1–2–b(n+1), n being its number of sentences and b a small positive integer
based on the extent of co-referencing.