The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Accessing Textual Documents Using Compressed Indexes of Arrays of Small Bloom Filters

Author

J. K. Mullin

Entry type

article

Abstract

A highly compressed index for a collection of variable-sized documents is described. Arrays of small Bloom filters are used to effeciently locate documents where the search probe contains 'anded' and 'ored' combinations of words. Theoretical and experimental results are reported. The method is applicable to unplanned searching of large text files. We further describe a method to provide an index to the filters. Thus only a small proportion of the compressed filter need be examined. The method is highly amendable to parallel processing.

Date

1987

Journal

The Computer Journal

Key alpha

Mullin

Number

4

Volume

30

Publication Date

0000-00-00

Location

A hard-copy of this is in the Papers Cabinet

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.