Abstract
A highly compressed index for a collection of variable-sized documents is described.
Arrays of small Bloom filters are used to effeciently locate documents where the search
probe contains 'anded' and 'ored' combinations of words. Theoretical and experimental
results are reported. The method is applicable to unplanned searching of large text
files. We further describe a method to provide an index to the filters. Thus only a
small proportion of the compressed filter need be examined. The method is highly
amendable to parallel processing.