(Almost) Optimal Parallel Block Access for Range Queries
Author
Mikhail Atallah, Sunil K. Prabhakar
Tech report number
CERIAS TR 2003-08
Abstract
Range queries are an important class of queries for several applications. can be achieved by tiling the multi-dimensional data and distributing it among multiple disks or nodes. It has been established that schemes that achieve optimal parallel block access exist only for a few special cases. Though several schemes for the allocation of tiles to disks have been developed, no scheme with non-trivial worst-case bound is known. We establish that any range query on a 2^q x 2^q-block grid of blocks can be performed using k = 2^t disks (t≤q), in at most [m/k] + O(log k) parallel block accesses. We achieve this result by judiciously distributing the blocks among the k nodes or disks. Experimental data show that the algorithm achieves very close to [m/k] performance (on average less than 0.5 away from [m/k], with a worst-case of 3). Although several declustering schemes for range queries have been developed, prior to our work no additive non-trivial performance bounds were known. Our scheme guarantees performance within a (small) additive deviation from [m/k]. Subsequent to this work, Bhatia et al. [4] have proved that such a performance bound is essentially optimal for this kind of scheme, and have also extended our results to the case where the number of disks is a product of the form k1*k2...*kt where kis need to all be 2.
Booktitle
19th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS)
Affiliation
Purdue University
Publication Date
2000-05-01
Keywords
Multi-Dimensional Data, Parallel I/O, Declustering, Range Queries