The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Hash-Merge Join: A Non-blocking Join Algorithm for Producing Fast and Early Join Results

Author

MF Mokbel, M Lu, WG Aref

Entry type

inproceedings

Abstract

This paper introduces the hash-merge join algorithm(HMJ, for short); a new non-blocking join algorithm thatdeals with data items from remote sources via unpredictable,slow, or bursty network traffic. The HMJ algorithmis designed with two goals in mind: (1) Minimize thetime to produce the first few results, and (2) Produce joinresults even if the two sources of the join operator occasionallyget blocked. The HMJ algorithm has two phases: Thehashing phase and the merging phase. The hashing phaseemploys an in-memory hash-based join algorithm that producesjoin results as quickly as data arrives. The mergingphase is responsible for producing join results if the twosources are blocked. Both phases of the HMJ algorithmare connected via a flushing policy that flushes in-memoryparts into disk storage once the memory is exhausted. Experimentalresults show that HMJ combines the advantagesof two state-of-the-art non-blocking join algorithms (XJoinand Progressive Merge Join) while avoiding their short-comings.

Date

2004

Booktitle

Proceedings of the 20th International Conference on Data Engineering

Journal

ICDE

Key alpha

Aref

Pages

251

Publication Date

2004-01-01

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.