The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Reports and Papers Archive


Browse All Papers »       Submit A Paper »

Analysis of Multi-Dimensional Space-Filling Curves

MF Mokbel, WG Aref, I Kamel

A space-filling curve is a way of mapping the multi-dimensional space into the 1-D space. It acts like a thread that passes through every cell element (or pixel) in the D-dimensional space so that every cell is visited exactly once. There are numerous kinds of space-filling curves. The difference between such curves is in their way of mapping to the 1-D space. Selecting the appropriate curve for any application requires knowledge of the mapping scheme provided by each space-filling curve. A space-filling curve consists of a set of segments. Each segment connects two consecutive multi-dimensional points. Five different types of segments are distinguished, namely, Jump, Contiguity, Reverse, Forward, and Still. A description vector V=(J, C, R, F, S), where J, C, R, F, and S are the percentages of Jump, Contiguity, Reverse, Forward, and Still segments in the space-filling curve, encapsulates all the properties of a space-filling curve. The knowledge of V facilitates the process of selecting the appropriate space-filling curve for different applications. Closed formulas are developed to compute the description vector V for any D-dimensional space and grid size N for different space-filling curves. A comparative study of different space-filling curves with respect to the description vector is conducted and results are presented and discussed.

Added 2008-04-22

ClassView: hierarchical video shot classification, indexing, and accessing

J Fan, AK Elmagarmid, X Zhu, WG Aref, L Wu

Recent advances in digital video compression and networks have made video more accessible than ever. However, the existing content-based video retrieval systems still suffer from the following problems. 1) Semantics-sensitive video classification problem because of the semantic gap between low-level visual features and high-level semantic visual concepts; 2) Integrated video access problem because of the lack of efficient video database indexing, automatic video annotation, and concept-oriented summary organization techniques. In this paper, we have proposed a novel framework, called ClassView, to make some advances toward more efficient video database indexing and access. 1) A hierarchical semantics-sensitive video classifier is proposed to shorten the semantic gap. The hierarchical tree structure of the semantics-sensitive video classifier is derived from the domain-dependent concept hierarchy of video contents in a database. Relevance analysis is used for selecting the discriminating visual features with suitable importances. The Expectation-Maximization (EM) algorithm is also used to determine the classification rule for each visual concept node in the classifier. 2) A hierarchical video database indexing and summary presentation technique is proposed to support more effective video access over a large-scale database. The hierarchical tree structure of our video database indexing scheme is determined by the domain-dependent concept hierarchy which is also used for video classification. The presentation of visual summary is also integrated with the inherent hierarchical video database indexing tree structure. Integrating video access with efficient database indexing tree structure has provided great opportunity for supporting more powerful video search engines.

Added 2008-04-22

Realizing Privacy-Preserving Features in Hippocratic Databases

Y Laura-Silva, WG Aref

Presenting privacy has become a crucial requirement for operating a business that manages personal data. Hippocratic databases have been proposed to answer this requirement through a database design that includes responsibility for the privacy of data as a founding tenet. We identify, study, and implement several privacy-preserving features that extend the previous work on Limiting Disclosure in Hippocratic databases. These features include the support of multiple policy versions, retention time, generalization hierarchies, and multiple SQL operations. The proposed features facilitate in making Hippocratic databases one step closer to fitting real-world scenarios. We present the design and implementation guidelines of each of the proposed features. The evaluation of the effect in performance shows that the cost of these extensions is small and scales well to large databases.

Added 2008-04-22

Exploring video content structure for hierarchical summarization

X Zhu, X Wu, J Fan, AK Elmagarmid, WG Aref

In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the supergroup into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.

Added 2008-04-22

Efficient query execution on broadcasted index tree structures

S Hambrusch, Chuan-Ming Liu, WG Aref, S Prabhakar

The continuous broadcast of data together with an index structure is an effective way of disseminating data in a wireless mobile environment. The index allows a mobile client to tune in only when relevant data is available on the channel and leads to reduced power consumption for the clients. This paper investigates the execution of queries on broadcasted index trees when query execution corresponds to a partial traversal of the tree. Queries exhibiting this behavior include range queries and nearest neighbor queries. We present two broadcast schedules for index trees and two query algorithms executed by mobile clients. Our solutions simultaneously minimize tuning time and latency and adapt to the client’s available memory. Experimental results using real and synthetic data compare results for a broadcast with node repetition to one without node repetition and they show how a priority-based data management can help reduce tuning time and latency.

Added 2008-04-22

Stream window join: tracking moving objects in sensor-network databases

MA Hammad, WG Aref, AK Elmagarmid

The widespread use of sensor networks presents revolutionary opportunities for life and environmental science applications. Many of these applications involve continuous queries that require the tracking, monitoring, and correlation of multi-sensor data that represent moving objects. We propose to answer these queries using a multi-way stream window join operator. This form of join over multi-sensor data must cope with the infinite nature of sensor data streams and the delays in network transmission. The paper introduces a class of join algorithms, termed W-join, for joining multiple infinite data streams. W-join addresses the infinite nature of the data streams by joining stream data items that lie within a sliding window and that match a certain join condition. W-join can be used to track the motion of a moving object or detect the propagation of clouds of hazardous material or pollution spills over time in a sensor network environment. We describe two new algorithms for W-join, and address variations and local/global optimizations related to specifying the nature of the window constraints to fulfill the posed queries. The performance of the proposed algorithms are studied experimentally in a prototype stream database system, using synthetic data streams and real time-series data. Tradeoffs of the proposed algorithms and their advantages and disadvantages are highlighted, given variations in the aggregate arrival rates of the input data streams and the desired response times per query.

Added 2008-04-22

Adaptive motion-compensated video coding scheme towards content-based bit rate allocation

J Fan, DKY Yau, WG Aref, A Rezgui

An adaptive motion-compensated video coding scheme, that is based on structural video component segmentation and coding complexity analysis, is proposed in this paper. The bits are allocated more efficiently among different frame types and variant video components. A novel scene cut detection algorithm is proposed for partitioning the input video sequences into a set of shots and each shot may be encoded as one or multiple GOPs according to its length. Moreover, the positions of the reference frames (I and P frames) in a video shot are adapted to improve the temporal predictability among frames and provide high coding efficiency, thus high picture quality with the same bit rate. More bits are allocated for these reference frames for providing high quality of the reconstructed pictures. The residue frames in a video shot are encoded as the bidirectional interpolation frames (B frames) and can be also quantized more coarsely because they have high temporal predictability and are not used as references. The bits, that have been allocated for the three different frame types (I, P, B frames), can be further distributed more efficiently among variant video components to avoid the coding artifacts. Experimental results show that this proposed adaptive video coding scheme is more efficient than the traditional fixed GOP coding algorithms and may be an efficient development of the present adaptive coding techniques. ©2000 SPIE and IS&T.

Added 2008-04-22

Distributed framework for real-time multimedia object communication

H Fahm, WG Aref, M Latif, A Ghafoor, P Liu, L Hsu

We highlight major technical requirements for designing and developing future distributed multimedia information systems using Internet technology. The key requirements of this system are to allow users to access and search and to communicate multimedia documents consisting of text, audio, video and images. We emphasize the role of object-oriented technology for information management and real-time communication protocols to guarantee QoS. We present a reference architecture for a Web-based real-time distributed multimedia system which integrates enabling technologies including real-time streaming, multimedia indexing and searching and distributed object management

Added 2008-04-22

Place: A Distributed Spatio-Temporal Data Stream Management System for Moving Objects

X Xiaopeng, HG Elmongui, X Chai, WG Aref
Added 2008-04-22

Detection and tracking of discrete phenomena in sensor-network databases

MH Ali, MF Mokbel, WG Aref, I Kamel

This paper introduces a framework for Phenomena Detection and Tracking (PDT, for short) in sensor network databases. Examples of detectable phenomena include the propagation over time of a pollution cloud or an oil spill region. We provide a crisp definition of a phenomenon that takes into consideration both the strength and the time span of the phenomenon.We focus on discrete phenomena where sensor readings are drawn from a discrete set of values, e.g., item numbers or pollutant IDs, and we point out how our work can be extended to handle continuous phenomena. The challenge for the proposed PDT framework is to detect as much phenomena as possible, given the large number of sensors, the overall high arrival rates of sensor data, and the limited system resources. Our proposed PDT framework uses continuous SQL queries to detect and track phenomena. Execution of these continuous queries is performed in three phases; the joining phase, the candidate selection phase, and the grouping/output phase. The joining phase employs an in-memory multi-way join algorithm that produces a set of sensor pairs with similar readings. The candidate selection phase filters the output of the joining phase to select candidate join pairs, with enough strength and time span, as specified by the phenomenon definition. The grouping/ output phase constructs the overall phenomenon from the candidate join pairs. We introduce two optimizations to increase the likelihood of phenomena detection while using less system resources. Experimental studies illustrate the performance gains of both the proposed PDT framework and the proposed optimizations.

Added 2008-04-22

An extensible index for spatial databases

WG Aref, IF Ilyas

Abstract: Emerging database applications require the use of new indexing structures beyond B-trees and R-trees. Examples are the k-D tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of all these indexes is that they recursively divide the space into partitions. A new extensible index structure, termed SP-GiST, is presented that supports this class of data structures, mainly the class of space partitioning unbalanced trees. Simple method implementations are provided that demonstrate how SP-GiST can behave as a k-D tree, a trie, a quadtree, or any of their variants. Issues related to clustering tree nodes into pages as well as concurrency control for SP-GiST are addressed. A dynamic minimum-height clustering technique is applied to minimize disk accesses and to make using such trees in database systems possible and efficient. A prototype implementation of SP-GiST is presented as well as performance studies of the various SP-GiST’s tuning parameters.

Added 2008-04-22

Incremental Evaluation of Sliding-Window Queries over Data Streams

TM Ghanem, MA Hammad, MF Mokbel, WG Aref, AK Elmagarmid

Two research efforts have been conducted to realize sliding-window queries in data stream management systems, namely, query reevaluation and incremental evaluation. In the query reevaluation method, two consecutive windows are processed independently of each other. On the other hand, in the incremental evaluation method, the query answer for a window is obtained incrementally from the answer of the preceding window. In this paper, we focus on the incremental evaluation method. Two approaches have been adopted for the incremental evaluation of sliding-window queries, namely, the input-triggered approach and the negative tuples approach. In the input-triggered approach, only the newly inserted tuples flow in the query pipeline and tuple expiration is based on the timestamps of the newly inserted tuples. On the other hand, in the negative tuples approach, tuple expiration is separated from tuple insertion where a tuple flows in the pipeline for every inserted or expired tuple. The negative tuples approach avoids the unpredictable output delays that result from the input-triggered approach. However, negative tuples double the number of tuples through the query pipeline, thus reducing the pipeline bandwidth. Based on a detailed study of the incremental evaluation pipeline, we classify the incremental query operators into two classes according to whether an operator can avoid the processing of negative tuples or not. Based on this classification, we present several optimization techniques over the negative tuples approach that aim to reduce the overhead of processing negative tuples while avoiding the output delay of the query answer. A detailed experimental study, based on a prototype system implementation, shows the performance gains over the input-triggered approach of the negative tuples approach when accompanied with the proposed optimizations.

Added 2008-04-22

R-trees with Update Memos

X Xiong, WG Aref

The problem of frequently updating multi-dimensional indexes arises in many location-dependent applications. While the R-tree and its variants are one of the dominant choices for indexing multi-dimensional objects, the R-tree exhibits inferior performance in the presence of frequent updates. In this paper, we present an R-tree variant, termed the RUM-tree (stands for R-tree with Update Memo) that minimizes the cost of object updates. The RUM-tree processes updates in a memo-based approach that avoids disk accesses for purging old entries during an update process. Therefore, the cost of an update operation in the RUM-tree reduces to the cost of only an insert operation. The removal of old object entries is carried out by a garbage cleaner inside the RUM-tree. In this paper, we present the details of the RUM-tree and study its properties. Theoretical analysis and experimental evaluation demonstrate that the RUMtree outperforms other R-tree variants by up to a factor of eight in scenarios with frequent updates.

Added 2008-04-22

Continuous Query Processing of Spatio-Temporal Data Streams in PLACE

MF Mokbel, X Xiong, MA Hammad, WG Aref

The tremendous increase in the use of cellular phones, GPS-like devices, and RFIDs results in highly dynamic environments where objects as well as queries are continuously moving. In this paper, we present a continuous query processor designed specifically for highly dynamic environments (e.g., location-aware environments). We implemented the proposed continuous query processor inside the PLACE server (Pervasive Location-Aware Computing Environments); a scalable location-aware database server developed at Purdue University. The PLACE server extends data streaming management systems to support location-aware environments. These environments are characterized by the wide variety of continuous spatio-temporal queries and the unbounded spatio-temporal streams. The proposed continuous query processor includes: (1) New incremental spatio-temporal operators to support a wide variety of continuous spatio-temporal queries, (2) Extended semantics of sliding window queries to deal with spatial sliding windows as well as temporal sliding windows, and (3) A shared-execution framework for scalable execution of a set of concurrent continuous spatio-temporal queries. Experimental evaluation shows promising performance of the continuous query processor of the PLACE server.

Added 2008-04-22

Disk Scheduling in Video Editing Systems

WG Aref, I Kamel, S Ghandeharizadeh

Modern video servers support both video-on-demand and nonlinear editing applications. Video-on-demand servers enable the user to view video clips or movies from a video database, while nonlinear editing systems enable the user to manipulate the content of the video database. Applications such as video and news editing systems require that the underlying storage server be able to concurrently record live broadcast information, modify prerecorded data, and broadcast an authored presentation. A multimedia storage server that efficiently supports such a diverse group of activities constitutes the focus of this study. A novel real-time disk scheduling algorithm is presented that treats both read and write requests in a homogeneous manner in order to ensure that their deadlines are met. Due to real-time demands of movie viewing, read requests have to be fulfilled within certain deadlines; otherwise, they are considered lost. Since the data to be written into disk is stored in main memory buffers, write requests can be postponed until critical read requests are processed. However, write requests still have to be processed within reasonable delays and without the possibility of indefinite postponement. This is due to the physical constraint of the limited size of the main memory write buffers. The new algorithm schedules both read and write requests appropriately, to minimize the amount of disk reads that do not meet their presentation deadlines, and to avoid indefinite postponement and large buffer sizes in the case of disk writes. Simulation results demonstrate that the proposed algorithm offers low violations of read deadlines, reduces waiting time for lower priority disk requests, and improves the throughput of the storage server by enhancing the utilization of available disk bandwidth.

Added 2008-04-22