Jun Xu - Georgia Tech
Students: Fall 2024, unless noted otherwise, sessions will be virtual on Zoom.
Network Data Streaming - A Computer Scientist's Journey in Signal Processing
Feb 16, 2005
Abstract
Accurate traffic measurement and monitoring iscritical for network management and operation. With
the rapid growth of the Internet, network link speeds
have become faster every year to accommodate more
Internet users. To accurately measure and monitor
these high-speed links becomes a very challenging
problem. Data streaming has been proposed as a viable
solution for measuring and monitoring high-speed links
in large networks. Data streaming is concerned with
processing a long stream of data items in one pass
using a small working memory in order to answer a
class of queries regarding the stream. While data
streaming has been studied by the database
researchers, most of their results can not be directly
applied to network data streaming. A key design
challenge, which makes most database streaming
algorithms inapplicable, is that the online processing
of each data item (packet) has to finish within tens
of nanoseconds, which is orders of magnitude more
stringent than in database applications (on the order
of milliseconds).
To address this challenge, we developed a new
methodology called "Lossy data structure + Bayesian
statistics = Accurate streaming", and applied it to
produce three results that are among the earliest in
this area. The first result, based on our new
streaming data structure called Space-Code Bloom
Filter, solves the open problem of measuring per-flow
traffic without keeping per-flow state. The second
result is the design of the most resource-efficient
and scalable IP traceback scheme to date, based on our
new coincident streaming/sampling} technique. This
work also pioneers the application of information
theory to optimizing streaming data structures. The
third result solves the open problem of accurately
estimating flow size distribution without keeping
per-flow state. Our algorithm is orders of magnitude
more accurate than the traditional sampling-based
approach.
About the Speaker
Jun (Jim) Xu is an Assistant Professor in the College
of Computing at Georgia Institute of Technology. He
received his Ph.D. in Computer and Information Science
from The Ohio State University in 2000. His current
research interests include data streaming algorithms
for the measurement and monitoring of computer
networks, algorithms and data structures for computer
networks, network security, and performance modeling
and simulation. He received the NSF CAREER award in
2003 for his ongoing efforts in establishing
fundamental lower bound and tradeoff results in
networking. He is a co-author of a paper that won the
Best Student Paper Award from 2004 ACM Sigmetrics/IFIP
Performance joint conference, and the thesis advisor
of the student winners.
of Computing at Georgia Institute of Technology. He
received his Ph.D. in Computer and Information Science
from The Ohio State University in 2000. His current
research interests include data streaming algorithms
for the measurement and monitoring of computer
networks, algorithms and data structures for computer
networks, network security, and performance modeling
and simulation. He received the NSF CAREER award in
2003 for his ongoing efforts in establishing
fundamental lower bound and tradeoff results in
networking. He is a co-author of a paper that won the
Best Student Paper Award from 2004 ACM Sigmetrics/IFIP
Performance joint conference, and the thesis advisor
of the student winners.