The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Reports and Papers Archive


Browse All Papers »       Submit A Paper »

You Can Run, But You Can't Hide: An Effective Statistical Methodology to Trace Back DDoS Attackers

Terence K.T. Law, John C.S. Lui, David K.Y. Yau

There is currently an urgent need for effective solutions against distributed denial-of-service (DDoS) attacks directed at many well-known Web sites. Because of increased sophistication and severity of these attacks, the system administrator of a victim site needs to quickly and accurately identify the probable attackers and eliminate the attack traffic. Our work is based on a probabilistic marking algorithm in which an attack graph can be constructed by a victim site. We extend the basic concept such that one can quickly and efficiently deduce the intensity of the “local traffic” generated at each router in the attack graph based on the volume of received marked packets at the victim site. Given the intensities of these local traffic rates, we can rank the local traffic and identify the network domains generating most of the attack traffic. We present our traceback and attacker identification algorithms. We also provide a theoretical framework to determine the minimum stable time t_{min}, which is the minimum time needed to accurately determine the locations of attackers and local traffic rates of participating routers in the attack graph. Entensive experiments are carried out to illustrate that one can accurately determine the minimum stable time t_{min} and, at the same time, determine the location of attackers under various threshold parameters, network diameters, attack traffic distributions, on/off patterns, and network traffic conditions.

Added 2008-06-12

Protection of Application Service Hosting Platforms: an Operating System Perspective

Xuxian Jiang, Dongyan Xu

he Application Service Hosting Platform (ASHoP), as a realization of the utility computing vision, has recently received tremendous attention from both industry and academia. An ASHoP provides a shared and high performance platform to host multiple Application Services (ASes). The ASes are outsourced by Application Service Providers (ASPs) to save their own IT resources. Furthermore, ASHoP resources are allocated to the ASes in an ondemand fashion, so that resource supply always follows the…

Added 2008-06-11

Principals, policies and keys in a secure distributed programming language

T. Chothia, D. Duggan, J. Vitek
Added 2008-06-04

A Bayesian approach toward active learning for collaborative filtering

Rong Jin, Luo Si

Collaborative filtering is a useful technique for exploiting the preference patterns of a group of users to predict the utility of items for the active user. In general, the performance of collaborative filtering depends on the number of rated examples given by the active user. The more the number of rated examples given by the active user, the more accurate the predicted ratings will be. Active learning provides an effective way to acquire the most informative rated examples from active users. Previous work on active learning for collaborative filtering only considers the expected loss function based on the estimated model, which can be misleading when the estimated model is inaccurate. This paper takes one step further by taking into account of the posterior distribution of the estimated model, which results in more robust active learning algorithm. Empirical studies with datasets of movie ratings show that when the number of ratings from the active user is restricted to be small, active learning methods only based on the estimated model don’t perform well while the active learning method using the model distribution achieves substantially better performance.

Added 2008-06-03

On detecting service violations and bandwidth theft in QoS network domains

Ahsan Habib, Sonia Fahmy, Srinivas r. Avasarala, Venkatesh Prabhakar, Bharat Bhargava

We design and evaluate a simple and scalable system to verify quality of service (QoS) in a differentiated services domain. The system uses a distributed edge-to-edge monitoring approach with measurement agents collecting information about delays, losses and throughput, and reporting to a service level agreement monitor (SLAM). The SLAM detects potential service violations, bandwidth theft, denial of service attacks, and flags the need to re-dimension the network domain or limit its users. Measurements may be performed entirely edge-to-edge, or the core routers may participate in logging packet drop information. We compare the core-assisted and edge-to-edge schemes, and we extend network tomography-based loss inference mechanisms to cope with different drop precedences in a QoS network. We also develop a load-based service monitoring scheme which probes the appropriate edge routers for loss and throughput on demand. Simulation results indicate that the system detects attacks with reasonable accuracy, and is useful for damage control in both QoS-enabled and best effort network domains.

Added 2008-06-03


Efficient join processing over uncertain data

R. Cheng, S. Singh, S. Prabhakar, R. Shah, J.S. Vitter, Y. Xia

In many applications data values are inherently uncertain. This includes moving-objects, sensors and biological databases. There has been recent interest in the development of database management systems that can handle uncertain data. Some proposals for such systems include attribute values that are uncertain. In particular, an attribute value can be modeled as a range of possible values, associated with a probability density function. Previous efforts for this type of data have only addressed simple queries such as range and nearest-neighbor queries. Queries that join multiple relations have not been addressed in earlier work despite the significance of joins in databases. In this paper we address join queries over uncertain data. We propose a semantics for the join operation, define probabilistic operators over uncertain data, and propose join algorithms that provide efficient execution of probabilistic joins. The paper focuses on an important class of joins termed probabilistic threshold joins that avoid some of the semantic complexities of dealing with uncertain data. For this class of joins we develop three sets of optimization techniques: item-level, page-level, and index-level pruning. These techniques facilitate pruning with little space and time overhead, and are easily adapted to most join algorithms. We verify the performance of these techniques experimentally.

Added 2008-06-03

Can Identical Twins be Discriminated Based on Fingerprints

A.K. Jain, S. Prabhakar
Added 2008-06-03

Ensuring correctness over untrusted private database

Sarvjeet Singh, Sunil Prabhakar

In this paper we address the problem of ensuring the correctness of query results returned by an untrusted private database. The database owns the data and may modify it at any time. The querier is allowed to execute queries over this database; however it may not learn anything more than the result of these legal queries. The querier does not necessarily trust the database and would like the owner to furnish proof that the data has not been modified in response to recent events such as the submission of the query. We develop two metrics that capture the correctness of query answers and propose a range of solutions that provide a trade-off between the degree of exposure of private data, and the overhead of generation and verification of the proof. Our proposed solutions are tested on real data through implementation using PostgreSQL.

Added 2008-06-03

Similar Document Detection with Limited Information Disclosure

Wei Jiang, Mummoorthy Murugesan, Chris Clifton, Luo Si

Similar document detection plays important roles in many applications, such as file management, copyright protection, and plagiarism prevention. Existing protocols assume that the contents of files stored on a server (or multiple servers) are directly accessible. This assumption limits more practical applications, e.g., detecting plagiarized documents between two conferences, where submissions are confidential. We propose novel protocols to detect similar documents between two entities where documents cannot be openly shared with each other. We also conduct experiments to show the practical value of the proposed protocols.

Added 2008-06-03

Protecting source privacy in federated search

Wei Jiang, Luo Si, Jing Li

Many information sources contain information that can only be accessed through search-specific search engines. Federated search provides search solutions of this type of hidden information that cannot be searched by conventional search engines. In many scenarios of federated search, such as the search among health care providers or among intelligence agencies, an individual information source does not want to disclose the source of the search results to users or other sources. Therefore, this paper proposes a two-step federated search protocol that protects the privacy of information sources. As far as we know, this is the first attempt to address the research problem of protecting source privacy in federated text search.

Added 2008-06-03

On the effectiveness of probabilistic packet marking for IPtraceback under denial of service attack

Kihong Park, Heejo Lee

Effective mitigation of denial of service (DoS) attack is a pressing problem on the Internet. In many instances, DoS attacks can be prevented if the spoofed source IP address is traced back to its origin which allows assigning penalties to the offending party or isolating the compromised hosts and domains from the rest of the network. IP traceback mechanisms based on probabilistic packet marking (PPM) have been proposed for achieving traceback of DoS attacks. We show that probabilistic packet marking-of interest due to its efficiency and implementability vis-a-vis deterministic packet marking and logging or messaging based schemes-suffers under spoofing of the marking field in the IP header by the attacker which can impede traceback by the victim. We show that there is a trade-off between the ability of the victim to localize the attacker and the severity of the DoS attack, which is represented as a function of the marking probability, path length, and traffic volume. The optimal decision problem-the victim can choose the marking probability whereas the attacker can choose the spoofed marking value, source address, and attack volume-can be expressed as a constrained minimax optimization problem, where the victim chooses the marking probability such that the number of forgeable attack paths is minimized. We show that the attacker’s ability to hide his location is curtailed by increasing the marking probability, however, the latter is upper-bounded due to sampling constraints. In typical IP internets, the attacker’s address can be localized to within 2-5 equally likely sites which renders PPM effective against single source attacks. Under distributed DoS attacks, the uncertainty achievable by the attacker can be amplified, which diminishes the effectiveness of PPM

Added 2008-06-02


Managing uncertainty in sensor database

Reynold Cheng, Sunil Prabhakar

Sensors are often employed to monitor continuously changing entities like locations of moving objects and temperature. The sensor readings are reported to a centralized database system, and are subsequently used to answer queries. Due to continuous changes in these values and limited resources (e.g., network bandwidth and battery power), the database may not be able to keep track of the actual values of the entities, and use the old values instead. Queries that use these old values may produce incorrect answers. However, if the degree of uncertainty between the actual data value and the database value is limited, one can place more confidence in the answers to the queries. In this paper, we present a frame-work that represents uncertainty of sensor data. Depending on the amount of uncertainty information given to the application, different levels of imprecision are presented in a query answer. We examine the situations when answer imprecision can be represented qualitatively and quantitatively. We propose a new kind of probabilistic queries called Probabilistic Threshold Query, which requires answers to have probabilities larger than a certain threshold value. We also study techniques for evaluating queries under different details of uncertainty, and investigate the tradeoff between data uncertainty, answer accuracy and computation costs.

Added 2008-06-02

Distributed Credential Chain Discovery in Trust Management with Parameterized Roles and Constraints (Short Paper)

Ziqing Mao, Ninghui Li, William H. Winsboro

Trust management (TM) is an approach to access control in decentralized distributed systems with access control decisions based on statements made by multiple principals. Li et al. developed the RT family of Role-Based Trust-management languages, which combine the strengths of Role-Based Access Control and TM systems. We present a distributed credential chain discovery algorithm for RT1C, a language in the RT family that has parameterized roles and constraints. Our algorithm is a combination of the logic-programming style top-down query evaluation with tabling and a goal-directed version of the deductive database style bottom-up evaluation. Our algorithm uses hints provided through the storage types to determine whether to use a top-down or bottom-up strategy for a particular part of the proof; this enables the algorithm to touch only those credentials that are related to the query, which are likely to be a small fraction of all the credentials in the system.

Added 2008-05-30