The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Reports and Papers Archive


Browse All Papers »       Submit A Paper »

Approximation Algorithms for Determining Placement of Intrusion Detectors

CERIAS TR 2011-01
Gaspar Modelo-Howard, Saurabh Bagchi, Guy Lebanon
Download: PDF

To secure today’s computer systems, it is critical to have different intrusion detection sensors embedded in them. The complexity of distributed computer systems makes it difficult to determine the appropriate choice and placement of these detectors because there are many possible sensors that can be chosen and each sensor can be placed in several possible places in the distributed system. In this paper, we describe a method to evaluate the effect a detector configuration has on the accuracy and precision of determining the system’s security goals. The method is based on a Bayesian network model, obtained from an attack graph representation of multi-stage attacks. We use Bayesian inference to solve the problem of determining the likelihood that an attack goal has been achieved, given a certain set of detector alerts. We quantify the overall performance as a function of the placement, quality, and uncertainty in knowledge of the detectors. Based on these observations, we implement a dynamic programming algorithm for determining the optimal detector settings in a large-scale distributed system and compare it against a previous Greedy algorithm. Finally, we report the results of five experiments, measuring the Bayesian networks behavior in the context of two real-world distributed systems undergoing attacks. Results show the dynamic programming algorithm outperforms the Greedy algorithm in terms of the benefit provided by the set of detectors picked. The dynamic programming solution also has the desirable property that we can trade off the running time with how close the solution is to the optimal.

Added 2011-02-12

Data in the Cloud: Authentication without Leaking

CERIAS TR 2010-25
Ashish Kundu
Download: PDF

Third party data distribution frameworks such as the cloud are increasingly being employed in order to store, process, and publish sensitive information such as healthcare and finance information, belonging to individuals and enterprises. Such data objects are often organized as trees, graphs or even forests (e.g., XML). In third party frameworks, not only authentication of data is important but also protection of privacy and assurance of confidentiality are important. Moreover, data authenticity must be assured even when the data object that a user has access to consists of subset(s) of the signed data.

Existing solutions such as Merkle hash technique and the redactable signature schemes lead to leakages of structural information, which can be used to infer sensitive information, which in turn would lead to privacy and confidentiality breaches. So the question is: can we authenticate subset(s) of signed data objects without leaking, and if so, how efficiently such authentication can be carried out? We have reported a positive result by presenting efficient and provably secure solutions not only for trees, but also graphs and forests. We have presented a scheme that computes only one signature per tree, graph or forest.

Our schemes support encrypted data to be stored at third-party services. Our schemes can also be used to automatically recover from structural errors in
  tree-structured data, and for leakage-free authentication of paths (e.g.,
  XPaths). Further, as the applications of our schemes, we have also developed a
  publish/subscribe model for XML—Structure-based routing, and a scheme for authentication of objects.

Added 2011-01-06

Provably Private Data Anonymization: Or, k-Anonymity Meets Differential Privacy

CERIAS TR 2010-24
Ninghui Li, Wahbeh Qardaji, Dong Su
Download: PDF

Privacy-preserving microdata publishing currently lacks a solid theoretical foundation.  Most existing techniques are developed to satisfy syntactic privacy notions such as $k$-anonymity, which fails to provide strong privacy guarantees.  The recently proposed notion of differential privacy has been widely accepted as a sound privacy foundation for statistical query answering.  However, no general practical microdata publishing techniques are known to satisfy differential privacy.  In this paper, we start to bridge this gap.  We first analyze k-anonymization methods and show how they fail to provide sufficient protection against re-identification, which it was designed to protect.  We then prove that, $k$-anonymization methods, when done “safely”, and when preceded with a random sampling step, can satisfy $(\epsilon,\delta)$-differential privacy with reasonable parameters.  This result is, to our knowledge, the first to link $k$-anonymity with differential privacy and illustrates that “hiding in a crowd of $k$” indeed offers privacy guarantees.  This naturally leads to future research in designing “safe” and practical $k$-anonymization methods. We observe that our result gives an alternative approach to output perturbation for satisfying differential privacy: namely, adding a random sampling step in the beginning and pruning results that are too sensitive to changing a single tuple.  This approach may be applicable to settings other than microdata publishing.  We also show that adding a random-sampling step can greatly amplify the level of privacy guarantee provided by a differentially-private algorithm.  This result makes it much easier to provide strong privacy guarantees when one wishes to publish a portion of the raw data.  Finally, we show that current definitions of $(\epsilon, \delta)$-differential privacy require $\delta$ to be very small to provide sufficient privacy protection when publishing microdata, making the notion impractical in some scenarios.  To address this problem, we introduce a notion called $f$-smooth $(\epsilon, \delta)$-differential privacy.

Added 2010-12-15

A Practical and Flexible Key Management Mechanism For Trusted Collaborative Computing

CERIAS TR 2008-33
Xukai Zou

Trusted collaborative computing (TCC) is a new research and application paradigm. Two important challenges in such a context are represented by secure information transmission among the collaborating parties and selective differentiated access to data among members of collaborating groups. Addressing such challenges requires, among other things, developing techniques for secure group communication (SGQ), secure dynamic conferencing (SDC), differential access control (DIF-AC), and hierarchical access control (HAC). Cryptography and key management have been intensively investigated and widely applied in order to secure information. However, there is a lack of key management mechanisms which are general and flexible enough to address all requirements arising from information transmission and data access. This paper proposes the first holistic group key management scheme which can directly support all these functions yet retain efficiency. The proposed scheme is based on the innovative concept of access control polynomial (ACP) that can efficiently and effectively support full dynamics, flexible access control with fine-tuned granularity, and anonymity. The new scheme is immune from various attacks from both external and internal malicious parties.

Added 2010-11-16

Dependability and Security in Medical Information System

CERIAS TR 2007-105
Xukai Zou

Medical Information Systems (MIS) help medical practice and health care significantly. Security and dependability are two increasingly important factors for MIS nowadays. In one hand, people would be willing to step into the MIS age only when their privacy and integrity can be protected and guaranteed with MIS systems. On the other hand, only secure and reliable MIS systems would provide safe and solid medical and health care service to people. In this paper, we discuss some new security and reliability technologies which are necessary for and can be integrated with existing MISs and make the systems highly secure and dependable. We also present an implemented Middleware architecture which has been integrated with the existing VISTA/CPRS system in the U.S. Department of Veterans Affairs seamlessly and transparently.

Added 2010-11-16

uKeying: A Key Management Framework for Wireless Sensor Networks Utilizing a Unique Session Key

CERIAS TR 2007-106
Xukai Zou

Key management is a core mechanism to ensure the security of applications and network services in wireless sensor networks. Key management includes two aspects: key distribution and key revocation. The goal of the key distribution is to establish the required keys between sensor nodes which must exchange data. Key revocation is used to remove compromised sensor nodes from the network. Although many key distribution schemes and key revocation schemes have been proposed in the literature, there is a lack of a framework which can integrate the schemes. In this paper, we propose a key management framework, uKeying, for wireless sensor networks using a globally distributed session key. uKeying includes three parts: a security mechanism to provide secrecy for the communication in the sensor network, an efficient session key distribution scheme, and a centralized key revocation scheme. The proposed framework does not depend on a specific key distribution scheme and can support many key distribution schemes. We further demonstrate how to use the framework to support secure group communication protocols in wireless sensor networks. Our analysis shows that the framework is secure, efficient, and extensible. The simulation and results reveal for the first time that a centralized key revocation scheme can also attain a high efficiency.

Added 2010-11-16

Characterizing and Aggregating Attack Graph-based Security Metrics

CERIAS TR 2010-23
Nwokedi C. Idika
Download: PDF

An attack graph is an abstraction that represents the ways an attacker can violate a security policy by leveraging interdependencies among discovered vulnerabilities. Attack graph analyses that extract security-relevant information from the attack graph are referred to as attack graph-based security metrics. Although a number of attack graph-based security metrics have been proposed in the literature, there has been no analysis of how these security metrics behave in response to security incidents. In this dissertation, we examine how attack graph-based security metrics behave in response to increased network vulnerabilities under heterogeneous network models. From this analysis, we identify opportunities for using equations that characterize particular attack graph-based security metrics avoiding the costly processing of attack graphs.

Security is recognized to be a multidimensional entity. However, all proposed attack graph-based security metrics have been unidimensional. In this dissertation, we provide an approach for aggregating the capabilities of existing attack graph-based security metrics with our proposed suite of attack graph-based security metrics.

Lastly, we specify an algorithm for network hardening given a limited budget. Given a set of network vulnerabilities and a set of candidate countermeasures to implement, a network administrator is to choose the set of countermeasures that optimize security given a limited budget. Our algorithm produces sets of countermeasures that optimize security with respect to a set of attack graph-based security metrics while staying within budget.

Added 2010-10-09

Efficient and Robust Solutions for Sensor Network Detection and Localization

CERIAS TR 2010-22
Jren-Chit Chin
Download: PDF

The ability to quickly detect and locate stealthy radiological dispersal devices (RDDs) allows authorities to disarm and remove the RDDs before they can be detonated. Traditionally, the detection of RDDs was accomplished by using expensive and cumbersome radiation sensors strategically located in the surveillance area. However, with recent advancements in wireless technologies and sensing hardware, deploying a large scale sensor network with small sensors is now becoming a reality. In this dissertation, we study methods to detect and locate radiation sources quickly and accurately using a network of sensors.

Localization of a single radiation source can be achieved by using three sensors in a noise- and error-free environment. When both noise and errors are considered, we present a closed-form solution that outperforms existing algorithms. When more than three sensors are available, we present an efficient algorithm to exploit the additional sensor data, in order to further improve the robustness and accuracy of the localization.

To localize multiple sources in a sensor network, we propose a hybrid formulation of a particle filter with a mean-shift technique, in order to achieve several important features which address major challenges faced by existing multiple source localization algorithms. First, our algorithm is able to maintain a constant number of estimation (source) parameters even as the number of radiation sources K increases. Second, our algorithm “learns” the number of sources from the estimated source parameters instead of relying on expensive statistical estimations. Third, the presence of obstacles may improve the localization accuracy of our algorithm. Unfortunately, the presence of obstacles significantly degrades the accuracy of existing algorithms.

When no radiation source is present, the localization algorithms produce false positives as the algorithms assume that a radiation source is present. We propose the Localization Enhanced Detection (LED) method, that decides whether a source with the estimated parameters is present or absent, using a close-to-minimal number of measurements, while maintaining the false positive and false negative rates below a specified level. We evaluate the LED method using simulation and testbed experiments, and compare the effectiveness of the LED method with existing detection methods.

We build a cross-platform, cross-language, and versatile software framework that provides an abstraction for interfacing with sensors and supports building applications on radiation source localization. The software framework implements various localization algorithms that are ready to be deployed in an actual system. The components in the software framework are loosely coupled and are general enough to support application domains beyond radiation source localization. We demonstrate the versatility of the software framework in building the Rapid Structural Assessment Network.

Added 2010-10-08

Challenge benchmarks for verification of real-time programs

CERIAS TR 2010-21
Jan Vitek

Real-time systems, and in particular safety-critical systems, are a rich source of challenges for the program verification community as software errors can have catastrophic consequences. Unfortunately, it is nearly impossible to find representative safety-critical programs in the public domain. This has been significant impediment to research in the field, as it is very difficult to validate new ideas or techniques experimentally. This paper presents open challenges for verification of real-time systems in the context of the Real-time Specification for Java. But, our main contribution is a family of programs, called CDx, which we present as an open source benchmark for the verification community.

Added 2010-10-07

An analysis of the dynamic behavior of JavaScript programs

CERIAS TR 2010-20
Jan Vitek

The JavaScript programming language is widely used for web programming and, increasingly, for general purpose computing. As such, improving the correctness, security and performance of JavaScript applications has been the driving force for research in type systems, static analysis and compiler techniques for this language. Many of these techniques aim to reign in some of the most dynamic features of the language, yet little seems to be known about how programmers actually utilize the language or these features. In this paper we perform an empirical study of the dynamic behavior of a corpus of widely-used JavaScript programs, and analyze how and why the dynamic features are used. We report on the degree of dynamism that is exhibited by these JavaScript programs and compare that with assumptions commonly made in the literature and accepted industry benchmark suites.

Added 2010-10-07

Implicit ownership types for memory management

CERIAS TR 2008-32
Jan Vitek
Download: PDF

The Real-time Specification for Java (RTSJ) introduced a range of language features for explicit memory management. While the RTSJ gives programmers fine control over memory use and allows linear allocation and constant-time deallocation, the RTSJ relies upon dynamic runtime checks for safety, making it unsuitable for safety critical applications. We introduce ScopeJ, a statically-typed, multi-threaded, object calculus in which scopes are first class constructs. Scopes reify allocation contexts and provide a safe alternative to automatic memory management. Safety follows from the use of an ownership type system that enforces a topology on run-time patterns of references. ScopeJ’s type system is novel in that ownership annotations are implicit. This substantially reduces the burden for developers and increases the likelihood of adoption. The notion of implicit ownership is particularly appealing when combined with pluggable type systems, as one can apply different type constraints to different components of an application depending on the requirements without changing the source language. In related work we have demonstrated the usefulness of our approach in the context of highly-responsive systems and stream processing.

Added 2010-10-07

Timetraveler: exploiting acyclic races for optimizing memory race recording

CERIAS TR 2010-19
T. N. Vijaykumar

As chip multiprocessors emerge as the prevalent microprocessor architecture, support for debugging shared-memory parallel programs becomes important. A key difficulty is the programs’ nondeterministic semantics due to which replay runs of a buggy program may not reproduce the bug. The non-determinism stems from memory races where accesses from two threads, at least one of which is a write, go to the same memory location. Previous hardware schemes for memory race recording log the predecessor-successor thread ordering at memory races and enforce the same orderings in the replay run to achieve deterministic replay. To reduce the log size, the schemes exploit transitivity in the orderings to avoid recording redundant orderings. To reduce the log size further while requiring minimal hardware, we propose Timetraveler which for the first time exploits acyclicity of races based on the key observation that an acyclic race need not be recorded even if the race is not covered already by transitivity. Timetraveler employs a novel and elegant mechanism called post-dating which both ensures that acyclic races, including those through the L2, are eventually ordered correctly, and identifies cyclic races. To address false cycles through the L2, Timetraveler employs another novel mechanism called time-delay buffer which delays the advancement of the L2 banks’ timestamps and thereby reduces the false cycles. Using simulations, we show that Timetraveler reduces the log size for commercial workloads by 88% over the best previous approach while using only a 696-byte time-delay buffer.

Added 2010-10-07

Remote reprogramming of wireless sensor networks

CERIAS TR 2010-18
Rajesh Krishna Panta
Download: PDF

In recent years, advances in hardware and software tools have led to many realworld sensor network deployments. Management of already deployed sensor networks is a very important issue. One of the crucial management tasks is that of software reconfiguration. During the lifetime of a sensor network, software running on the sensor nodes may need to be changed for various reasons like correcting software bugs, modifying the application to meet the changing environmental conditions in which the network is deployed, adapting to evloving user requirements, etc. The frequency of software updates is very high in sensor networks due to various reasons — harsh and unpredictable environments in which the sensor nodes are deployed, time-varying nature of the wireless channel, lack of robust tools for developing software, interference in the unlicensed ISM band, topology change casued by node mobility, battery outages, etc. Since a sensor network may consist of hundreds or even thousands of nodes which may be situated at places which are difficult or, sometimes, impossible to access physically, remote reprogramming of sensor networks is essential. This thesis presents energy efficient and fast reprogramming services for wireless sensor networks. Sensor networks are often battery-powered and need to be operated unattended for long periods of time. Radio transmission is often the most energy-expensive operation in sensor networks. Since energy is a very scarce resource, this thesis focuses on conserving energy during reprogramming. Also, since the performance of the network may be degraded, or even reduced to zero, during software update process, the reprogramming techniques proposed here minimize reprogramming time significantly compared to existing reprogramming protocols.

Added 2010-10-07

Analytical characterization of internet security attacks

CERIAS TR 2010-17
Sarah Sellke

Internet security attacks have drawn significant attention due to their enormously adverse impact. These attacks includes Malware (Viruses, Worms, Trojan Horse), Denial of Service, Packet Sniffer, and Password Attacks. There is an increasing need to provide adequate defense mechanisms against these attacks. My thesis proposal deals with analytical aspects of the Internet security attacks, as well as practical solutions based on our analysis.

First, We focus on modeling and containment of internet worms. We present a branching process model for the propagation of worms. Our model leads to the development of automatic worm containment strategies, which effectively contain both uniform scanning worms and local preference scanning worms. Incremental deployment of our scheme provides worm containment for local networks when combined with traditional firewalls.

Next, we study the capacity of Bounded Service Timing Channels. We derive an upper bound and two lower bounds on the capacity of such timing channels. We show that when the length of the support interval is small, the uniform BSTC has the smallest capacity among all BSTCs. Based on our analysis, we design and implement a covert timing channel over TCP/IP networks. We are able to quantify the achievable data rate (or leak rate) of such a covert channel. Moreover, by sacrificing data rate, we are able to mimic normal traffic patterns, which makes detecting such communication virtually impossible.

Added 2010-10-06

Memory balancing for large-scale network simulation in power-law networks

CERIAS TR 2008-31
Hyojeong Kim

Large-scale network simulation has grown in importance due to a rapid increase in Internet size and the availability of Internet measurement topologies with applications to computer networks and network security. A key obstacle to large-scale network simulation over PC clusters is the memory balancing problem, where a memory-overloaded machine can slow down a distributed simulation due to disk I/O overhead. Network partitioning methods for parallel and distributed simulation are insufficiently equipped to handle new challenges brought on by memory balancing due to their focus on CPU and communication balancing.

This dissertation studies memory balancing for large-scale network simulation in power-law networks over PC clusters. First, we design and implement a measurement subsystem for dynamically tracking memory consumption in DaSSFNet, a distributed network simulator. Accurate monitoring of memory consumption is difficult due to complex protocol interaction through which message related events are created and destroyed inside and outside a simulation kernel. Second, we achieve efficient memory cost monitoring by tackling the problem of estimating peak memory consumption of a group of simulated network nodes in power-law topologies during network partitioning. In contrast to CPU balancing where the processing cost of a group of nodes is proportional to their sum, in memory balancing this closure property need not hold. Power-law connectivity injects additional complications due to skews in resource consumption across network nodes. Third, we show that the maximum memory cost metric outperforms the total cost metric for memory balancing under multilevel recursive partitioning but the opposite holds for CPU balancing. We show that the trade-off can be overcome through joint memory-CPU balancing—in general not feasible due to constraint conflicts—which is enabled by network simulation having a tendency to induce correlation between memory and CPU costs. Fourth, we evaluate memory balancing in the presence of virtual memory (VM) management which admits larger problem instances to be run over limited physical memory. VM introduces complex memory management dependencies that make understanding and evaluating simulation performance difficult. We provide a performance evaluation framework wherein the impact of memory thrashing in distributed network simulation is incorporated which admits quantitative performance comparison and diagnosis. Fifth, we show that improved memory balancing under the maximum cost metric in the presence of VM manifests as faster completion time compared to the total cost metric despite the CPU balancing advantage of the latter. In the cases where the CPU balancing advantage of the total cost metric is strong, we show that joint memory-CPU balancing can achieve the best of both worlds.

We carry out performance evaluation using benchmark applications with varying traffic characteristics: BGP routing, worm propagation under local and global scanning, and distributed client/server system. We use a testbed of 32 Intel x86 machines running a measurement-enhanced DaSSFNet over Linux.

Added 2010-10-06