right
 

Welcome to

Complex Systems Failure (CSf)

 

 

A Multidisciplinary University Research Initiative (MURI)

Home
Research Thrusts
Laboratories
Members
Accomplishments
Published Papers
Software Tools
BIBLIOGRAPHY
Contact Information

Program Manager:

Dr. Mou-Hsiung (Harry) Chang,

U.S. Army Research Office

Principal Investigator:

Dr. Asok Ray,

Pennsylvania State University

Abstract:

The search for fundamental principles of fault tolerance in human-engineered complex dynamic systems is very new. The physics of individual failure in a component cannot sufficiently explain the pathological behaviors observed in the aggregated system. Fault propagation in incidents like the failure of the electric power infrastructure in Western United States in the summer of 1996 [CNN 96, PBS 96] remains unpredictable except for the search for triggering individual errors. Complex macroscopic behaviors emerge as a consequence of the nonlinear dynamics of interactions between linked components. System behavior may range from strict order to chaos with great sensitivity to initial conditions embedded in the physics of individual failures. This is known as the butterfly effect [Mainzer 93].

Many biological and chemical systems exist where microscopic flux and chaos is offset by macroscopic order. Based on this concept we formulate analytical notions of pervasive fault tolerance in human-engineered complex dynamic systems. These systems, although architecturally similar to physical systems, may structurally be quite different. For example, component subsystems may not be microscopic particles following the laws of classical mechanics where causality is deterministic. At the lowest level of decomposition, however, the macroscopic effect is triggered by single fault manifestations of emerging physical defects in hardware, an erroneous state of software or a human operator error. We propose to develop methods for determining regions of stability by deriving and finding critical values of physical parameters where the subsequent behavior of the macroscopic system changes abruptly. Both theoretic and experimental analyses are essential.

For theoretical analyses, we model complex dynamic systems as hybrid interacting automata whose continuously varying dynamics capture the physical process at the lowest level of abstraction. Discrete event models at the higher levels capture the cognitive response of the system to observed emerging physical phenomena. We have used this concept to utilize the dynamic structural behaviors of materials in formulating damage mitigating control algorithms at the system level to enhance the life of critical mechanical components [Ray 94]. Our broader aim is to formulate analytical models of the higher level dynamics of component interactions triggered by all types of individual failures to (i) predict emerging pathological system behavior from time-series observations of events and their dynamic interactions, and (ii) formulate adaptive mechanisms to circumvent or mitigate the effects of pathological behavior.

With the present state of knowledge of macroscopic behaviors in engineered complex dynamical systems, experimental analysis is essential for understanding and characterizing pathologies. Guiding principles from physical systems [Haken 93] have not been verified in human engineered systems like the Internet [Grossglauser 99]. We propose to undertake a comprehensive characterization of pathological behaviors, both syntactic and operational, through extensive experimentation. This will be achieved by analyzing spatio-temporal patterns in databases of event/action dynamics. Starting with Kott's catalog of general complex system pathologies [Kott 99], we will use information theoretic and modeling approaches to iteratively induce classification and refine characteristics of pathologies as new data from our laboratory experiments are obtained.

Validation of observed pathological patterns through scientifically designed realistic, medium complexity simulation experiments is essential. We, therefore, propose a Failure Simulation Network for collaborative experiments among all participants. High-fidelity physics based models of components, hardware-in-the-loop, and experimental data will be generated and maintained here. An open invitation to industry and academia to join will extend the network to a collaboratory for the scientific community's interactions with real operational issues and development of objective criteria for evaluating failures in complex systems.

The proposed research has the broader potential of providing a scientific basis for engineering dependability in military operations [Trivedi 98]. It envisions a fundamentally new approach to engineering and operation of complex informational systems for pervasive fault tolerance. Instead of specifying parameters for worst-case design of components, we postulate designing these systems by specifying a scalable set of resources (components) that interact to support evolving operational needs of multiple defense applications in a dynamic and uncertain environment. Dependability of operations will be achieved by identifying and mitigating the origins of disorder through dynamic coordination and control of available system resources.

Home | Research Thrusts | Laboratories | Members | Accomplishments | Published Papers | Software Tools | BIBLIOGRAPHY | Contact Information

 
Copyright © 2005 CSF-MURI, The Pennsylvania State University
For problems or questions regarding this web contact [CSF-MURI].
Last updated: 11/02/05.