Open Access Dissertation
Date of Award
Service Oriented Architectures (SOAs) enable the automatic creation of business applications from independently developed and deployed Web services. As Web services are inherently unreliable,
how to deliver reliable Web services composition over unreliable Web services is a significant and challenging problem. The process requires monitoring the system's behavior, determining when and why faults occur, and then applying fault prevention/recovery mechanisms to minimize the impact and/or recover from these faults. However, it is hard to apply a non-distributed management
approach to SOA, since a manager needs to communicate with the different components through authentications. In SOA, a business process can terminate successfully if all services finish their
work correctly through providing alternative plans in case of fault. However, the business process itself may encounter different faults because the fault may occur anywhere at any time due to SOA
In this work, we propose new fault management technique (FLEX) and we identify several improvements over existing techniques. First, existing techniques rely mainly on static information while FLEX is based on dynamic information. Second, existing frameworks use a limited number of attributes; while we use all possible attributes by identify them as either required or optional. Third, FLEX reduces the comparison cost (time and space) by filtering out services at each level needed for evaluation. In general, FLEX is divided into two phases: Phase I, computes service reliability and utility, while in Phase II, runtime planning and evaluation. In Phase I, we assess
the fault likelihood of the service using a combination of techniques (e.g., Hidden Marcov Model, Reputation, and Clustering). In Phase II, we build a recovery plan to execute in case of fault(s) and we calculate the overall system reliability based on the fault occurrence likelihoods assessed for all the services that are part of the current composition. FLEX is novel because it relies on key activities of the autonomic control loop (i.e., collect, analyze, decide, plan, and execute) to support dynamic management based on the changes of user requirements and QoS level. Our technique dynamically evaluates the performance of Web services based on their previous history and user requirements, assess the likelihood of fault occurrence, and uses the result to create (multiple) recovery plans. Moreover, we define a method to assess the overall system reliability by evaluating the performance of individual recovery plans, when invoked together.
The Experiment results show that our technique improves the service selection quality by selecting the services with the highest score and improves the overall system performance in comparison
with existing works. In the future, we plan to investigate techniques for monitoring service oriented systems and assess the online negotiation possibilities for combining different services to
create high performance systems.
Alhosban, Amal, "Fault Management For Service-Oriented Systems" (2013). Wayne State University Dissertations. 745.