Access Type

Open Access Dissertation

Date of Award

January 2014

Degree Type

Dissertation

Degree Name

Ph.D.

Department

Computer Science

First Advisor

Dr. Hasan Jamil

Abstract

Many emerging database applications entail sophisticated graph based query manipulation, predominantly evident in large-scale

scientific applications. To access the information embedded in

graphs, efficient graph matching tools and algorithms have become of prime importance. Although the prohibitively expensive time

complexity associated with exact sub-graph isomorphism techniques has limited its efficacy in the application domain, approximate yet efficient graph matching techniques have received much attention due to their pragmatic applicability. Since public domain databases are noisy and incomplete in nature, inexact graph matching techniques have proven to be more promising in terms of inferring knowledge from numerous structural data repositories.

Contemporary algorithms for approximate graph matching incur

substantial cost to generate candidates, and then test and rank them for possible match. Leading algorithms balance processing time and overall resource consumption cost by leveraging sophisticated data structures and graph properties to improve overall performance.

In this dissertation, we propose novel techniques for approximate graph matching based on two different techniques called TraM or

Top-k Graph Matching and Approximate Network Matching or AtoM respectively. While TraM off-loads a significant amount of its processing on to the database making the approach viable for large graphs, AtoM provides improved turn around time by means of graph

summarization prior to matching. The summarization process is aided by domain sensitive similarity matrices, which in turn helps improve the matching performance. The vector space embedding of the graphs and efficient filtration of the search space enables computation of approximate graph similarity at a throw-away cost. We combine domain similarity and topological similarity to obtain overall graph similarity and compare them with neighborhood biased segments of the data-graph for proper matches. We show that our approach can naturally support the emerging trend in graph pattern queries and discuss its suitability for large networks as it can be seamlessly transformed to adhere to map-reduce framework.

We have conducted thorough experiments on several synthetic and real data sets, and have demonstrated the effectiveness and efficiency of the proposed method.

Share

COinS