Approximation algorithms for a graph-cut problem with applications to a clustering problem in bioinformatics
Choudhury, Salimur Rashid
University of Lethbridge. Faculty of Arts and Science
Lethbridge, Alta. : University of Lethbridge, Deptartment of Mathematics and Computer Science, 2008
Clusters in protein interaction networks can potentially help identify functional relationships among proteins. We study the clustering problem by modeling it as graph cut problems. Given an edge weighted graph, the goal is to partition the graph into a prescribed number of subsets obeying some capacity constraints, so as to maximize the total weight of the edges that are within a subset. Identification of a dense subset might shed some light on the biological function of all the proteins in the subset. We study integer programming formulations and exhibit large integrality gaps for various formulations. This is indicative of the difficulty in obtaining constant factor approximation algorithms using the primal-dual schema. We propose three approximation algorithms for the problem. We evaluate the algorithms on the database of interacting proteins and on randomly generated graphs. Our experiments show that the algorithms are fast and have good performance ratio in practice.
xiii, 71 leaves : ill. ; 29 cm.
Cluster analysis -- Computer programs , Protein-protein interactions -- Data processing , Proteins -- Research -- Data processing , Graph algorithms , Bioinformatics , Dissertations, Academic