He J, et al. (2012)

Reference: He J, et al. (2012) Efficient and accurate Greedy Search Methods for mining functional modules in protein interaction networks. BMC Bioinformatics 13 Suppl 10(Suppl 10):S19

Reference Help

Abstract

Background: Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures.

Methods: In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules.

Results: The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms.

Conclusions: Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the computational time significantly while keeping high prediction accuracy.

Download Citation (.nbib)

Reference Type: Journal Article | Research Support, Non-U.S. Gov't
Authors: He J, Li C, Ye B, Zhong W
Primary Lit For
Additional Lit For
Review For

Gene Ontology Annotations

Evidence ID	Analyze ID	Gene/Complex	Systematic Name/Complex Accession	Qualifier	Gene Ontology Term ID	Gene Ontology Term	Aspect	Annotation Extension	Evidence	Method	Source	Assigned On	Reference

Download (.txt)

Analyze

Phenotype Annotations

Evidence ID	Analyze ID	Gene	Gene Systematic Name	Phenotype	Experiment Type	Experiment Type Category	Mutant Information	Strain Background	Chemical	Details	Reference

Download (.txt)

Analyze

Disease Annotations

Evidence ID	Analyze ID	Gene	Gene Systematic Name	Disease Ontology Term	Disease Ontology Term ID	Qualifier	Evidence	Method	Source	Assigned On		Reference

Download (.txt)

Analyze

Regulation Annotations

Evidence ID	Analyze ID	Regulator	Regulator Systematic Name	Target	Target Systematic Name	Direction	Regulation of	Happens During	Regulator Type	Direction	Regulation Of	Happens During	Method	Evidence	Strain Background	Reference

Download (.txt)

Analyze

Post-translational Modifications

				Site		Modification	Modifier	Source	Reference

Download (.txt)

Analyze

Interaction Annotations

Genetic Interactions

Evidence ID	Analyze ID		Interactor	Interactor Systematic Name	Interactor	Interactor Systematic Name	Allele	Assay	Annotation	Action	Phenotype	SGA score	P-value	Source	Reference	Note

Download (.txt)

Analyze

Physical Interactions

Evidence ID	Analyze ID		Interactor	Interactor Systematic Name	Interactor	Interactor Systematic Name	Assay	Annotation	Action	Modification	Source	Reference	Note

Download (.txt)

Analyze

Functional Complementation Annotations

Complement ID	Locus ID	Gene	Species	Gene ID	Strain background	Direction	Details	Source	Reference

Download (.txt)

Analyze

Published Datasets

Evidence ID	Analyze ID		Dataset	Description	Keywords	Number of Conditions	Reference

Download (.txt)

Analyze

Downloadable Files

Evidence ID	Analyze ID		File	Description

Download (.txt)

Analyze