The past decades witnessed extensive efforts to study the relationship among proteins. Particularly, sequence-based protein-protein interactions (PPIs) prediction is fundamentally important in speeding up the process of mapping interactomes of organisms. High-throughput experimental methodologies make many model organism's PPIs known, which allows us to apply machine learning methods to learn understandable rules from the available PPIs. Under the machine learning framework, the composition vectors are usually applied to encode proteins as real-value vectors. However, the composition vector value might be highly correlated to the distribution of amino acids, i.e., amino acids which are frequently observed in nature tend to have a large value of composition vectors. Thus formulation to estimate the noise induced by the background distribution of amino acids may be needed during representations. Here, we introduce two kinds of denoising composition vectors, which were successfully used in construction of phylogenetic trees, to eliminate the noise. When validating these two denoising composition vectors on Escherichia coli (E. coli), Saccharomyces cerevisiae (S. cerevisiae) and human PPIs datasets, surprisingly, the predictive performance is not improved, and even worse than non-denoised prediction. These results suggest that the noise in phylogenetic tree construction may be valuable information in PPIs prediction.
|Evidence ID||Analyze ID||Interactor||Interactor Systematic Name||Interactor||Interactor Systematic Name||Type||Assay||Annotation||Action||Modification||Phenotype||Source||Reference||Note|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Gene Ontology Term||Gene Ontology Term ID||Qualifier||Aspect||Method||Evidence||Source||Assigned On||Annotation Extension||Reference|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Phenotype||Experiment Type||Experiment Type Category||Mutant Information||Strain Background||Chemical||Details||Reference|
|Evidence ID||Analyze ID||Regulator||Regulator Systematic Name||Target||Target Systematic Name||Experiment||Assay||Construct||Conditions||Strain Background||Reference|