Reference: Song G, et al. (2026) Protein Solubility Prediction Using Fused Graph Convolutional Networks and Improved Attention Networks with AlphaFold3-Derived Features. J Chem Inf Model

Reference Help

Abstract


Research on protein solubility holds critical significance in industrial production, biopharmaceuticals, and the food industry. While numerous studies have focused on predicting protein solubility in recent years, existing models often rely solely on one-dimensional sequence information and three-dimensional spatial contact data, failing to fully leverage other 3D structural features and the global physicochemical properties of proteins. Concurrently, advancements in protein language models (e.g., ESM-C) and spatial structure prediction models (e.g., AlphaFold3) have provided more accurate and comprehensive information for extracting sequence and spatial features, creating new opportunities for solubility prediction. We propose a novel model, called FGNNSol. The model first employs AlphaFold3 to predict the three-dimensional structure of proteins, leveraging this structural information to construct edges and generate edge features. It simultaneously utilizes ESM-C embeddings and other residue-level properties as node features while incorporating protein-level global features to collectively build a comprehensive protein graph representation. It then trains a model integrating GPSol (an improved graph attention network) and Graph Convolutional Networks. Finally, the integrated representation from these networks is concatenated with global protein features and input into a multilayer perceptron to output the prediction. Experimental validation was performed using the Escherichia coli eSOL dataset as the training, validation, and test sets for model development and evaluation, while the Saccharomyces cerevisiae dataset was used as an external test set to assess the model's generalization capability. Results show that FGNNSol achieved R2 values of 0.577 and 0.469 on the two test sets, respectively, clearly outperforming existing models. When a threshold of 0.5 was used to classify proteins as soluble or insoluble, the binary classification metrics of FGNNSol were also mostly superior to those of existing models, fully demonstrating its effectiveness and generalization ability. The model code and data are available via https://github.com/SCrownJ/FGNNSol.

Reference Type
Journal Article
Authors
Song G, Luo Z, Geng A, Xu J, Meng Y, Cui F, Wei L, Zou Q, Zhang Z
Primary Lit For
Additional Lit For
Review For

Gene Ontology Annotations


Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Gene/Complex Qualifier Gene Ontology Term Aspect Annotation Extension Evidence Method Source Assigned On Reference

Phenotype Annotations


Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details.

Gene Phenotype Experiment Type Mutant Information Strain Background Chemical Details Reference

Disease Annotations


Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Gene Disease Ontology Term Qualifier Evidence Method Source Assigned On Reference

Regulation Annotations


Increase the total number of rows displayed on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; to filter the table by a specific experiment type, type a keyword into the Filter box (for example, “microarray”); download this table as a .txt file using the Download button or click Analyze to further view and analyze the list of target genes using GO Term Finder, GO Slim Mapper, or SPELL.

Regulator Target Direction Regulation Of Happens During Method Evidence

Post-translational Modifications


Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through its pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Site Modification Modifier Reference

Interaction Annotations


Genetic Interactions

Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details about experiment type and any other genes involved in the interaction.

Interactor Interactor Allele Assay Annotation Action Phenotype SGA score P-value Source Reference

Physical Interactions

Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details about experiment type and any other genes involved in the interaction.

Interactor Interactor Assay Annotation Action Modification Source Reference

Functional Complementation Annotations


Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through its pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Gene Species Gene ID Strain background Direction Details Source Reference