Iakoucheva LM, Brown CJ, Lawson JD, Obradovic Z, Dunker AK. Intrinsic disorder in cell-signaling and cancer-associated proteins. J Mol Biol. 2002;323(3):573–84.
Article
CAS
PubMed
Google Scholar
Wright PE, Dyson HJ. Intrinsically disordered proteins in cellular signalling and regulation. Nat Rev Mol Cell Biol. 2015;16(1):18–29.
Article
CAS
PubMed
PubMed Central
Google Scholar
Zhou J, Zhao S, Dunker AK. Intrinsically disordered proteins link alternative splicing and post-translational modifications to complex cell signaling and regulation. J Mol Biol. 2018;430(16):2342–59.
Article
CAS
PubMed
Google Scholar
Uversky VN, Oldfield CJ, Dunker AK. Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys. 2008;37:215–46.
Article
CAS
PubMed
Google Scholar
Eftekharzadeh B, Daigle JG, Kapinos LE, Coyne A, Schiantarelli J, Carlomagno Y, Cook C, Miller SJ, Dujardin S, Amaral AS, et al. Tau protein disrupts nucleocytoplasmic transport in Alzheimer’s disease. Neuron. 2018;99(5):925-940 e927.
Article
CAS
PubMed
PubMed Central
Google Scholar
Haass C, Selkoe DJ. Soluble protein oligomers in neurodegeneration: lessons from the Alzheimer’s amyloid beta-peptide. Nat Rev Mol Cell Biol. 2007;8(2):101–12.
Article
CAS
PubMed
Google Scholar
Jaikaran ET, Higham CE, Serpell LC, Zurdo J, Gross M, Clark A, Fraser PE. Identification of a novel human islet amyloid polypeptide beta-sheet domain and factors influencing fibrillogenesis. J Mol Biol. 2001;308(3):515–25.
Article
CAS
PubMed
Google Scholar
Tang W, Wan S, Yang Z, Teschendorff AE, Zou Q. Tumor origin detection with tissue-specific miRNA and DNA methylation markers. Bioinformatics. 2018;34(3):398–406.
Article
CAS
PubMed
Google Scholar
Cheng Y, LeGall T, Oldfield CJ, Dunker AK, Uversky VN. Abundance of intrinsic disorder in protein associated with cardiovascular disease. Biochemistry. 2006;45(35):10448–60.
Article
CAS
PubMed
Google Scholar
Cao C, Wang J, Kwok D, Cui F, Zhang Z, Zhao D, Li MJ, Zou Q. webTWAS: a resource for disease candidate susceptibility genes identified by transcriptome-wide association study. Nucleic Acids Res. 2022;50(D1):D1123–30.
Article
CAS
PubMed
Google Scholar
Zeng X, **ang H, Yu L, Wang J, Li K, Nussinov R. Cheng FJNMI: Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. Nat Mach Intell. 2022;4(11):1004–16.
Article
Google Scholar
Cheng Y, LeGall T, Oldfield CJ, Mueller JP, Van YY, Romero P, Cortese MS, Uversky VN, Dunker AK. Rational drug design via intrinsically disordered protein. Trends Biotechnol. 2006;24(10):435–42.
Article
CAS
PubMed
Google Scholar
Zeng X, Wang F, Luo Y. Kang S-g, Tang J, Lightstone FC, Fang EF, Cornell W, Nussinov R, Cheng FJCRM: Deep generative molecular design reshapes drug discovery. Cell Rep Med. 2022;4:100794.
Article
Google Scholar
UniProt C. UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 2023;51(D1):D523–31.
Article
Google Scholar
Hanson J, Yang Y, Paliwal K, Zhou Y. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics. 2017;33(5):685–92.
Article
PubMed
Google Scholar
Jones DT, Cozzetto D. DISOPRED3: precise disordered region predictions with annotated protein-binding activity. Bioinformatics. 2015;31(6):857–63.
Article
CAS
PubMed
Google Scholar
Zhang T, Faraggi E, Xue B, Dunker AK, Uversky VN, Zhou Y. SPINE-D: accurate prediction of short and long disordered regions by a single neural-network based method. J Biomol Struct Dyn. 2012;29(4):799–813.
Article
CAS
PubMed
PubMed Central
Google Scholar
Wang S, Ma J, Xu J. AUCpreD: proteome-level protein disorder prediction by AUC-maximized deep convolutional neural fields. Bioinformatics. 2016;32(17):i672–9.
Article
CAS
PubMed
PubMed Central
Google Scholar
Tang YJ, Pang YH, Liu B. IDP-Seq2Seq: identification of intrinsically disordered regions based on sequence to sequence learning. Bioinformatics. 2021;36(21):5177–86.
Article
PubMed
Google Scholar
Hanson J, Paliwal KK, Litfin T, Zhou Y. SPOT-Disorder2: improved protein intrinsic disorder prediction by Ensembled deep learning. Genom Proteom Bioinf. 2019;17(6):645–56.
Article
Google Scholar
Necci M, Piovesan D, Predictors C, DisProt C, Tosatto SCE. Critical assessment of protein intrinsic disorder prediction. Nat Methods. 2021;18(5):472–81.
Article
CAS
PubMed
PubMed Central
Google Scholar
Conte AD, Mehdiabadi M, Bouhraoua A, Miguel Monzon A, Tosatto SCE, Piovesan D. Critical assessment of protein intrinsic disorder prediction (CAID) - results of round 2. Proteins. 2023;91(12):1925–34.
Article
CAS
PubMed
Google Scholar
Del Conte A, Bouhraoua A, Mehdiabadi M, Clementel D, Monzon AM. predictors C, Tosatto SCE, Piovesan D: CAID prediction portal: a comprehensive service for predicting intrinsic disorder and binding regions in proteins. Nucleic Acids Res. 2023;51(W1):W62–9.
Article
PubMed
PubMed Central
Google Scholar
Tompa P. Intrinsically unstructured proteins. Trends Biochem Sci. 2002;27(10):527–33.
Article
CAS
PubMed
Google Scholar
van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, Fuxreiter M, Gough J, Gsponer J, Jones DT, et al. Classification of intrinsically disordered regions and proteins. Chem Rev. 2014;114(13):6589–631.
Article
PubMed
PubMed Central
Google Scholar
Hu G, Katuwawala A, Wang K, Wu Z, Ghadermarzi S, Gao J, Kurgan L. flDPnn: accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat Commun. 2021;12(1):4438.
Article
CAS
PubMed
PubMed Central
Google Scholar
Dosztanyi Z, Meszaros B, Simon I. ANCHOR: web server for predicting protein binding regions in disordered proteins. Bioinformatics. 2009;25(20):2745–6.
Article
CAS
PubMed
PubMed Central
Google Scholar
Meszaros B, Erdos G, Dosztanyi Z. IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res. 2018;46(W1):W329–37.
Article
CAS
PubMed
PubMed Central
Google Scholar
Peng Z, Kurgan L. High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder. Nucleic Acids Res. 2015;43(18):e121.
Article
PubMed
PubMed Central
Google Scholar
Zhang F, Zhao B, Shi W, Li M, Kurgan L. DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning. Brief Bioinform. 2022;23(1):bbab521.
Article
PubMed
Google Scholar
Meszaros B, Simon I, Dosztanyi Z. Prediction of protein binding regions in disordered proteins. PLoS Comput Biol. 2009;5(5):e1000376.
Article
PubMed
PubMed Central
Google Scholar
Katuwawala A, Zhao B, Kurgan L. DisoLipPred: accurate prediction of disordered lipid-binding residues in protein sequences with deep recurrent networks and transfer learning. Bioinformatics. 2021;38(1):115–24.
Article
PubMed
Google Scholar
Hanson J, Litfin T, Paliwal K, Zhou Y. Identifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning. Bioinformatics. 2020;36(4):1107–13.
Article
CAS
PubMed
Google Scholar
Malhis N, Jacobson M, Gsponer J. MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences. Nucleic Acids Res. 2016;44(W1):W488-493.
Article
CAS
PubMed
PubMed Central
Google Scholar
Disfani FM, Hsu WL, Mizianty MJ, Oldfield CJ, Xue B, Dunker AK, Uversky VN, Kurgan L. MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. Bioinformatics. 2012;28(12):i75-83.
Article
CAS
PubMed
PubMed Central
Google Scholar
Sorensen CS, Kjaergaard M. Effective concentrations enforced by intrinsically disordered linkers are governed by polymer physics. Proc Natl Acad Sci U S A. 2019;116(46):23124–31.
Article
PubMed
PubMed Central
Google Scholar
Anand S, Mohanty D. Inter-domain movements in polyketide synthases: a molecular dynamics study. Mol Biosyst. 2012;8(4):1157–71.
Article
CAS
PubMed
Google Scholar
Meng F, Kurgan L. DFLpred: high-throughput prediction of disordered flexible linker regions in protein sequences. Bioinformatics. 2016;32(12):i341–50.
Article
CAS
PubMed
PubMed Central
Google Scholar
Pang Y, Liu B. TransDFL: identification of disordered flexible linkers in proteins by transfer learning. Genom Proteom Bioinf. 2023;21(2):359–69.
Article
Google Scholar
Peng Z, **ng Q, Kurgan L. APOD: accurate sequence-based predictor of disordered flexible linkers. Bioinformatics. 2020;36(Suppl_2):i754–61.
CAS
PubMed
PubMed Central
Google Scholar
Enard W, Przeworski M, Fisher SE, Lai CS, Wiebe V, Kitano T, Monaco AP, Paabo S. Molecular evolution of FOXP2, a gene involved in speech and language. Nature. 2002;418(6900):869–72.
Article
CAS
PubMed
Google Scholar
Darwin C: The descent of man, and selection in relation to sex, vol. 1: Murray; 1888.
Searls DB. The language of genes. Nature. 2002;420(6912):211–7.
Article
CAS
PubMed
Google Scholar
Strait BJ, Dewey TG. The Shannon information entropy of protein sequences. Biophys J. 1996;71(1):148–55.
Article
CAS
PubMed
PubMed Central
Google Scholar
Wang R, Jiang Y, ** J, Yin C, Yu H, Wang F, Feng J, Su R, Nakai K, Zou Q. DeepBIO: an automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis. Nucleic Acids Res. 2023;51(7):3017–29.
Article
CAS
PubMed
PubMed Central
Google Scholar
Zhang W, Meng Q, Wang J, Guo F. HDIContact: a novel predictor of residue-residue contacts on hetero-dimer interfaces via sequential information and transfer learning strategy. Brief Bioinform. 2022;23(4):bbac169.
Article
PubMed
Google Scholar
Meng Q, Guo F, Wang E, Tang J. ComDock: a novel approach for protein-protein docking with an efficient fusing strategy. Comput biol med. 2023;167:107660–107660.
Article
CAS
PubMed
Google Scholar
Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, Guo D, Ott M, Zitnick CL, Ma J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci U S A. 2021;118(15):e2016239118.
Article
CAS
PubMed
PubMed Central
Google Scholar
Li H, Pang Y, Liu B. BioSeq-BLM: a platform for analyzing DNA, RNA, and protein sequences based on biological language models. Nucleic Acids Res. 2021;49(22):e129.
Article
CAS
PubMed
PubMed Central
Google Scholar
** J, Yu Y, Wang R, Zeng X, Pang C, Jiang Y, Li Z, Dai Y, Su R, Zou Q. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome biol. 2022;23(1):1–23.
Article
Google Scholar
Bepler T, Berger B. Learning the protein language: evolution, structure, and function. Cell Syst. 2021;12(6):654–69.
Article
CAS
PubMed
PubMed Central
Google Scholar
Ferruz N, Schmidt S, Hocker B. ProtGPT2 is a deep unsupervised language model for protein design. Nat Commun. 2022;13(1):4348.
Article
CAS
PubMed
PubMed Central
Google Scholar
Madani A, Krause B, Greene ER, Subramanian S, Mohr BP, Holton JM, Olmos JL Jr, **ong C, Sun ZZ, Socher R, et al. Large language models generate functional protein sequences across diverse families. Nat Biotechnol. 2023;41(8):1099–106.
Article
CAS
PubMed
PubMed Central
Google Scholar
Chen L, Yu L, Gao L. Potent antibiotic design via guided search from antibacterial activity evaluations. Bioinformatics. 2023;39(2):btad059.
Article
CAS
PubMed
PubMed Central
Google Scholar
Unsal S, Atas H, Albayrak M, Turhan K, Acar AC, Doğan T. Learning functional properties of proteins with language models. Nat Mach Intell. 2022;4(3):227–45.
Article
Google Scholar
Hatos A, Hajdu-Soltesz B, Monzon AM, Palopoli N, Alvarez L, Aykac-Fas B, Bassot C, Benitez GI, Bevilacqua M, Chasapi A, et al. DisProt: intrinsic protein disorder annotation in 2020. Nucleic Acids Res. 2020;48(D1):D269–76.
CAS
PubMed
Google Scholar
Piovesan D, Tabaro F, Micetic I, Necci M, Quaglia F, Oldfield CJ, Aspromonte MC, Davey NE, Davidovic R, Dosztanyi Z, et al. DisProt 7.0: a major update of the database of disordered proteins. Nucleic Acids Res. 2017;45(D1):D219–27.
Article
CAS
PubMed
Google Scholar
Quaglia F, Meszaros B, Salladini E, Hatos A, Pancsa R, Chemes LB, Pajkos M, Lazar T, Pena-Diaz S, Santos J, et al. DisProt in 2022: improved quality and accessibility of protein intrinsic disorder annotation. Nucleic Acids Res. 2022;50(D1):D480–7.
Article
CAS
PubMed
Google Scholar
Pang Y, Liu B. DMFpred: predicting protein disorder molecular functions based on protein cubic language model. PLoS Comput Biol. 2022;18(10):e1010668.
Article
CAS
PubMed
PubMed Central
Google Scholar
Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics. 2010;26(5):680–2.
Article
CAS
PubMed
PubMed Central
Google Scholar
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI blog. 2019;1(8):9.
Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K: Bert: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. 2019: 4171–4186.
Vu MH, Akbar R, Robert PA, Swiatczak B, Sandve GK, Greiff V, Haug DTT. Linguistically inspired roadmap for building biologically reliable protein language models. Nat Mach Intell. 2023;5(5):485–96.
Article
Google Scholar
Elnaggar A, Heinzinger M, Dallago C, Rihawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M. ProtTrans: towards cracking the language of life’s code through self-supervised deep learning and high performance computing. IEEE Trans Pattern Anal Mach Intell. 2020;44(10):7112–27.
Article
Google Scholar
Li H, Liu B. BioSeq-Diabolo: biological sequence similarity analysis using Diabolo. PLOS Comput Biol. 2023;19(6):e1011214.
Article
CAS
PubMed
PubMed Central
Google Scholar
Chung J, Gulcehre C, Cho K, Bengio Y: Empirical evaluation of gated recurrent neural networks on sequence modeling. Twenty-eighth Conference on Neural Information Processing Systems (Workshops). 2014: 1–9.
Sutskever I, Vinyals O, Le QV: Sequence to sequence learning with neural networks. Twenty-eighth Conference on Neural Information Processing Systems. 2014: 1–9.
Shannon CE. A mathematical theory of communication. Bell syst tech j. 1948;27(3):379–423.
Article
Google Scholar
Quinlan JR. Induction of decision trees. Mach learn. 1986;1:81–106.
Article
Google Scholar
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. Stat. 2017;1050(20):10.48550.
Google Scholar
Defferrard M, Bresson X, Vandergheynst P: Convolutional neural networks on graphs with fast localized spectral filtering. Advances in Neural Information Processing Systems. 2016: 3844–3852.
He T, Hu J, Song Y, Guo J, Yi Z. Multi-task learning for the segmentation of organs at risk with label dependence. Med Image Anal. 2020;61:101666.
Article
PubMed
Google Scholar
Wang Y, Zhai Y, Ding Y, Zou Q: SBSM-Pro: support bio-sequence machine for proteins. ar**v preprint ar**v:230810275 2023.
Dao FY, Liu ML, Su W, Lv H, Zhang ZY, Lin H, Liu L. AcrPred: a hybrid optimization with enumerated machine learning algorithm to predict Anti-CRISPR proteins. Int j biol macromol. 2023;228:706–14.
Article
CAS
PubMed
Google Scholar
Zou X, Ren L, Cai P, Zhang Y, Ding H, Deng K, Yu X, Lin H, Huang C. Accurately identifying hemagglutinin using sequence information and machine learning methods. Front med. 2023;10:1281880.
Article
Google Scholar
Zhu W, Yuan SS, Li J, Huang CB, Lin H, Liao B. A first computational frame for recognizing heparin-binding protein. Diagnostics. 2023;13(14):2465.
Article
CAS
PubMed
PubMed Central
Google Scholar
Ao C, Ye X, Sakurai T, Zou Q, Yu L. m5U-SVM: identification of RNA 5-methyluridine modification sites based on multi-view features of physicochemical features and distributed representation. Bmc Biol. 2023;21(1):93.
Article
CAS
PubMed
PubMed Central
Google Scholar
Tang FR, Chao JN, Wei YM, Yang FL, Zhai YX, Xu L, Zou Q. HAlign 3: fast multiple alignment of ultra-large numbers of similar DNA/RNA sequences. Mol Biol Evol. 2022;39(8):msac166.
Article
CAS
PubMed
PubMed Central
Google Scholar
Zou Q, Hu Q, Guo M, Wang G. HAlign: fast multiple similar DNA/RNA sequence alignment based on the centre star strategy. Bioinformatics. 2015;31(15):2475–81.
Article
CAS
PubMed
Google Scholar
Steinegger M, Meier M, Mirdita M, Vohringer H, Haunsberger SJ, Soding J. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics. 2019;20(1):473.
Article
PubMed
PubMed Central
Google Scholar
Avanti S, Peyton GA, Kundaje: Learning important features through propagating activation differences. Proceedings of the 34th International Conference on Machine Learning. 2017: 3145–3153.
Schwarzenberg R, Hübner M, Harbecke D, Alt C, Hennig L: Layerwise relevance visualization in convolutional text graph classifiers. Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing. 2019: 58–62.
Sheehy AM, Gaddis NC, Choi JD, Malim MH. Isolation of a human gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature. 2002;418(6898):646–50.
Article
CAS
PubMed
Google Scholar
Mercenne G, Bernacchi S, Richer D, Bec G, Henriet S, Paillart JC, Marquet R. HIV-1 Vif binds to APOBEC3G mRNA and inhibits its translation. Nucleic Acids Res. 2010;38(2):633–46.
Article
CAS
PubMed
Google Scholar
Bennett RP, Salter JD, Smith HC. A new class of antiretroviral enabling innate immunity by protecting APOBEC3 from HIV Vif-dependent degradation. Trends Mol Med. 2018;24(5):507–20.
Article
CAS
PubMed
PubMed Central
Google Scholar
Rose KM, Marin M, Kozak SL, Kabat D. The viral infectivity factor (Vif) of HIV-1 unveiled. Trends Mol Med. 2004;10(6):291–7.
Article
CAS
PubMed
Google Scholar
Yu L, Yang K, He X, Li M, Gao L, Zha Y. Repositioning linifanib as a potent anti-necroptosis agent for sepsis. Cell Death Discov. 2023;9(1):57.
Article
CAS
PubMed
PubMed Central
Google Scholar
Ito F, Alvarez-Cabrera AL, Liu S, Yang H, Shiriaeva A, Zhou ZH, Chen XS. Structural basis for HIV-1 antagonism of host APOBEC3G via Cullin E3 ligase. Sci Adv. 2023;9(1):eade3168.
Article
PubMed
PubMed Central
Google Scholar
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, et al. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res. 2023;51(D1):D488–508.
Article
CAS
PubMed
Google Scholar
Reingewertz TH, Benyamini H, Lebendiker M, Shalev DE, Friedler A. The C-terminal domain of the HIV-1 Vif protein is natively unfolded in its unbound state. Protein Eng Des Sel. 2009;22(5):281–7.
Article
CAS
PubMed
Google Scholar