A Hybrid Protocol for Finding Novel Gene Targets for Various Diseases Using Microarray Expression Data Analysis and Text Mining

Manoharan, Sharanya; Iyyappan, Oviya Ramalakshmi

doi:10.1007/978-1-0716-2305-3_3

Sharanya Manoharan³ &
Oviya Ramalakshmi Iyyappan⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2496))

722 Accesses
5 Citations

Abstract

The advancement in technology for various scientific experiments and the amount of raw data produced from that is enormous, thus giving rise to various subsets of biologists working with genome, proteome, transcriptome, expression, pathway, and so on. This has led to exponential growth in scientific literature which is becoming beyond the means of manual curation and annotation for extracting information of importance. Microarray data are expression data, analysis of which results in a set of up/downregulated lists of genes that are functionally annotated to ascertain the biological meaning of genes. These genes are represented as vocabularies and/or Gene Ontology terms when associated with pathway enrichment analysis need relational and conceptual understanding to a disease. The chapter deals with a hybrid approach we designed for identifying novel drug–disease targets. Microarray data for muscular dystrophy is explored here as an example and text mining approaches are utilized with an aim to identify promisingly novel drug targets. Our main objective is to give a basic overview from a biologist’s perspective for whom text mining approaches of data mining and information retrieval is fairly a new concept. The chapter aims to bridge the gap between biologist and computational text miners and bring about unison for a more informative research in a fast and time efficient manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Protocol: EUR 44.95; Price includes VAT (Germany)

eBook: EUR 106.99; Price includes VAT (Germany)

Softcover Book: EUR 139.09; Price includes VAT (Germany)

Hardcover Book: EUR 213.99; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gene Expression Mining in Type 2 Diabetes Research

Biomarker Discovery with Text Mining and Literature Based Discovery

Differential gene expression in disease: a comparison between high-throughput studies and the literature

Article Open access 11 October 2017

References

Piro RM, Di Cunto F (2012) Computational approaches to disease-gene prediction: rationale, classification and successes. FEBS J 279(5):678–696. https://doi.org/10.1111/j.1742-4658.2012.08471.x
Article CAS PubMed Google Scholar
Krallinger M, Leitner F, Vazquez M, Salgado D, Marcelle C, Tyers M et al (2012) How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience. Database 2012:bas017. https://doi.org/10.1093/database/bas017
Article CAS PubMed PubMed Central Google Scholar
Arrowsmith J (2011) Trial watch phase II failures: 2008–2010. Nat Rev Drug Discov 10(5):328–329. https://doi.org/10.1038/nrd3439
Article CAS PubMed Google Scholar
Dai Y-F, Zhao X-M (2015) A survey on the computational approaches to identify drug targets in the postgenomic era. Biomed Res Int 2015:239654. https://doi.org/10.1155/2015/239654
Article CAS PubMed PubMed Central Google Scholar
Ma C-C, Wang Z-L, Xu T, He Z-Y, Wei Y-Q (2020) The approved gene therapy drugs worldwide: from 1998 to 2019. Biotechnol Adv 40:107502. https://doi.org/10.1016/j.biotechadv.2019.107502
Article CAS PubMed Google Scholar
Himič V, Davies KE (2021) Evaluating the potential of novel genetic approaches for the treatment of Duchenne muscular dystrophy. Eur J Hum Genet 29(9):1369–1376. https://doi.org/10.1038/s41431-021-00811-2
Article PubMed PubMed Central Google Scholar
Kupatt C, Windisch A, Moretti A, Wolf E, Wurst W, Walter MC (2021) Genome editing for Duchenne muscular dystrophy: a glimpse of the future? Gene Ther 28(9):542–548. https://doi.org/10.1038/s41434-021-00222-4
Article CAS PubMed PubMed Central Google Scholar
Sun C, Shen L, Zhang Z, **e X (2020) Therapeutic strategies for Duchenne muscular dystrophy: an update. Genes (Basel) 11(8):837. https://doi.org/10.3390/genes11080837
Article CAS Google Scholar
Ferrero E, Dunham I, Sanseau P (2017) In silico prediction of novel therapeutic targets using gene–disease association data. J Transl Med 15(1):182. https://doi.org/10.1186/s12967-017-1285-6
Article CAS PubMed PubMed Central Google Scholar
Lin Y, Mehta S, Küçük-McGinty H, Turner JP, Vidovic D, Forlin M et al (2017) Drug target ontology to classify and integrate drug discovery data. J Biomed Semantics 8(1):50. https://doi.org/10.1186/s13326-017-0161-x
Article PubMed PubMed Central Google Scholar
Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG et al (2017) A comprehensive map of molecular drug targets. Nat Rev Drug Discov 16(1):19–34. https://doi.org/10.1038/nrd.2016.230
Article CAS PubMed Google Scholar
Overington JP, Al-Lazikani B, Hopkins AL (2006) How many drug targets are there? Nat Rev Drug Discov 5(12):993–996. https://doi.org/10.1038/nrd2199
Article CAS PubMed Google Scholar
Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G et al (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 18(6):463–477. https://doi.org/10.1038/s41573-019-0024-5
Article CAS PubMed PubMed Central Google Scholar
Patel L, Shukla T, Huang X, Ussery DW, Wang S (2020) Machine learning methods in drug discovery. Molecules 25(22):5277. https://doi.org/10.3390/molecules25225277
Article CAS PubMed Central Google Scholar
Zheng S, Dharssi S, Wu M, Li J, Lu Z (2019) Text mining for drug discovery. Methods Mol Biol 1939:231–252. https://doi.org/10.1007/978-1-4939-9089-4_13
Article CAS PubMed Google Scholar
Cheng T, Hao M, Takeda T, Bryant SH, Wang Y (2017) Large-scale prediction of drug-target interaction: a data-centric review. AAPS J 19(5):1264–1275. https://doi.org/10.1208/s12248-017-0092-6
Article CAS PubMed Google Scholar
Opap K, Mulder N (2017) Recent advances in predicting gene-disease associations. F1000Res 6:578. https://doi.org/10.12688/f1000research.10788.1
Article CAS PubMed PubMed Central Google Scholar
Papanikolaou N, Pavlopoulos GA, Theodosiou T, Vizirianakis IS, Iliopoulos I (2016) DrugQuest - a text mining workflow for drug association discovery. BMC Bioinformatics 17(Suppl 5):182. https://doi.org/10.1186/s12859-016-1041-6
Article CAS PubMed PubMed Central Google Scholar
Rodriguez-Esteban R, Bundschus M (2016) Text mining patents for biomedical knowledge. Drug Discov Today 21(6):997–1002. https://doi.org/10.1016/j.drudis.2016.05.002
Article CAS PubMed Google Scholar
Kafkas Ş, Dunham I, McEntyre J (2017) Literature evidence in open targets – a target validation platform. bioRxiv. 124719. https://doi.org/10.1101/124719
Schriml LM, Mitraka E (2015) The disease ontology: fostering interoperability between biological and clinical human disease-related data. Mamm Genome 26(9):584–589. https://doi.org/10.1007/s00335-015-9576-9
Article PubMed PubMed Central Google Scholar
Gremse M, Chang A, Schomburg I, Grote A, Scheer M, Ebeling C et al (2011) The BRENDA tissue ontology (BTO): the first all-integrating ontology of all organisms for enzyme sources. Nucleic Acids Res 39(Database issue):D507–DD13. https://doi.org/10.1093/nar/gkq968
Article CAS PubMed Google Scholar
Natale DA, Arighi CN, Blake JA, Bona J, Chen C, Chen S-C et al (2016) Protein ontology (PRO): enhancing and scaling up the representation of protein entities. Nucleic Acids Res 45(D1):D339–DD46. https://doi.org/10.1093/nar/gkw1075
Article CAS PubMed PubMed Central Google Scholar
Yang Y, Adelstein SJ, Kassis AI (2009) Target discovery from data mining approaches. Drug Discov Today 14(3):147–154. https://doi.org/10.1016/j.drudis.2008.12.005
Article PubMed Google Scholar
Rodriguez-Esteban R, Jiang X (2017) Differential gene expression in disease: a comparison between high-throughput studies and the literature. BMC Med Genet 10(1):59. https://doi.org/10.1186/s12920-017-0293-y
Article CAS Google Scholar
Wang T, Li B, Nelson CE, Nabavi S (2019) Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data. BMC Bioinformatics 20(1):40. https://doi.org/10.1186/s12859-019-2599-6
Article PubMed PubMed Central Google Scholar
Marco-Puche G, Lois S, Benítez J, Trivino JC (2019) RNA-Seq perspectives to improve clinical diagnosis. Front Genet 10:1152. https://doi.org/10.3389/fgene.2019.01152
Article CAS PubMed PubMed Central Google Scholar
Gambardella G, di Bernardo D (2019) A tool for visualization and analysis of single-cell RNA-Seq data based on text mining. Front Genet 10:734. https://doi.org/10.3389/fgene.2019.00734
Article CAS PubMed PubMed Central Google Scholar
Chiesa M, Colombo GI, Piacentini L (2017) DaMiRseq—an R/Bioconductor package for data mining of RNA-Seq data: normalization, feature selection and classification. Bioinformatics 34(8):1416–1418. https://doi.org/10.1093/bioinformatics/btx795
Article CAS Google Scholar
Gonorazky H, Liang M, Cummings B, Lek M, Micallef J, Hawkins C et al (2015) RNAseq analysis for the diagnosis of muscular dystrophy. Ann Clin Transl Neurol 3(1):55–60. https://doi.org/10.1002/acn3.267
Article CAS PubMed PubMed Central Google Scholar
Jiang Z, Shi Y, Tan G, Wang Z (2021) Computational screening of potential glioma-related genes and drugs based on analysis of GEO dataset and text mining. PLoS One 16(2):e0247612. https://doi.org/10.1371/journal.pone.0247612
Article CAS PubMed PubMed Central Google Scholar
Bian Y, Yang L, Zhao M, Li Z, Xu Y, Zhou G et al (2019) Identification of key genes and pathways in post-traumatic stress disorder using microarray analysis. Front Psychol 10:302. https://doi.org/10.3389/fpsyg.2019.00302
Article PubMed PubMed Central Google Scholar
Mi H, Muruganujan A, Casagrande JT, Thomas PD (2013) Large-scale gene function analysis with the PANTHER classification system. Nat Protoc 8(8):1551–1566. https://doi.org/10.1038/nprot.2013.092
Article CAS PubMed PubMed Central Google Scholar
Baran J, Gerner M, Haeussler M, Nenadic G, Bergman CM (2011) pubmed2ensembl: a resource for mining the biological literature on genes. PLoS One 6(9):e24716-e. https://doi.org/10.1371/journal.pone.0024716
Article CAS Google Scholar
Maglott D, Ostell J, Pruitt KD, Tatusova T (2005) Entrez gene: gene-centered information at NCBI. Nucleic Acids Res 33(Database issue):D54–DD8. https://doi.org/10.1093/nar/gki031
Article CAS PubMed Google Scholar
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, Wiegers J, Wiegers TC et al (2020) Comparative Toxicogenomics database (CTD): update 2021. Nucleic Acids Res 49(D1):D1138–D1D43. https://doi.org/10.1093/nar/gkaa891
Article CAS PubMed Central Google Scholar
Wiegers TC, Davis AP, Cohen KB, Hirschman L, Mattingly CJ (2009) Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD). BMC Bioinformatics 10:326. https://doi.org/10.1186/1471-2105-10-326
Article CAS PubMed PubMed Central Google Scholar
Szklarczyk D, Gable AL, Nastou KC, Lyon D, Kirsch R, Pyysalo S et al (2021) The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res 49(D1):D605–DD12. https://doi.org/10.1093/nar/gkaa1074
Article CAS PubMed Google Scholar
McCray AT, Burgun A, Bodenreider O (2001) Aggregating UMLS semantic types for reducing conceptual complexity. Stud Health Technol Inform 84(Pt 1):216–220
CAS PubMed PubMed Central Google Scholar
Bodenreider O, McCray AT (2003) Exploring semantic groups through visual approaches. J Biomed Inform 36(6):414–432. https://doi.org/10.1016/j.jbi.2003.11.002
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Bioinformatics, Stella Maris College (Autonomous), Chennai, Tamilnadu, India
Sharanya Manoharan
Department of Sciences, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Chennai, Tamilnadu, India
Oviya Ramalakshmi Iyyappan

Authors

Sharanya Manoharan
View author publications
You can also search for this author in PubMed Google Scholar
Oviya Ramalakshmi Iyyappan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Morgridge Institute for Research, University of Wisconsin, Madison, WI, USA
Kalpana Raja

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Manoharan, S., Iyyappan, O.R. (2022). A Hybrid Protocol for Finding Novel Gene Targets for Various Diseases Using Microarray Expression Data Analysis and Text Mining. In: Raja, K. (eds) Biomedical Text Mining. Methods in Molecular Biology, vol 2496. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2305-3_3

Download citation

DOI: https://doi.org/10.1007/978-1-0716-2305-3_3
Published: 18 June 2022
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2304-6
Online ISBN: 978-1-0716-2305-3
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

A Hybrid Protocol for Finding Novel Gene Targets for Various Diseases Using Microarray Expression Data Analysis and Text Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gene Expression Mining in Type 2 Diabetes Research

Biomarker Discovery with Text Mining and Literature Based Discovery

Differential gene expression in disease: a comparison between high-throughput studies and the literature

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Hybrid Protocol for Finding Novel Gene Targets for Various Diseases Using Microarray Expression Data Analysis and Text Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gene Expression Mining in Type 2 Diabetes Research

Biomarker Discovery with Text Mining and Literature Based Discovery

Differential gene expression in disease: a comparison between high-throughput studies and the literature

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation