Log in

Grid-based framework for high-performance processing of scientific knowledge

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

An essential matter in the knowledge-based information society is how to extract useful information quickly from a large volume of literature. Since most existing data mining frameworks deal with structured input data, many limitations are faced in analyzing unstructured scientific literature and extracting new information. This study proposes a scientific-knowledge processing framework, which offers high performance by using grid computing technology for extracting important entities and their relations from the scientific literature. Since the grid computing provides a large volume of data storage and high-speed computing, the proposed framework can efficiently analyze the massive body of scientific literature and process knowledge. The workflow tool that we have developed for the proposed framework enables users to easily design and execute complicated applications that consist of complicated scientific-knowledge processes. The experimental results showed that the proposed framework reduced working time by approximately 83 % when the number of running nodes was assigned in accordance with the workload ratio of each step in scientific-knowledge processes. As a result, it is possible to effectively process a large volume of scientific literature by flexibly adjusting the number of computing nodes that constitute the grid environment as the number of documents for processing increases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://www.globus.org/ogsa

  2. http://www.globus.org/wsrf

  3. http://www.globus.org/toolkit

  4. http://gdp.globus.org/gt4-tutorial

  5. http://www.oasis-open.org/committees/wsbpel

  6. http://www.w3.org/TR/wsci

  7. http://www.ndsl.kr

References

  1. Alsairafi S, Emmanouil F, Ghanem M, Giannadakis N, Guo Y, Kalaitzopoulos D, Osmond M, Rowe A, Syed J, Wendel P (2003) The design of discovery net: towards open grid services for knowledge discovery. Int J High Perform Comput Appl 17(3):297–315

    Article  Google Scholar 

  2. Altintas I, Berkley C, Jaeger E, Jones M, Ludascher B, Mock S (2004) Kepler: an extensible system for design and execution of scientific workflows. In: Proceedings of the 16th International Conference on Scientific and Statistical Database Management: 423–424

  3. Brezany P, Janciak I, Tjoa A (2005) GridMiner: a fundamental infrastructure for building intelligent grid systems. In: Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence: 150–156

  4. Choi S, Myaeng S (2010) Simplicity is better: revisiting single kernel PPI extraction. In: Proceedings of the 23rd International Conference on Computational Linguistics

  5. Chun H, Jeong C, Song S, Choi Y, Choi S, Sung W (2011) Relation extraction based on composite kernel combining pattern similarity of predicate-argument structure. In: Proceedings of U-and E-Service, Science and Technology: 269–273

  6. Congiusta A, Talia D, Trunfio P (2007) Service-oriented middleware for distributed data mining on the grid. J Parallel Distrib Comput 68(1):3–15

    Article  Google Scholar 

  7. Goble C, Wroe C, Stevens R (2003) The myGrid project: services, architecture and demonstrator. In: Proceedings of UK e-Science All Hands Meeting: 595–603

  8. Harrison A, Wang I, Taylor I, Shields M (2007) WS-RF workflow in Triana. International Journal of High Performance Computing Applications Special Issue on Workflow Systems in Grid Environments

  9. Hull D, Wolstencroft K, Stevens R, Goble C, Pocock M, Li P, Oinn T (2006) Taverna: a tool for building and running workflows of services. Nucleic Acids Res 34(Web Server issue):729–732

    Article  Google Scholar 

  10. Le-Khac N, Kechadi T, Carthy J (2006) ADMIRE framework: distributed data mining on data grid platforms. In: Proceedings of the 1st International Conference on Software and Data Technologies: 67–72

  11. Song S, Choi Y, Chun H, Jeong C, Choi S, Sung W (2011) Multi-words terminology recognition using web search. In: Proceedings of U-and E-Service, Science and Technology: 233–238

  12. Stankovski V, Trnkoczy J, Swain M, Dubitzky W, Kravtsov V, Schuster A, Niessen T, Wegener D, May M, Rohm M, Franke J (2008) Digging deep into the data mine with DataMiningGrid. IEEE Internet Comput 12(6):69–76

    Article  Google Scholar 

  13. Talia D, Trunfio P (2007) How distributed data mining tasks can thrive as services on Grids. In: Proceedings of National Science Foundation Symposium on Next Generation of Data Mining and Cyber-Enabled Discovery for Innovation

  14. Talia D, Trunfio P (2010) How distributed data mining tasks can thrive as knowledge services. Commun ACM 53(7):132–137

    Article  Google Scholar 

  15. Talia D, Trunfio P, Verta O (2008) The Weka4WS framework for distributed data mining in service-oriented Grids. Concurrency Comput Pract Ex 20(16):1933–1951

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sung-Pil Choi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jeong, CH., Choi, YS., Chun, HW. et al. Grid-based framework for high-performance processing of scientific knowledge. Multimed Tools Appl 71, 783–798 (2014). https://doi.org/10.1007/s11042-013-1411-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1411-2

Keywords

Navigation