Abstract
Automation is the key concept when designing a service platform, because automation could reduce human’s work. Focusing on unstructured information such as text, image and audio, we implemented our service platform “Kachako” in a hybrid-cloud way where services themselves are transferred on demand. We suggest making each service specified by its input and output types, and executable of the service portable, compatible and interoperable. Assuming such services, Kachako thoroughly automates everything that users need. Kachako provides graphical user interfaces allowing end users to complete their tasks within Kachako without programming. Kachako is designed in a modular way by complying with well-known frameworks such as UIMA, Hadoop and Maven, allowing partial reuse or customization. We showed that Kachako is practically useful by integrating our natural language processing (NLP) services. Kachako is the world first full automation system for NLP freely available.
Chapter PDF
Similar content being viewed by others
References
Apache UIMA, http://uima.apache.org/
Ferrucci, D., Lally, A., Gruhl, D., Epstein, E., Schor, M., Murdock, J.W., Frenkiel, A., Brown, E.W., Hampp, T., Doganata, Y., Welty, C., Amini, L., Kofman, G., Kozakov, L., Mass, Y.: Towards an Interoperability Standard for Text and Multi-Modal Analytics. IBM Research Report, RC24122 (2006)
Apache ActiveMQ, http://activemq.apache.org/
Apache Hadoop, http://hadoop.apache.org/
Apache Ivy, http://ant.apache.org/ivy/
Apache Maven, http://maven.apache.org/
Ferrucci, D.A.: Introduction to This is Watson. IBM Journal of Research and Development 56, 1:1–1:15 (2012)
Hahn, U., Buyko, E., Landefeld, R., Mühlhausen, M., Poprat, M., Tomanek, K., Wermter, J.: An Overview of JCoRe, the JULIE Lab UIMA Component Repository. In: LREC 2008 Workshop, Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP, Marrakech, Morocco, pp. 1–8 (2008)
Hernandez, N., Poulard, F., Vernier, M., Rocheteau, J.: Building a French-speaking community around UIMA, gathering research, education and industrial partners, mainly in Natural Language Processing and Speech Recognizing domains. In: LREC 2010 Workshop of New Challenges for NLP Frameworks, Valletta, Malta (2010)
Ogren, P.V., Wetzler, P.G., Bethard, S.: ClearTK: A UIMA Toolkit for Statistical Natural Language Processing. In: LREC 2008 Workshop ’Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP’, Marrakech, Morocco, pp. 32–38 (2008)
Kano, Y., Miwa, M., Cohen, K., Hunter, L., Ananiadou, S., Tsujii, J.: U-Compare: a modular NLP workflow construction and evaluation system. IBM Journal of Research and Development 55, 11:1–11:10 (2011)
Kano, Y., Dorado, R., McCrohon, L., Ananiadou, S., Tsujii, J.: U-Compare: An Integrated Language Resource Evaluation Platform Including a Comprehensive UIMA Resource Library. In: 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta, pp. 428–434 (2010)
Kano, Y., Baumgartner, W.A., McCrohon, L., Ananiadou, S., Cohen, K.B., Hunter, L., Tsujii, J.: U-Compare: share and compare text mining tools with UIMA. Bioinformatics 25, 1997–1998 (2009)
Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Res. 34, W729–W732 (2006)
Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., Taylor, J.: Galaxy: a web-based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. ch. 19, Unit 19.10.1–19.10.21 (2010)
Ishida, T.: Language Grid: An Infrastructure for Intercultural Collaboration. In: Proceedings of the International Symposium on Applications on Internet, pp. 96–100. IEEE Computer Society (2006)
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia, USA, pp. 168–175 (2002)
BioMed Central’s open access full-text corpus, http://www.biomedcentral.com/about/datamining
Settles, B.: ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics 21, 3191–3192 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kano, Y. (2013). Kachako: A Hybrid-Cloud Unstructured Information Platform for Full Automation of Service Composition, Scalable Deployment and Evaluation. In: Ghose, A., et al. Service-Oriented Computing - ICSOC 2012 Workshops. ICSOC 2012. Lecture Notes in Computer Science, vol 7759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37804-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-37804-1_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37803-4
Online ISBN: 978-3-642-37804-1
eBook Packages: Computer ScienceComputer Science (R0)