Abstract
In this paper we present a framework for unified, personalized access to heterogeneous multimedia content in distributed repositories. Focusing on semantic analysis of multimedia documents, metadata, user queries and user profiles, it contributes to the bridging of the gap between the semantic nature of user queries and raw multimedia documents. The proposed approach utilizes as input visual content analysis results, as well as analyzes and exploits associated textual annotation, in order to extract the underlying semantics, construct a semantic index and classify documents to topics, based on a unified knowledge and semantics representation model. It may then accept user queries, and, carrying out semantic interpretation and expansion, retrieve documents from the index and rank them according to user preferences, similarly to text retrieval. All processes are based on a novel semantic processing methodology, employing fuzzy algebra and principles of taxonomic knowledge representation. The first part of this work presented in this paper deals with data and knowledge models, manipulation of multimedia content annotations and semantic indexing, while the second part will continue on the use of the extracted semantic information for personalized retrieval.
Similar content being viewed by others
References
Akrivas G, Stamou G, Kollias S (2004) Semantic association of multimedia document descriptions through fuzzy relational algebra and fuzzy reasoning. IEEE Trans Syst Man Cybern Part A, 34:(2), March
Akrivas G, Wallace M, Andreou G, Stamou G, Kollias S (2002) “Context-Sensitive Semantic Query Expansion”, Proceedings of the IEEE international conference on artificial intelligence systems (ICAIS), Divnomorskoe, Russia, September 2002
Altenschmidt C, Biskup J (2002) Explicit representation of constrained schema map**s for mediated data integration. In: Bhalla S (ed) Databases in networked information systems, pp 103–132
Altenschmidt C, Biskup J, Flegel U, Karabulut Y (2003) Secure mediation: requirements, design, and architecture. J Comput Secur 11(3):365–398, March
Amir A et al (2003) IBM research TRECVID-2003 video retrieval system. Proceedings of NIST TRECVID workshop, Gaithersburg, MD, USA, November 2003
Argillander J, Iyengar G, Nock H (2005) Semantic annotation of multimedia using maximum entropy models. Proceedings of IEEE international conference on acoustics, speech, and signal processing, (ICASSP ’05), March 2005
Athanasiadis Th, Avrithis Y (2004) Adding semantics to audiovisual content. Proceedings of the international conference for image and video retrieval (CIVR ’04), Dublin, Ireland, July 2004
Athanasiadis Th, Tzouvaras V, Petridis K, Precioso F, Avrithis Y, Kompatsiaris Y (2005) Using a multimedia ontology infrastructure for semantic annotation of multimedia content. Proceedings of the 5th international workshop on knowledge markup and semantic annotation (SemAnnot ’05). Galway, Ireland, November 2005
Baeza-Yates RA, Ribeiro-Neto BA (1999) Modern information retrieval. ACM Press/Addison-Wesley
Benitez AB, Chang S-F (2003) Extraction, description and application of multimedia using MPEG-7. Proceedings of the 37th Asilomar conference on signals, systems and computers. Pacific Grove, California, USA, November 2003
Benitez AB, Chang S-F (2003) Image classification using multimedia knowledge networks. Proceedings of the IEEE international conference on image processing (ICIP’03). Barcelona, Spain 2003
Benitez AB et al (2000) Object-based multimedia content description schemes and applications for MPEG-7. Image Communication Journal 16:235–269 (invited paper on a special issue on MPEG-7)
Benitez AB, Chang S-F, Smith JR (2001) “IMKA: a multimedia organization system combining perceptual and semantic knowledge”. Proceedings of the 9th ACM multimedia, Ottawa, Canada 2001
Benitez AB, Zhong D, Chang S, Smith J (2001) MPEG-7 MDS content description tools and applications. Proceedings of the international conference on computer analysis of images and patterns (CAIP), Warsaw, Poland
Benitez AB et al (2002) Semantics of multimedia in MPEG-7. Proceedings of the IEEE international conference on image processing, vol. 1, pp 137–140
Benkhalifa M, Bensaid A, Mouradi A (1999) Text categorization using the semi-supervised fuzzy c-means algorithm”. Proceedings of the 18th international conference of the North American Fuzzy Information Processing Society-NAFIPS, pp 561–565
Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 28(5):34–43
Berry MW, Dumais ST, O’Brien GW (1995) Using linear algebra for intelligent information retrieval. SIAM Rev 37(4):177–196
Bertini M, Cucchiara R, Del Bimbo A, Torniai C (2005) Video annotation with pictorially enriched ontologies. Proceedings of the IEEE international conference on multimedia and expo, Amsterdam, The Netherlands, July 2005
Bertini M, Del Bimbo A, Torniai C (2005) Automatic video annotation using ontologies extended with visual information. Proceedings of the 13th annual ACM international conference on Multimedia, Singapore, November 2005
Biskup J, Freitag J, Karabulut Y, Sprick B (1997) A mediator for multimedia systems. Proceedings of the 3rd international workshop on multimedia information systems, Como, Italy, September 1997
Bloehdorn S et al (2005) Semantic annotation of images and videos for multimedia analysis. Lecture notes in computer science—The semantic web: research and applications, vol. 3532, Springer, pp 592–607
Burgin R (1995) The retrieval effectiveness of five clustering algorithms as a function of indexing exhaustivity. J Am Soc Inf Sci 46(8):562–572
Burnett I et al (2003) MPEG-21 goals and achievements. IEEE Multimedia 10(4):60–70
Cai L, Hofmann T (2003) Text categorization by boosting automatically extracted concepts. Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, Toronto, Canada, July/August 2003, pp 182–189
Cutting D, Karger DR, Pedersen JO, Tukey JW (1992) Scatter/Gather: a cluster-based approach to browsing large document collections. Proceedings of the ACM/SIGIR, pp 318–329
Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci 41(6):391–407
Denoyer L, Gallinari P, Vittaut J-N, Brunesseaux S (2003) Structured multimedia document classification. Proceedings of the ACM DOCENG conference, Grenoble, France
Doerr M, Hunter J, Lagoze C (2003) Towards a core ontology for information integration. J Digit Inf 4(1), April
Dorai C, Venkatesh S (2001) Computational media aesthetics: finding meaning beautiful. IEEE Multimed 8(4):10–12
Fagin R, Kumar R, Sivakumar D (2003) Efficient similarity search and classification via rank aggregation. Proceedings of the 2003 ACM SIGMOD international conference on management of data, San Diego, California, USA, June 2003, pp 301–312
Fagin R, Lotem A, Naor M (2003) Optimal aggregation algorithms for middleware. J Comput Syst Sci 66:614–656
García R, Celma O (2005) Semantic integration and retrieval of multimedia metadata. Proceedings of the 5th international workshop on knowledge markup and semantic annotation (SemAnnot), Galway, Ireland, November 2005
Gruber TR (1993) A translation approach to portable ontology specification. Knowl Acquis 5:199–220
Hauptmann AG (2004) Towards a large scale concept ontology for broadcast video. Proceedings of the 3rd international conference on image and video retrieval (CIVR’04), Dublin, Ireland, July 2004
Hauptmann AG (2005) Lessons for the future from a decade of informedia video analysis research. Lect Notes Comput Sci 3568:1–10
Hauptmann AG, Yan R, Ng TD, Lin W, ** R, Derthick M, Christel M, Chen M, Baron R (2002) Video classification and retrieval with the informedia digital video library system. Proceedings of the text and retrieval conference (TREC02), Gaithersburg, MD, USA, November 2002
Hauptmann AG et al (2003) Informedia at TRECVID 2003: analyzing and searching broadcast news video. Proceedings of the NIST TRECVID workshop, Gaithersburg, MD, USA, November 2003
Henderson JM, Hollingworth A (1999) High level scene perception. Annu Rev Psychol 50:243–271
Hofmann T (1999) Probabilistic latent semantic indexing. Proceedings of the 22nd ACM-SIGIR international conference on research and development in information retrieval, pp 50–57
Hollink L, Worring M, Schreiber G (2005) Building a visual ontology for video retrieval. Proceedings of the ACM multimedia, Singapore, November 2005
Hoogs A, Rittscher J, Stein G, Schmiederer J (2003) Video content annotation using visual analysis and a large semantic knowledgebase. Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), Madison, Wisconsin, USA, June 2003
Hunter J (1999) A proposal for an MPEG-7 description definition language. MPEG-7 AHG test and evaluation meeting, Lancaster, February 1999
Hunter J (2001) Adding multimedia to the semantic web-building an MPEG-7 ontology. Proceedings of the international semantic web working symposium (SWWS), California, USA, July 30–August 1
Hunter J (2003) Enhancing the semantic interoperability of multimedia through a core ontology. IEEE Trans Circuits Syst Video Technol 13(1):49–58
ISO/IEC FDIS 15938-5, ISO/IEC JTC 1/SC 29 M 4242 (2001) Information technology multimedia content description interface Part 5: multimedia description schemes, pp 442–448, October 2001
Klir G, Bo Yuan (1995) Fuzzy sets and fuzzy logic, theory and applications. Prentice Hall, New Jersey
Landauer T, Foltz P, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25:259–284
MacLeod K (1990) An application specific neural model for document clustering. Proceedings of the 4th annual parallel processing symposium, vol. 1, pp 5–16
Mich O, Brunelli R, Modena CM (1999) A survey on video indexing. J Vis Commun Image Represent 10:78–112
Milanese R (1993) Detecting salient regions in an image: from biology to implementation. PhD Thesis, University of Geneva, Switzerland
Miyamoto S (1990) Fuzzy sets in information retrieval and cluster analysis. Kluwer Academic Publishers, Dordrecht/Boston/London
MPEG-21 Overview v.5, ISO/IEC JTC1/SC29/WG11/N5231, Shanghai, October 2002, http://www.chiariglione.org/mpeg/standards/mpeg-21/mpeg-21.htm
Mylonas Ph, Avrithis Y (2005) Context modeling for multimedia analysis and use. Proceedings of the 5th international and interdisciplinary conference on modeling and using context (CONTEXT ‘05), Paris, France 2005
Naphade M, Huang T (2001) A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Trans Multimedia 3(1):141–151
Naphade MR, Kozintsev IV, Huang TS (2002) A factor graph framework for semantic video indexing. IEEE Trans Circuits Syst Video Technol 12(1):40–52, January
NIST TRECVID (2006), http://www-nlpir.nist.gov/projects/trecvid/
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comp Vis 42:145–175
Osberger W, Maeder AJ (1998) Automatic identification of perceptually important regions in an image. Proceedings of IEEE International Conference on Pattern Recognition
Papadopoulos G, Mylonas Ph, Mezaris V, Avrithis Y, Kompatsiaris I (2006) Knowledge-assisted image analysis based on context and spatial optimization. International Journal on Semantic Web and Information Systems 2(3):17–36
Petridis K et al (2006) Knowledge representation and semantic annotation of multimedia content. IEE Proc Vis Image Signal Process (special issue on knowledge-based digital media processing) 153(3):255–262, June 2006
Rapantzikos K, Avrithis Y, Kollias S (2005) On the use of spatiotemporal visual attention for video classification”. Proceedings of international workshop on very low bitrate video coding (VLBV '05), Sardinia, Italy, September 2005
Sahami et al (1997) Real-time full-text clustering of networked documents. Proceedings of the National Conference on Artificial Intelligence, p 845
Salembier P, Smith JR (2001) MPEG-7 multimedia description schemes. IEEE Trans Circuits Syst Video Technol 11(6):748–759
Schutze et al (1997) Craig projections for efficient document clustering. SIGIR Forum (ACM Special Interest Group on Information Retrieval), pp 74–81
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47
Sikora T (2001) The MPEG-7 Visual standard for content description—an overview. IEEE Trans Circuits Syst Video Technol (special issue on MPEG-7) 11(6):696–702
Simou N, Saathoff C, Dasiopoulou S, Spyrou E, Voisine N, Tzouvaras V, Kompatsiaris I, Avrithis Y, Staab S (2005) An ontology infrastructure for multimedia reasoning. International workshop VLBV05, Sardinia, Italy, September 2005
Simou N, Tzouvaras V, Avrithis Y, Stamou G, Kollias S (2005) A visual descriptor ontology for multimedia reasoning. Proceedings of the workshop on image analysis for multimedia interactive services (WIAMIS ’05), Montreux, Switzerland, April 2005
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380
Smith JR (2004) Video indexing and retrieval using MPEG-7. In: B Furht, O Marques (eds) The handbook of image and video databases: design and applications. CRC Press
Smith JR (2006) “MARVEL: Multimedia Analysis and Retrieval System”, http://www.research.ibm.com/marvel/details.html (November 6)
Snoek C et al (2005) MediaMill: exploring news video archives based on learned semantics. Proceedings of ACM Multimedia, Singapore, November 2005
Snoek C, Worring M, Geusebroek J-M, Koelma D, Seinstra F, Smeulders A (2006) The semantic pathfinder for generic news video indexing. Proceedings of the 2006 international conference on multimedia and expo (ICME), Toronto, Canada, July 2006
Snoek C, Worring M, Hauptmann A (2006) Learning rich semantics from news video archives by style analysis. ACM Transactions on Multimedia Computing, Communications and Applications, 2(2):91–108
Staab S, Studer R (2004) Handbook on ontologies. International handbooks on information systems. Springer-Verlag, Heidelberg, New York
Stamou G, Kollias S (eds) (2005) Multimedia content and the semantic web: methods, standards and tools. Wiley & Sons Ltd
Theodoridis S, Koutroumbas K (1998) Pattern recognition. Academic Press
Troncy R (2003) Integrating structure and semantics into audio-visual documents. Proceedings of the 2nd international semantic web conference (ISWC'03), LNCS 2870, Florida, USA, October 2003, pp 566–581
Tsechpenakis G, Akrivas G, Andreou G, Stamou G, Kollias S (2002) Knowledge-assisted video analysis and object detection. Proceedings of European symposium on intelligent technologies, hybrid systems and their implementation on smart adaptive systems (Eunite02), Albufeira, Portugal, September 2002
Tsinaraki C, Polydoros P, Christodoulakis S (2004) Integration of OWL ontologies in MPEG-7 and TVAnytime compliant Semantic Indexing. Proceedings of the 16th international conference on advanced information systems engineering (CAiSE 2004), Riga, Latvia, June 2004
Tzitzikas Y, Meghini C, Spyratos N (2004) Towards a generalized interaction scheme for information access. Foundations of information and knowledge systems: third international symposium (FoIKS 2004), Wilheminenburg Castle, Austria, February 17–20, 2004
Voisine N, Dasiopoulou S, Mezaris V, Spyrou E, Athanasiadis Th, Kompatsiaris I, Avrithis Y, Strintzis MG (2005) Knowledge-assisted video analysis using a genetic algorithm. Proceedings of the 6th international workshop on image analysis for multimedia interactive services (WIAMIS 2005), April 2005
Wallace M, Akrivas G, Mylonas Ph, Avrithis Y, Kollias S (2003) Using context and fuzzy relations to interpret multimedia content. Proceedings of the 3rd international workshop on content-based multimedia indexing (CBMI), IRISA, Rennes, France, September 2003
Wallace M, Avrithis Y, Stamou G, Kollias S (2005) Knowledge-based multimedia content indexing and retrieval. In: Stamou G, Kollias S (eds) Multimedia content and semantic web: methods, standards and tools. Wiley
Wallace M, Avrithis Y, Kollias S (2006) Computationally efficient sup-t transitive closure for sparse fuzzy binary relations. Fuzzy Sets Syst 157(3):341–372
Willett P (1988) Recent trends in hierarchic document clustering: a critical review. Inf Process Manag 24(5):577–597
W3C, Semantic Web, www.w3.org/2001/sw: (November 6, 2006).
W3C, SWBPD MM Task Force Description, http://www.w3.org/2001/sw/BestPractices/MM/image_annotation.html (November 6, 2006).
W3C, Web Ontology Language-OWL, http://www.w3.org/TR/owl-features/ (November 6, 2006).
W3C, XML Schema, http://www.w3.org/XML/Schema (November 6, 2006).
Zhao R, Grosky WI (2002) Narrowing the semantic gap-improved text-based web document retrieval using visual features. IEEE Trans Multimedia (special issue on multimedia databases) 4(2), June 2002
Zhong D, Chang S-F (1999) An integrated system for content-based video object segmentation and retrieval. IEEE Trans Circuits Syst Video Technol 9(8):1259–1268, December
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mylonas, P., Athanasiadis, T., Wallace, M. et al. Semantic representation of multimedia content: Knowledge representation and semantic indexing. Multimed Tools Appl 39, 293–327 (2008). https://doi.org/10.1007/s11042-007-0161-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-007-0161-4