Abstract
World-wide-web applications have grown very rapidly and have made a significant impact on computer systems. Among them, web browsing for useful information may be most commonly seen. Due to its tremendous amounts of use, efficient and effective web retrieval has thus become a very important research topic in this field. In the past, we proposed a web-mining algorithm for extracting interesting browsing patterns from log data in web servers. It integrated fuzzy-set concepts and data mining approach to achieve this purpose. In that algorithm, each web page used only the linguistic term with the maximum cardinality in the mining process. The number of items was thus the same as that of the original web page, making the processing time reduced. The fuzzy browsing patterns derived in this way are, however, not complete, meaning some possible patterns may be missed. This paper thus modifies it and proposes a new fuzzy web-mining algorithm for extracting all possible fuzzy interesting knowledge from log data in web servers. The proposed algorithm can derive a more complete set of browsing patterns but with more computation time than the previous method. Trade-off thus exists between the computation time and the completeness of browsing patterns. Choosing an appropriate mining method thus depends on the requirements of the application domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal R, Srikant R (1995) Mining sequential patterns. The Eleventh International Conference on Data Engineering 3–14
Blishun AF (1987) Fuzzy learning models in expert systems. Fuzzy Sets and Systems 22:57–70
Campos LM de and Moral S (1993) Learning rules for a fuzzy inference model. Fuzzy Sets and Systems 59:247–257
Chang RLP, Pavliddis T (1977) Fuzzy decision tree algorithms. IEEE Transactions on Systems. Man and Cybernetics 7:28–35
Chen MS, Park JS, Yu PS (1998) Efficient data mining for path taversal patterns. IEEE Transactions on Knowledge and Data Engineering 10:209–221
Chen L, Sycara K (1998) Web Mate: A personal agent for browsing and searching. The Second International Conference on Autonomous Agents. ACM
Clair C, Liu C, Pissinou N (1998) Attribute weighting: a method of applying domain knowledge in the decision tree process. The Seventh International Conference on Information and Knowledge Management. 259–266
Clark P, Niblett T (1989) The CN2 induction algorithm. Machine Learning 3:261–283
Cohen E, Krishnamurthy B, Rexford J (1999) Efficient algorithms for predicting requests to web servers. The Eighteenth IEEE Annual Joint Conference on Computer and Communications Societies 1:284–293
Cooley R, Mobasher B, Srivastava J (1997) Grou** web page references into transactions for mining world wide web browsing patterns. Knowledge and Data Engineering Exchange Workshop 2–9
Cooley R, Mobasher B, Srivastava J (1997) Web mining: information and pattern discovery on the world wide web. The Ninth IEEE International Conference on Tools with Artificial Intelligence 558–567
Delgado M, Gonzalez A (1993) An inductive learning procedure to identify fuzzy systems. Fuzzy Sets and Systems 55:121–132
Gonzalez A (1995) A learning methodology in uncertain and imprecise environments. International Journal of Intelligent Systems 10: 357–371
Graham I and Jones PL (1988) Expert Systems — Knowledge, Uncertainty and Decision. Chapman and Computing, Boston 117–158
Hong TP, Chen JB (1999) Finding relevant attributes and membership functions. Fuzzy Sets and Systems 103(3):389–404
Hong TP, Chen JB (2000) Processing individual fuzzy attributes for fuzzy rule induction. Fuzzy Sets and Systems 112(1):127–140
Hong TP, Lee CY (1996) Induction of fuzzy rules and membership functions from training examples. Fuzzy Sets and Systems 84:33–47
Hong TP, Kuo CS, Chi SC (1999) A data mining algorithm for transaction data with quantitative values. Intelligent Data Analysis 3(5): 363–376
Hong TP, Lin KY, Wang SL (2002) Mining linguistic browsing patterns in the world wide web. Soft Computing 6(5):329–336
Hong TP, Tseng SS (1997) A generalized version space learning algorithm for noisy and uncertain data. IEEE Transactions on Knowledge and Data Engineering 9(2):336–340
Hou RH, Hong TP, Tseng SS, Kuo SY (1997) A new probabilistic induction method. Journal of Automatic Reasoning 18:5–24
Kandel A (1992) Fuzzy Expert Systems. CRC Press, Boca Raton 8–19
Mamdani EH (1974) Applications of fuzzy algorithms for control of simple dynamic plants. IEEE Proceedings 1585–1588
Quinlan JR (1987) Decision tree as probabilistic classifier. The Fourth International Machine Learning Workshop. Morgan Kaufmann, San Mateo CA 31–37
Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo CA
Rives J (1990) FID3: fuzzy induction decision tree. The First International Symposium on Uncertainty, Modeling and Analysis 457–462
Wang CH, Hong TP, Tseng SS (1996) Inductive learning from fuzzy examples. The Fifth IEEE International Conference on Fuzzy Systems, New Orleans 13–18
Wang CH, Liu JF, Hong TP, Tseng SS (1999) A fuzzy inductive learning strategy for modular rules. Fuzzy Sets and Systems 103(1):91–105
Weber R (1992) Fuzzy-ID3: a class of methods for automatic knowledge acquisition. The Second International Conference on Fuzzy Logic and Neural Networks, Iizuka Japan 265–268
Yuan Y, Shaw MJ (1995) Induction of fuzzy decision trees. Fuzzy Sets and Systems 69:125–139
Zadeh LA (1988) Fuzzy logic. IEEE Computer 83–93
Zimmermann HJ (1991) Fuzzy Set Theory and Its Applications. Kluwer Academic Publisher, Boston
<Reftitle>References</Reftitle>
Makhoul J, Kubala F et al. (2000) Speech and language technologies for audio indexing and retrieval code. In: Proceedings of the IEEE, Volume: 88 Issue: 8, Aug 2000, pp: 1338–1353
Viswanathan M, Beigi H.S.M et al. (1999) Retrieval from spoken documents using content and speaker information. In: ICDAR’99 pp: 567–572
Gauvain J.-L, Lamel L (2000) Large-vocabulary continuous speech recognition: advances and applications. In: Proceedings of the IEEE, Volume: 88 Issue: 8, Aug 2000, pp: 1181–1200
Chih-Chin Liu, Jia-Lien Hsu, Chen A.L.P (1999) An approximate string matching algorithm for content-based music data retrieval. In: IEEE International Conference on Multimedia Computing and Systems, Volume: 1, 1999, pp: 451–456
Delfs C, Jondral F (1997) Classification of piano sounds using time-frequency signal analysis. In: ICASSP-97, Volume: 3 pp: 2093–2096
Paradie M.J, Nawab S.H (1990) The classification of ringing sounds. In: ICASSP-90, pp: 2435–2438
Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. In: ICASSP-97, Volume: 2, pp: 1331–1334
Tong Zhang, C.-C. Jay Kuo (1999) Heuristic approach for generic audio data segmentation and annotation. In: ACM Multimedia’99, pp: 67–76
Liu Z, Huang J, Wang Y (1998) Classification TV programs based on audio information using hidden Markov model. In: IEEE Second Workshop on Multimedia Signal Processing, 1998, pp: 27–32
Wold E, Blum T, Keislar D, Wheaten J (1996) Content-based classification, search, and retrieval of audio. In: IEEE Multimedia, Volume: 3 Issue: 3, Fall 1996, pp: 27–36
Zhu Liu, Qian Huang (2000) Content-based indexing and retrieval-by-example in audio. In: ICME 2000, Volume: 2, pp: 877–880
Beritelli F, Casale S, Russo M (1995) Multilevel Speech Classification Based on Fuzzy Logic. In: Proceedings of IEEE Workshop on Speech Coding for Telecommunications, 1995, pp: 97–98
Zhu Liu, Qian Huang (1998) Classification of audio events in broadcast news. In: IEEE Second Workshop on Multimedia Signal Processing, 1998, pp:364–369
Mingchun Liu, Chunru Wan (2001) A study on content-based classification and retrieval of audio database. In: International Database Engineering and Application Symposium, 2001, pp: 339–345
Li S.Z (2000) Content-based audio classification and retrieval using the nearest feature line method, IEEE Transactions on Speech and Audio Processing, Volume: 8 Issue: 5, Sept 2000, pp: 619–625
Jang J.-S.R (1993) ANFIS: adaptive-network-based fuzzy inference system, IEEE Transactions on Systems, Man and Cybernetics, 1993, volume: 23, Issue: 3, pp: 665–685
<Reftitle>References</Reftitle>
Attardi, G., Di Marco S., and Salvi, D. (1998). Categorisation by Context. Journal of Universal Compouter Science, 4:719–736.
Boley, D., Gini, M., Gross, R., Hang, E-H., Hasting, K., Karypis, G., Kumar, V., Mobasher, B., and Moore, J. (1999). Partioning-based clustering for Web document categorization Decision Support System, 27 (1999) 329–341.
Chakrabarti, S., Dom, B., Gibson, D., Kleinberg, J., Rahavan, P., and Rajagopalan, S.(1998). Automatic resource list compilation by analyzing hyperlink structure and associated text. Seventh International World Wide Web Conference, 1998.
Chang, C-H., and Hsu, C-C. (1997). Customizable Multi-Engine Search tool with Clustering. Sixth International World Wide Web Conference, April 7–11, 1997 Santa Clara, California, USA.
Cohen, W. (1998). A web-based information system that reasons with structured collections of text. Agents ’98, 1998.
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., and Slattery, S. (1998). Learning to extract symbolic knowledge from the World Wide Web. AAAI-98, 1998.
Hayes, J., and Weinstein, S. P. (1990). CONSTRUE-TIS: A system for contentbased indexing of a database of news stories. Second Annual Conference on Innovative Applications of Artificial Intelligence, 1–5.
Iwayama, M. (1995). Cluster-based text categorization: a comparison of category search strategies. SIGIR-95, pp. 273–280.
JDK Java 2 Sun. http://java.sun.com
Kruschwitz, U. (2001). Exploiting Structure for Intelligent Web Search. 2001 IEEE International Confernce on System Science, January 3–6, 2001, Hawaii, IEEE Press.
Lawrence, S. and Giles, C. L. (1999). Nature, 400:107–109. Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99).
Loia, V. and Luongo, P. (2001). Genetic-based Fuzzy Clustering for Automatic Web Document Categorization, 2001 ACM Symposium Applied Computation, March 11–14 2001, Las Vegas, USA, ACM Press.
Loia, V. and Luongo, P. (2001). An Evolutionary Approach to Automatic Web Page Categorization and Updating, 2001 International Conference on Web Intelligence, October 23–26, 2001, Maebashi City, Japan.
Mase, H., Tsuji, H., Kinukawa, H., Hosoya, Y., Koutani, K., and Kiyota, K. (1996). Experimental simulation for automatic patent categorization. Advances in Production Management Systems, 377–382.
McCallum, A., Nigam, K., Rennie, J., and Seymore, K. (1999). A Machine Learning Approach to Building Domain-Specific Search Engine. Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99).
Open Directory Project. URL: http://dmoz.org/about.html
Sahami, M., Yusufali, S., and Baldoando, M. Q., W. (1998) SONIA: A service for organizing networked information autonomously. Third ACM Conference on Digital Libraries.
Selberg, E. (1999) Towards Comprehensive Web Search. PhD thesis, University of Washington.
Selberg, E and Etzioni, O. (2000). On the Instability of Web Search Engine. RIAO2000.
Zamir, O., and Etzioni, O. (1988). Web Document Clustering: A Feasibility Demonstration. SIGIR’98, Melbourne, Australia, ACM Press.
A Lexical Database for English. URL: http://www.cogsci.princeton.edu/wn/
<Reftitle>References</Reftitle>
Agrawal R, Imielinski T, Swami A (1993) Mining Association Rules between Set of Items in Large Databases. Proc. of the 1993 ACM SIGMOD Conference, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. Proc. Of the 20th VLDB Conference, pp 478–499
Attar R, Fraenkel AS (1977) Local Feedback in Full-Text Retrieval Systems. Journal of the Association for Computing Machinery 24(3):397–417
Au WH, Chan KCC (1998) An effective algorithm for discovering fuzzy rules in relational databases. Proc. Of IEEE International Conference on Fuzzy Systems, vol II, pp 1314–1319
Baeza-Yates R, Ribeiro-Nieto B (1999) Modern Information Retrieval, Addison-Wesley, USA
Berzal F, Cubero JC, Marín N, Serrano JM (2001) TBAR: An efficient method for association rule mining in relational databases. Data and Knowledge Engineering 37(1):47–84
Berzal F, Blanco I, Sanchez, Vila MA (2002) Measuring the Accuracy and Importance of Association Rules: A New Framework. Intelligent Data Analysis 6:221–235
Bodner RC, Song F (1996) Knowledge-Based Approaches to Query Expansion in Information Retrieval. In: McGalla G (ed) Advances in Artificial Intelligence pp 146–158. Springer, New York
Brin S, Motwani JD, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. SIGMOD Record 26(2):255–264
Buckley C, Salton G, Allan J, Singhal A (1993) Automatic Query Expansion using SMART: TREC 3″. Proc. of the 3 rd Text Retrieval Conference. NIST Special Publication 500–225, pp 69–80
Buell DA, Kraft DH (1981) Performance Measurement in a Fuzzy Retrieval Environment. Proceedings of the Fourth International Conference on Information Storage and Retrieval, ACM/SIGIR Forum 16(1): 56–62, Oakland, CA
Chen H, Ng T, Martinez J, Schatz BR (1997) A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community System. Journal of the American Society for Information Science 48(1):17–31
Croft WB, Thompson RH (1987) I3R: A New Approach to the Design of Document Retrieval Systems. Journal of the American Society for Information Science 38(6):389–404
Delgado M, Marín N, Sanchez D, Vila MA (2001). Fuzzy Association Rules: General Model and Applications. IEEE Transactions of Fuzzy Systems (accepted)
Delgado M, Martín-Bautista MJ, Sanchez D, Vila MA (2000). Mining strong approximate dependences from relational databases. Proc. Of IPMU 2000 2:1123–1130. Madrid, Spain
Delgado M, Martín-Bautista MJ, Sanchez D, Vila MA (2001) Mining association rules with improved semantics in medical databases. Artificial Intelligence in Medicine 21:241–245
Delgado M, Martín-Bautista MJ, Sanchez D, Vila MA (2002) Mining Text Data: Special Features and Patterns. Proc. of EPS Exploratory Workshop on Pattern Detection and Discovery in Data Mining, pp 140–153. Imperial College Londres, UK
Delgado M, Sanchez D, Vila MA (2000) Acquisition of fuzzy association rules from medical data. In Barro S, Marín R (eds) Fuzzy Logic in Medicine. PhysicaVerlag
Delgado M, Sanchez D, Vila MA (2000) Fuzzy cardinality based evaluation of quantified sentences. International Journal of Approximate Reasoning 23:23–66
Efthimiadis E (1996) Query Expansion. Annual Review of Information Systems and Technology 31:121–187
Feldman R, Fresko M, Kinar Y, Lindell Y, Liphstat 0, Rajman M, Schler Y, Zamir O (1998) Text Mining at the Term Level. Proc. of the 2nd European Symposium of Principles of Data Mining and Knowledge Discovery, pp 65–73
Feldman R, Hirsh H (1996) Mining associations in text in the presence of Background Knowledge. Proc. of the Second International Conference on Knowledge Discovery from Databases
Fu AW, Wong MH, Sze SC, Wong WC, Wong WL, Yu WK (1998) Finding Fuzzy Sets for the Mining of Fuzzy Association Rules for Numerical Attributes. Proc. of Int. Symp. on Intelligent Data Engineering and Learning (IDEAL’98), pp 263–268, Hong Kong
Fu LM, Shortliffe EH (2000) The application of certainty factors to neural computing for rule discovery. IEEE Transactions on Neural Networks 11(3):647–657
Gauch S, Smith JB (1993) An Expert System for Automatic Query Reformulation. Journal of the American Society for Information Science 44(3):124–136
Han J, Pei J, Yin Y (2000)Mining frequent patterns without candidate generation. Proc. ACM SIGMOD Int. Conf. On Management of Data, pp 1–12. Dallas, TX, USA
Harman D (1988) Towards interactive query expansion. Proc. of the Eleventh Annual International ACMSIGIR Conference on Research and Development in Information Retrieval pp 321–331. ACM Press
Hearst M (1999) Untangling Text Data Mining. Proc. of the 37th Annual Meeting of the Association for Computational Linguistics (ACL’99). University of Maryland
Hearst M (2000) Next Generation Web Search: Setting our Sites. IEEE Data Engineering Bulletin, Special issue on Next Generation Web Search, Gravano L (ed)
Houtsma M, Swami A (1995) Set-oriented mining for association rules in relational databases. Proc. Of the 11th International Conference on Data Engineering pp 25–33.
Kodratoff Y (1999) Knowledge Discovery in Texts: A Definition, and Applications. In: Ras ZW, Skowron A (eds) Foundation of Intelligent Systems, Lectures Notes on Artificial Intelligence 1609. Springer Verlag
Kraft D, Petry FE, Buckles BP, Sadasivan T (1997) Genetic Algorithms for Query Optimization in Information Retrieval: Relevance Feedback. In: Sanchez E, Shibata T, Zadeh LA, (eds) Genetic Algorithms and Fuzzy Logic Systems,
Kraft D, Petry FE, Buckles BP, Sadasivan T (1997) Advances in Fuzziness: Applications and Theory 7:157–173, World Scientific
Lin SH, Shih CS, Chen MC, Ho JM, Ko MT, Huang YM (1998) Extracting Classification Knowledge of Internet Documents with Mining Term Associations: A Semantic Approach. Proc. of ACM/SIGIR’98 pp 241–249. Melbourne, Australia
Mannila H, Toivonen H, Verkamo I (1994) Efficient algorithms for discovering association rules. Proc. Of AAAI Workshop on Knowledge Discovery in Databases pp 181–192
Miller G (1990) WordNet: An on-line lexical database. International Journal of Lexicography 3(4)
Mitra M, Singhal A, Buckley C (1998) Improving Automatic Query Expansion. Proc. Of ACM SIGIR pp 206–214. Melbourne, Australia
Park JS, Chen MS, Yu PS (1995) An effective hash based algorithm for mining association rules. SIGMOD Record 24(2):175–186
Peat HJ, Willet P (1991) The limitations of term co-occurrence Data for Query Expansion in Document Retrieval Systems. Journal of the American Society for Information Science 42(5):378–383
Piatetsky-Shapiro G (1991) Discovery, Analysis, and Presentation of Strong Rules. In: Piatetsky-Shapiro G, Frawley WJ (eds) Knowledge Discovery in Databases, AAAI/MIT Press
Porter MF (1980) An algorithm for suffix strip**. Program 14(3):130–137
Qui Y, Frei HP (1993) Concept Based Query Expansion. Proc. Of the Sixteenth Annual International ACM-SIGIR’93 Conference on Research and Development in Information Retrieval pp 160–169
Rajman M, Besançon R (1997) Text Mining: Natural Language Techniques and Text Mining Applications. Proc. of the 3d International Conference on Database Semantics (DS-7)Chapam & Hall IFIP Proceedings serie
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5):513–523
Salton G, McGill MJ (1983) Introduction to Modern Information Retrieval. McGraw-Hill
Shortliffe E, Buchanan B (1975) A model of inexact reasoning in medicine. Mathematical Biosciences 23:351–379
Srinivasan P, Ruiz ME, Kraft DH, Chen J (2001) Vocabulary mining for information retrieval: rough sets and fuzzy sets. Information Processing and Management 37:15–38
Van Rijsbergen CJ, Harper DJ, Porter MF (1981) The selection of good search terms. Information Processing and Management 17:77–91
Vélez B, Weiss R, Sheldon MA, Gifford DK (1997) Fast and Effective Query Refinement. Proc. Of the 20th ACM Conference on Research and Development in Information Retrieval (SIGIR’97). Philadelphia, Pennsylvania
Voorhees EM (1994) Query expansion using Lexical-Semantic Relations. ACM SIGIR pp 61–70
Xu J, Croft WB (1996) Query Expansion Using Local and Global Document Analysis. Proc. of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pp 4–11
Zadeh LA (1983) A computational approach to fuzzy quantifiers in natural languages. Computing and Mathematics with Applications 9(1):149–184
<Reftitle>References</Reftitle>
Fair, Isaac and Co.: http://www.fairisaac.com/.
Bonissone P.P., Decker K.S. (1986) Selecting Uncertainty Calculi and Granularity: An Experiment in Trading; Precision and Complexity, in Uncertainty in Artificial Intelligence (L. N. Kanal and J. F. Lemmer, Eds.), Amsterdam.
Fagin R. (1998) Fuzzy Queries in Multimedia Database Systems, Proc. ACM Symposium on Principles of Database Systems, pp. 1–10.
Fagin R. (1999) Combining fuzzy information from multiple systems. J. Computer and System Sciences 58, pp 83–99.
Mizumoto M. (1989) Pictorial Representations of Fuzzy Connectives, Part I: Cases of T-norms, T-conorms and Averaging Operators, Fuzzy Sets and Systems 31, pp. 217–242.
Nikravesh M. (2001a) Perception-based information processing and retrieval: application to user profiling, 2001 research summary, EECS, ERL, University of California, Berkeley, BT-BISC Project. http://zadeh.cs.berkeley.edu/ & http://www-bisc.cs.berkeley.edu/.
Nikravesh M. (2001b) Credit Scoring for Billions of Financing Decisions, Joint 9th IFSA World Congress and 20th NAFIPS International Conference. IFSA/NAFIPS 2001“ Fuzziness and Soft Computing in the New Millenium”, Vancouver, Canada, July 25–28, 2001.
Stanford University Admission, http://www.stanford.edu/home/stanford/facts/undergraduate.html.
U.S. Citizens for Fair Credit Card Terms; http://www.cardratings.org/cardrepfr.html.
University of California-Berkeley, Office of Undergraduate Admission, http://advising.berkeley.edu/ouars/.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hong, TP., Lin, KY., Wang, SL. (2004). A Time-Completeness Tradeoff on Fuzzy Web-Browsing Mining. In: Loia, V., Nikravesh, M., Zadeh, L.A. (eds) Fuzzy Logic and the Internet. Studies in Fuzziness and Soft Computing, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39988-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-39988-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05770-0
Online ISBN: 978-3-540-39988-9
eBook Packages: Springer Book Archive