×

Mining indirect association rules for web recommendation. (English) Zbl 1176.68208

Summary: Classical association rules, here called “direct”, reflect relationships existing between items that relatively often co-occur in common transactions. In the web domain, items correspond to pages and transactions to user sessions. The main idea of the new approach presented is to discover indirect associations existing between pages that rarely occur together but there are other, “third” pages, called transitive, with which they appear relatively frequently. Two types of indirect associations rules are described in the paper: partial indirect associations and complete ones. The former respect single transitive pages, while the latter cover all existing transitive pages. The presented IDARM* Algorithm extracts complete indirect association rules with their important measure-confidence-using pre-calculated direct rules. Both direct and indirect rules are joined into one set of complex association rules, which may be used for the recommendation of web pages. Performed experiments revealed the usefulness of indirect rules for the extension of a typical recommendation list. They also deliver new knowledge not available to direct ones. The relation between ranking lists created on the basis of direct association rules as well as hyperlinks existing on web pages is also examined.

MSC:

68T35 Theory of languages and software systems (knowledge-based systems, expert systems, etc.) for artificial intelligence
68M10 Network design and communication in computer systems

Software:

WebACE
PDFBibTeX XMLCite
Full Text: DOI EuDML

References:

[1] Adomavicius, G. and Tuzhilin, A. (2001). Using data mining methods to build customer profiles, IEEE Computer 34(2): 74-82. · Zbl 05087422 · doi:10.1109/2.901170
[2] Agrawal, R., Imieliński, T. and Swami, A. (1993). Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, ACM Press, New York, NY, pp. 207-216.
[3] Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules, Proceedings of the 20-th International Conference on Very Large Databases, Santiago de Chile, Chile, Morgan Kaufmann, pp. 487-499.
[4] Agrawal, R. and Shafer, J.C. (1996). Parallel mining of association rules, IEEE Transactions on Knowledge and Data Engineering 8(6): 962-969.
[5] Boley, D., Gini, M., Gross, R., Han, E.H., Hastings, K., Karypis, G., Kumar, V., Mobasher, B. and Moorey, J. (1999). Document categorization and query generation on the world wide web using WebACE, Artificial Intelligence Review 13(5-6): 365-391.
[6] Chen, L., Bhowmick, S.S. and Li, J. (2006). Mining temporal indirect associations, Proceedings of the 10-th Pacific-Asia Conference, PAKDD 2006, Singapore, LNCS 3918, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 425-434.
[7] Cheung, D.W.L., Han, J., Ng, V. and Wong, C.Y. (1996). Maintenance of discovered association rules in large databases: An incremental updating technique, Proceedings of the 12-th International Conference on Data Engineering, New Orleans, LA, USA, IEEE Computer Society, Los Alamitos, CA, pp. 106-114.
[8] Cheung, D.W.L., Lee, S.D. and Kao, B. (1997). A general incremental technique for maintaining discovered association rules, Proceedings of the 5-th International Conference on Database Systems for Advanced Applications (DASFAA), Advanced Database Research and Development, Melbourne, Australia, Series 6, World Scientific, pp. 185-194.
[9] Cho, Y.H., Kim, J.K. and Kim, S.H. (2002). A personalized recommender system based on web usage mining and decision tree induction, Expert Systems with Applications 23(3): 329-342.
[10] Chun, J., Oh, J.-Y., Kwon, S. and Kim, D. (2005). Simulating the effectiveness of using association rules for recommendation systems, Proceedings of the 3-rd Asian Simulation Conference, AsiaSim 2004, Berlin-Heidelberg-New York, NY, LNCS 3398, Springer Verlag, Berlin-Heidelberg-New York, NY, pp. 306-314.
[11] Daniłowicz, C. and Baliński, J. (2001). Document ranking based upon Markov chains, Information Processing and Management 37(4): 623-637. · Zbl 0986.68025 · doi:10.1016/S0306-4573(00)00038-8
[12] EU (2002). Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector, Available at
[13] Fagin, R., Kumar, R. and Sivakumar, D. (2003). Comparing top k lists, SIAM Journal on Discrete Mathematics 17(1): 134-160. · Zbl 1057.68075 · doi:10.1137/S0895480102412856
[14] Géry, M. and Haddad, M.H. (2003). Evaluation of web usage mining approaches for user’s next request prediction, Proceedings of the 5-th ACM CIKM International Workshop on Web Information and Data Management, WIDM 2003, New Orleans, LA, USA, ACM Press, New York, NY, pp. 74-81.
[15] Goodrum, A., McCain, K.W., Lawrence, S. and Giles, C.L. (2001). Scholarly publishing in the Internet age: A citation analysis of computer science literature, Information Processing and Management 37(5): 661-675. · Zbl 0972.68578 · doi:10.1016/S0306-4573(00)00047-9
[16] Ha, S.H. (2002). Helping online customers decide through web personalization, IEEE Intelligent Systems 17(6): 34-43. · Zbl 05095657 · doi:10.1109/MIS.2002.1134360
[17] Hao, M.C., Hsu, M., Dayal, U., Wei, S.F., Sprenger, T. and Holenstein, T. (2001). Market basket analysis visualization on a spherical surface, Proceedings of SPIE, Vol. 4302, Visual Data Exploration and Analysis VIII, International Society for Optical Engineering SPIE, San Jose CA, pp. 227-233, Available at hpl. hp. com/techreports/2001/HPL-2001-3.pdf.
[18] Hamano, S. and Sato, M. (2004). Mining indirect association rules, Proceedings of the 4-th Industrial Conference on Data Mining, ICDM 2004, Leipzig, Germany, LNCS 3275, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 106-116.
[19] Han, J., Pei, J. and Yin, Y. (2000). Mining frequent patterns without candidate generation, Proceeding of the ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, ACM Press, New York, NY, pp. 1-12.
[20] Henzinger, M.R. (2001). Hyperlink analysis for the Web, IEEE Internet Computing 5(1): 45-50.
[21] Juszczyszyn, K., Kazienko, P. and Musiał, K. (2008). Local topology of social network based on motif analysis, Proceedings of the 12-th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, KES 2008, Zagreb, Croatia, LNAI 5178, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp.97-105, (in press). · Zbl 1179.68154
[22] Kazienko, P. and Kiewra, M. (2003). ROSA-Multi-agent system for web services personalization, Proceedings of the 1-st Atlantic Web Intelligence Conference AWIC 2003, Madrid, Spain, LNAI 2663, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 297-306.
[23] Kazienko, P. (2004a). Multi-agent web recommendation method based on indirect association rules, Proceedings of the 8-th International Conference on Knowledge-Based Intelligent Information & Engineering Systems KES’2004, Wellington, New Zealand, LNAI 3214, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 1157-1164.
[24] Kazienko, P. (2004b). Product recommendation in e-commerce using direct and indirect confidence for historical user sessions, Proceedings of the 7-th International Conference on Discovery Science DS’04, Padova, Italy, LNAI 3245, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 255-269.
[25] Kazienko, P. and Adamski, M. (2007). AdROSA-Adaptive personalization of web advertising, Information Sciences 177(11): 2269-2295.
[26] Kazienko, P. and Kiewra, M. (2004). Personalized recommendation of web pages, in T. Nguyen , Intelligent Technologies for Inconsistent Knowledge Processing, Advanced Knowledge International, Adelaide, Australia, pp. 163-183.
[27] Kazienko, P. and Kuźmińska, K. (2005). The influence of indirect association rules on recommendation ranking lists, Proceedings of the 5-th International Conference on Intelligent Systems Design and Applications, ISDA 2005, International Workshop on Recommender Agents and Adaptive Web-based Systems, RAAWS 2005, Wrocław, Polan, IEEE Computer Society, Los Alamitos, CA, pp. 482-487.
[28] Kazienko, P. and Matrejek, M. (2005). Adjustment of indirect association rules for the web, Proceedings of the 31-st Conference on Current Trends in Theory and Practice of Computer Science SOFSEM 2005, Liptovský Ján, Slovakia, LNCS 3381, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 211-220. · Zbl 1117.68466 · doi:10.1007/b105088
[29] Kazienko, P. and Pilarczyk, M. (2008). Hyperlink recommendation based on positive and negative association rules, New Generation Computing 26(3):227-244, (in press).
[30] Kendall, M.G. (1948). Rank Correlation Methods, Charles Griffin & Company, Ltd., London. · Zbl 0032.17602
[31] Kobsa, A. (2002). Personalized hypermedia and international privacy, Communications of the ACM 45(5): 64-67.
[32] Lawrence, S., Giles, C.L. and Bollacker, K. (1999). Digital libraries and autonomous citation indexing, IEEE Computer 32(6): 67-71. · Zbl 05089357 · doi:10.1109/2.769447
[33] Lawrence, R.D., Almasi, G.S., Kotlyar, V., Viveros, M.S. and Duri, S.S. (2001). Personalization of supermarket product recommendations. Data Mining & Knowledge Discovery 5(1/2): 11-32. · Zbl 1006.68630 · doi:10.1023/A:1009835726774
[34] Lee, G., Lee, K.L. and Chen, A.L.P. (2001). Efficient graph-based agorithms for discovering and maintaining association rules in large databases, Knowledge and Information Systems 3(3): 338-355. · Zbl 0989.68041 · doi:10.1007/PL00011672
[35] Lu, Z., Yao, Y. and Zhong, N. (2003). Web log mining, in N. Zhong, J. Liu and Y. Yao , Web Intelligence, Springer, Berlin/New York, NY.
[36] Madria, S.K., Bhowmick, S.S., Ng, W.-K. and Lim, E.P. (1999). Research issues in web data mining. Procedings of the 1-st International Conference on Data Warenhousing and Knowledge Discovery, DaWaK’99, Florence, Italy, LNCS 1676, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 303-312.
[37] Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. and Alon, U. (2002). Network motifs: Simple building blocks of complex networks. Science 298(5594): 824-827.
[38] Mobasher, B., Cooley, R. and Srivastava, J. (2000). Automatic personalization based on web usage mining, Communications of the ACM 43(8): 142-151. · doi:10.1145/345124.345169
[39] Montaner, M., López, B. and de la Rosa, J.L. (2003). A taxonomy of recommender agents on the internet, Artificial Intelligence Review 19(4): 285-330.
[40] Morzy, T. and Zakrzewicz, M. (2003). Data mining, in J. Błażewicz, W. Kubiak, T. Morzy and M. Rubinkiewicz , Handbook on Data Management in Information Systems, Springer-Verlag, Berlin/Heidelberg/New York, NY, pp. 487-565.
[41] Nakagawa, M. and Mobasher, B. (2003). Impact of site characteristics on recommendation models based on association rules and sequential patterns. Proceedings of the IJCAI’03 Workshop on Intelligent Techniques for Web Personalization, Acapulco, Mexico, Available at cs. depaul. edu/ {}mobasher/papers/NM03a. pdf.
[42] Spearman, C. (1904/1987). The proof and measurement of association between two things, The American Journal of Psychology 15: 72-101. · doi:10.2307/1412159
[43] Tan, P.-N., Kumar, V. and Srivastava, J. (2000). Indirect association: Mining higher order dependencies in data, Proceedings of the 4-th European Conference on Principles of Data Mining and Knowledge Discovery, PKDD 2000, Lyon, France, LNCS 1910, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 632-637.
[44] Tan, P.-N. and Kumar, V. (2002). Mining indirect associations in web data. Proceedings of the 3-rd International Workshop on Mining Web Log Data Across All Customers Touch Points, WEBKDD 2001, San Francisco, CA, USA, LNCS 2356, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 145-166.
[45] Tan, P.-N. and Kumar, V. (2003). Discovery of indirect associations from web usage data, in N. Zhong, J. Liu and Y.Y. Yao , Web Intelligence, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 128-152.
[46] Wan, Q. and An, A. (2003). Efficient mining of indirect associations using HI-mine, Advances in Artificial Intelligence: Proceedings of the 16-th Conference of the Canadian Society for Computational Studies of Intelligence, AI 2003, Halifax, Canada, LNCS 2671, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 206-221.
[47] Wan, Q. and An, A. (2006a). An efficient approach to mining indirect associations. Journal of Intelligent Information Systems 27(2): 135-158.
[48] Wan, Q. and An, A. (2006b). Efficient indirect association discovery using compact transaction databases, Proceedings of the IEEE International Conference on Granular Computing, GrC’06, Atlanta, GA, USA, IEEE Press, Los Alamitos, Ca, Available at cse. yorku. ca/ {}aan/research/paper/grc06-final. pdf.
[49] Wang, D., Bao, Y., Yu, G. and Wang, G. (2002). Using page classification and association rule mining for personalized recommendation in distance learning, Proceedings of the 1-st Internationl Conference on Advances in Web-Based Learning, ICWL’02, Hong Kong, China, LNCS 2436, Springer Verlag, Berlin-Heidelberg-New York, NY, pp. 363-376. · Zbl 1019.68784
[50] Weiss, R., Velez, B., Sheldon, M.A., Namprempre, C., Szilagyi, P., Duda, A. and Gifford, D.K. (1996). HyPursuit: A hierarchical network search engine that exploits content-link hypertext clustering, Proceedings of the 7-th ACM Conference on Hypertext, Hypertext’96, Washington, DC, USA, ACM Press, New York, NY, pp. 180-193.
[51] Yang, H. and Parthasarathy, S. (2003). On the use of constrained associations for web log mining, Proceedings of the 4-rd International Workshop on Mining Web Data for Discovering Usage Patterns and Profiles, WEBKDD 2002, MiningWeb Data for Discovering Usage Patterns and Profiles, Edmonton, Canada, LNCS 2703, Springer-Verlag, Berlin-Heidelberg-New York, NY, pp. 100-118.
[52] Yen, S.J. and Chen, A.L.P. (1996). An efficient approach to discovering knowledge from large databases, Proceedings of the 5-th International Conference on Parallel and Distributed Information Systems, Miami Beach, FL, USA, IEEE Computer Society, Los Alamitos, CA, pp. 8-18.
[53] Zaki, M.J., Parathasarathy, S. and Li, W. (1997). A localized algorithm for parallel association mining, Proceedings of the 9-th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA’97, Newport, RI, USA, ACM Press, New York, NY, pp. 321-330.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.