×

Comparison of relevance learning vector quantization with other metric adaptive classification methods. (English) Zbl 1100.68099

Summary: The paper deals with the concept of relevance learning in learning vector quantization and classification. Recent machine learning approaches with the ability of metric adaptation but based on different concepts are considered in comparison to variants of relevance learning vector quantization. We compare these methods with respect to their theoretical motivation and we demonstrate the differences of their behavior for several real world data sets.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Andonie, R.; Cataron, A., An information energy LVQ approach for feature ranking, (Verleysen, M., European symposium on artificial neural networks (2004), d-side publications: d-side publications Evere, Belgium), 471-476
[2] Blake, C. L.; Merz, C. J., UCI repository of machine learning databases (1998), Irvine, CA: University of California, Department of Information and Computer Science
[3] Bojer, T., Hammer, B., Schunk, D., & Tluk von Toschanowitz, K. (2001). Relevance determination in learning vector quantization. In Proceedings of the ninth European symposium on artificial neural networks, ESANN’2001; Bojer, T., Hammer, B., Schunk, D., & Tluk von Toschanowitz, K. (2001). Relevance determination in learning vector quantization. In Proceedings of the ninth European symposium on artificial neural networks, ESANN’2001
[4] Crammer, K., Gilad-Bachrach, R., Navot, A., & Tishby, A. (2002). Margin analysis of the LVQ algorithm. In Proceedings of the NIPS 2002http://www-2.cs.cmu.edu/Groups/NIPS/NIPS2002/NIPS2002preproceedings/index.html; Crammer, K., Gilad-Bachrach, R., Navot, A., & Tishby, A. (2002). Margin analysis of the LVQ algorithm. In Proceedings of the NIPS 2002http://www-2.cs.cmu.edu/Groups/NIPS/NIPS2002/NIPS2002preproceedings/index.html
[5] Cristianini, N.; Lodhi, H.; Shawe-Taylor, J.; Watkins, C., Text classification using string kernels, Journal of Machine Learning Research, 2, 419-444 (2002) · Zbl 1013.68176
[6] Diekhans, M.; Jaakkola, T.; Haussler, D., A discriminative framework for detecting remote protein homologies, Journal of Computational Biology, 7, 1-2 (2000)
[7] Fano, R. M., Transmission and information: A statistical theory of communication (1961), Wiley: Wiley New York · Zbl 0151.24402
[8] Flach, P. A., Gärtner, T., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In Proceedings 16th annual conference on computational learning theory and seventh kernel workshop (COLT-2003); Flach, P. A., Gärtner, T., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In Proceedings 16th annual conference on computational learning theory and seventh kernel workshop (COLT-2003) · Zbl 1274.68312
[9] Gärtner, T., A survey of kernels for structured data, SIGKDD explorations (2003)
[10] Gori, M.; Bianchini, M.; Scarselli, F., Processing directed acyclic graphs with recursive neural networks, IEEE Transaction on Neural Networks, 12, 6, 1464-1470 (2001)
[11] Gori, M.; Frasconi, P.; Sperduti, A., A general framework of adaptive processing of data structures, IEEE Transaction on Neural Networks, 9, 5, 768-786 (1998)
[12] Hammer, B.; Jain, B. J., Neural methods for non-standard data, (Verleysen, M., European symposium on artificial neural networks (2004), d-side publications: d-side publications Evere, Belgium), 281-292
[13] Hammer, B.; Strickert, M.; Villmann, Th., Supervised neural gas with general similarity measure, Neural Processing Letters, 21, 1, 21-44 (2005)
[14] Hammer, B.; Strickert, M.; Villmann, Th., On the generalization ability of GRLVQ networks, Neural Processing Letters, 21, 2, 109-120 (2005)
[15] Hammer, Barbara; Strickert, Marc; Villmann, Thomas, Prototype based recognition of splice sites, (Seiffert, U.; Jain, L. A.; Schweitzer, P., Bioinformatic using computational intelligence paradigms (2005), Springer: Springer New York), 25-56
[16] Hammer, B.; Villmann, Th., Generalized relevance learning vector quantization, Neural Networks, 15, 8-9, 1059-1068 (2002)
[17] Hammer, B.; Villmann, Th., Mathematical aspects of neural networks, (Verleysen, M., Proceedings of the European symposium on artificial neural networks (ESANN’2003) (2003), d-side: d-side Brussels, Belgium), 59-72
[18] Hellmann, M. E.; Raviv, J., Probability of error, equivocation and the chernoff bound, IEEE Transactions on Information Theory, 16, 368-372 (1970) · Zbl 0218.62005
[19] Kapur, J. N., Measures of information and their application (1994), Wiley: Wiley New Delhi · Zbl 0925.94073
[20] Kapur, J. N.; Kesavan, H. K., Entropy optimization principles with applications (1992), Academic Press: Academic Press San Diego, London · Zbl 0718.62007
[21] Kearns, M., Mansour, Y., Ng, A., & Ron, D. (1995). An experimental an theoretical comparison of model selection methods. In Proceedings of the eighth annual ACM workshop on computational learning theory; Kearns, M., Mansour, Y., Ng, A., & Ron, D. (1995). An experimental an theoretical comparison of model selection methods. In Proceedings of the eighth annual ACM workshop on computational learning theory
[22] Kohonen, Teuvo, Self-organizing maps. Springer series in information sciences (Vol. 30) (1995), Springer: Springer Heidelberg, Berlin, (Second Extended Edition 1997) · Zbl 0866.68085
[23] Kohonen, T.; Somervuo, P., How to make large self-organizing maps for nonvectorial data, Neural Networks, 15, 8-9, 945-952 (2002)
[24] Martinetz, Thomas M.; Berkovich, Stanislav G.; Schulten, Klaus J., ‘Neural-gas’ network for vector quantization and its application to time-series prediction, IEEE Transaction on Neural Networks, 4, 4, 558-569 (1993)
[25] Hammer, B., Micheli, A., & Sperduti, A. (2005). Universal approximation capability of cascade correlation for structures. Neural Computation 17, 1109-1159; Hammer, B., Micheli, A., & Sperduti, A. (2005). Universal approximation capability of cascade correlation for structures. Neural Computation 17, 1109-1159 · Zbl 1096.68132
[26] National Cancer Institute. (2004). Prostate cancer data set. http://www.cancer.gov; National Cancer Institute. (2004). Prostate cancer data set. http://www.cancer.gov
[27] Onicescu, O., Theorie de l’information energie informationelle, Comptes rendus de l’Academie des Sciences Series A-B, Tome, 263, 841-842 (1966) · Zbl 0143.41206
[28] Passerini, A.; Ceroni, A.; Frasconi, P.; Vullo, A., Predicting the disulfide bonding state of cysteines with combinations of kernel machines, Journal of VLSI Signal Processing, 35, 3, 287-295 (2003) · Zbl 1042.68647
[29] Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P., Numerical recipes in C (1999), Cambridge University Press: Cambridge University Press Cambridge, NY
[30] Principe, J. C.; Fischer, J. W.; Xu, D., Information theoretic learning, (Haykin, S., Unsupervised adaptive filtering (2000), Wiley: Wiley New York)
[31] Renyi, A. (1961). On measures of entropy and information. In Proceedings of the fourth berkeley symposium on mathematical statistics and probability; Renyi, A. (1961). On measures of entropy and information. In Proceedings of the fourth berkeley symposium on mathematical statistics and probability · Zbl 0106.33001
[32] Sato, A.; Yamada, K., A formulation of learning vector quantization using a new misclassification measure, (Jain, A. K.; Venkatesh, S.; Lovell (Eds.),, B. C., Proceedings of the 14th international conference on pattern recognition, Vol. 1 (1998), IEEE Computer Society: IEEE Computer Society Los Alamitos, CA), 322-325
[33] Sato, A. S.; Yamada, K., Generalized learning vector quantization, (Tesauro, G.; Touretzky, D.; Leen (Eds.), T., Advances in neural information processing systems, Vol. 7 (1995), MIT Press: MIT Press Cambridge, MA), 423-429
[34] Sato, Atsushi; Yamada, Kenji, An analysis of convergence in generalized LVQ, (Niklasson, L.; Bodén, M.; Ziemke (Eds.), T., Proceedings of the ICANN’98, the eighth international conference on artificial neural networks, Vol. 1 (1998), Springer: Springer London), 170-176
[35] Schleif, F.-M., Clauss, U., Villmann, T., & Hammer, B. (2004). Supervised relevance neural gas and unified maximum separability analysis for classification of mass spectrometric data. In Proceedings of the international conference of machine learning applications (ICMLA’2004); Schleif, F.-M., Clauss, U., Villmann, T., & Hammer, B. (2004). Supervised relevance neural gas and unified maximum separability analysis for classification of mass spectrometric data. In Proceedings of the international conference of machine learning applications (ICMLA’2004)
[36] Schölkopf, B.; Smola, A., Learning with IEEE Press (2002), MIT Press: MIT Press Cambridge, MA
[37] Sona, D.; Micheli, A.; Sperduti, A., Contextual processing of structured data by recursive cascade correlation, IEEE Transaction on Neural Networks, 15, 6, 1396-1410 (2004)
[38] Sperduti, A.; Starita, A., Supervised neural networks for the classification of structures, IEEE Transaction on Neural Networks, 8, 3, 714-735 (1997)
[39] Torkkola, K., Feature extraction by non-parametric mutual information maximization, Journal of Machine Learning Research, 3, 1415-1438 (2003) · Zbl 1102.68638
[40] Torkkola, K., & Campbell, W. M. (2000). Mutual information in learning feature transformations. In Proceedings of the international conference on machine learning ICML’2000Stanford, CA; Torkkola, K., & Campbell, W. M. (2000). Mutual information in learning feature transformations. In Proceedings of the international conference on machine learning ICML’2000Stanford, CA
[41] Tsang, I. W.; Kwok, J. T., Distance metric learning with kernels, (Kaynak, O., Proceedings of the international conference on artificial neural networks (ICANN’2003), Istanbul (2003)), (pp. 126-129)
[42] Villmann, Th.; Hammer, B., Supervised neural gas for learning vector quantization, (Polani, D.; Kim, J.; Martinetz, T., Proceedings of the fifth German workshop on artificial life (GWAL-5) (2002), Akademische Verlagsgesellschaft/Infix/IOS Press: Akademische Verlagsgesellschaft/Infix/IOS Press Berlin, Germany), 9-16
[43] Zhang, Z.; Kwok, J. T.; Yeung, D.-Y., Parametric distance metric learning with label information, (Kaynak, O., Proceedings of the 18th international joint conference on artificial intelligence (IJCAI’03) (2003), Acapulco: Acapulco Mexico), 1450-1452
[44] Zhang, Z., Kwok, J. T., & Yeung, D.-Y. (2003b). Parametric distance metric learning with label information; Zhang, Z., Kwok, J. T., & Yeung, D.-Y. (2003b). Parametric distance metric learning with label information
[45] Zhang, Zhen; Page, Grier; Zhang, Hong, Applying classification separability analysis to microarray data, (Lin, S. M.; Johnson, K. F., Methods of microarray data analysis (2002), Kluwer: Kluwer Dordrecht, Netherlands), 125-136
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.