×

Non-linear metric learning using pairwise similarity and dissimilarity constraints and the geometrical structure of data. (English) Zbl 1207.68261

Summary: The problem of clustering with side information has received much recent attention and metric learning has been considered as a powerful approach to this problem. Until now, various metric learning methods have been proposed for semi-supervised clustering. Although some of the existing methods can use both positive (must-link) and negative (cannot-link) constraints, they are usually limited to learning a linear transformation (i.e., finding a global Mahalanobis metric). In this paper, we propose a framework for learning linear and non-linear transformations efficiently. We use both positive and negative constraints and also the intrinsic topological structure of data. We formulate our metric learning method as an appropriate optimization problem and find the global optimum of this problem. The proposed non-linear method can be considered as an efficient kernel learning method that yields an explicit non-linear transformation and thus shows out-of-sample generalization ability. Experimental results on synthetic and real-world data sets show the effectiveness of our metric learning method for semi-supervised clustering tasks.

MSC:

68T10 Pattern recognition, speech recognition
68P05 Data structures
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Xiang, S.; Nie, F.; Zhang, C., Learning a Mahalanobis distance metric for data clustering and classification, Pattern Recognition, 41, 12, 3600-3612 (2008) · Zbl 1162.68642
[2] L. Yang, R. Jin, Distance metric learning: a comprehensive survey, Technical Report, Michigan State University (〈http://www.cse.msu.edu/∼yangliu1/frame_survey_v2.pdf; L. Yang, R. Jin, Distance metric learning: a comprehensive survey, Technical Report, Michigan State University (〈http://www.cse.msu.edu/∼yangliu1/frame_survey_v2.pdf
[3] J. Goldberger, S. Roweis, G. Hinton, R. Salakhutdinov, Neighborhood components analysis, in: Advances in NIPS, MIT Press, Cambridge, MA, USA, 2004, pp. 513-520.; J. Goldberger, S. Roweis, G. Hinton, R. Salakhutdinov, Neighborhood components analysis, in: Advances in NIPS, MIT Press, Cambridge, MA, USA, 2004, pp. 513-520.
[4] K. Weinberger, J. Blitzer, L. Saul, Distance metric learning for large margin nearest neighbor classification, in: Advances in NIPS, MIT Press, Cambridge, MA, USA, 2006, pp. 1473-1480.; K. Weinberger, J. Blitzer, L. Saul, Distance metric learning for large margin nearest neighbor classification, in: Advances in NIPS, MIT Press, Cambridge, MA, USA, 2006, pp. 1473-1480.
[5] Hastie, T.; Tibshirani, R., Discriminant adaptive nearest neighbor classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18, 6, 607-616 (1996)
[6] A. Globerson, S. Roweis, Metric learning by collapsing classes, in: Advances in NIPS, MIT Press, Cambridge, MA, USA, 2006, pp. 451-458.; A. Globerson, S. Roweis, Metric learning by collapsing classes, in: Advances in NIPS, MIT Press, Cambridge, MA, USA, 2006, pp. 451-458.
[7] J.H. Friedman, Flexible metric nearest neighbor classification, Technical Report, Statistics Department, Stanford University, 1994.; J.H. Friedman, Flexible metric nearest neighbor classification, Technical Report, Statistics Department, Stanford University, 1994.
[8] Z.H. Zhang, J.T. Kwok, D.Y. Yeung, Parametric distance metric learning with label information, in: IJCAI, Acapulco, Mexico, 2003, pp. 1450-1452.; Z.H. Zhang, J.T. Kwok, D.Y. Yeung, Parametric distance metric learning with label information, in: IJCAI, Acapulco, Mexico, 2003, pp. 1450-1452.
[9] M.H.C. Law, Clustering, dimensionality reduction, and side information, Ph.D. Dissertation, Michigan University, 2006.; M.H.C. Law, Clustering, dimensionality reduction, and side information, Ph.D. Dissertation, Michigan University, 2006.
[10] S. Basu, Semi-supervised clustering: probabilistic models, algorithms and experiments, Ph.D. Dissertation, University of Texas at Austin, 2005.; S. Basu, Semi-supervised clustering: probabilistic models, algorithms and experiments, Ph.D. Dissertation, University of Texas at Austin, 2005.
[11] M.H.C. Law, A. Topchy, A.K. Jain, Model-based clustering with probabilistic constraints, in: SIAM Conference on Data Mining, 2005, pp. 641-645.; M.H.C. Law, A. Topchy, A.K. Jain, Model-based clustering with probabilistic constraints, in: SIAM Conference on Data Mining, 2005, pp. 641-645.
[12] S. Basu, A. Banerjee, R.J. Mooney, Semi-supervised clustering by seeding, in: 19th International Conference on Machine Learning (ICML-02), Sydney, Australia, 2002, pp. 19-26.; S. Basu, A. Banerjee, R.J. Mooney, Semi-supervised clustering by seeding, in: 19th International Conference on Machine Learning (ICML-02), Sydney, Australia, 2002, pp. 19-26.
[13] K. Wagstaff, C. Cardie, S. Rogers, S. Schroedl, Constrained K-means clustering with background knowledge, in: 18th International Conference on Machine Learning (ICML01), 2001, pp. 577-584.; K. Wagstaff, C. Cardie, S. Rogers, S. Schroedl, Constrained K-means clustering with background knowledge, in: 18th International Conference on Machine Learning (ICML01), 2001, pp. 577-584.
[14] Z. Lu, T. Leen, Semi-supervised learning with penalized probabilistic clustering, in: Advances in NIPS 17, MIT Press, Cambridge, MA, USA, 2005, pp. 849-856.; Z. Lu, T. Leen, Semi-supervised learning with penalized probabilistic clustering, in: Advances in NIPS 17, MIT Press, Cambridge, MA, USA, 2005, pp. 849-856.
[15] Bansal, N.; Blum, A.; Chawla, S., Correlation clustering, Machine Learning, 56, 1-3, 89-113 (2004) · Zbl 1089.68085
[16] T. Lange, M.H. Law, A.K. Jain, J. Buhmann, Learning with constrained and unlabelled data, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 2005, pp. 730-737.; T. Lange, M.H. Law, A.K. Jain, J. Buhmann, Learning with constrained and unlabelled data, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 2005, pp. 730-737.
[17] Zhao, Q.; Miller, D. J., Mixture modeling with pair wise instance-level class constraints, Neural Computation, 17, 11, 2482-2507 (2005) · Zbl 1075.68636
[18] Yu, S. X.; Shi, J., Segmentation given partial grouping constraints, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 2, 173-183 (2004)
[19] B. Kulis, S. Basu, I. Dhillon, R.J. Mooney, Semi-supervised graph clustering: a kernel approach, in: 22nd International Conference on Machine Learning, 2005, pp. 457-464.; B. Kulis, S. Basu, I. Dhillon, R.J. Mooney, Semi-supervised graph clustering: a kernel approach, in: 22nd International Conference on Machine Learning, 2005, pp. 457-464.
[20] S. Basu, M., Bilenko, R.J. Mooney, A probabilistic framework for semi-supervised clustering, in: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 59-68.; S. Basu, M., Bilenko, R.J. Mooney, A probabilistic framework for semi-supervised clustering, in: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 59-68.
[21] N. Shental, A. Bar-Hillel, T. Hertz, D. Weinshall, Computing Gaussian mixture models with EM using equivalence constraints, in: Advances in NIPS 16, MIT Press, Cambridge, MA, USA, 2004, pp. 465-472.; N. Shental, A. Bar-Hillel, T. Hertz, D. Weinshall, Computing Gaussian mixture models with EM using equivalence constraints, in: Advances in NIPS 16, MIT Press, Cambridge, MA, USA, 2004, pp. 465-472.
[22] D. Klein, S.D. Kamvar, C. Manning, From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering, in: 19th International Conference on Machine Learning (ICML-02), Sydney, Australia, 2002, pp. 307-314.; D. Klein, S.D. Kamvar, C. Manning, From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering, in: 19th International Conference on Machine Learning (ICML-02), Sydney, Australia, 2002, pp. 307-314.
[23] E.P. Xing, A.Y. Ng, M.I. Jordan, S. Russell, Distance metric learning with application to clustering with side information, in: Advances in NIPS 15, MIT Press, Cambridge, MA, USA, 2003, pp. 505-512.; E.P. Xing, A.Y. Ng, M.I. Jordan, S. Russell, Distance metric learning with application to clustering with side information, in: Advances in NIPS 15, MIT Press, Cambridge, MA, USA, 2003, pp. 505-512.
[24] M. Bilenko, S. Basu, R.J. Mooney, Integrating constraints and metric learning in semi-supervised clustering, in: 21st International Conference on Machine Learning, 2004, pp. 81-88.; M. Bilenko, S. Basu, R.J. Mooney, Integrating constraints and metric learning in semi-supervised clustering, in: 21st International Conference on Machine Learning, 2004, pp. 81-88.
[25] Bar-Hillel, A.; Hertz, T.; Shental, N.; Weinshall, D., Learning a Mahalanobis metric from equivalence constraints, Journal of Machine Learning Research, 6, 937-965 (2005) · Zbl 1222.68140
[26] T. Hertz, A. Bar-Hillel, D. Weinshall, Boosting margin-based distance functions for clustering, in: Proceedings of 21st International Conference on Machine Learning, 2004, pp. 393-400.; T. Hertz, A. Bar-Hillel, D. Weinshall, Boosting margin-based distance functions for clustering, in: Proceedings of 21st International Conference on Machine Learning, 2004, pp. 393-400.
[27] M. Schultz, T. Joachims, Learning a distance metric from relative comparisons, in: Advances in NIPS 16, MIT Press, Cambridge, MA, USA, 2004, pp. 41-48.; M. Schultz, T. Joachims, Learning a distance metric from relative comparisons, in: Advances in NIPS 16, MIT Press, Cambridge, MA, USA, 2004, pp. 41-48.
[28] Yeung, D. Y.; Chang, H., Extending the relevant component analysis algorithm for metric learning using both positive and negative equivalence constraints, Pattern Recognition, 39, 1007-1010 (2006) · Zbl 1158.68483
[29] Chang, H.; Yeung, D. Y., Locally linear metric adaptation with application to semi-supervised clustering and image retrieval, Pattern Recognition, 39, 1253-1264 (2006) · Zbl 1095.68093
[30] Yeung, D. Y.; Chang, H., A Kernel approach for semi-supervised metric learning, IEEE Transactions on Neural Networks, 18, 1, 141-149 (2007)
[31] Chang, H.; Yeung, D. Y.; Cheung, W. K., Relaxational metric adaptation and its application to semi-supervised clustering and content-based image retrieval, Pattern Recognition, 39, 1905-1917 (2006) · Zbl 1096.68714
[32] De Bie, T.; Momma, M.; Cristianini, N., Efficiently learning the metric with side-information, Lecture Notes in Artificial Intelligence, 2842, 175-189 (2003) · Zbl 1263.68128
[33] Kumar, N.; Kummamuru, K., Semi-supervised clustering with metric learning using relative comparisons, IEEE Transactions on Knowledge and Data Engineering, 20, 4, 496-503 (2007)
[34] S.C.H. Hoi, W. Liu, M.R. Lyu, W.-Y. Ma, Learning distance metrics with contextual constraints for image retrieval, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, USA, 2006, pp. 2072-2078.; S.C.H. Hoi, W. Liu, M.R. Lyu, W.-Y. Ma, Learning distance metrics with contextual constraints for image retrieval, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, USA, 2006, pp. 2072-2078.
[35] Guo, Y. F.; Li, S. J.; Yang, J. Y.; Shu, T. T.; Wu, L. D., A generalized Foley-Sammon transform based on generalized fisher discriminant criterion and its application to face recognition, Pattern Recognition Letters, 24, 1, 147-158 (2003) · Zbl 1055.68092
[36] H. Wang, S. Yan, D. Xu, X. Tang, T. Huang, Trace ratio vs. ratio trace for dimensionality reduction, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2007, pp. 1-8.; H. Wang, S. Yan, D. Xu, X. Tang, T. Huang, Trace ratio vs. ratio trace for dimensionality reduction, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2007, pp. 1-8.
[37] Roweis, S. T.; Saul, L. K., Non-linear dimensionality reduction by locally linear embedding, Science, 290, 5500, 2323-2326 (2000)
[38] D. Cai, X. He, J. Han, Semi-supervised discriminant analysis, in: Proceedings of the IEEE International Conference on Computer Vision (ICCI’07), Brazil, 2007, pp. 1-7.; D. Cai, X. He, J. Han, Semi-supervised discriminant analysis, in: Proceedings of the IEEE International Conference on Computer Vision (ICCI’07), Brazil, 2007, pp. 1-7.
[39] S.C.H. Hoi, R. Jin, M.R. Lyu, Learning nonparametric kernel matrices from pairwise constraints, in: 24th International Conference on Machine Learning (ICML’07), New York, USA, 2007, pp. 361-368.; S.C.H. Hoi, R. Jin, M.R. Lyu, Learning nonparametric kernel matrices from pairwise constraints, in: 24th International Conference on Machine Learning (ICML’07), New York, USA, 2007, pp. 361-368.
[40] J. Zhuang, I.W. Tsang, S.C.H. Hoi, SimpleNPKL: simple non-parametric kernel learning, in: 26th International Conference on Machine Learning (ICML’09), Montreal, Canada, 2009, pp. 1273-1280.; J. Zhuang, I.W. Tsang, S.C.H. Hoi, SimpleNPKL: simple non-parametric kernel learning, in: 26th International Conference on Machine Learning (ICML’09), Montreal, Canada, 2009, pp. 1273-1280.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.