×

Binet-Cauchy kernels on dynamical systems and its application to the analysis of dynamic scenes. (English) Zbl 1477.68433

Summary: We propose a family of kernels based on the Binet-Cauchy theorem, and its extension to Fredholm operators. Our derivation provides a unifying framework for all kernels on dynamical systems currently used in machine learning, including kernels derived from the behavioral framework, diffusion processes, marginalized kernels, kernels on graphs, and the kernels on sets arising from the subspace angle approach. In the case of linear time-invariant systems, we derive explicit formulae for computing the proposed Binet-Cauchy kernels by solving Sylvester equations, and relate the proposed kernels to existing kernels based on cepstrum coefficients and subspace angles. We show efficient methods for computing our kernels which make them viable for the practitioner. Besides their theoretical appeal, these kernels can be used efficiently in the comparison of video sequences of dynamic scenes that can be modeled as the output of a linear time-invariant dynamical system. One advantage of our kernels is that they take the initial conditions of the dynamical systems into account. As a first example, we use our kernels to compare video sequences of dynamic textures. As a second example, we apply our kernels to the problem of clustering short clips of a movie. Experimental evidence shows superior performance of our kernels.

MSC:

68T45 Machine vision and scene understanding
47B32 Linear operators in reproducing-kernel Hilbert spaces (including de Branges, de Branges-Rovnyak, and other structured spaces)
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Aggarwal, G., Roy-Chowdhury, A., and Chellappa, R. 2004. A system identification approach for video-based face recognition. In Proc. Intl. Conf. Pattern Recognition, Cambridge, UK.
[2] Aitken, A.C. 1946. Determinants and Matrices, 4th edition. Interscience Publishers. · Zbl 0063.00033
[3] Bach, F.R. and Jordan, M.I. 2002. Kernel independent component analysis. Journal of Machine Learning Research, 3:1–48. · Zbl 1088.68689 · doi:10.1162/153244303768966085
[4] Bakir, G., Weston, J., and Schölkopf, B. 2003. Learning to find pre-images. In Advances in Neural Information Processing Systems 16. MIT Press.
[5] Barla, A., Odone, F., and Verri, A. 2002. Hausdorff kernel for 3D object acquisition and detection. In European Conference on Computer Vision ’02, number 2353 in LNCS, pp. 20. Springer. · Zbl 1039.68595
[6] Baxter, J. and Bartlett, P.L. 1999. Direct gradient-based reinforcement learning: Gradient estimation algorithms. Technical report, Research School of Information, ANU Canberra.
[7] Bernstein D.S. 2005. Matrix Mathematics. Princeton University Press. · Zbl 1075.15001
[8] Blanz, V., Schölkopf, B., Bülthoff H., Burges, C., Vapnik, V., and Vetter, T. 1996. Comparison of view-based object recognition algorithms using realistic 3D models. In C. von der Malsburg, W. von Seelen, J.C. Vorbrüggen, and B. Sendhoff (eds.), Artificial Neural Networks ICANN’96, vol. 1112 of Lecture Notes in Computer Science, pp. 251–256. Berlin: Springer-Verlag.
[9] Burges, C.J.C. and Vapnik, V. 1995. A new method for constructing artificial neural networks. Interim technical report, ONR contract N00014-94-c-0186, AT&T Bell Laboratories.
[10] Burkhardt, H. 2004. Invariants on skeletons and polyhedrals. Technical report, Universität Freiburg, in preparation.
[11] Chang, C.C. and Lin, C.J. 2001. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/\(\sim\)cjlin/libsvm .
[12] Chapelle, O., Haffner, P., and Vapnik, V. 1999. SVMs for histogram-based image classification. IEEE Transactions on Neural Networks, 10(5).
[13] De Cock, K. and De Moor, B. 2002. Subspace angles between ARMA models. Systems and Control Letter, 46:265–270. · Zbl 0994.93057 · doi:10.1016/S0167-6911(02)00135-4
[14] Cortes, C., Haffner, P., and Mohri, M. 2002. Rational kernels. In Advances in Neural Information Processing Systems 15, vols. 14. Cambridge, MA: MIT Press. · Zbl 1274.68302
[15] Cristianini, N. and Shawe-Taylor, J. 2000. An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK. · Zbl 0994.68074
[16] de Finetti, B. 1990. Theory of probability, vol. 1–2. John Wiley and Sons, 1990. reprint of the 1975 translation. · Zbl 0694.60001
[17] Doretto, G., Chiuso, A., Wu, Y.N., and Soatto, S. 2003. Dynamic textures. International Journal of Computer Vision, 51(2):91–109. · Zbl 1030.68646 · doi:10.1023/A:1021669406132
[18] Gardiner, J.D., Laub, A.L., Amato, J.J., and Moler, C.B. 1992. Solution of the Sylvester matrix equation AXB + CXD = E. ACM Transactions on Mathematical Software, 18(2):223– 231. · Zbl 0893.65026 · doi:10.1145/146847.146929
[19] Gärtner, T., Flach, P.A., and Wrobel, S. 2003. On graph kernels: Hardness results and efficient alternatives. In B. Schölkopf and M.K. Warmuth, (eds.) Sixteenth Annual Conference on Computational Learning Theory and Seventh Kernel Workshop, COLT. Springer. · Zbl 1274.68312
[20] Golub, G.H. and Van Loan, C.F. 1996. Matrix Computations, 3rd edition. Baltimore, MD: John Hopkins University Press. · Zbl 0865.65009
[21] Gretton, A., Herbrich, R., and Smola, A.J. 2003. The kernel mutual information. In Proceedings of ICASSP.
[22] Gröbner, W. 1965. Matrizenrechnung. BI Hochschultaschenbücher.
[23] Heisele, B., Ho, P., and Poggio, T. 2001. Face recognition with support vector machines: Global versus component-based approach. In Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), pp. 688–694. Los Alamitos, CA: IEEE Computer Society.
[24] Herbrich, R. 2002. Learning Kernel Classifiers: Theory and Algorithms. MIT Press. · Zbl 1063.62092
[25] Isidori, A. 1989. Nonlinear Control Systems. 2nd edition. Springer. · Zbl 0693.93046
[26] Joachims, T. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning, pp. 137–142. Berlin: Springer.
[27] Kashima, H., Tsuda, K., and Inokuchi, A. 2003. Marginalized kernels between labeled graphs. In Proceedings of the 20th International Conference on Machine Learning (ICML), Washington, DC, United States.
[28] Kashima, H., Tsuda, K., and Inokuchi, A. 2004. Kernels on graphs. In K. Tsuda, B. Schölkopf, and J.P. Vert (Eds.), Kernels and Bioinformatics, Cambridge, MA: MIT Press.
[29] Kondor, I.R. and Lafferty, J.D. 2002. Diffusion kernels on graphs and other discrete structures. In Proceedings of the ICML.
[30] Luenberger, D.G. 1979. Introduction to Dynamic Systems: Theory, Models, and Applications. New York, USA:John Wiley and Sons, Inc., ISBN 0–471 - 02594 - 1. · Zbl 0458.93001
[31] Martin, R.J. 2000. A metric for ARMA processes. IEEE Transactions on Signal Processing, 48(4):1164–1170. · Zbl 0993.94540 · doi:10.1109/78.827549
[32] Nocedal, J. and Wright, S.J. 1999. Numerical Optimization. Springer Series in Operations Research. Springer. · Zbl 0930.65067
[33] Pinkus, A. 1996. Spectral properties of totally positive kernels and matrices. In M. Gasca and C.A. Miccheli (eds.), Total Positivity and its Applications, vol. 359 of Mathematics and its Applications, pp. 1–35. Kluwer. · Zbl 0891.45001
[34] Platt, J. 1999. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C.J.C. Burges, and A.J. Smola (eds.), Advances in Kernel Methods–Support Vector Learning, pp. 185–208, Cambridge, MA: MIT Press.
[35] Ralaivola, L. and d’Alché Buc, F. 2003. Dynamical modeling with kernels for nonlinear time series prediction. In Se. Thrun, La. Saul, and Be. Schölkopf, (eds.) Advances in Neural Information Processing Systems 16. MIT Press.
[36] Roweis, S. and Saul, L.K. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323–2326. · doi:10.1126/science.290.5500.2323
[37] Saisan, P., Doretto, G., Wu, Y.N., and Soatto, S. 2001. Dynamic texture recognition. In Proceedings of CVPR, vol. 2, pp. 58–63.
[38] Schölkopf, B. 1997. Support Vector Learning. R. Oldenbourg Verlag, Munich, Download: http://www.kernel-machines.org . · Zbl 0915.68137
[39] Schölkopf, B. and Smola, A. 2002. Learning with Kernels. Cambridge, MA: MIT Press. · Zbl 1019.68094
[40] Shashua, A., and Hazan, T. 2005. Algebraic set kernels with applications to inference over local image representations. In L.K. Saul, Y. Weiss, and L. Bottou (eds.), Advances in Neural Information Processing Systems 17, pp. 1257–1264, Cambridge, MA: MIT Press.
[41] Shi, J. and Malik, J. 1997. Normalized cuts and image segmentation. IEEE Conf. Computer Vision and Pattern Recognition.
[42] Smola, A.J. and Kondor, I.R. 2003. Kernels and regularization on graphs. In B. Schölkopf and M.K. Warmuth (eds.), Proceedings of the Annual Conference on Computational Learning Theory, Lecture Notes in Computer Science, pp. 144–158. Springer. · Zbl 1274.68351
[43] Smola, A.J. and Vishwanathan, S.V.N. 2003. Hilbert space embeddings in dynamical systems. In Proceedings of the 13th IFAC symposium on system identification. Rotterdam, Netherlands.
[44] Soatto, S. Doretto, G., and Wu: Y.N. 2001. Dynamic textures. In Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), pp. 439–446. Los Alamitos, CA: IEEE Computer Society. · Zbl 1030.68646
[45] Sutton, R.S. and Barto, A.G. 1998. Reinforcement Learning: An Introduction. MIT Press.
[46] Vapnik, V. 1995. The Nature of Statistical Learning Theory. New York: Springer . · Zbl 0833.62008
[47] Vapnik, V. 1998. Statistical Learning Theory. New York: John Wiley and Sons. · Zbl 0935.62007
[48] Vidal, R., Ma, Y., and Sastry, S. 2005. Generalized Principal Component Analysis (GPCA). IEEE Transactions on Pattern Analysis and Machine Intelligence, In press.
[49] Vishwanathan, S.V.N. 2002. Kernel Methods: Fast Algorithms and Real Life Applications. PhD thesis, Indian Institute of Science, Bangalore, India.
[50] Vishwanathan, S.V.N., Borgwardt, K., and Schraudolph, Nicol N. 2006. Faster graph kernels. In International Conference on Machine Learning, submitted. · Zbl 1242.05112
[51] Vishwanathan, S.V.N., Smola, A.J., and Murty, M.N. 2003. SimpleSVM. In T. Fawcett and N. Mishra (eds.), Proceedings of the 20th International Conference on Machine Learning (ICML), Washington DC, AAAI press.
[52] Willems, J.C. 1986b. From time series to linear system. I. Finite-dimensional linear time invariant systems. Automatica J. IFAC, 22(5):561–580. · Zbl 0604.62090 · doi:10.1016/0005-1098(86)90066-X
[53] Willems, J.C. 1986a. From time series to linear system. II. Exact modelling. Automatica J. IFAC, 22(6):675–694. · Zbl 0628.62088 · doi:10.1016/0005-1098(86)90005-1
[54] Willems, J.C. 1987. From time series to linear system. III. Approximate modelling. Automatica J. IFAC, 23(1):87– 115. · Zbl 0628.62089 · doi:10.1016/0005-1098(87)90120-8
[55] Wolf, L. and Shashua, A. 2003. Learning over sets using kernel principal angles. Jounal of Machine Learning Research, 4:913–931. · Zbl 1098.68679 · doi:10.1162/jmlr.2003.4.6.913
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.