Abstract
The tensor algebraic approach, also known in the literature as structural equation modeling with multimode latent variables, has been succesfully employed for representing the causal factor structure of data formation, for over fifty years in econometrics, psychometrics and chemometricss. More recently, the tensor factorization approach has been successfully employed in computer vision computer graphics to represent the cause-and-effect relationship, but also in various machine learning tasks with predictive models.
Computer graphics and computer vision problems have been cast as causal inference problems with tensor algebra serving as a suitable and transparent framework for modeling the cause-and-effect relationship. Computer graphics may be viewed as addressing analogous questions to forward causal inferencing that estimate the change in effects from a unit change in causal factors. Computer vision may be viewed as addressing analogous questions to inverse causal inferencing that estimates causes from effects given an estimated forward causal model.
There are two main classes of tensor factorizations which stem from two types of higher order tensor decompositions which generalize different concepts of the matrix SVD, the rank-R decomposition (open problem) and Rank-(R1;R2; : : : ;RM) decomposition plus various tensor factorizations under different constraints.
In the first part of the tutorial, we will define the meaning of causality, linear tensor rank-R and the multilinear tensor rank, rank-(R1;R2; :::;RM). The linear tensor rank, rank-R, generalizes the matrix concept of rank, while the multilinear rank, rank-(R1;R2; :::;RM), generalizes the matrix concepts of orthonormal row/column subspaces. We will address causal inference by employing several multilinear representations, Multilinear PCA Multilinear ICA, Compositional Hierarchical Block Tucker etc. and introduce the multilinear projection operator, tensor pseudo-inverse and the identity tensor which are important in inverse causal inferencing and in performing recognition in a tensor framework. Furthermore, we will discuss why images have been traditionally vectorized in statistical learning, and discuss the advantages and disadvantages oftreating images as vectors, matrices and higher order objects in the context of a tensor framework.
In the second part, we discuss how neural network architectures can be tensorized and how tensor factorizations such as TensorTrain and Hierarchical Tucker computation can result in state-of-the-art art performance with large parameter savings and computational speed-ups on a wide range of applications.
ShorT Bio
M. Alex O. Vasilescu (https://web.cs.ucla.edu/ maov) received her education at the Massachusetts Institute of Technology and the University of Toronto. Vasilescu introduced the tensor paradigm for computer vision, computer graphics, machine learning, and extended the tensor algebraic framework by generalizing concepts from linear algebra. Starting in the early 2000s, she re-framed the analysis, recognition, synthesis, and interpretability of sensory data as multilinear tensor factorization problems suitable for mathematically representing cause-and-effect and demonstratively disentangling the causal factors of observable data. The tensor framework is a powerful paradigm whose utility and value has been further underscored by Amnon Shashua’s team that have recently provided theoretical evidence showing that deep learning is a neural network approximation of multilinear tensor factorization. Vasilescu’s face recognition research, known as TensorFaces, has been funded by the TSWG, the Department of Defenses Combating Terrorism Support Program, and by IARPA, Intelligence Advanced Research Projects Activity. Her work was featured on the cover of Computer World, and in articles in the New York Times, Washington Times, etc. MITs Technology Review Magazine named her to their TR100 List of Top 100 Young Innovators, and the National Academy of Science co-awarded the KeckFutures Initiative Grant.
Ivan Oseledets (https://faculty.skoltech.ru/people/ivanoseledets) graduated from Moscow Institute of Physics and Technology in 2006, got Candidate of Sciences degree in 2007, and Doctor of Sciences in 2012, both from Marchuk Institute of Numerical Mathematics of Russian Academy of Sciences. He joined Skoltech CDISE in 2013. Ivan’s research covers a broad range of topics. He proposed a new decomposition of high-dimensional arrays (tensors) – tensor-train decomposition, and developed many efficient algorithms for solving high-dimensional problems. These algorithms are used in different areas of chemistry, biology, data analysis and machine learning. His current research focuses on development of new algorithms in machine learning and artificial intelligence such as construction of adversarial examples, theory of generative adversarial networks and compression of neural networks. It resulted in publications in top computer science conferences such as ICML, NIPS, ICLR, CVPR, RecSys, ACL and ICDM. Professor Oseledets is an Associate Editor of SIAM Journal on Mathematics in Data Science, SIAM Journal on Scientific Computing, Advances in Computational Mathematics (Springer). He is also an area chair of ICLR 2020 conference. Ivan Oseledets got several awards for his research and industrial cooperation, including two gold medals of Russian academy of Sciences (for students in 2005 and young researchers in 2009), Dynasty Foundation award (2012), SIAM Outstanding Paper Prize (2018), Russian President Award for young researchers in science and innovation (2018), Ilya Segalovich award for Best PhD thesis supervisor (2019).