Table of Links
Abstract and 1. Introduction
-
Some recent trends in theoretical ML
2.1 Deep Learning via continuous-time controlled dynamical system
2.2 Probabilistic modeling and inference in DL
2.3 Deep Learning in non-Euclidean spaces
2.4 Physics Informed ML
-
Kuramoto model
3.1 Kuramoto models from the geometric point of view
3.2 Hyperbolic geometry of Kuramoto ensembles
3.3 Kuramoto models with several globally coupled sub-ensembles
-
Kuramoto models on higher-dimensional manifolds
4.1 Non-Abelian Kuramoto models on Lie groups
4.2 Kuramoto models on spheres
4.3 Kuramoto models on spheres with several globally coupled sub-ensembles
4.4 Kuramoto models as gradient flows
4.5 Consensus algorithms on other manifolds
-
Directional statistics and swarms on manifolds for probabilistic modeling and inference on Riemannian manifolds
5.1 Statistical models over circles and tori
5.2 Statistical models over spheres
5.3 Statistical models over hyperbolic spaces
5.4 Statistical models over orthogonal groups, Grassmannians, homogeneous spaces
-
Swarms on manifolds for DL
6.1 Training swarms on manifolds for supervised ML
6.2 Swarms on manifolds and directional statistics in RL
6.3 Swarms on manifolds and directional statistics for unsupervised ML
6.4 Statistical models for the latent space
6.5 Kuramoto models for learning (coupled) actions of Lie groups
6.6 Grassmannian shallow and deep learning
6.7 Ensembles of coupled oscillators in ML: Beyond Kuramoto models
-
Examples
7.1 Wahba’s problem
7.2 Linked robot’s arm (planar rotations)
7.3 Linked robot’s arm (spatial rotations)
7.4 Embedding multilayer complex networks (Learning coupled actions of Lorentz groups)
-
Conclusion and References
5.2 Statistical models over spheres
So far, quite a few papers presented ML algorithms based on statistical models over spheres. In the present subsection we point out some recent applications in all three major fields of ML: (un)supervised learning and RL. Majority of applications have been reported for unsupervised learning problems, such as clustering of angular data based on the cosine metric. The paper [109] contains a brief review of statistical clustering methods on spheres.
The von Mises-Fisher family vMF(κ, µ) is the most popular statistical model for ML algorithms over spheres [44, 102, 103, 104]. The only alternative model used so far is provided by the family of Bingham distributions. In addition, there are a couple of papers experimenting with other (that is – different from von Mises-Fisher and Bingham) options.
5.2.1 von Mises-Fisher distribution
The von Mises-Fisher distributions are defined by density functions of the form [46]
Here C(κ) is the normalization constant, given by
For important particular case d = 3 (corresponding to the two-dimensional sphere), the above expression for C(κ) simplifies to
5.2.2 Spherical Cauchy distribution
5.2.3 Bergman-spherical Cauchy distribution
To our best knowledge, the family (33) has never been studied in directional statistics, or applied in any field. This family arises as an m-dimensional complex statistical manifold, which is invariant with respect to actions of the group K of Bergman isometries of the unit ball. In other words, while sphC(ρ, µ) is invariant with respect to the group Q of isometries of the unit ball in standard hyperbolic metric, BsphC(w) is an invariant submanifold for actions of the group K of isometries of the unit ball in Bergman metric. We refer to [85] for geometric and harmonic-analytic details on this mathematically sophisticated topic.
In conclusion, the family BsphC(w) may seem as an exotic choice for the statistical model in ML problems, but it has significant advantages, as it is low-dimensional and satisfies property (P4). We are not able to discuss advantages and drawbacks of this family compared to sphC(ρ, µ) in specific setups. This question is related with geometric subtleties and deep understanding of isometries in unit balls.
Remark 5. In mathematics, especially in harmonic analysis and potential theory, the notion of Poisson kernel plays a central role. These are integral kernels that appear in solutions of the Dirichlet boundary problems.
Functions of the form (32) are Poisson kernels for the hyperbolic Dirichlet problem [85]. On the other side,(33) are Poisson kernel in the Bergman metric. To our best knowledge, none of these densities have never been applied in ML so far.
Two recent studies [108, 109] employed Poisson kernels in Euclidean metric to unsupervised learning (clustering on spheres). We emphasize that (32) and (33) are more suitable models for dealing with hyperbolic data.
5.2.4 Bingham distribution
Some recent papers [59, 110] experimented with the Bingham family Bing(M, Z) for RL problems on spheres and rotation groups. This family is defined by densities of the form
Parameters of (34) are orthogonal d × d matrix M and diagonal d × d matrix Z.
The normalization constant in (34) is given by
where 1F(·; ·; ·) stands for confluent hypergeometric series.
Notice that (34) contains many parameters. Unlike two previous families, this family does not posses nice grouptheoretic properties. Hence, the choice of this statistical model entails much more involved estimation of parameters.
A characteristic property of Bingham distributions is that they are antipodally symmetric. This property is advantageous when dealing with rotations in the three-dimensional space (statistical learning over the group SO(3)), because it naturally avoids so-called gimbal loop (a well known error caused by the fact that two antipodal points on S 3 correspond to the same 3D rotation).
:::info
Author:
(1) Vladimir Jacimovic, Faculty of Natural Sciences and Mathematics, University of Montenegro Cetinjski put bb., 81000 Podgorica Montenegro (vladimirj@ucg.ac.me).
:::
:::info
This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.
:::
