Table of Links
Abstract and 1 Introduction
2 Preliminaries
3. Revisiting Normalization
3.1 Revisiting Euclidean Normalization
3.2 Revisiting Existing RBN
4 Riemannian Normalization on Lie Groups
5 LieBN on the Lie Groups of SPD Manifolds and 5.1 Deformed Lie Groups of SPD Manifolds
5.2 LieBN on SPD Manifolds
6 Experiments
6.1 Experimental Results
7 Conclusions, Acknowledgments, and References
APPENDIX CONTENTS
A Notations
B Basic layes in SPDnet and TSMNet
C Statistical Results of Scaling in the LieBN
D LieBN as a Natural Generalization of Euclidean BN
E Domain-specific Momentum LieBN for EEG Classification
F Backpropagation of Matrix Functions
G Additional Details and Experiments of LieBN on SPD manifolds
H Preliminary Experiments on Rotation Matrices
I Proofs of the Lemmas and Theories in the Main Paper
6.1 EXPERIMENTAL RESULTS
For each family of LieBN or DSMLieBN, we report two representatives: the standard one induced from the standard metric (θ = 1), and the one induced from the deformed metric with selected θ. If the standard one is already saturated, we only report the results of the standard ones.
Application to SPDNet: As SPDNet is the most classic SPD network, we apply our LieBN to SPDNet on the Radar, HDM05, and FPHA datasets. Additionally, we compare our method with SPDNetBN, which applies the SPDBN in Eqs. (7) and (8) to SPDNet. Following Brooks et al.
(2019b); Chen et al. (2023d), we use the architectures of {20, 16, 8}, {93, 30}, and {63, 33} for the Radar, HDM05 and FPHA datasets, respectively. The 10-fold average results, including the average training time (s/epcoh), are summarized in Tab. 4. We have three key observations regarding the choice of metrics, deformation, and training efficiency. The choice of metrics: The metric that yields the most effective LieBN layer differs for each dataset. Specifically, the optimal LieBN layers on these three datasets are the ones induced by AIM-(1), LCM-(0.5), and AIM-(1.5), respectively, which improves the performance of SPDNet by 2.22%, 11.71%, and 4.8%.
Additionally, although the LCM-based LieBN performs worse than other LieBN variants on the Radar and FPHA datasets, it exhibits the best performance on the HDM05 dataset. These observations highlight the advantage of the generality of our LieBN approach. The effect of deformation: Deformation patterns also vary across datasets. Firstly, the standard AIM is already saturated on the Radar dataset.
Secondly, as indicated in Tab. 4, an appropriate deformation factor θ can further enhance the performance of LieBN. Notably, even though the LieBN induced by LCM-(1) impedes the learning of SPDNet on the FPHA datasets, it can improve performance when an appropriate deformation factor θ is applied. These findings highlight the utility of the deforming geometry of the SPD manifold. Efficiency: Our LieBN achieves comparable or even better efficiency than SPDNetBN, although compared with SPDNetBN, our LieBN places additional consideration on variance.
Particularly, the LieBN induced by standard LEM or LCM exhibits better efficiency than SPDNetBN. Even with deformation, the LCM-based LieBN is still comparable with SPDNetBN in terms of efficiency. This phenomenon could be attributed to the fast and simple computation of LCM and LEM.
Application to EEG classification: We evaluate our method on the architecture of TSMNet for two tasks, inter-session and inter-subject EEG classification. Following Kobler et al. (2022a), we adopt the architecture of {40, 20}. Compared to the SPDDSMBN, TSMNet+DSMLieBN-AIM obtains the highest average scores of 55.10% and 53.97% in inter-session and -subject transfer learning, improving the SPDDSMBN by 0.98% and 3.87%, respectively. In the inter-subject scenario, the advantage of the efficiency of our LieBN over SPDDSMBN is more obvious. Specifically, both the LEM- and LCM-based DSMLieBN achieve similar or better performance compared to SPDDSMBN, while requiring considerably less training time.
For example, DSMLieBN-LCM-(1) achieves better results with only half the training time of SPDDSMBN on inter-subject tasks. Interestingly, under the standard AIM, the sole difference between SPDDSMBN and our DSMLieBN is the way of centering and biasing. SPDDSMBN applies the inverse square root and square root to fulfill centering and biasing, while AIM-induced LieBN uses more efficient Cholesky decomposition. As such, the DSMLieBN induced by the standard AIM is more efficient than SPDDSMBN, particularly on the inter-subject task.
Authors:
(1) Ziheng Chen, University of Trento;
(2) Yue Song, University of Trento and a Corresponding author;
(3) Yunmei Liu, University of Louisville;
(4) Nicu Sebe, University of Trento.