Can AI Concepts Truly Generalize Across Different Domains? Experiments Reveal Answers

Table of Links

Abstract and 1 Introduction

2 Related Work

3 Methodology and 3.1 Representative Concept Extraction

3.2 Self-supervised Contrastive Concept Learning

3.3 Prototype-based Concept Grounding

3.4 End-to-end Composite Training

4 Experiments and 4.1 Datasets and Networks

4.2 Hyperparameter Settings

4.3 Evaluation Metrics and 4.4 Generalization Results

4.5 Concept Fidelity and 4.6 Qualitative Visualization

5 Conclusion and References

Appendix

4 Experiments

4.1 Datasets and Networks

We consider four widely used task settings commonly utilized for domain adaptation. The task in each of the following settings is classification.

• Digits: This setting utilizes MNIST and USPS [LeCun et al., 1998; Hull, 1994] with Hand-written images of digits and Street View House Number Dataset (SVHN) [Netzer et al., 2011] with cropped house number photos.

• VisDA-2017 [Peng et al., 2017]: contains 12 classes of vehicles sampled from Real (R) and 3D domains.

• DomainNet [Venkateswara et al., 2017]: contains 126 classes of objects (clocks, bags, etc.) sampled from 4 domains – Real (R), Clipart (C), Painting (P), and Sketch (S).

• Office-Home [Peng et al., 2019]: Office-Home contains 65 classes of office objects like calculators, staplers, etc. sampled from 4 different domains – Art (A), Clipart (C), Product (P), and Real (R).

Network Choice: For Digits, we utilize a modified version of LeNet [LeCun et al., 1998] which consists of 3 convolutional layers for digit classification with ReLU activation functions and a dropout probability of 0.1 during training. For all other datasets we utilize a ResNet34 architecture similar to [Yu and Lin, 2023] and initialize it with pre-trained weights from Imagenet1k. For details, refer Appendix.

Figure 4: Schematic overview of proposed SimCLR transformations for OfficeHome dataset from the Product(P) domain. Note that green arrows depict maximizing similarity while red arrows depict minimizing similarity in concept space. Transformation sets T1+ and T2+ comprise images transformed from chair while T1− and T2− consist of images transformed from non-chair classes. Figure 4: Schematic overview of proposed SimCLR transformations for OfficeHome dataset from the Product(P) domain. Note that green arrows depict maximizing similarity while red arrows depict minimizing similarity in concept space. Transformation sets T1+ and T2+ comprise images transformed from chair while T1− and T2− consist of images transformed from non-chair classes.

Baselines. We start by comparing against standard nonexplainable NN architectures – the S+T setting as described in [Yu and Lin, 2023]. Next, we compare our proposed method against 5 different self-explaining approaches. As none of the approaches specifically evaluate concept generalization in the form of domain adaptation, we replicate all approaches. SENN and DiSENN utilize a robustness loss calculated on the Jacobians of the relevance networks with DiSENN utilizing a VAE as the concept extractor. BotCL [Wang, 2023] also proposes to utilize contrastive loss but uses it for position grounding. Similar to BotCL, Ante-hoc concept learning [Sarkar et al., 2022] uses contrastive loss on datasets with known concepts, hence we do not explicitly compare against it. Lastly, UnsupervisedCBM [Sawada, 2022b] uses a mixture of known and unknown concepts and requires a small set of known concepts. For our purpose, we provide the onehot class labels as known concepts in addition to unknown. A visual summary of the salient features of each baseline is depicted in Table 1.

4.2 Hyperparameter Settings

RCE Framework: We utilize the Mean Square Error as the reconstruction loss and set sparsity regularizer λ to 1e-5 for all datasets. The weights ω1 = ω2 = 0.5 are utilized for digit, while they are set at ω1 = 0.8 and ω2 = 0.2 for object tasks.

Learning: We utilize the lightly[1] library for implementing SimCLR transformations [Chen, 2020]. We set the temperature parameter (τ ) to 0.5 by default [Xu et al., 2019] for all datasets. The hyperparameters for each transformation are defaults utilized from SimCLR. The training objective is Contrastive Cross Entropy (NTXent) [Chen, 2020]. Figure 4 depicts an example of various transformations along with the adjudged positive and negative transformations. For the training procedure, we utilize the SGD optimizer with momentum set to 0.9 and a cosine decay scheduler with an initial learning rate set to 0.01. We train each dataset for 10000 iterations with early stopping. The regularization parameters of λ1 and λ2 are set to 0.1 respectively. For Digits, β is set to 1 while it is set to 0.5 for objects. For further details, refer to Appendix.

Table 2: Domain generalization performance for the Office-Home Dataset with domains Art (A), Clipart (C), Product (P) and Real (R).

Table 3: Domain generalization performance for the [Left] DomainNet dataset with domains Real (R), Clipart (C), Picture (P), and Sketch (S) and [Right] VisDA dataset with domains Real (R) and 3-Dimensional visualizations (3D).

Table 4: Domain generalization performance for the Digit datasets with domains MNIST (M), USPS (U) and SVHN (S). In addition, we also report the results of multiple source domain adaptation to the target domains in the Appendix.

[1] https://github.com/lightly-ai/lightly

Can AI Concepts Truly Generalize Across Different Domains? Experiments Reveal Answers | HackerNoon

Table of Links

4 Experiments

4.1 Datasets and Networks

4.2 Hyperparameter Settings

Leave a Reply Cancel reply

Stay Connected

Latest News

iPhone 18 Pro Models Again Rumored to Feature Under-Screen Face ID

New Apple TV+ drama tells the wild story of Siegfried & Roy, in all their campy Vegas glory

Which subscription to choose in 2025?

China’s NIO reportedly aims to make profit in Q4 · TechNode

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Table of Links

4 Experiments

4.1 Datasets and Networks

4.2 Hyperparameter Settings

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News