Domain-Specific Batch Norm
Paper Title | Domain-Specific Batch Normalization for Unsupervised Domain Adaptation |
---|---|
Authors | Chang et al. |
Date | 2019-05 |
Link | https://cvlab.postech.ac.kr/lab/papers/CVPR19_domain_adap.pdf |
Paper Summary
This paper introduces a novel unsupervised domain adaptation framework called Domain-Specific Batch Normalization (DSBN) for deep neural networks. The authors propose to adapt to both source and target domains by specializing batch normalization layers for each domain while sharing all other model parameters. The framework consists of two stages: in the first stage, pseudolabels are estimated for the target domain using an external unsupervised domain adaptation algorithm, and in the second stage, final models are learned using a multi-task classification loss. The authors demonstrate that DSBN can be easily incorporated into existing domain adaptation techniques and evaluate its performance on multiple benchmark datasets.
Paper Review
Short Summary
The paper presents an unsupervised domain adaptation approach based on domain-specific batch normalization (DSBN). DSBN consist of two batch norm modules that learn to normalize the distribution of the Source and Target domain separately. The authors proposed a two-stage training process where pseudolabels are estimated for the target domain by a base model in the first stage, then the model is trained with real label (source) + pseudolabels (target) in the second stage. They demonstrate the effectiveness of this approach on multiple benchmark dataset compared to contemporary techniques.
Strengths
- A simple approach that turns out to work very well, the assumption is quite simple: Batch Norm help standardize the feature distribution, but different domains should have different parameters.
- DSBN can be applied to any other network architectures so long as Batch Norm is applicable.
- Comprehensive evaluations of the proposed approach compared to SOTA results and an ablation studies to validate DSBN’s effectiveness.
Weaknesses
- Insuficient theoretical justification for developing DSBN. In fact, mean and variance is only 2 parameters, surely they could not represent all of the differences in the distributions of the 2 domains.
- Due to the idea being simple, the experiments could have tried more variants of the same ideas (not stopping at mean and variance)
- Lack a discussion on the implementation challenges (efficient implementation, computational and memory requirements) of DSBN.
- The contribution of semi-supervised learning (using psuedo-labels) is unclear and not separately assessed with DSBN.
Reflection
This paper is more about a technique that can be orthogonal to many other approaches. Much of the same could be talked about LoRA (Low Rank Adaptation), I’m wondering if anyone have tried combining both approaches into one.
Most interesting thought/idea from reading this paper
Can we estimate the Target Batch Norm in case we have a small number of labeled target datapoints? How about expanding to more than 2 modules in DSBN? There any many potentials.