[ Archived Post ] Generalizing to Unseen Domains via Adversarial Data Augmentation

Please note that this post is for my own educational purpose.

Create a model → that can do well on data from other modules as well → this is an iterative model. (provides good results).

Data acquisition is costly → how can we create more data? → since we still want a model that can perform well on many different distributions. (self-driving cars → need this kind of system → this is called domain adaptation). (having some prior can be good → but also can lead to limited results). (this paper → only considers when the training data is from one domain).

The high-level objective function can be seen above. (want a model that can perform well on the shift in semantic space). This method → is an iterative data augmentation method. (also can be understood as a regualization term).

Adversarial training → is very closely related to this kind of research. (defense from adversarial attacks can be related to this area of research but more in semantic space). (some methods → are limited since they need to be exposed to the adversarial distribution during training).

The whole method is closely related to Wasserstein distance → in semantic space → the softmax loss → added log. (earth moving distance in the semantic space → very interesting way of measuring). (first creating the distance in the semantic space is important → with Lagrangian dual form).

The full algorithm can be seen above. (wow there is a solid mathematical theory behind this method → very impressive.).

Theoretical understanding of the augmentation steps → the data augmentation can be thought of the regularization method. (very interesting). (data depended on regulaization method). (Tikhonov regularized Newton step).

So fucking sexy.

The target domains are unknown during training time → the authors apply this method on digit dataset → as well as semantic scene segmentation. (Adam was used → with other regularization methods were compared).

Trained on MNIST → however, tested on SVHN → the authors' method gives better generalization power.

When models are tested on unseen domains → the model still gives good results. (even on different data set as well).

The authors were able to create a framework → that performs data augmentation to generalize to unseen domains → this method does not need to see the target domain.

https://jaedukseo.me I love to make my own notes my guy, let's get LIT with KNOWLEDGE in my GARAGE

https://jaedukseo.me I love to make my own notes my guy, let's get LIT with KNOWLEDGE in my GARAGE