[Paper Summary] ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Cheng-Han Lee (Steven)
3 min readJun 5, 2018

Paper Information

CVPR (Oral) 2019, Jiankang Deng et al.

Contributions

  1. Data : Refine the largest publicly available training data, MS-Celeb-1M, in both automatic and manual way.
  2. Network : Taking VGG2 as the training data, the authors conduct extensive contrast experiments regarding the convolutional network settings and report the verification accuracy on LFW, CFP and AgeDB.
  3. Loss : The authors propose a new loss function, additive angular margin (ArcFace), to learn highly discriminative features for robust face recognition.
  4. Performance : The proposed ArcFace achieves state-of-the-art results on the MegaFace Challenge, which is the largest public face benchmark with one million faces for recognition.

From Softmax to ArcFace

The most widely used classification loss function — Softmax loss.

For simplicity, the authors fix the bias bj = 0. Then, transform the target logit as follows:

Authors fix the norm of Wj equaling to1 by L2 normalization, which makes the predictions only depend on the angle between the feature vector and the weight.

In SphereFace, angular margin m is introduced by multiplication on the angle.

In order to remove this restriction, cos(mθ ) is substituted by a piece-wise monotonic function ψ(θ). The SphereFace is formulated as:

If the feature normalization is applied to SphereFace, we can get the feature normalized SphareFace, denoted as SphereFace-FNorm:

The angular margin m is removed to the outside of cos θ, thus they propose the cosine margin loss function:

The authors add an angular margin m within cos θ. Since cos(θ+ m) is lower than cos(θ) when θ ∈ [0, π − m], the constraint is more stringent for classification. Authors define the proposed ArcFace as:

Experiments

1. For network setting

Verification accuracy (%) under different input sittings (Softmax@VGG2).
Verification accuracy (%) under different output settings (Softmax@VGG2).
Verification accuracy (%) comparison between the original residual unit and the improved residual unit (Softmax@VGG2).

2. For loss setting

Verification performance (%) of different weight decay (WD) values (SE-LResNet50E-IR,Softmax@VGG2).
Verification performance (%) of ArcFace with different angular margins m (LMobileNetE,ArcFace@MS1M).
Verification performance (%) for different loss functions (LResNet100E-IR@MS1M).

--

--