16 – Medium

16

16

Generalization

Improving the model’s generalization capacity is what we pursued for long time. There are two main ways of improvement:

Aug 9, 2020

Aug 9, 2020

16

Do we really have to decrease the loss of training set to zero?

Obviously not. Generally speaking, we are using training set to train the model, but the performance is evaluated on the validation. When…

Aug 8, 2020

Aug 8, 2020

16

Improvement on attention mechanism

Increasing key size

Aug 3, 2020

Aug 3, 2020

16

The multi-head attention based transformer models are quite dominant nowadays.

Bottleneck of attention

Aug 2, 2020

Aug 2, 2020

16

It’s never easy to explain a deep learning algorithm.

The time and space complexity of transformer is O(n^2), which is, obviously not optimal. Lots of methods were aimed to improve this point…

Jul 30, 2020

Jul 30, 2020

16

The important NLP features:

Input is a 1-dimension sequence.

Jul 29, 2020

Jul 29, 2020

16

Self attention (BERT)

Self attention meads K= V = Q. If the input is a sentence, then every word needs to be attention calculated with all the words in…

Jul 28, 2020

Jul 28, 2020

16

BERT attention main ideas in 2 mins

It should be good enough to understand three main ideas of BERT at the beginning.

Jul 25, 2020

Jul 25, 2020

16

Data science / NLP/ computer vision

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams