Tyron Jung
Oct 17, 2021

--

Hey, just wanted to clarify something - aren't the scores the softmax results? Seems like what you're talking about here is the result of aggregating the V matrix using the attention scores (i.e. the context matrix), not the scores themselves.

--

--