The paper you reference has:
Illia Polosukhin

Thanks for replying quickly! Probably , I did not make myself clear. The ‘inputs’ I used means the word vector, instead of the training data. I am sorry for confusing you about that.

You are right about the CNN-non-static, I was trying to update the embeddings during the training with gradient descent.

I wanted to use tensorflow to implement this. So, based on the code I referenced, I changed the type of ‘embedded_chars_expanded’ to ‘Variable’ in order to update them when I invoked tf.GradientDescentOptimizer.minimize(loss).

But the limited memory of GPU forced me to slice the ‘embedded_chars_expanded’, unfortunately, the ‘embedded_chars_expanded’ will become ‘tensor’ , which can not be updated.

I did not find answer from your blog Part 5, is it the blog named ’TensorFlow — Text Classification’? Hoping to get your guidance.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.