Using retrieval technique for selection of top-k token’s encoder hidden states to be attended during cross-attention in…