Seeking knowledge is never stupid, at least I hope :D

The sparse coding algorithm is an unsupervised algorithm working only on inputs. My examples are somehow misleading because I first want to see the impact of a sparsity constraint on a prediction task which is NOT a sparse coding algorithm with dictionary learning.

In your example, you never try to decode your encoded inputs back to their original inputs. It looks like you try to predict a class, so you don’t have any decoder matrix. The only thing you can do is interpret the matrix W1 and W2 as encoders, which makes sense since they will try to encode data to improve performance on the classification task.

To implement a Dictionary learning inside a classification task, you need to add a decoder layer after each encoding layer, so in your case you could add 2 decoder

If you look closely at my code on the convolutional auto-encoder, you’ll see that I balance both tasks at the same time: there is a loss for the prediction task and a loss for the sparse coding task. The decoder matrix is the matrix of weights “W_decoder_ps” which is the decoder matrix.