Andros Wong
Aug 9, 2017 · 1 min read

Hi guys great tutorial and the mathematics was very clear. However on the code side I am confused as to what you define as the inputs and outputs of the layer (Is the input the scalar value per neuron of the combined weights or are they each weight multiplied by previous neurons? Is the output just the scalar value per neuron after the activation?). Specifically in the API design you mentioned:

 #out_grad is the derivative of the cost function w.r.t. the 
# inputs to all of the neurons for the following layer.

but then it seems to me in

output_wrt_inputs = self.W
output_wrt_inputs[:, self.out_act < 0] = 0
cost_wrt_inputs = cost_wrt_output * output_wrt_inputsreturn cost_wrt_inputs

there is either a contradictory way of defining out_grad or the meaning of ‘inputs’ is different. Would you be able to clarify further what is going on in this code?

    Andros Wong

    Written by

    Software Engineer @Coil. Working on the Interledger Protocol.

    Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
    Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
    Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade