m0nadsinBecoming Human: Artificial Intelligence MagazineMulti-Head AttentionExamining a module consisting of several attention layers running in parallel.Jan 27, 2022Jan 27, 2022
m0nadsDiffusion modelsDiffusion probabilistic models are parameterized Markov chains models trained to gradually denoise data.Jan 25, 20221Jan 25, 20221
m0nadsSharpness-Aware MinimizationA training procedure based on minimizing a worse -case perturbation of the weightsJan 24, 20221Jan 24, 20221