m0nadsinBecoming Human: Artificial Intelligence MagazineMulti-Head AttentionExamining a module consisting of several attention layers running in parallel.6 min read·Jan 27, 2022----
m0nadsDiffusion modelsDiffusion probabilistic models are parameterized Markov chains models trained to gradually denoise data.6 min read·Jan 25, 2022--1--1
m0nadsSharpness-Aware MinimizationA training procedure based on minimizing a worse -case perturbation of the weights6 min read·Jan 24, 2022--1--1