Unknown Deep learning Architecture and Algorithm part 1

Ashish Patel
ML Research Lab
Published in
4 min readSep 28, 2019

New algorithm and architecture of Deep learning

Our world is full of amazing stuff. But that doesn’t mean we have limited number of architecture in machine learning and deep learning for that only one way you have to explore research. Research is full of amazing collection of deep learning architecture. So, In this article I have collected this type of Architecture which may be useful for your next Deep learning project.

1. Implicit Autoencoders by Alireza Makhzani

  • Implicit autoencoder” (IAE), a generative autoencoder in which both the generative path and the recognition path are parametrized by implicit distributions.
  • Implicit distributions allows us to learn more expressive posterior and conditional likelihood distributions for the autoencoder.
  • Reference : https://arxiv.org/abs/1805.09804

2. Neural reparameterization improves structural optimization

  • Structural optimization is a popular method for designing objects such as bridge trusses, airplane wings, and optical devices.
  • Implicit bias over functions induced by neural networks to improve the parameterization of structural optimization.
  • Reference : https://arxiv.org/abs/1805.09804

3. Extreme Language Model Compression with Optimal Sub-words and Shared Projections

  • Pre-trained deep neural network language models such as ELMo, GPT, BERT and XLNet have recently achieved state-of-the-art performance on a variety of language understanding tasks. It’s big in size impractical for mobile and edge devices.
  • This all model have large input dimension and vocabulary. It has been achieved for large number of teacher’s model(Large dataset). But Inefficient for student dataset(Small dataset).
  • In this paper, Novel Approach design by author, novel knowledge distillation technique for training a student model with a significantly smaller vocabulary as well as lower embedding and hidden state dimensions.
  • This method is able to compress the BERT_BASE model by more than 60x, with only a minor drop in downstream task metrics, resulting in a language model with a footprint of under 7MB. Experimental results also demonstrate higher compression efficiency and accuracy when compared with other state-of-the-art compression techniques.
  • Reference : https://arxiv.org/abs/1909.11687

4. ALBERT: Alite BERT for self-supervised learning of language representations

  • Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.However, at some point further model increases become harder due to GPU/TPU memory limitations, longer training times, and unexpected model degradation.
  • Inside this problem, Author has derived two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT.Comprehensive empirical evidence shows that proposed methods lead to models that scale much better compared to the original BERT.
  • Best model establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT-large.
  • References : https://openreview.net/pdf?id=H1eA7AEtvS

5. BagNet: Berkeley Analog Generator with Layout Optimizer Boosted with Deep Neural Networks

  • This work presents a learning framework that learns to reduce the number of simulations of evolutionary-based combinatorial optimizers, using a DNN that discriminates against generated samples, before running simulations.
  • Using this approach, the discriminator achieves at least two orders of magnitude improvement on sample efficiency for several large circuit examples including an optical link receiver layout.
  • References : https://arxiv.org/abs/1907.10515

6. A Heuristic for Efficient Reduction in Hidden Layer Combinations For Feedforward Neural Networks

  • Hyper-parameter search problem in the field of machine learning and present a heuristic approach in an attempt to tackle it.
  • In most learning algorithms, a set of hyper-parameters must be determined before training commences. The choice of hyper-parameters can affect the final model’s performance significantly. But, yet determining a good choice of hyper-parameters is in most cases complex and consumes large amount of computing resources.
  • In this paper, Difference between exhaustive search of hyperparameters and a heuristic search, and show that there is a significant reduction in time taken to obtain the resulting model with marginal differences in evaluation metrics when compared to the benchmark case.

Thanks for reading…!!! Happy learning…!!! Stay tune for more this kind of unknown but amazing Research paper architecture and methods reading summary…!!!

--

--

Ashish Patel
ML Research Lab

LLM Expert | Data Scientist | Kaggle Kernel Master | Deep learning Researcher