Unknown Deep learning Architecture and Algorithm part 2

Published in

ML Research Lab

9 min readSep 28, 2019

Google Designed New Architecture of Neural network

Google AI is one the leading research community who are doing massive research in AI. Google is one of the leading Internet company who has invested massive amount of resources in AI Research. So, Today I am sharing one of the new architecture of neural network design and invented by Google Team after spending massive time behind the research. Thanks google for amazing work…!!!!

1. EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling

Convolutional neural networks (CNNs) are commonly developed at a fixed resource cost, and then scaled up in order to achieve better accuracy when more resources are made available.
The conventional practice for model scaling is to arbitrarily increase the CNN depth or width, or to use larger input image resolution for training and evaluation.
While these methods do improve accuracy, they usually require tedious manual tuning, and still often yield sub-optimal performance. What if scale up a CNN to obtain better accuracy and efficiency?
Efficient Net is Novel Approach uses a simple yet highly effective compound coefficient to scale up CNNs in a more structured manner. Unlike conventional approaches that arbitrarily scale network dimensions, such as width, depth and resolution, our method uniformly scales each dimension with a fixed set of scaling coefficients.
References : http://ai.googleblog.com/2019/05/efficientnet-improving-accuracy-and.html
Code : https://github.com/DableUTeeF/keras-efficientnet | https://github.com/Tony607/efficientnet_keras_transfer_learning/blob/master/Keras_efficientnet_transfer_learning.ipynb

2. Introducing AdaNet: Fast and Flexible AutoML with Learning Guarantees

Ensemble learning, the art of combining different machine learning (ML) model predictions, is widely used with neural networks to achieve state-of-the-art performance, benefitting from a rich history and theoretical guarantees to enable success at challenges such as the Netflix Prize and various Kaggle competitions
However, they aren’t used much in practice due to long training times, and the ML model candidate selection requires its own domain expertise. But as computational power and specialized deep learning hardware such as TPUs become more readily available, machine learning models will grow larger and ensembles will become more prominent.
Now, imagine a tool that automatically searches over neural architectures, and learns to combine the best ones into a high-quality model.
AdaNet, a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet builds on our recent reinforcement learning and evolutionary-based AutoML efforts to be fast and flexible while providing learning guarantees.
AdaNet provides a general framework for not only learning a neural network architecture, but also for learning to ensemble to obtain even better models.
References : http://ai.googleblog.com/2018/10/introducing-adanet-fast-and-flexible.html
Code : https://github.com/smyrbdr/AdaNet-on-MNIST

3.SyntaxNet: The World’s Most Accurate Parser Goes Open Source

At Google, Research Scientist spend a lot of time thinking about how computer systems can read and understand human language in order to process it in intelligent ways.
SyntaxNet, an open-source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding (NLU) systems.
Working : SyntaxNet is a framework for what’s known in academic circles as a syntactic parser, which is a key first component in many NLU systems. Given a sentence as input, it tags each word with a part-of-speech (POS) tag that describes the word’s syntactic function, and it determines the syntactic relationships between words in the sentence, represented in the dependency parse tree. These syntactic relationships are directly related to the underlying meaning of the sentence in question.
References : https://ai.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html
Code : https://github.com/short-edition/syntaxnet-wrapper

4. MorphNet: Towards Faster and Smaller Neural Networks

Deep neural networks (DNNs) have demonstrated remarkable effectiveness in solving hard problems of practical relevance such as image classification, text recognition and speech transcription. However, designing a suitable DNN architecture for a given problem continues to be a challenging task.
However, designing a suitable DNN architecture for a given problem continues to be a challenging task. Given the large search space of possible architectures, designing a network from scratch for your specific application can be prohibitively expensive in terms of computational resources and time.
MorphNet, a sophisticated technique for neural network model refinement, which takes the latter approach.
MorphNet takes an existing neural network as input and produces a new neural network that is smaller, faster, and yields better performance tailored to a new problem.
References : https://ai.googleblog.com/2019/04/morphnet-towards-faster-and-smaller.html
code : https://github.com/google-research/morph-net

5. Introducing PlaNet: A Deep Planning Network for Reinforcement Learning

Research into how artificial agents can improve their decisions over time is progressing rapidly via reinforcement learning (RL). For this technique, an agent observes a stream of sensory inputs (e.g. camera images) while choosing actions (e.g. motor commands), and sometimes receives a reward for achieving a specified goal.
Model-free approaches to RL aim to directly predict good actions from the sensory observations, enabling DeepMind’s DQN to play Atari and other agents to control robots. However, this blackbox approach often requires several weeks of simulated interaction to learn through trial and error, limiting its usefulness in practice.
Deep Planning Network (PlaNet) agent, which learns a world model from image inputs only and successfully leverages it for planning.
PlaNet solves a variety of image-based control tasks, competing with advanced model-free agents in terms of final performance while being 5000% more data efficient on average.
References : https://ai.googleblog.com/2019/02/introducing-planet-deep-planning.html
Code : https://github.com/google-research/planet

6. MobileNetV2: The Next Generation of On-Device Computer Vision Networks

Google MobileNetV1, a family of general purpose computer vision neural networks designed with mobile devices in mind to support classification, detection and more. The ability to run deep networks on personal mobile devices improves user experience, offering anytime, anywhere access, with additional benefits for security, privacy, and energy consumption. As new applications emerge allowing users to interact with the real world in real time, so does the need for ever more efficient neural networks.
MobileNetV2 to power the next generation of mobile vision applications. MobileNetV2 is a significant improvement over MobileNetV1 and pushes the state of the art for mobile visual recognition including classification, object detection and semantic segmentation.

MobileNetV2 is released as part of TensorFlow-Slim Image Classification Library, or you can start exploring MobileNetV2 right away in Colaboratory. Alternately, you can download the notebook and explore it locally using Jupyter. MobileNetV2 is also available as modules on TF-Hub, and pretrained checkpoints can be found on github.
Reference : https://ai.googleblog.com/2018/04/mobilenetv2-next-generation-of-on.html
Code : https://github.com/xiaochus/MobileNetV2

7. Learning Cross-Modal Temporal Representations from Unlabeled Videos(BERTVideo)

While people can easily recognize what activities are taking place in videos and anticipate what events may happen next, it is much more difficult for machines. Yet, increasingly, it is important for machines to understand the contents and dynamics of videos for applications, such as temporal localization, action detection and navigation for self-driving cars.

In order to train neural networks to perform such tasks, it is common to use supervised training, in which the training data consists of videos that have been meticulously labeled by people on a frame-by-frame basis. Such annotations are hard to acquire at scale. Consequently, there is much interest in self-supervised learning, in which models are trained on various proxy tasks, and the supervision of those tasks naturally resides in the data itself.
The goal is to discover high-level semantic features that correspond to actions and events that unfold over longer time scales. To accomplish this, It has been exploited the key insight that human language has evolved words to describe high-level objects and events. In videos, speech tends to be temporally aligned with the visual signals, and can be extracted by using off-the-shelf automatic speech recognition (ASR) systems, and thus provides a natural source of self-supervision. model is an example of cross-modal learning, as it jointly utilizes the signals from visual and audio (speech) modalities during training.
References : https://ai.googleblog.com/2019/09/learning-cross-modal-temporal.html

8. Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing

One of the biggest challenges in natural language processing (NLP) is the shortage of training data. Because NLP is a diversified field with many distinct tasks, most task-specific datasets contain only a few thousand or a few hundred thousand human-labeled training examples.
However, modern deep learning-based NLP models see benefits from much larger amounts of data, improving when trained on millions, or billions, of annotated training examples.
To help close this gap in data, researchers have developed a variety of techniques for training general purpose language representation models using the enormous amount of unannotated text on the web (known as pre-training).
The pre-trained model can then be fine-tuned on small-data NLP tasks like question answering and sentiment analysis, resulting in substantial accuracy improvements compared to training on these datasets from scratch.
New technique for NLP pre-training called Bidirectional Encoder Representations from Transformers, or BERT.
Anyone in the world can train their own state-of-the-art question answering system (or a variety of other models) in about 30 minutes with single TPU.
References : https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
Code : https://github.com/Separius/BERT-keras

9. AMOEBA-Net:Using Evolutionary AutoML to Discover Neural Network Architectures

The brain has evolved over a long time, from very simple worm brains 500 million years ago to a diversity of modern structures today. The human brain, for example, can accomplish a wide variety of activities, many of them effortlessly — telling whether a visual scene contains animals or buildings feels trivial to us.
Example like, To perform activities like these, artificial neural networks require careful design by experts over years of difficult research, and typically address one specific task, such as to find what’s in a photograph, to call a genetic variant, or to help diagnose a disease. Ideally, one would want to have an automated method to generate the right architecture for any given task.
By using the computational resources to programmatically evolve image classifiers at unprecedented scale, can one achieve solutions with minimal expert participation? How good can today’s artificially-evolved neural networks be?
References : https://ai.googleblog.com/2018/03/using-evolutionary-automl-to-discover.html
Code : https://github.com/tensorflow/tpu/tree/master/models/official/amoeba_net

10. MnasNet: Towards Automating the Design of Mobile Machine Learning Models

Convolutional neural networks (CNNs) have been widely used in image classification, face recognition, object detection and many other domains. Unfortunately, designing CNNs for mobile devices is challenging because mobile models need to be small and fast, yet still accurate.
In “MnasNet: Platform-Aware Neural Architecture Search for Mobile”, google explore an automated neural architecture search approach for designing mobile models using reinforcement learning. To deal with mobile speed constraints, we explicitly incorporate the speed information into the main reward function of the search algorithm, so that the search can identify a model that achieves a good trade-off between accuracy and speed.
References : https://ai.googleblog.com/2018/08/mnasnet-towards-automating-design-of.html
Code : https://github.com/abhoi/Keras-MnasNet

Thanks for reading…!!! Happy learning…!!! Stay tune for more this kind of unknown but amazing Research paper architecture and methods reading summary…!!!

Unknown Deep learning Architecture and Algorithm part 2

7. Learning Cross-Modal Temporal Representations from Unlabeled Videos(BERTVideo)

Written by Ashish Patel