Image Credits: O’Reilly Media

Adaptive optimization methods, such as Adam and Adagrad, maintain some statistics over time about the variables and gradients (e.g. moments) which affect the learning rate. These statistics won’t be very accurate when working with sparse tensors, where most of its elements are zero or near zero. We investigated the effects of using Adam with varying learning rates on sparse tensors, and explored solutions to allow Adam to properly handle the sparsity.

In our experiments, we observed a lack of responsiveness to learning rate changes. We speculate this is due to fact that Adam update its moment estimates from a much…


Overview of the experiment infrastructure

It’s such a joy to be able to test an idea, go straight to the idea without wrestling with the tools. We recently developed an experimental setup which, so far, looks like it will do just that. I’m excited about it and hope it can help you too, so here it is. We’ll go through the why we created another framework, and how each module in the experiment setup works. But, if you’re in a hurry, here’s a summary!

Summary

What: Experimental infrastructure for Machine Learning using Tensorflow
Why?: Allows you to test new algorithms efficiently
Tell me more:

  • You can…

If you’re using TensorFlow and recently heard about Eager Execution and don’t know if or when to switch, or you’re getting started on TensorFlow and don’t know if you should use Eager or Graph Execution…… then this article is for you.

We are currently in the first scenario as we have recently started using TensorFlow, and were intrigued by the official introduction of Eager Execution at the TensorFlow Dev Summit a few months ago. I went through the process of evaluating Eager Execution and wrote this review to help guide our decision for our next TensorFlow projects.

What is Eager Execution?

Eager Execution is…


There are plenty of established machine learning frameworks out there, and new frameworks are popping up frequently to address specific niches. We were interested in examining if one of these frameworks fits in our workflow. The number of frameworks makes it very difficult to examine all of these frameworks simultaneously. I found that many of the comparisons available online focus on one or two frameworks. I surveyed the most popular frameworks, and aim to provide a helpful comparative analysis. It’s focused on distributed execution, optimisation on relevant architectures, community support and portability.

It’s important to note that most comparisons have…


SVHN is relatively new and popular dataset, a natural next step to MNIST and complement to other popular computer vision datasets. This is an overview of the common preprocessing techniques used and the best performance benchmarks, as well as a look at the state-of-the-art neural network architectures used. This will be useful for anyone considering testing their algorithms on SVHN.

We have previously discussed that we are conducting experiments using the MNIST dataset. For the next phase of our experiments, we have begun experimenting with the Street View House Numbers (SVHN) dataset to test the robustness of our algorithms. This…


We have previously discussed that we are conducting experiments using the MNIST dataset, and released the code for the MNIST and NIST preprocessing code. For the next phase of our experiments, we have begun experimenting with the Street View House Numbers (SVHN) dataset to test the robustness of our algorithms.

The SVHN dataset contains real world images obtained from the house numbers in Google Street View images. The dataset comes in a similar style as the MNIST dataset where images are of small cropped digits, while being significantly harder and containing an order of magnitude more labelled data.

Examples of the images in the SVHN dataset
Some examples of the images in the SVHN dataset

The commonly…


This article assesses the research paper, ‘A Distributional Perspective on Reinforcement Learning’ by the authors, Marc G. Bellemare, Will Dabney and Remi Munos, published in the proceedings of the 34th International Conference on Machine Learning (ICML) in 2017. Bellemare et al.’s paper will be assessed on several criteria. Firstly, content is assessed through providing some background information and describing the methods and findings of the paper. Secondly, the novelty and innovation of the paper is described. Thirdly, the technical quality is examined. Finally, possible applications for the research work and suggested improvements are identified.

Content

Bellemare et al.’s paper is well-researched…


Hello everyone! I am Abdelrahman Ahmed (or Abdel for short) and I joined the Project AGI team as a Research Assistant a few months ago. I thought it would be a good time to tell you more about myself and what I am working on at Project AGI.

Who am I?
Originally from Egypt, I studied Computer Science in the UK for about two years before transferring to the University of Technology Sydney. I am currently in the final semester of my degree with a major in Enterprise Systems Development and sub-major in Data Analytics. …

Abdelrahman Ahmed

AI Research Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store