WeBase — Making deep learning simple

Harshit Singhai
Nov 1 · 14 min read

This article is written as part of my final capstone project at Bennett University. In this article, I’ll tell you about WeBase, its purpose, how is it useful and how to make the most out of it.

Content

  1. Problem Statement
  2. Introducing WeBase
  3. Deep Learning as a Service
  4. Is WeBase really needed?
  5. The story so far…
  6. Other big fishes in the market
  7. Core user scenarios
  8. Tech Stack
  9. A brief overview of machine learning algorithms
  10. Screenshots of Web UI
  11. Why have we chosen this project?
  12. What went wrong?
  13. What’s next?
  14. Video
  15. Conclusion

Problem Statement

Every day we constantly hear about AI and machine learning on the news and how it’s changing the world we live in.

Artificial intelligence will be the most disruptive class of technologies over the next decade, with a huge amount of data generated, and unprecedented advances in deep learning.

“AI promises to be the most disruptive class of technologies during the next 10 years due to advances in computational power, volume, velocity and variety of data, as well as advances in deep neural networks (DNNs)”

-John-David Lovelock, research vice president at Gartner.

However, most of us don’t see this impact directly. People with little or no machine learning experience are not able to use it for their own personal purpose.

Non-tech and tech startups invest a lot of time and effort in learning about machine learning to solve simple problems like object detection or image classification. Typically, they invest months trying to solve a problem that could be easily solved in a week by an experienced machine learning engineer.

By making the machine learning algorithm accessible via a simple API call or a web application, we can significantly reduce the time and effort. Thus enabling better use of manpower and work hours.

To simply put, WeBase provides common machine learning models as a service.

Introducing WeBase

A platform that allows the user to access machine learning algorithms in a click of a button. WeBase aims to make advanced AI accessible for users with it’s simple to use user-interface.

Developers can either use an existing model via a simple API or create their own custom model from scratch.

To create a custom model, the user will require to upload his/her data, WeBase automatically trains the model. No parameter tuning required. No domain knowledge. No need to find the right infrastructure to host your models. Just upload your data, sit back, relax and grab a beer. The trained model can either be used through our web application or via an API.

Deep Learning as a Service

Introducing deep Learning as a Service with WeBase. WeBase enables organizations to overcome the common barriers to deep learning deployment: skills and complexity. The innovation will be developing a full end-to-end deep learning model architecture and deploying it as a service. Each user will have a private profile where he can manage his custom and used models and save data to the cloud.

Is WeBase really needed?

Data Science is a very dynamic field. Not a lot of time has gone since deep learning became the defacto for the best solution for a lot of problems like self-driving cars, object detection, face recognition, violence detection, etc. However, it’s hard to keep up with the advancement going on in deep learning. According to the statistics, there is a huge demand and supply gap when it comes to machine learning engineer. In such a scenario it becomes more important to come up with solutions that can help startups and organizations searching for a solution to a common problem with machine learning.

Right now there are many organizations competing in the market to gain the edge over the others such as Mircosoft Azure AI and Google ML. Currently, Azure AI is being used by WIPRO, Schneider Electric, NBA, etc. And they expect the demand to increase as more industries move away from traditional statistics roles to machine learning-centric Data Science roles.

Similarity other leading cloud services such as Google Cloud, AWS Machine learning, IBM Watson are heavily investing in Artificial Intelligence to win the market. Going through their case studies we found lots of leading companies like Adobe, HP, and Dell depend on such services for their prediction and analysis.

Big 5 tech companies in the world i.e Google, Microsoft, IBM, Amazon are investing heavily to make machine learning accessible to all

Our approach is to make such services available for free or with minimum cost so that everyone from shopkeepers, a small-business owner or a student can leverage the algorithms freely or cheaply.

Going through the reports we strongly believe that this idea will be successful from the evidence that all the big 5 companies in the world such as Google, Microsoft, IBM, Amazon are investing heavily to make machine learning accessible to all.

The story so far…

Currently, WeBase provides solution for common machine learning problems such as NSFW (not safe for work) or SFW (safe for work) classification, removing image background, colorize B/W image, gender classification, object detection, face recognition, neural style transfer, summarize text, sentiment analysis, and much more.

WeBase takes in the dataset from the user and uses its custom modeling service to train and deploy the model. The trained model can either be used through our web application or via an API.

Other big fishes in the market

Some similar projects would be Google Cloud ML, IBM Cloud DLaaS or Azure AI. Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs. It relies on Google’s state-of-the-art transfer learning and neural architecture search technology.

However, these services are mostly used by high-end businesses due to pricing. Moreover, most of their custom models lack robustness and they are might not be able to fit the training data.

We don’t aim to compete with these behemoths. Rather we aim to provide pre-trained models that can easily be used by users for their own purpose. Also, the target audience includes people who have no experience with deep learning and only want an experience of what deep learning brings to the table.

Other than that, if the user wants to create a custom model using our services, we aim to provide that at a much lower cost and minimal documentation as compared to other services operating in the same domain.

Documentation, configuration, and high pricing make it difficult for people to use these machine learning cloud services. Often, if the application is not optimally configured the user is charged heavily for using the service. WeBase eliminates these problems by making it much simpler and easier to leverage machine learning as a service with minimal documentation and cost.

Core user scenarios

Scenarios where WeBase can be used:

Non-tech and tech startups invest a lot of time and effort in learning about machine learning algorithms and developing models from scratch. Our platforms make it easier for them, thus saving both time and effort.

Most of the machine learning projects in industries and startups focus on common problems such as regression, classification, clustering, facial recognition, object detection, and semantic segmentation. With our platform, they can easily use most common machine learning models with a simple API call.

With our application companies will be able to cut downtime on research. Instead, they can better utilize the time by tweaking custom model developed by our platform according to their requirements. Companies can cut down both cost, effort, time and instead focus on other departments like sales, operations, expansion, HR, customer relationships, etc.

Potential users: machine learning engineers, students, freelancers, designers, etc

Our potential clients/customers/users are non-tech and tech startups, developers, machine learning engineers, students, freelancers, designers and anyone who wishes to use a part of machine learning in their application without getting into the roots of the literature. Users can also use our application for personal use like removing the background of their image, neural style transfer to add filters to their image or coloring black and white images.

Tech Stack

The frontend of the application is based on React. React is a JavaScript library for building user interfaces. It is maintained by Facebook. React can be used as a base in the development of single-page or mobile applications, as it is optimal for fetching rapidly changing data that needs to be recorded. Our platform will have a dashboard-interface, and thus React can be used as it suits well to component-based UI.

For mobile applications, we rely on Ionic React. Being a dashboard-based platform, we recommend our platform to be accessed via the web for better performance and accessibility. Ionic React is relatively new has it has been released just a couple of days ago. Ionic React Framework makes it easy to build apps for iOS, Android, Desktop, and the web as a Progressive Web App. All with one code base, standard React development patterns and using the standard react-dom library and huge ecosystem around the web platform.

For our backend, we have used Node. It’s a JavaScript runtime built on Chrome’s V8 JS engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient. Node.js is much faster and scalable than other backend frameworks like Rails. Node’s ability to process requests with low response times make it a great fit for our web applications as it carries out lots of processing on the client’s side. We will be processing high volumes of IO-bound requests, thus Node.js fits perfectly in our case.

MongoDB is a document database with scalability and flexibility. We will be using MongoDB as it stores data in flexible, JSON-like documents. Distributed database at its core, so high availability, horizontal scaling, and geographic distribution are built-in and easy to use. We will be using MongoDB mainly because of its features like performance, scalability and Easy Integration with BigData Hadoop.

Cloud Storage for Firebase is a powerful, simple, and cost-effective object storage service built for Google scale. It will be used in our application for storing images uploaded by the user.

Flask is a micro web framework written in Python. Flask will be used to make endpoint for our deep learning model. The main reason why we have decided to have a separate backend for our deep learning model is to avoid I/O and CPU intensive request congestion resulting in a faster and more scalable application.

Node.js is known for I/O operations but fails when it comes to CPU Intensive task as it runs on the event loop, which runs on a single thread. Using two separate backends we can optimize our resources much better. Node.js will be used for I/O operation and flask for CPU Intensive work, thus reducing latency and focusing on high performance and scalability.

As Deep learning is the core of our platform, we decided to use Keras as it allows easy, scalable and fast prototyping. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow. Architectures such as Convolution neural network, transfer learning, generative adversarial networks, Recurrent neural network, LSTM were developed and prototyped using Keras.

Redis, an open-source, in-memory, data structure server is frequently used as a distributed shared cache. Redis is used mainly for caching purposes in our application.

A brief overview of machine learning algorithms.

A brief overview of all the machine learning models we have used in our platform.

  1. Data: For the dataset, we used Github repository containing the lists of URLs to download NSFW images, that were used to build the dataset to train our robust NSFM classification model. The link can be found here

2. Cleaning the data: The data we got from the NSFW data scraper is already labeled! But we got some errors. Especially since Reddit isn’t perfectly curated.

3. Keras and transfer learning: After a little research, We decided to go with Inception v3 weighted with imagenet. We removed the top layer to use the model for our purpose.

4. With the model in place, We added 3 more layers. A 256 hidden neuron layer, followed by a hidden 128 neuron layer, followed by a final 5 neuron layer. The latter being the ultimate classification into the five final classes moderated by softmax.

5. Dropout was used to randomly remove neural pathways so no one feature dominates the model.

6. The model was also fine-tuned accordingly

  1. Dataset: We used CIFAR-10 and Places365 datasets. To train a model on the full dataset.
  2. Network Architecture: A generative adversarial network or “GAN” is a neural network consisting of two submodels. These models work together to generate something, could be an image or even music, that to humans seems like the “real” thing.
  3. Generator: The architecture of the generator is inspired by U-Net: The architecture of the model is symmetric, with n encoding units and n decoding units.
  4. Discriminator: For discriminator, we use patch-gan architecture with a contractive path similar to the baselines: a series of 4x4 convolutional layers with stride 2 with the number of channels being doubled after each downsampling.
  1. We used Semantic segmentation for background removal.
  2. Dataset: The most common datasets for segmentation were the COCO dataset, which includes around 80K images with 90 categories, the VOC pascal dataset with 11K images and 20 classes, and the newer ADE20K datasets.
  3. After doing some research, we decided to move forward with three models, which were available to us: the FCN, Unet, and Tiramisu
  4. We used Unet and Tiramisu to train the model. We trained our model with the schedule as described in the original paper: standard cross-entropy loss, RMSProp optimizer with 1e-3 learning rate and small decay. We split our 11K images into 70% training, 20% validation, 10% test.

The Keras model was created by training SmallerVGGNet from scratch on around 2200 images (~1100 for each class) gathered from Google Images. It achieved around 95% training accuracy and ~85% validation accuracy. (20% of the dataset is used for validation)

  1. Conditional GANs were used for Face Aging.
  2. The Face Aging-cGan has four networks.
  3. An Encoder: It learns the inverse mapping of input face images and the age condition with the latent vector Z.
  4. A FaceNet: It is a facial recognition network that learns the difference between an input image x and a reconstructed image x’.
  5. A Generator Network: It takes a hidden representation of a face image and a condition vector as input and generates an image.
  6. A Discriminator Network: It tries to discriminate between the real images and the fake images.
  1. Used python google_image_download library to download the dataset from google images
  2. ImageDataGenerator for data augmentation
  3. Trained a small convolution neural network
  4. Used XCeption for transfer learning.
  5. Hyperparameter tuning

Screenshots of Web UI

Basic Layout

Login Page
Register Page
Email Verification
Account Verified

Dashboard

Dashboard

Model Workflow

Image Colorization

Face Aging

Gender Classification

NSFW

Real Estate Classification

Sentiment Analysis

User Profile

Why have we chosen this project?

Being deep learning and machine learning enthusiast, we wanted to create an application where all the common models can be found, thus WeBase was born. The vision is to make machine learning models accessible to every developer with an internet connection.

With this project, we have used our machine learning and software development skills to develop an application that can help the community.

We believe our platform has a huge potential as we’re not restricting ourselves to a particular domain rather serving and deploying scalable solutions across various domains like Healthcare, Education, Entertainment, E-commerce, Journalism, etc.

What went wrong?

  1. Some solutions took more time than expected.
  2. Deployment issues — There were a lot of deployment issues when connecting various pieces of the puzzle together.
  3. Not able to properly implement the research paper.
  4. There were cases where there was little or no reference material available.
  5. The neural network took a much longer time to train
  6. Data not available — In some situations, we wrote scripts to scrape the data from the internet and filtered it manually. In a few cases, we manually collected some data.
  7. A lot of time was consumed in solving errors while integrating different frameworks and libraries.

What’s next?

Going forward, we will be providing solutions to more complex and real-life problems. Hosting the application and deploying it in the cloud remains our top priority.

We will also be working on the payment gateway to enable monetization. In the future, we will be limiting the free API class. On reaching the free API limit, the user will need to pay a certain amount depending on the usage.

Video

Conclusion

It’s a long way to go. Let’s see how far can we go.

“If everything seems under control, you’re not going fast enough.”

Find more details here at https://webaselandingpage.netlify.com

That’s it for today. See you soon.

Did you like what you read? Recommend this post by clicking the heart button so others can see it!

I would love to hear from you! — LinkedIn, Github, Website

Harshit Singhai

Written by

Deep learning enthusiast. In my spare time, I like to play football, and watch rom-com/thriller movies till the late hours. Chelsea forever.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade