Approach, Learnings, and Code

I had recently participated in the Jigsaw Multilingual Toxic Comment Classification challenge at Kaggle and our team (ACE team) secured 3rd place on the final leader board. In this blog, I describe the problem statement, our approach, and the learnings we had from the competition. I have also provided the link to our full code towards the end of the blog.

Problem Description

The goal of the competition was to detect toxicity (for example rudeness, disrespect, or threat) in user comments during online interaction. …

How the two popular frameworks have converged

Pytorch and Tensorflow are by far two of the most popular frameworks for Deep Learning. It’s always a lot of work to learn and be comfortable with a new framework, so a lot of people face the dilemma of which one to choose out of the two

The two frameworks had a lot of major differences in terms of design, paradigm, syntax etc till some time back, but they have since evolved a lot, both have picked up good features from each other and are no longer that different.

A lot of online articles comparing the two are a little…

Pushing Deep Learning to the Limit with 175B Parameters


OpenAI recently released pre-print of its new mighty language model GPT-3. Its a much bigger and better version of its predecessor GPT-2. In fact, with close to 175B trainable parameters, GPT-3 is much bigger in terms of size in comparison to anything else out there. Here is a comparison of number of parameters of recent popular pre trained NLP models, GPT-3 clearly stands out.

What’s New?

After the success of Bert, the field of NLP is increasingly moving in the direction of creating pre-trained language models, trained on huge text corpus (in an unsupervised way), which are later fine-tuned on specific tasks…

In Data Science and Beyond

Working in Data Science / Artificial Intelligence could be overwhelming — you need a good handle on basic Math / Statistics (example probability, matrix algebra, calculus), familiarity with various off the shelf algorithms (example Regression, Neural Networks, Clustering etc), facility with tools and programming languages (example Python, R, Excel) and good communication and presentation skills .

Now that’s a lot of things to fit in one body. Not everyone is familiar with all that takes to be a top data scientist. On top of all this, the field itself is evolving at a very rapid pace and you need to…

A more human-like and versatile chatbot

Google recently published a paper on its new chatbot Meena. Google has hit all the right chords in terms of its design and approach. While the underlying techniques are not entirely new, but it seems to be the right direction in terms of building chatbots which are truly versatile and more human-like in terms of their interactions.

The Rise of Chatbots

Chatbots are AI system which interact with users via text messages or speech. There has been tremendous growth in applications of chatbots off late. The chatbot market, in fact, is expected to grow to $9.4B based on some estimates.

There are a lot…

Defining, detecting and avoiding bias

AI algorithms are increasingly being used in a wide range of areas for making decisions that impact our day to day life. Some examples are —Recruitment, Healthcare, Criminal Justice, Credit Risk Scoring etc. It's being used by not just private businesses but also governments.

One of the supposed benefits of using AI or machines in general for making decisions is — they may be impartial, objective and may not carry the same biases as humans do and hence may be more “fair”. Some of the recent studies have shown that AI systems can be biased as well.

Imagenet, the public…

All of us have heard phrases like — “Data is the new oil” and what not. It’s also well known that most of the Deep Learning models are pretty data hungry and getting appropriate (labelled) data is tough and expensive. Given all this, a natural thing to do would be to squeeze out as much as possible from the data you already have, I’ll go over some techniques which help us do that

Following is a list of items that I’ll cover in this blog.

  1. Augmentation
  2. Transfer Learning
  3. Semi Supervised Learning / Pseudo Labeling
  4. Simulation

Please note that NOT all…

Attention, Transformer, BERT and more.

In a previous post, I wrote about two recent important concepts in NLP — Word Embedding and RNN. In this post, I’ll cover the concept of Attention and Transformer which have become the building blocks for most of the state of the art models in NLP at present. I’ll also review BERT which made the powerful concept of transfer learning easier in NLP

Like my earlier post, I’ll skip most of the mathematics and focus more on intuitive understanding. The reason for this is — use of excessive notations and equations is a turn off for many, myself included. …

2018 has been widely touted as the “Imagenet moment” in the field of NLP due to the release of pre-trained models and open source algorithms by big players like Google, Open AI, Facebook etc. I have tried to summarize some of the important advancements in the field of NLP in this blog.

I have spent a good fifteen years in the field of Data Science and AI and have worked a fair bit on NLP in this period, but the pace with which things have been moving in the last 2–5 years is unprecedented. …

Moiz Saifee

Senior Principal at Correlation Venture. Passionate about Artificial Intelligence. Kaggle Master; IIT Kharagpur alum

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store