What happens when Machines are curious to learn?

Questioning to Quench the Quest

The year 2017 has been an Exciting year for AI/ML, Many ML Advancements (Technical papers, projects) are proposed which definitely helps to evolve (take baby steps) from our current AI into more General AI (Artificial General Intelligence).

Starting from the Hinton’s Capsule network as alternative to Convolutional Neural network, OpenAI’s Evolutionary Strategies as a Scalable alternative to Reinforcement Learning , Uber’s NeuroEvolution papers , Alpha Zero defeated the Go player and even defeated the stockfish (computer simulated chess grant master)from the scratch without human intervention, An distributed Automl such as Auto Tuning Machines by MIT etc are some of the start of Art benchmarks in 2017,Apart from this, many exciting AI/ML frameworks are open sourced by Tech giants in 2017

Deep learning Frameworks -2017

some interesting projects from Microsoft are also eye opening for the Future AI space

Machine Learning in another Dimension:

Lets look at Machine learning in another perspective ,

(i) Our Ancestors used to predict the future, by devising their own way of Time series, Trend and Seasonality Analysis using Horoscope (birth chart). They used past data patterns, to predict the future using some sort of calculation (math).

Astrologers are Modern Day Data scientists ?!
Astrologers as Modern Day Data Scientist!

(ii) Time Travel was always an mystery and lots of theories revolving around this topic .will time travel become reality with advancements in Quantum mechanics and computing ?

I believe that at one point of time ,we are able to time travel (not merely an time travel ,but able to see the future) You Don’t really need to travel at speed of light, just travel deep into the world of data science and Machine learning. For example predicting the future frames of video by Generative Adversarial network, if your able to give life history of a person it may predict the near real time future events .Exciting!!!!

Travelling through time with the help of Neural Networks!?

So ,What do you mean by making machines, curious to learn? Enter Active Learning

First of all, How machines learns?, even how Human learns?

“PEOPLE learns from Experience Machine learns by Example

So, How we are giving those examples to machine is directly proportional to the Accuracy of the prediction

Active learning is an semi supervised learning strategy which improves by asking questions, the key Hypothesis is that if the learner is allowed to choose the Data from which it learns -to be active ,curious or exploratory, if you will –it can perform better with less training (i.e. Active Learning chooses set of examples to learn instead of getting human annotated labels for Training)The key intuition behind Active learning is

“Better\Best Examples makes Learning Easier for Humans as well as Machines”

Supervised Learning Strategy :In Supervised Learning Techniques,(Passive learning ) all the examples(labelled training data) are given at once ,to the machine and machine identifies the underlying patterns in the data and predict the future events as shown in the fig: with this three steps

Step 1: From unlabeled –Raw data ,few examples are chosen by Human annotator (Randomly) and annotates the Raw data to get training instances (Examples)

Step 2: These set of Human chosen Examples are given to the Machine (supervised Learning Algorithms)

Step 3: Machine learns the underlying pattern from the Given annotated data (Passively)

Supervised way of learning ,where human(annotator)gives examples to the machines

Drawback of Supervised learning technique is the cost of labeling a raw data which will be time consuming and expensive for example : In Speech recognition annotating the audio takes 10 times longer than the actual annotation of the word ,other Expensive Annotations include Document classification, Object Recognition etc..

Active learning on other hand, gets the raw data(unlabeled data) tries to find the pattern in the data (with minimum available Examples)and if leaner doesn’t know how to proceed (finding pattern) then it will get help from Human annotator for most difficult patterns and trains a model with less training data

Active learning where machines chooses the apt example to learn the pattern

How to make machines curious to Learn?

There are several different situations in which the learner may be able to ask queries. The three main settings that have been considered are

(i) membership query synthesis,

(ii) stream-based selective sampling, and

(iii) pool-based sampling

Membership Query synthesis: As the name suggests it will syntheses Query, the learner may request labels for any unlabeled instance in the input space, including queries that the learner generates de novo(new samples are generated), rather than those sampled from some underlying natural distribution

Some latest Innovation of this type of Query synthesis is by using GAN (generative Adversarial Networks) Generator was trained to synthesis queries and they call it has Generative Adversarial Active Learning (GAAL)

Stream based Selective Sampling: It doesn’t generate any query samples, it only samples from the actual distribution, and then the learner can decide whether or not to request its label from annotator i.e. each unlabeled instance is typically drawn one at a time from the data source, and the learner must decide whether to query or discard it

Pool based Sampling: In this sampling method, bulk of unlabeled data was collected from the data source and the learner evaluates them and selects the Best Query for Annotation from those bulk of data

Query strategies (Technique behind Curiosity) :Making machines curious was not a trivial task, there are some strategies which makes machine curious (at least acts or mimics as curious)such as

Uncertainty Sampling: This technique was most used one, which queries instances from the data source for which it is least certain how to label.

Query-by-committee (QBC): The QBC approach involves maintaining a committee of learners C = {θ (1), . . . , θ©} which are all trained on the current labelled set. Then Each learner or model asked to predict the label for the unlabeled data, The most informative query is considered to be the instance (sample of unlabeled data) about which they most disagree.

Expected Model change: In this strategy, the learner chooses the sample instances which causes greatest change to the underlying model

Expected Error reduction: Unlike Expected model change, this strategy queries the instances which greatly reduces the Generalization Error.

Active learning has immense potential for real time use cases and as power to completely transform Feature Engineering which is one of the most crucial and time consuming part of the Data Science pipeline .I strongly believe that the Active learning should be one of the necessary steps in the every Data Science Pipeline to make accurate prediction on real time data.(more on active learning refer this survey literature)

Lets look at Active learning in real time implementation:

Let say, you want to predict the Absentees in your office (Your features may be health status,family background, traffic etc).you come up with one of the classifier(say Logistic Regression)then you predicted the absent/present of your employees. At end of the day you have Ground truth regarding your Absentees, you validate your result with this ground truth ,and added those additional examples (Ground truth ) to your Model What does this mean ?Is this Active learning ???? Of course it is Active learning ,since we are starting from few set of examples(training instances for day 1) and adding more number of examples on daily basis, these improves our Accuracy of the model as shown below

So Active Learning was awesome ,how to implement Active learning ?is there any Open source Libraries available ?

Yes, we have some Open source libraries which gives this immense power ,they are

For Python :

Libact -Active learning by meta Learning

Sckit-learn-semi supervised algorithmns

Curious snake

For R:

R -active learning library

Another Java based Text Annotation tool called Dualist

Lets Look at Active learning library libact briefly in this Article for more details follow their paper available here. libact library provides interfaces to design active learning strategy ,these interfaces are

Dataset :This object stores the labeled set and the unlabeled set (none for unlabeled set). Each unlabeled or labeled example within a Dataset object is assigned with a unique identifier,after getting the labels(ground truth )from the annotator ,labels are placed in their respective position using this identifier

Model:This object represents a supervised classifiation algorithm,this object may also take sklearn’s classification algorithms

Querystrategy: This object implements an active learning algorithm.The make_query method of a QueryStrategy object returns the identifier of an unlabeled example that the object (active learning algorithm) wants to query.

Labeler : This object plays the role of the oracle in the given active learning problem. Its label method takes in an unlabeled example and returns the retrieved label

So lets look at the sample usage program of libact -python code given in their documentation

#X-features of training 
#y-ground truth -unlabeled data instances are stored as none
dataset=Dataset(X, y) #step1
#declare a QueryStrategy instance
query_strategy = QueryStrategy(dataset)
labler = Labeler() # declare Labeler instance
model = Model() # declare model instance
for _ in range(quota): # loop through the number of queries
query_id = query_strategy.make_query()
lbl = labeler.label(dataset.data[query_id][0]) # query the label of the example at query_id
dataset.update(query_id, lbl) # update the dataset with newly-labeled example
model.train(dataset) # train model with newly-updated Dataset

Libact library provides active learning by meta learning interfaces which selects the best Query strategy on the fly ,for example(UncertaintySampling, hintsvm etc)which they call it has ALBL(Active Learning By Learning)

here is an example code for ALBL

from libact.query_strategies import ActiveLearningByLearning
from libact.query_strategies import HintSVM
from libact.query_strategies import UncertaintySampling
from libact.models import LogisticRegression

qs = ActiveLearningByLearning(
dataset, # Dataset object
T=100, # qs.make_query() can be called for at most 100 times
query_strategies=[
UncertaintySampling(dataset, model=LogisticRegression(C=1.)),
UncertaintySampling(dataset, model=LogisticRegression(C=.01)),
HintSVM(dataset)
],
model=LogisticRegression()
)

Thus many current research publications are focusing on Combining the state of art techniques such as Reinforcement learning ,GAN ,Deep learning with active learning.Looking forward many more such interesting combination of these techniques with active learning in 2018.

Spread Intelligence with Love!

Love all;