Incremental ML
How to make automated Machine Learning model?
Being a Data Scientist, your basic role involves making models, refreshing, validating and updating it with time. Why not automate that too? Why can’t machine do even that work for me and update it with time? There is Reinforcement Learning which learns by itself, but at the same time its more challenging to deploy those methodology.
Incremental Learning also called as Self Learning, a method which allows us to model problem where we have a continuous stream of data coming in.
Why do we need Incremental Learning?
The general practice of using any model results is to validate the model on the latest data or refresh it with the current data. In case of validation, the model might work well on the current data but we might miss some crucial information present in the new data as we are relying on the model built on old data. In case of refreshing the model, we might miss some information if we rely only on new data or combining all the data might be computationally too high. Incremental Learning can help us in getting information from the complete data and with less computation cost. The main aim of Incremental Learning is to store the existing model information and update the model with time as the new information comes in. Instead of building the model on the complete data (prior training set and the new data feed in), Incremental learning will update the model parameters for the existing model obtained from the new data.
Applications of Incremental Learning
Nowadays, many websites are using different version of Incremental Learning to learn from the flood of users that keep on coming and going back to websites. Incremental Learning can be used to learn user’s preferences from the stream of data and optimize some of the decisions on the website.
Shipping Service Website:
In a shipping service website, where the website offers a shipping price based on details entered by the user (origin, destination, size of the parcel), sometime users chooses our service and sometime not. Here, properties of user (demographics, entered details, prior response to shipping yes/no with the offered price) can be captured and an optimized price can be offered so that they will have a pretty high probability to choose our service. Any Machine Learning algorithm (Logistic Regression, Neural Network, Tree Based model) can be used here to predict the likelihood for a user to choose our service with a given price. What will Incremental Learning model will do here is- as we get the continuous stream of data for users, it will update the current model parameters with each new instance of data. Any change (like economically, where the user might demur to pay high or normal price) which can influence the user’s preferences will also get adapted by the model itself over the time.
Search Engine:
Another example is an application in product search website where we want to apply incremental learning to give good search to user. When a user type in a search query i.e., the specifications of the required mobile phone, the website shows 10 different phones to the user out of all possible mass of phones satisfying the user’s requirement. We would like to ameliorate the likelihood that the user will click the link out of given featured phones where the parameters in model will be the specifications of phone from search query, correlation between words in search query and phone’s name/description. Incremental Learning model will update the model parameters with each and every positive/negative response gathered in the data for a given link and search query.
In my next article, I would be sharing more in detail through codes and methodology of how you can use this in a live project.
I can be reached on Linkedin or through my website.
References
[1] Z. Abdallah, M. Gaber, B. Srinivasan, and S. Krishnaswamy. Adaptive mobile activity recognition system with evolving data streams. Neurocomputing, 150(PA):304–317, 2015.
[2] M. Ackerman and S. Dasgupta. Incremental clustering: The case for extra clusters. In NIPS, pages 307–315, 2014.
[3] C. Alippi, G. Boracchi, and M. Roveri. Just in time classifiers: Managing the slow drift case. In IJCNN, pages 114–120, 2009.
[4] R. Allamaraju, H. Kingravi, A. Axelrod, G. Chowdhary, R. Grande, J. How, C. Crick, and W. Sheng. Human aware UAS path planning in urban environments using nonstationary MDPs. In IEEE International Conference on Robotics and Automation, pages 1161–1167, 2014.
[5] Y. Amirat, D. Daney, S. Mohammed, A. Spalanzani, A. Chibani, and O. Simonin. Assistance and service robotics in a human environment. Robotics and Autonomous Systems, 75, Part A:1–3, 2016.
[6] A. Anak Joseph and S. Ozawa. A fast incremental kernel principal component analysis for data streams.
