Data, data everywhere, which ML models to run?

Anshita Solanki
Softweb Solutions Inc.
5 min readMay 29, 2018

Data experts often face the problem of choosing the most appropriate algorithm to address a specific given challenge. It’s easy to get overwhelmed trying to think of which machine learning algorithm to apply. In order to apply appropriate algorithms, you first need to thoroughly understand the problem, figure out what specific area(s) you want to address, and exactly what output should your algorithm provide.

The algorithms can be divided into 3 main categories:

Supervised Algorithms: The training data set has inputs as well as the desired output. During the training session, the model is trained to identify data and process accordingly.

Suppose, for classification, a neural network is fed thousands of labeled images of various animals and an unlabeled image is then given to the pre-trained system, it tries to understand the image layer by layer and based on the training, presents an appropriate output.

Unsupervised Algorithms: In this category, there is no target outcome. The algorithms will cluster the data set for different groups.

The key difference between supervised and unsupervised learning is that the data are not labeled in unsupervised learning. Even though the pictures of monkeys don’t come with the label “monkey”, deep learning networks will still learn to identify them based on the data fed to the system previously.
We have explained the types of algorithms in detail in our blog which you can read here: Deep learning 101.

Reinforcement Algorithms: These algorithms will train themselves based on the success/error of the output. Eventually by experience, such algorithms will be able to give near-accurate predictions.

We are going to cover the subsets of supervised machine learning algorithms in this blog.

Regression

Regression analysis is used to model the relationship between an independent variable and one or more dependent variables.

Webinar agenda

Overview: Big data and machine learning

Real world uses and benefits of ML

Business uses of ML

ML solutions for varied industries

ML & data visualization: Seeing is believing

Pathway to success — Onboarding (PoV)

Demo

Q&A

On Demand

Forecasting

The most common use of regression in business is to predict events. Let’s say, a company wants to estimate growth in sales based on the current economic scenario. Using the recent company data that indicates the growth in sales parallel to the growth in economy and the past data, a regression model can help the firm to predict future sales.

Analysis of data from point of sales systems and purchase accounts enables companies to highlight market patterns like increase in demand on certain days of the week or at certain times of the year.

Optimization

A data-driven approach eliminates guesswork and hypothesis from decision making. This improves business performance by highlighting the areas that have the maximum impact on operational efficiency and revenues. It helps organizations to optimize performance KPIs to integrate capacity planning and optimization, while also enabling them to rapidly identify issues in operations and immediately work upon it to delay downtime.

Example
Amazon prime video uses regression models to answer questions like:

  • Which genre of movie / series is most interesting to a particular customer?
  • How many customers prefer watching the same movie / series?
  • What to recommend next to the viewer that has watched certain kind of movies / series?

Time series

Time series forecasting methods produce forecasts based solely on historical values and they are widely used in business situations where forecasts of a year or less are required.

Identifying trends

Trends are consecutive increases or decreases in a measurement over time. Time series algorithms are used to understand this rise and fall of demand and trends. Data tendencies reporting from time series charts can be useful to managers when measurements show an increase or decrease in sales for a particular product. It aims to understand patterns evolving over time and use these patterns to predict future behavior like monthly sales, weekly ER volumes, stock prices, etc.

Seasonality or Seasonal Variation

Seasonal variations tend to be repeated from year to year. Data points variances measured and compared from year to year can reveal seasonal fluctuation patterns that can serve as the basis for future forecasts. This type of information is of particular importance to markets whose products fluctuate seasonally, such as commodities and clothing retail businesses. For retailers, for instance, time series data may reveal that consumer demand for winter clothes spikes at a distinct time period each year, information that would be important in forecasting production and delivery requirements.

Example
Uber records login times when customers click the app through smartphones and studies through time series analysis to predict the trend of Uber demands in the following hours. They analyze users’ demands and determine the price of their ride. Using time series analysis helps Uber with predictions of demand at scale.

Classification

Classification allows users to identify different categories where the new or incoming set of data belongs.

Customer Segmentation

Classification algorithms enable companies to group customers into segments based on specific variations among them. These segments account for customer differences across multiple dimensions such as demographics, browsing behavior, and buying pattern. Connecting these traits allows data-savvy companies to roll out highly personalized and targeted marketing campaigns that are more effective at boosting sales than generalized campaigns.

Image classification

Image classification can most effectively identify relevant features of an image in the presence of potential complications like variation in the point of view, illumination, scale, or volume of clutter in the picture. It has a wide range of business applications including modeling 3D plans for construction sites based on 2D designs, social media photo tagging, informing medical diagnoses, and more.

Example
Netflix divides the user base of more than 99 million global members into segments of users with similar movie and TV series preferences and displays recommendations based on what’s popular in those communities.

Solve your data challenges with SIA — Softweb Intelligence and Analytics platform

SIA, a machine learning platform by Softweb Solutions, is empowering firms to make business savvy and data-driven decisions. With features like data cleansing, data ingestion, data preparation, and feature engineering, SIA allows you to import structured and unstructured data into the system from different data sources, helps you to remove data that might distort analysis, and enables you to quickly identify variances and standardize the format. It provides companies with the desired outcome by applying methods like classification, regression analysis, and predictive analysis.

To learn more about the business benefits of our machine intelligence platform, you can talk to our experts.

Originally published at www.softwebsolutions.com on May 29, 2018.

--

--

Anshita Solanki
Softweb Solutions Inc.

I write about technology, life, love, philosophy, and more…