Building and Deploying over 20 ML Apps with code for free.

Chituyi
5 min readMar 10, 2024

--

👋👋 Follow me in this series as I demonstrate how businesses can quickly onboard ML/AI to improve productivity by understanding the inner workings, strengths, and weaknesses of some of the chosen algorithms.

Build and Deploy

Machine Learning Algorithms and Their Use Cases

Machine Learning (ML) is a powerful tool that allows computers to learn from data and make predictions or decisions without being explicitly programmed. Here, we will discuss common ML algorithms in various categories and their use cases.

As part of my commitment to the community, I will be building applications using all the discussed algorithms (☑️) and deploying them for community reference. On some occasions, I would serve 2 or more models in the same application. I plan to deploy 2 projects per use case per month as I show the implementation of all the discussed algorithms, with a write-up about the experience. I will be using PlotlyDash and Flask to deploy and serve the models Stay tuned for updates!

Predictive Analysis

Predictive analysis involves using historical data to predict future outcomes. Here are some commonly used algorithms.

  1. Linear Regression. This workhorse predicts continuous values like housing prices. Imagine predicting future sales based on historical sales trends and economic factors.
  2. Logistic Regression. Ideal for binary classifications (yes/no), it can predict customer churn (will they unsubscribe?) based on past behavior.
  3. Decision Trees. Easy to interpret, these flowchart-like structures can predict loan defaults by analyzing factors like income and credit score.
  4. Random Forest ☑️. Combining multiple decision trees, it improves prediction accuracy. Imagine predicting flight delays by considering weather patterns, aircraft maintenance data, and historical delays from various airlines.
  5. Gradient Boosting Machines (GBMs). An ensemble technique that builds on weaker models to create a powerful predictor. GBMs can be used for complex fraud detection systems, analyzing past fraudulent transactions to identify patterns in new transactions.
  6. XGBoost ☑️. Like GBMs, XGBoost is known for its speed, scalability, and regularization capabilities. It can be a great choice for complex prediction tasks.
  7. LightGBM. Another GBM variant, LightGBM is known for its efficiency and handling of large datasets.

Generative A.I.

Generative AI models generate new data instances that resemble your training data. Here are some examples.

  1. Generative Adversarial Networks (GANs) ☑️. Used to generate realistic images, music, speech, and text.
  2. Variational Autoencoders (VAEs) ☑️. Used for generating realistic faces, handwriting, etc.
  3. Transformer Models ☑️. Used for generating text, such as creating articles, writing poetry, etc.

Recommendations

Recommendation systems suggest products to users based on their past behavior. Some commonly used algorithms are.

  1. Collaborative Filtering ☑️. Recommends items by finding users who are like the target user.
  2. Content-Based Filtering ☑️. Recommends items by comparing the content of the items with a user’s profile.
  3. Matrix Factorization ☑️. Used in Netflix’s movie recommendation system.
  4. Deep Learning-Based Recommendations ☑️. Uses neural networks to learn user-item interactions.

Computer Vision

Computer vision involves teaching computers to “see” and understand the content of digital images. Some commonly used algorithms are.

  1. Convolutional Neural Networks (CNNs’) ☑️. Used for image classification, object detection, and more.
  2. YOLO (You Only Look Once) ☑️. A real-time object detection system.
  3. Vision Transformer (ViT) ☑️. is a powerful image classification model that utilizes attention mechanisms, typically used in NLP, to process image data. Unlike traditional CNNs, ViTs break down images into patches and process them like sentences, achieving high accuracy without extensive pre-training on specific image datasets.
  4. Mask R-CNN ☑️. Used for instance segmentation, which involves detecting and delineating each distinct object of interest appearing in an image.

Comparing Data Sets

Comparing two datasets is a common task in data analysis. Some commonly used techniques are.

  1. t-Test. Used to compare the means of two groups.
  2. Chi-Square Test ☑️. Used to compare categorical variables.
  3. Mann-Whitney U Test ☑️. Used to compare two independent samples of ordinal or continuous data.
  4. Kolmogorov-Smirnov Test ☑️. Used to compare two cumulative distribution functions.

Named Entity Recognition (NER)

NER is a subtask of information extraction that seeks to locate and classify named entities in text. Some commonly used algorithms are.

  1. Conditional Random Fields (CRFs). A type of discriminative probabilistic model is often used for NER.
  2. BiLSTM-CRF ☑️. Combines bidirectional LSTM and CRF for NER.
  3. BERT. A transformer-based model that has shown great success in NER.

Natural Language Processing (NLP)

NLP involves teaching computers to understand human language. Some commonly used algorithms are.

  1. Naive Bayes ☑️. Used for text classification, such as spam detection.
  2. LSTM (Long Short-Term Memory). Used for sequence prediction problems, such as language translation.
  3. BERT (Bidirectional Encoder Representations from Transformers). Used for a variety of NLP tasks, including question answering and sentiment analysis.

Anomaly Detection

Anomaly detection involves identifying unusual patterns that do not conform to expected behavior. Some commonly used algorithms are.

  1. Isolation Forest ☑️. An algorithm that isolates anomalies instead of profiling normal data points.
  2. One-Class SVM. Used for novelty detection when the training set is not polluted by outliers.
  3. Autoencoders. Can be used for anomaly detection by reconstructing the input data and measuring the reconstruction error.
  4. Local Outlier Factor (LOF) ☑️. Measures the local deviation of the density of a given sample with respect to its neighbours.

Unsupervised Learning

Unsupervised learning is a type of machine learning that looks for previously undetected patterns in a data set with no pre-existing labels and with a minimum of human supervision. Here are some commonly used algorithms.

  1. K-Means Clustering ☑️. Groups data points into ‘k’ clusters based on similarities. Imagine segmenting customers into different purchasing groups.
  2. Hierarchical Clustering. Creates a hierarchy of clusters, allowing for a more nuanced exploration of data relationships.
  3. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) ☑️. A density-based clustering algorithm.
  4. Principal Component Analysis (PCA). Reduces data dimensionality while preserving most of the information. This can be useful for image compression or preparing data for visualization.
  5. Apriori Algorithm ☑️. Discovers frequent items in transactional data. This is commonly used for market basket analysis, recommending products that customers often buy together.

Reinforcement Learning

Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment to maximize the notion of cumulative reward. Here are some commonly used algorithms.

  1. Q-Learning ☑️. An off-policy reinforcement learning algorithm that seeks to find the best action to take given the current state.
  2. Deep Q Network (DQN) ☑️. Combines Q-Learning with deep neural networks at scale.
  3. Policy Gradients. An algorithm that optimizes the parameters of an agent’s policy.
  4. Actor-Critic Methods. A hybrid method that combines value-based and policy-based methods.

In conclusion, machine learning offers a wide range of algorithms for various tasks. The choice of algorithm depends on the task at hand, the nature of the data, and the specific requirements of the project. Always remember that no single algorithm works best for all tasks. It’s always a good idea to try out different algorithms and see what works best for your specific use case.

Check out free ML projects with Code to get started here!

https://dallo7.github.io/

#MLDemocratizer!

--

--

Chituyi

Building data Pipelines for ML and AI to aid Supply Chain Agility and improve Customer Intimacy. https://dallo7.github.io/