Boosting Data Value with Neural Embeddings

Bernard Abayowa

Published in

84.51°

9 min readNov 10, 2020

By Bernard Abayowa, 84.51° Director of Data Science

Introduction

Data available to businesses have grown significantly over the years. Large amounts of data are being generated daily from business operations, customer engagements, and external sources. This data needs to be analyzed to support strategic and operational decisions. However, there are some barriers that make turning data into insights challenging. These include data cleaning, curation from various sources, validation, and ultimately feature generation.

In this blog post, we will discuss challenges associated with feature generation from data and how neural embeddings can alleviate them and enable businesses to get more out of data for far less cost and effort.

Feature Generation Barriers

Data often contain hidden insights that are valuable to businesses but difficult and expensive to acquire. Machine learning algorithms can extract these insights and solve complex data-rich business problems. However, they require data to be transformed into features suitable for making predictions. The common approach businesses use for generating features is the manipulation of data with domain knowledge to create new variables. This approach is often referred to as feature engineering or handcrafting. Typically, feature engineering relies on the skills of the domain expert to understand which features to create and how? This step is manual and labor-intensive.

The sources and structure of data required to make contextually relevant and efficient models have grown significantly over the years. Handcrafting features from this data is time-consuming and demands the generation of very high dimensional features which can quickly become difficult to maintain. This problem is more emphasized in real-time scenarios where there may be limitations in computational and memory requirements to meet service-level agreements.

Moreover, there is quite a bit of domain knowledge that we cannot fully explain or put into hand-code formulas or rules. These include complex associations beyond our explicit understanding that are acquired through our senses or experience over time, as well as those beyond our awareness. Some of this knowledge can significantly boost the discriminatory and generative power of machine learning models. However, they are often ignored in feature engineering.

Furthermore, handcrafted features usually contain sensitive information. Privacy and security are more important than ever in business. This suggests the need for better ways of representing data to prevent the reconstruction of sensitive information, without a negative impact on the performance of machine learning models.

Many of the above challenges associated with feature engineering can be alleviated with Neural Embeddings.

What are Neural Embeddings?

Neural embedding is the transformation of high dimensional data into a low dimensional vector space that reflects the semantic or functional similarities of concepts in the data. It converts large texts of human-readable data and numbers into matrices, which are meaningless to humans but is a representation of the original data in a form that is readily usable by machine algorithms. This embedding approach can also encode implicit information in data that are difficult to explain. They are more privacy-preserving and have better security properties compared to feature engineering.

The machinery for generating neural embeddings are deep or shallow versions of neural networks, a general-purpose framework for learning representations from data directly from raw input with less-to-minimal feature engineering.

*Neural embedding transforms data into a lower-dimensional vector space that reflects the semantic or functional similarities of concepts in the data*

Types of Neural Embeddings

Neural embeddings learn features from data such that similar input will result in similar vectors in the embedding space. However, the semantic or functional similarity patterns learned by models vary based on the input data structure and the training process.

The input data structures include unstructured data, such as the sequence of words in a sentence, group of products in a transaction, clickstream, images, and sensor data; Interaction data such as customers and their purchased products; hierarchical data such as taxonomy of products; graphical data such as social networks, and customer or product knowledge graphs. The most common type of data in business, tabular data, can also be transformed into lower-dimensional embeddings.

Neural embeddings are usually extracted from intermediate or final activations of neural network models trained in a supervised or self-supervised fashion.

To generate supervised embeddings, we train a neural network model to map input data to target labels. This is done in a similar manner to a traditional supervised learning approach such as gradient boosting. We start with a set of features relevant to our prediction task, and then train a neural network end-to-end for classification, ranking, or regression tasks.

*A supervised model with combined customer and product input features and embeddings*

Another way to generate supervised embeddings is the metric learning approach. Here, we start with a set of input entities we want to associate, such as customers and relevant products, and then train a neural network to generate separate embedding vectors for the entities such that their level of association can be computed with a similarity metric such as a dot product of the two vectors.

The metric learning approach is especially useful in real-time services where there may be limitations on memory and computational resources, and in few-shot classification scenarios where we have many classes, but few labels are available for each class.

*Metric learning approach with separate customer and product input features and output embeddings*

For self-supervised embeddings, we use the data to predict itself. This could involve predicting the present, past, or future context of the data. A part of the data is used as input while other parts of the data are used as supervisory signal for the prediction. The self-supervision training could also involve encoding and decoding of the data input in its entirety as done in autoencoders.

In self-supervised embeddings, we use the data to predict itself. For example, products or transaction embeddings can be generated by training a neural network to predict randomly masked products in a transaction

Business Use Cases

There are many applications of neural embeddings in business. We can group these applications into four categories: Implicit insights discovery, Segmentation and grouping, Search and retrieval, and Transfer learning.

Let’s look at each one of these use case categories.

Implicit Insights Discovery

Earlier we discussed the ability of neural embeddings to extract complex associations in data. These include implicit insights that are difficult or impossible to obtain through queries of structured or unstructured data. With neural embeddings, we can discover and visualize entity relationships and use that insight to solve a variety of business problems.

Some of the business solutions that can be developed from these insights include similarity or complements data products, which businesses can use to help customers discover offerings such as new or existing products, brands, or services that are relevant to those they like. The insights can also be used to generate personalized rankings of entities that are relevant to a customer in various business contexts.

*Insights from product-to-product and customer-to-customer embedding similarities can be used to generate personalized product recommendations to customers*

Furthermore, neural embeddings can be used to discover associations in complex networks. Businesses usually have many data points about customers and products. However, chances are that this data will contain missing links. Techniques such as knowledge graph embeddings can be used to find these links and generate 360 views of customers or products.

Business Use Cases: Product and Service Recommendations for Customers, Complementary Products, Customer Marketing, Product and Customer Knowledge Graphs, Similar Entities e.g. Products, Customers, Content, Brands or Services

Segmentation and Grouping

Segmentation and grouping techniques are used in business to gain insights into the market landscape and to improve customer experiences. It involves the association of business entities such as customers or products based on shared qualities or characteristics.

Similar to clustering and traditional supervised techniques with engineered features, neural embeddings can be used to build segmentations of business entities. However, the unique properties of embeddings enable the development of segmentations more easily when there are few labels available for supervised learning, or when data is multi-modal or unstructured.

As in the segmentation use case, neural embeddings can also be used to solve problems involving the analysis of groups. The density of entities in the embedding space can be analyzed over time to identify trending topics such as product types and customer preferences. Moreover, we can use techniques such as hierarchical embeddings to refine taxonomies or discover a better organization of taxonomies that reflect customer needs. Furthermore, embeddings can also be used to identify outliers in a group of entities based on their locations in the embedding space.

Business Use Cases: Product Segmentation, Customer Preferences and Profiles, Product Taxonomy Building and Refinement, Business and Customer Trends Analysis

*Embeddings form multi-dimensional clusters that can be manipulated to generate various segmentations such as customer and product segmentations from data*

Search and Retrieval

Search interfaces are one of the most popular digital touchpoints customers use to find products and other business offerings. Businesses also use search and retrieval algorithms internally to drive information extraction systems for use in operations. However, most search and retrieval tools used in business rely on keyword-based linking of queries and data spaces or hand-crafted features for non-text search scenarios. Neural embeddings enable search and retrieval of entities with similar semantic meaning but different keywords or features by enabling the incorporation of contextual information, which can be difficult to handcraft with traditional techniques.

Furthermore, neural embeddings enable multimodal retrieval tasks like image-based product search or text-based image retrieval. Suppose you have an image of a product you would like to buy, but not the name. Multi-modal embedding search systems can be used to find products relevant to the query image. They integrate multiple data sources such as images and text or speech signals into a common embedding space where related items can be easily matched regardless of their input data modality.

Business Use Cases: Semantic Business Data Query, Text-Based Search, Image-Based Search, Voice-Based Search

Product images and text can be mapped into a common multi-model embedding space to enable image-based retrieval of products . This is useful in scenarios where customer has an image of a product but does not remember the description or for search convenience

Transfer Learning

The training or interpretation of machine learning models often requires labeled examples, which are laborious and difficult to obtain in many business scenarios. Transfer learning is one of the major breakthroughs of neural networks that alleviates this burden. It involves using a model or its neural embeddings as a starting point for training another model on a related task.

With transfer learning, businesses can pretrain models from structured or unstructured data. The resulting models or neural embeddings can then be used for downstream tasks such as classification, regression or ranking in the following ways.

Fixed feature extraction: In this scenario, embeddings are extracted from pretrained models. The pretrained embedding is then used independently or in combination with engineered features for training another model, which could be a neural network or traditional models like gradient boosting. This is the most popular approach to transfer learning in business. It is helpful when the training data for the new task is similar to those used to train the source model.

Finetuning: This approach involves retraining the source model to extract new embeddings or to solve a new task directly. Finetuning is useful when large amounts of training data is available but very different from the one used to train the source model. For example, a model trained for natural language generation can be finetuned for a classification task involving product descriptions. The benefit of this approach is that it speeds up model training time and convergence for the new task thereby improving operational efficiency.

Business use cases: Numerous Marketing and Operational Tasks Involving Supervised Learning including Tabular Data Analysis, Image Analysis, and Text Analysis

*An illustration of transfer learning: knowledge from source model can be transferred to many target models to solve many tasks*

Conclusion

Neural Embeddings can help businesses extract hidden insights from their data that would typically require significant manual effort or expensive acquisition. They efficiently generate informative features required for machine learning with minimal feature engineering, and drive solutions for many business analysis tasks. In addition, neural embeddings can be used directly to solve business tasks involving semantic or functional similarity. Therefore, there is significant value in using neural embeddings in data-driven businesses.

Boosting Data Value with Neural Embeddings

Written by Bernard Abayowa