Published in


Learning to link images with their descriptions

In this blog post, we will present an introduction to recent advances in multimodal information retrieval and conditional language models. In other words, it will be about machine learning, dealing with both image and textual data.

Everything written here is the ground and absolute truth, gated by the understanding of the author.

Why do we want to do this?

First of all, it is necessary to explain why we are interested in extracting knowledge from both image and text data.




Heuritech is a cutting-edge artificial intelligence company that provides fashion brands with predictive analytics on trends. Read our Tech and Fashion blog.

Recommended from Medium

A simple explanation of Machine Learning and Neural Networks

YOLO (You Only Look Once)

Object Recognition using CNN model

Building an on-premise ML ecosystem with MinIO Powered by Presto, Weka/R and S3Select Feature

Sentiment classification with Naive Bayes, Logistic regression, and ngrams: Part 3

What is Predictive Model Performance Evaluation

Intuition Behind Word Embeddings in NLP For Beginners?

Inpainting Fluid Dynamics with Tensor Decomposition (NumPy)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


More from Medium

Artificial Intelligence in Healthcare Part II

Humane Explanations: Attention and Multi-headed Attention

[Paper Summary] Knowledge Distillation — A survey

Review: MT-DNN (NLP)