Shopin Visual AI product update 10/23/2018

Published in

Shopin

7 min readOct 23, 2018

Happy Tuesday, one and all.

The Shopin team has been hard at work relentlessly pushing forward different aspects of our solution for the world of retail on the blockchain.

Our current core focus has been advancements in our Visual AI engine. This involves intensive mapping and tagging of new categories in fashion retail, and training our artificial intelligence models.

These models power our ShopperIQ onboarding games for retailers as well as recognition of a retailer’s inventory, the shopper’s uploaded images and purchase history.

Below is a string of updates from our team:

Things to know:
Similar recommendations — recommendations within the same category:
Eg: Recommending more similar bags to the bags you like.
Cross-category recommendations — recommendations across categories:
We recommend items from other categories to the one you choose. Eg: You like a scarf and we recommend a shirt, jacket or other “non-scarf” items that you may like.
Complete the look-suggest items that look good with your choices
This is one of our most exciting products. We recommend other items that would look great to complete the look you’re purchasing. Eg: That scarf would look great with those shoes, briefcase, gloves, and suit. It’s a programmatic solution that delivers the experience of having a personal stylist with you wherever you go.

Product Update Commits Update 10/23/2018:

Complete the look (CTL)

Development for experiments on multi-layer BiLSTM
Research on hyper-parameters tuning for BiLSTM
Evaluation of co-occurrence model on test data
Submitting multiple training jobs to test hyper-parameters

Product Normalization — Text:

Finalizing dataset for NER — for all categories
Deploying of NER for Intimate wear in AI pipeline

Product Normalization — Image:

Training of Jumpsuits length model
Bench-marking of the multi-task model with separate Image models

Similar Recommendations:

Training for Skirts & Shorts model in progress
Integration of multi-layer in Bags model

Object Detection:

Deployment of “human” classifier
Testing of RestAPI
Deployment of Internal OD

Product Update Commits Update 10/10/2018:

Object Detection:

Training for Internal OD — Iteration 2 has been completed. Currently performing error analysis
Finished error analysis for men's onboarding — Iteration 2
Working on lambda to sagemaker conversion

Complete the look (CTL)

Completed inference code for multidimensional co-occurrence model. While running the final model, the team observed that it’s taking a considerable amount of time to give outputs due to the very large size of the matrix. Currently, this task has been put on hold as the team will need to do research for a possible solution
Previously the model was showing output for “complete the look” in random orders, i.e.- first tops, then shoes and then jeans. The team has modified the code so that the output is shown in a fixed logical order, i.e. — first tops, then jeans & then shoes
Calculated Top K 96% precision for multi-query multi recommendation system
Modifying the AI pipeline lambda based on the results of the testing. This is expected to complete today

Product Normalization — Text:

Started working on the Classifier model for category & gender. Currently, we are utilizing output from Tree search logic. The data has been sent for tagging which will be used as input later on
Creating a combined PN model which will be deployed on lambda

Product Normalization — Image:

Multitask learning experiment on the pattern
Experimenting for Jumpsuits neck model. Currently, the Top 1 accuracy that we have been able to reach is 75%. The major portion of inaccuracy is due to one jumpsuit having multiple styles for the neckline, resulting in confusion during the tagging the “Neck” style. The base data has been sent for tagging which will be used to improve the model
Training the model: Shorts

Similar Recommendations:

Trying to implement multitask learning so that a single model can be used for multiple attributes

DevOps/Engineering:

Performing a comparison between AWS & GCP for CICD

Product Update Commits Update 10/8/2018:

Object Detection:

Training for shoes finished on complete model images. The team is currently performing the error analysis for the same
Training completed for suit model for men's Onboarding — Iteration 2
Started lambda to sagemaker conversion of OD model

Complete the look (CTL)

Working on Inference coding for multi-dimension co-occurrence model, expected to complete today
Development of autoencoder for training and inference is in its final stage and expected to complete today itself
Completed first round of testing on AI pipeline. The team found an issue in which if we feed multiple files to the AI pipeline at the same time, there is no output for most of the files. For example — when we tried inputting 60 files to the pipeline, output was only available for 8 of them. The team is currently looking into it

Product Normalization — Text:

Finalizing the dataset which will be used as input for classifier model. Once finished, the team will send the data to taggers for tagging and train the model on that
Evaluation of NER Iteration-2 for intimate wear

Product Normalization — Image:

Performing error analysis for Neck model of jumpsuits. The team noticed there are multiple styles for the same collar, i.e.- A neck can come in the style of V- Neck as well as collared while model will give only one output (V-Neck or Collared) which results in the lower accuracy of the model
The data for shorts length model has been sent for tagging. Once finished, we will train the model for the same
For the Coat & Jacket neck model, we noticed that the model is confusing between the collar of coat and tops. It picks up the collar of tops and gives a result based on that. Also, there can be multiple neck style for the same coat. We are currently looking into possible ways of dealing with these challenges
Multitask learning experiment on patterns

Similar Recommendations:

Finalized the shoes dataset for tagging and sent it to taggers
Trying to implement multitask learning so that a single model can be used for multiple attributes

DevOps/Engineering

Planned a sprint to do a check between AWS & GCP for CICD and started working on the same

Product Update Commits Update 10/5/2018:

Object Detection (Onboarding):

Training for shoes images from complete model images done. Currently working on evaluation for the same
Retraining Suiting model for men’s On-boarding — Iteration 2
Complete the look team was facing issues in the OD model where they were seeing a drastic drop in the images count after the OD run. Helped the team in rectifying it

Complete the look (CTL)

DevOps for multi-dimension co-occurrence model completed.
Working on Inference coding for multi-dimension co-occurrence model
Resolved issues with the storage of High dimension matrix in multi-dimension co-occurrence model
Testing the lambda functions for both inputting training images into the AI pipeline & data shaping (JSON creation).
Explored alternatives to ANNOY files. FALCONN & NMSLib seems to be good alternatives and will require testing to measure the performance between the three
DevOps for Auto-encoder loss completed

Product Normalization — Text:

Changing tree search code for changes in logic & taxonomy
Trained logistic & neural-net classifier model. The neural-net classifier is giving a high accuracy of 92%
Completed Sprint planning for the next 2 weeks
Evaluation of NER Iteration-2 for intimate wear
working on custom code for Protege so that the annotation task can be distributed to multiple taggers without any duplicates

Product Normalization — Image

Testing models for all the jumpsuit attributes
Working on Jumpsuit neck model to improve accuracy
Performing error analysis on OD output for Sweater, Coats & Jackets

Similar Recommendations:

Finalized classes to be trained for every style

DevOps/Engineering:

Built a POC for DynamoDB
Working on benchmarking of FAISS with Annoy

Product Update Commits Update 10/4/2018:

Object Detection (Onboarding):

Sent additional Shoes data for tagging. Will be used to improve model’s performance for detecting shoes from complete model images
Men’s model for onboarding- Iteration 1 is complete.
Retraining Suiting model for men's Onboarding — Iteration 2
Improving pre-processing scripts and also, documenting the code for the same

Complete the look (CTL)

Completed the 2nd Sprint, training the model on ASOS data (matching Tops to Bottoms). Tested the model against Tommy Hilfiger etc
DevOps for multi-dimension co-occurrence model is expected to be completed today. Will start working on coding tomorrow.
Created lambda function for inputting training images into the AI pipeline. It also takes care of creating batches of 100 input files.
Working on lambda for data shaping (JSON creation) for training the CTL model.
Kickstarted next Sprint for CTL.

Product Normalization — Text:

2nd iteration of Named Entity Recognition (NER) model trained to extract 8 important attributes like Support, Fabric, Neckline, Pattern,
NER model evaluation on the precision of score of 91% across all vital categories
Performed error analysis of Style & Sub-Style extraction code output from tree search logic
Pre-processing of the complete dataset
Training count-vectorizer model. Will train logistic & neural net classifier model after this and perform error analysis

Product Normalization — Image:

Committed Color & Pattern demos to Github
Testing models for all the jumpsuit attributes
2nd iteration of single multi-task learning model is complete. One single model is able to detect Neckline, Color, Length, Sleeve and Category with an average 92% Top-1 and 95% Top-2 accuracy without any test-time augmentation.
In production, this allows a reduction in the number of models being maintained (from 4 to 1) for attribute prediction.
Validated sleeves output from OD for Jumpsuits (using the dress sleeve model). Accuracy — 97%
Tried Jumpsuit neck model on two architectures, densenet(77%) & squeezenet (69.5%). The low accuracy maybe contributed to duplicates in the input dataset. Will be sending the data for tagging
Performing error analysis on OD output for Sweater, Coats & Jackets

Similar Recommendations:

Prioritized the future pipeline based on the Inventory stats gotten at the Category-Style Level.
Discussion for model standardization by building one single classification model has been kick-started.

DevOps/Engineering

Deployed combined PN model for intimate wear on Fargate
Resolved pending issues in CICD of OD model
Completed the dockerization of tagging tool
Created lambda function for batching of retailer CSV

I hope you’ve enjoyed the update — We will continue to bring you more of these with regularity to give you a window into our technical development.

Thank you for staying tuned in.

Shopin Visual AI product update 10/23/2018

Written by Shopin