Motivation and a warm welcome from Run Advisor

Marcel Caraciolo
Published in
9 min readMar 22, 2022

Hello and welcome to my new series of posts about two passions that I decided to mix together and see the possible outcomes: Running and Data Science! My name is Marcel Caraciolo, and this is the blog Run Advisor where the Artificial Intelligence and Data Science meets the Running Endurance Sports.

But let’s first introduce myself , the idea behind this project, the goals and finally the motivation.

Who is Marcel Caraciolo ?

My name is Marcel Caraciolo, a Brazilian guy living at Recife, Pernambuco. Nowadays I am bioinformatics systems coordinator for the Hospital Israelita Albert Einstein, a huge complex hospital reference for healthcare services, including clinical genomics sequencing tests for cancer and rare diseases.

When I am not working on genomics data science or managing, I am running (training hard to be a marathoner), studying product management, data science and productivity practices or build and rebuild one of my Lego Architecture sets.

Brief Introduction

The number of elite and amateur runners is increasing year over year. They belong to one of the sports that get more adepts at Brazil and in the world: The road running. Its easy onboarding to get started, low entry costs and the healthcare concern are the main triggers for the growing interest from people on running activities.

This huge running community drives a business industry that moves around R$ 3 billion reais per year only in Brazil, according to the trends report produced by SEBRAE [1].

Running market opportunities for several players

As you can see at the figure 1 , the road running activities go beyond medals and races. More athletes are interested on purchasing sports goods and specialised equipment for performance improvement, self-esteem (good-looking) or just following the new releases on running gear.

In terms of volume we might estimate more than 4 million people running casually or professionally at Brazil, and more than 1000,0 running races organised at several places around the country per year.

Number of official running races only at Brazil — accordingly to the source

This growing market is getting attention from several sports brands, sports department stores and independent e-commerces that want to position themselves as the first option in purchasing products(shoes, clothes, watches, etc) for the runners consumer preferences.

A huge set of running gear products without knowing the runners real preferences

There is an overwhelming set of products from several brands, suppliers sold by many on-line and physical stores in order to conquer the preference ( in terms of buying and fidelity) of the running athletes consumers.


What is the problem ?

Sports goods recommendation

It is a fact that an amateur runner doesn't really need a lot of equipment that is marketed for runners. But as the number of options are growing, specially with the promise of enhancing their running, either by providing more feedback for training, more comfort or even protection from possible dangers, the runners are getting more confused about the best affordable to choose in accordance to their needs and goals: performance, style or comfort.

Sports brands are also facing a longstanding challenge to understand their consumers, and even get their attention. The increased product offerings and increased competition in the market are requiring new marketing strategies from sports companies and department stores. One of marketing investment lines are on the running races, it is a popular opportunity to engage and interact with the participants.

Rio Marathon : More than 10,000 runners participate , and it is increasing year by year!

Running Races are becoming quite popular in many countries and more amateur and elite athletes are joining to this group of people looking forward to finish their first 5km, 10km ,21km or 42km or better perform their finish race time, or be one of the three top winners (sometimes there are interesting prizes). Despite of all these advantages, the runners are spending more to get the best experience at their races.

Ok, but how can we connect the running races to the sports brands and department stores ? The brands want to connect further with their final consumers and fidelize them so they can buy their products for a long period of time. The department stores sell the equipments from these sports brands and want to catch the consumer to purchase from their sales pot. And to add to this equation, the running race organizers, that want more athletes joining their events in exchange of more personalised and rich experience events for the runners. How can we help all these stakeholders ?

The next step is how we can get the available data from the runner in a running race or even a simple jog so the brands and stores can take a step further to understand more their final customers ?

Keep this image in above in mind, we will discuss more about it later.


Running is a trend today. The popularity of the sport has also risen dramatically over the last decade, with an estimated +40% growth in the number of people participating in marathons worldwide from 2008 to 2018.

There are many cases of how technologies might help runners how they improve their running techniques — from tracking exercise activity, running gear recommendations, running pose form to adjusting nutrition.

Artificial Intelligence (Al, a broad name for a group of advanced methods, tools, and algorithms for automatic execution of various tasks) has entered practically all areas of fitness business over the years.

There are popular tools for many use cases to support runners as they train, race and recover. But for gear and running equipment recommendation, there aren't many available tools. To find the right equipment with the right runner is extremely important, specially when it comes to footwear. Of course there are many recommender systems that suggest clothing to users and incorporates runner's physical traits (sex, weight, age, gait) and their training (weekly volume, pace, terrain, etc) , but understand what the runners really likes to use at his everyday training jog or at his official race, is quite challenging.

Images and videos are becoming an important source of information about the preferences of consumers. The metadata hidden inside the image is powerful because we can extract many traits, environment conditions and even preferences. By an example, let's back to the previous image:

Can you count the number of running gear items are showing at this running race photo below ? And how many brands are also captured at this image ?

The visual content is considered as the most powerful and informative source that sports brands all over the world are looking for. Better than doing a on-line survey , why can't they extract these preferences from all runners from a set of photos taken in a running race ? That requires Artificial Intelligence and Powerful Hardware to handle this task of live object recognition and brand detection in fast, accurate and in the real-time mode.

Proposed Methodology

My goal is to build a gear recommender and ranking system based on a full set of real images of runners taken in running races. For tackling this problem, we will divide in some steps: the localization and classification stages, where we need to locate and predict the object's brand . After, we will build a prototype solution for recommending and ranking gear items based on the images from several real marathon races.

More details we depict it below.

01- Systematic scientific survey and comparison

The first stage is to perform a compilation survey of several machine learning techniques applied to multiple logo detection in real live images [2][3]. The goal is to understand the possibilities and drawbacks of those tools in performing this task, perform a technical comparison against all these methods, and if it is possible to select a most suitable algorithm combined with adjustments based on features selection or parameters optimisation.

02- Training and Testing a ML model with a real running race photo dataset

The second step is to apply the previous solution into a real running race photo dataset in order to evaluate the previous algorithm performance and accuracy against correct object detection, that is, its capability of identifying correctly a shoes, a vestment or watches and the logo brand power detection, that is , it power to correctly identify the brands: Adidas, Nike, Olympikus, etc. The focus here is to evaluate and perform refined settings in the algorithm so we can come with a solution capable of performing multiple brand logo detection and identification in a live running images dataset [4].

03- Prototype a running gear recommender system based on images and a brands performance analytics.

The final step is to build a recommender application prototype that could analyse the runner photos from races and extract the brands and objects used so we can create a user preferences profile and recommend new potential running gear [5]. Future studies will also include branding analytics, so the brands could analyse different indicators aggregating demographics and the athletes preference data like brands appearance count, brands coverage by gear, total brand coverage over all races, etc.

Conclusions and Next Steps

In this post I present my current research project based on running sports domain. I choose it for several reasons as I presented earlier: it is a popular and a challenging sport, it is very well represented by current fitness apps; it appeals to technologically savvy participants; and it is a sport that attracts a massive number of runners from novices to elite who invests a lot in running gear and tools to improve their run skills or simply to challenge themselves in races from 5km to 42kms (marathons). Many people are more interested on purchasing the fashionable and technological equipment for improving their results on races or just comfort.

My goal is to build a running gear recommender system for runners that will collect the athlete preferences based on their live photos on races and training seasons. The hypothesis is to evaluate the performance of recommender systems on suggesting equipment and gear (shoes, watches, vestment and other running accessory) based on what the runners really uses in their daily training or official races. For this task we will apply machine learning techniques to detect these objects from images and correctly identify the logo brand of those objects and build a ranking system that will aggregate all the brands and objects extracted from the images into a brands analytics, in order to help the department stores, brands and running race organisers to better understand and engage with their consumers.

The goal is to build a brands ranking analytics solution for brands and objects used by runners in marathon races. The images taken at the races provide rich information and preferences from the sports consumers, which can help the companies in interact and better engage with them by providing specialised gear and equipment.



[2] Indapwar, Amarja, Jaytrilok Choudhary, and Dhirendra Pratap Singh. “Survey of Real-Time Object Detection for Logo Detection System.” Intelligent Systems. Springer, Singapore, 2021. 61–72.

[3] Sahel, Salma, et al. “Logo detection using deep learning with pretrained CNN models.” Engineering, Technology & Applied Science Research 11.1 (2021): 6724–6729.

[4] Wong, Yan Chiew. (2019). DEEP LEARNING BASED RACING BIB NUMBER DETECTION AND RECOGNITION. Jordanian Journal of Computers and Information Technology. 05. 1. 10.5455/jjcit.71–1562747728.

[5] Manandhar, Dipu et al. “Brand-Aware Fashion Clothing Search using CNN Feature Encoding and Re-ranking.” 2018 IEEE International Symposium on Circuits and Systems (ISCAS) (2018): 1–5.



Marcel Caraciolo

Entrepreneur, Product Manager and Bioinformatics Specialists at Genomika Diagnósticos. Piano hobby, Runner for passion and Lego Architecture lover.