How Our AI Platform Empowers Analysts to Make Data-Driven Decisions in Market Research

Here is our experience on building AI-based platform that gives business analysts a powerful tool for informed decision making, improving their performance tremendously.

Alexander Romanenko
8 min readDec 20, 2017


My name is Alexander Romanenko and I’m both the founder and CEO of IndexBox, one of the leading market research publishers in the world. For over 10 years now, IndexBox has been conducting market research, completing consulting work and selling reports. Our clients include thousands of companies worldwide — from small farming businesses to renowned international organizations and brand names such as the IMF, McKinsey, EY, Mitsui, Nike, Henkel, Kraft Foods, Bulgari and PepsiCo.

Our database includes approx. 35,000 marketing reports, which cover over 200 countries and thousands of product categories. Today, IndexBox has fine tuned the process of how to generate and update these reports on the fly — in literally only 10 minutes. Yura, our company-designed AI-based big data analysis platform, is our assistant. This tool helps our analysts to make data-driven decisions and to find actionable insights, forecast sales and project future demand, while doing market research. As it has big data to hand, the AI is now able to answer questions such as ‘which products boast market prospects’ and ‘which country would be most profitable in terms of exports’, ‘investing into which business would generate the most return’ and ‘how to load production capacity’.

Unlike many other AI-driven robots, Yura was designed not to replace our analysts, but with a completely different concept in mind — to ensure that they are best involved in the entire process with their expertise.

The Market Research Industry and Its Pain

In the ten years that IndexBox has been involved in research and consulting projects, we consistently encountered the same problem — a lack of high-quality and accurate data. This explains why an analyst can spend most of the project time sourcing data and verifying its reliability; as a result, the actual analysis itself can get pushed to one side. The final submission should be completed the night before the project deadline, which then meant meeting the client red-eyed. We fundamentally wanted to change this.

Since IndexBox is a small company, we could not recruit many assistants to complete this routine analysis work. There was only one solution: to design a AI that could analyze big data and generate a core of a marketing report instead of the analyst; the analyst would then only have to produce a conclusion based on the wealth of their experience. We successfully achieved this objective.

Creation of the Artificial Intelligence Analyst

The actual process of developing the AI platform was preceded by an extensive period of preparatory work that involved systemizing our expertise in terms of gathering, cleaning and normalizing data, generating forecasts and the data analysis itself. Physically, this resembled a library of directories, methodical instruction manuals, catalogues listing external resources and various analysis algorithms. This was all transformed into a simple code and Yura AI was born.

Yura AI can handle the following tasks:

1. Data collection

2. Data cleaning, categorization and normalization

3. Building predictive models to generate market analysis and forecasts

4. Data visualization and trend description

5. Data-driving decision making

Data Collection

The first thing the AI does is gather data from hundreds of sources; these include the official websites of both national and supranational statistics agencies such as the UN and the World Bank, industry association sites, as well as commercial databases. Even at this initial stage, the AI has to work hard to standardize all the data from the various sources. Coding systems, language and units of measurement all vary from country to country; even the essence of one and the same indicator may be different.

At this point it is necessary to specify that official data from the most respected sources should not be considered by default as being reliable. In our experience, almost one third of all official data contains significant anomalies. This can be explained by merely technical issues, as well as poor practice from those concerned in authority; this is frequently observed in the reporting on African countries.

Yura AI rechecks the data by cross-referencing it with other sources, in a bid to solve this problem. When it comes to international trade, for example, the AI ‘looks at’ the mirror data: it compares the volume of supplies from both exporters and importers, targeting any violations and restoring any missing data.

Let’s suppose that an analyst is interested in data concerning the volume of banana imports to Namibia in 2007. According to the official data provided by Namibia as a reporter, the volume of imports amounted to 364 kg. Should this amount be calculated using the mirror data, however, a figure almost 7,000 greater is obtained — 2.5 thousand tonnes. Yura AI compares these values and selects the most accurate.

The AI constructs a complete picture to assess their accuracy; in this case the global trade in bananas over a 10-year period. If this process were to completed manually, it would take several hours. The AI downloads and calculates all the mirror data for all the various omissions in 3–4 minutes.

Data Cleaning, Categorization and Normalization

Even after the mirror data has been collected, the statistics can remain incomplete, incorrect, inaccurate or irrelevant. Neither the live analyst or the AI can continue to work with it. It is at this stage that the processes of data cleaning, categorization and normalization are completed.

In some cases, this task can be solved by using statistical functions, such as when, for example, the average value is calculated for the missing data, or the process of extrapolation is used to fill and complete the missing data. This approach, however, is successful in 1 case out of 5, as it does not always incorporate the economic nature of the phenomenon.

Should a vanilla harvest In Madagascar be lost following a flood, which in turn then results in a sharp increase in prices, the statistical tools will consider these peaks and troughs as an anomaly. Yura AI works differently: it uses algorithms developed by our in-house analysts. It automatically finds the best predictive model for the data. In other words, Yura knows where and which statistical data recovery methods can be applied, and at which intervals these methods are not suitable. For example, a wide values range and a small number of observations make it impossible for us to produce a qualitative forecast without using other indicators as ‘reference points’. The AI’s knowledge is based on the principles of machine learning. It produces more accurate results each time.

Our analysts can always observe the entire data manipulation process — from raw data inputs and resulting functions to verified data outputs. Yura’s data mining is clearly depicted in the shot below. Data regarding the global import and export of bananas in both physical and monetary terms has been taken as an example.

How Yura AI Works. Data Cleaning, Categorization and Normalization

At the final data recovery stage, the AI carries out additional calculations to check the data in terms of indirect indicators. For example, trade data is used to calculate a product’s apparent consumption in a single country. Should the average per capita consumption, calculated using this approach, differ significantly in terms of countries with similar economic performance results and consumption pattern, then mistakes are present in the raw datasets. The data is corrected or replaced by the values from similar countries, should recovery be impossible; their dependent indicators are also recalculated and the verification cycle starts once again. As a result, even a single economic indicator for any one single country should comply with the global statistical picture. Yura AI monitors this.

Building Predictive Models to Generate Market Analysis and Forecasts

Yura AI uses a range of machine learning methods to generate market projections and forecasts, depending on the task to be solved. For example, should it be necessary to supplement a number of indicators with new values, Yura adopts the process of regression analysis.

Should it be necessary to isolate groups of factors and assess their impact on a given indicator, the AI resorts to factor analysis. For example, the platform uses an economic model designed by our analysts, which in turn uses figures combined from over 50 different indicators that affect the variables in which we are interested, to forecast the key market parameters (apparent consumption, production, exports, imports and prices). As a result, the projected indicator values are calculated in chain order, so as to assess their impact on each other. With the analysts controlling the calculation results of this model, it is, therefore, well practiced and the accuracy increases with every market forecast that we produce.

Data Visualization and Trend Description

Once the AI has completed its most ‘dirty’ work, it begins the process of visualizing the data in the form of graphs and tables and the actual writing of text. The AI is now able to pinpoint trends and to describe them beautifully; on occasion, the platform performs this task better than the analysts themselves. Yura AI is able not only to determine positive and negative trends using an algorithm, which is based on the calculation of moving averages and rates of growth, developed by our analysts, it is also able to spot lateral trends. While then describing a graph, the AI adopt a flexible algorithm for generating the text description, which would suit best depending on the given combination of different types of trend pattern.

Finally, Yura AI converts all the calculation results into easy-to-understand visuals and descriptions that are comprehensible even to those beyond the realms of data science.

Data-driving Decision Making

Now for the main question — what did our clients receive? In short, data driving decisions for growing their business. The AI can generate an answer to a particular question, for example, ‘which country (out of 200) would be the most profitable in terms of exports’. It goes through 4 computation stages, from data collection to constructing a complex economic model; this equates to more than 500,000 mathematical operations and only 3 minutes in total of script time.

How Yura AI Works. Data-driving Decision Making

Following the introduction of artificial intelligence tools at IndexBox, the process of generating marketing research has decreased significantly and quality has surged. The platform can generate a ready-to-go 100-page report for analysts in almost 10 minutes, removing the need for them to go through the processes of data recovery, visualization and description.

We remain certain that Yura AI will not usurp the analyst; on the contrary, using this platform will ensure that the analyst can focus on advanced analysis work, when the true value of human expertise plays centre stage.



Alexander Romanenko

Founder & CEO at IndexBox, a leading AI-based market research publisher (