Understanding the User Behavior of Digital Marketplaces

Published in

SingularityNET

14 min readJun 21, 2019

Our study on the behavior of digital marketplaces participants will help us develop a reputation algorithm that can perform more effective product recommendations.

Introduction

Recommendations often dictate our digital activities. What should we buy? Which video should we watch next? Tech Corporations are tackling all of these questions, albeit imperfectly, for now.

As the age of artificial intelligence draws near, the impact of recommendation engines on our daily lives is only set to increase. The SingularityNET team is creating a reputation algorithm which will help us to perform more effective and contextual product recommendations for prospective buyers.

This reputation algorithm will be useful in a wide variety of online marketplaces. Currently, we are testing the algorithm by performing a large-scale simulation which tries to mimic real-world marketplaces as much as possible. In this article, we will explain our study on the behavior of digital marketplaces participants.

How many users leave reviews?

Multiple studies have already concluded that the percentage of users who leave a review while visiting online marketplaces depends on whether they were asked to do so.

According to the BrightLocal consumer review survey, if asked, approximately 71% of consumers will leave a review for a business. Similarly, our analysis of Fiverr, a freelance services marketplace, showed that the platform’s persistent ask for a review results in feedback from roughly 60–80% of its users.

Thus, we can conclude that if online marketplaces ask their participants for reviews, then up to 70% of their participants will oblige, depending on the system we have in place.

Our study of user behavior revealed that the persistence of a website or business in asking for reviews also makes a difference in the actions of their customers. Some websites are particularly insistent in asking for a review, to the point that it obstructs user behavior — while other websites only send an email to request customer feedback.

An important factor that influenced the difficulty of getting customer reviews was the size (in terms of the user base, revenue, and monthly active visitors) of the digital platform. Our research shows that huge online retailers, due to the significant number of other products and sellers, “dissolve” the uniqueness of any particular seller or product.

This dissolution of identity on large platforms elevates the importance of asking users for their feedback. For example, when the users of Aliexpress, a popular global e-commerce platform, confirm that they have received a product, the website automatically sends its users to a page where they can leave a review of the product.

In this regard, Aliexpress is very persistent in asking for customer feedback, and about 35% of its users leave a review. A platform of similar size: Amazon, in turn, has a low percentage of users who leave reviews (2–5%), as it does not persist in pursuing its users for feedback.

Another important conclusion reached by various studies is that online users are more likely to leave a review after a bad experience than they are after a good one. For example, the 2018 ReviewTrackers Online Reviews Survey found that 34% of users who had a negative experience left a review, while the figure stood at 28% for those who had a positive experience.

An interesting question is whether the price of a product influences the number of reviews. The study “How Online Reviews Influence Sales,” concluded that reviews have a greater impact on the conversion of expensive goods than cheap ones. The study found that for cheap products, when reviews were displayed, the conversion rate increased by 180%, whereas for expensive products, the conversion rate saw an increase of 380%. The authors of the study concluded that a higher price involves a higher risk, and having more information via reviews helped mitigate that risk.

To summarize, the probability of an online user leaving a review is affected by:

Size of a marketplace
Degree of request for feedback (this, in turn, affects the number of honest reviews — and a high number of honest reviews decrease the value of fake reviews).
Nature of experience (positive/negative)
Cost of registration to leave a review (i.e., do users need to make a purchase, registration, captchas, and other technological obstacles).

Percentage of scam reviews/agents

The task of figuring out the percentage of scam reviews on digital marketplaces is not an easy one.

Fake reviews are trying to game the system and are of little use if they could be detected easily. Researchers have devised two main methods to detect and estimate scam reviews. The first method involves looking at the distribution of ratings and detecting anomalies, while the second method is to deploy machine learning tools that detect fraudulent activity based on several factors.

Usually, the researchers also have insider data.

For example, Yelp reported that 16% of the users who give feedback on their platform submit fake reviews. On the other hand, Amazon claims that less than 1% of the reviews (and therefore ratings) on their platform are fake due to their excellent reputation police. Actual estimations from more neutral sources estimate this number to be around 9%.

In general, the percentage of fake reviews, to a large extent, depends on:

How expensive it is to post a bad review
The effectiveness of reputation police
The percentage of other (real) users that post reviews (the higher it is, the smaller the relative percentage of fake reviews)
How powerful the incentive is to post a fake review
The difficulty of getting customers

When it is hard to acquire customers, like in the case of AirBnB, especially when such customers make large transactions, a higher number of scam reviews were observed. On the other hand, in niches where it is easy to acquire customers, with a high frequency of small transactions, like in the case of Uber, there was little incentive to post fake reviews since acquiring customers is easy and the rate of customer feedback is naturally high.

Researchers have also compared Expedia and Tripadvisor reviews. While Expedia only allows its users to post a review after they have made a purchase, Tripadvisor allows anyone to post a review — a difference of 5–15% was observed between the two platforms. A closer review of TripAdvisor by Schuckert et al. revealed that approximately 20% of the reviews on the site were fake.

When it comes to Amazon, there are estimations that in some categories as much as 61% of the reviews are scams; however, in most categories, the ratio is thought to be much smaller.

All of these numbers only highlight the fake transactions and do not shed much light on how they are distributed between sellers and buyers. That is hard to determine. One study by Hu et al. concluded that lower quality products are more likely to manipulate ratings (in a positive way) than higher quality products. Similarly, another study looked at how ratings are affected by competition, by analyzing restaurants within 0.5 km radius. The table below shows the results:

The results show that if a restaurant has independent competitors within 0.5 km range, they will on average have 0.001 unfavorable ratings. One can conclude, therefore, that competition alone has a measurable effect on ratings received.

However, the conclusion is not valid for chain competitors (which generally do not care about ratings as much as independent competitors). The big chain restaurants, like McDonalad’s, do not gain much by ratings. People know what they can expect from these places and so do not need online ratings to form their decisions. Also, for these big chain restaurants, the risks outweigh the rewards — i.e., if they get caught in review fraud, the potential loss of reputation can be much more damaging than any gains, so there is an incentive to avoid any misrepresentation.

Relationship between product price and amount of scam

As we have mentioned earlier, there are more incentives for fake reviews in higher priced items than in cheaper items.

For instance, if we look at a large Amazon dataset, we can inspect the distribution of ratings in different price categories. The data shows that as the products get more expensive, the number of 1-star rated reviews rises. It is not too far fetched to assume that many of those poor reviews are fake.

Also, since most fake reviewers tend to post either 1-star or 5-star reviews, one interesting data point is the percentage of ratings between 2 and 4-star reviews over different price groups. Our research shows that as the price increases, the total share of reviews in that range decreases. A 9% decrease in such ratings was observed in products that cost more than $500 when compared to products that cost less than $10.

Figure 1: Share of different ratings of reviews for different price categories.

Table 1: Share of different ratings for different categories.

Table 2: share of 2–4 rated reviews in all reviews. On the right-hand side, there is also a percentage change in different categories compared to the lowest price category.

Profit margins

Most private companies generally do not reveal their profit margins on AI services. However, public companies do have to report their financials, and one can roughly estimate the profitability that they have.

The publicly traded companies are of various sizes, some are relatively small (revenue below $100 million a year), while others are quite a bit bigger. In general, one should not look at the net profit of these companies — most are aggressively expanding and are heavily investing in marketing. On top of that, many such companies are also reporting losses over a year because of similar expansion related expenses.

One way to estimate the profit margin is to take the gross profit of such companies and subtract the costs related to research and development (R&D) from that amount. Also, when doing our research, we excluded the administrative cost for companies that were making a loss and the money spent on marketing.

This means we take into account profits before marketing, interest, debt repayment, and amortization. Taking into account those conditions, we calculated that the profit margins ranged from 10% to 50%. It seems reasonable to expect that the average real agents might have 30% profit margins in the AI industry.

The margins were much lower after interest, amortization, and marketing were taken into account — and for public companies were usually lower than 10%. However, that is the difference between small and big companies — big companies have economies of scale and as such, probably deal with more competition and need to have bigger administrative and marketing costs. Smaller companies might not have such costs, especially if their main marketing platform would be an online marketplace.

While it is hard to get an accurate assessment for real (legit) agents, it is even harder to do so for scamming agents. Such agents probably do have higher than average profit margins than real agents; however, it is hard to find any estimates on the matter.

User growth

When a marketplace reaches network effects, there is usually much higher growth than before — this was noted on cases of AirBnB, Uber, and others.

According to various studies, online marketplaces usually reach network effects 2–4 years after the time they are “conceived.”

The average growth of an online marketplace can be expected to be around 50% a year, though this varies based on different factors. For instance, during the boom times — when network effects kick in, we might have over 100% growth year to year.

Price distribution

For our research, we reviewed several hundred thousand Amazon products that were in the top 100,000 in sales rank in their respective categories up to the year 2014. Top 100,000 sales rank means that in the recent period they were in top 100.000 most sold products in their category.

Next, we made a histogram of prices for each category. The data shows that the vast majority of most sold products are lower priced products, and only a select few products are of the higher price range. It is not that expensive products are not being sold on Amazon; it just means that those products get significantly fewer sales than cheaper products.

On Figure 2 below, we can see how this distribution looks in the category of software:

Figure 2: Histogram of price ranges for the most popular products in sales rank in the category of software.

Other notes

So far, we have been characterizing agents as “good” and “bad.” In real life, most agents are not necessarily only good or bad.

For example, fraudulent buyers are often legit accounts.

In such cases, the seller often pays or manipulates them into purchasing a product. The schemes can be as complicated as asking the raters to rate one of their competitor’s products, and then not paying for those fake reviews to cause their competitor some trouble.

It is also recommended that the users are encouraged to write reviews; which makes it much easier to catch the fakers and gives them fewer incentives to cheat.

Wang et al. note that there are six common characteristics of fake reviewers:

Review gap: Unlike genuine reviewers who use their accounts from time to time to post reviews, spammers are usually not longtime members of a site. Thus, if the reviews are posted over a relatively long timeframe, it suggests normal activity. However, if all reviews are posted within a short burst, it indicates suspicious behavior.
Review count: Paid users generally write more reviews than unpaid users. In other cases to avoid being detected or blacklisted, a spammer could post very few reviews from one account and create a new account.
Rating entropy: Spammers mostly post extreme reviews since their goal is either to improve a particular company’s rating artificially or to bring a bad reputation to its competitors. This results in high entropy — or drastic randomness — in fake users’ ratings.
Rating deviation: Spammers are likely to deviate from the general rating consensus. If genuine users fairly outnumber spammers, it is easy to detect instances where a user’s rating varies significantly from the average ratings from other users.
Timing of review: One of the strategies used by spammers involves posting extremely early after the opening of a business to maximize the impact of their review. Early reviews can significantly impact a consumers’ sentiment on a product and, in turn, impact sales.
User tenure: Fake reviewers tend to have short-lived accounts characterized by a relatively large number of reviews and handles, usernames, or aliases designed to avoid detection.

Frequency of reviews

Set-up

While conducting our research, we reviewed a large Amazon dataset to look at how often scammers post ratings and how often real agents post ratings.

We first reviewed a dataset with over 9 million products. We began by searching for products that sell and were plastered with fake reviews. This was not easy to do; Amazon tries to catch those fake reviewers by itself. To flag fraudulent reviews, we made a shortlist; the following conditions had to be satisfied:

The product must have at least 100 reviews.
At least 95% of those ratings must be either one or five-star ratings (average of this number is around 66%).
The price must be higher than $50. With this, we try to evade promotional products and new anticipated books which might likely be influenced by fake reviews, but we cannot be sure.

After this filter, we were still left with over one hundred products. Despite the filter, we were still not certain whether the products had been influenced by fake reviews (although the chances were high). The uncertainty prompted us to inspect products personally.

We chose the products that had either been removed or had many of their reviews not taken into account by Amazon when it calculated their average rating — which makes it likely that Amazon flagged many of those reviews as well.

The following four products passed our manual test: B008YOQKXU, B00CY9RGD4, B00FKGXX4E, B00GB31I8I.

Four products were enough because each had more than 100 reviews and we are trying to get a sample of fake reviewers, not fake products. We worked with a few hundred (over 500) users, many of which did write fake reviews at least at some point in their review process.

On the other hand, we had to also find a control group. In that group, we tried to include real reviews. We combined all the products that had a price of over $83, less than 70% of 1 or 5-star rated reviews and more than 45% of 1 or 5 star rated reviews. It had to have over 150 reviews as well. We considered all reviews and all reviewers who have done so to be non-fraudulent.

In the end, we got 51152 “real” reviewers who wrote 1382761 reviews (with ratings) and 778 scam reviewers who wrote 6488 reviews.

Findings

Our findings reveal that fraudulent reviewers write fewer reviews. In fact, 42.4% of the scamming agents write only one review, while the same is true for only 17.7% of real reviewers. Below are the distributions of the number of reviews for both groups.

Figure 3: Distribution of the number of reviews per user for the control group — called “real users.”

Figure 4: Distribution of the number of reviews per user for the test group — called “fake users.”

The average number of reviews for the real group was 27 and for test group 8. Medians are 6 and 2, respectively.

What’s Next?

This market research is a part of a greater project by the SingularityNET team, in which we seek to create a reputation algorithm that can do a better job at recommending products. We are testing the algorithm by performing a large-scale simulation which tries to mimic real-world marketplaces as much as possible — and this research will help advance the accuracy of that simulation. Moving forward, we will continue to update our community members on our progress.

If you would like to learn more about SingularityNET, we have a passionate and talented community which you can connect with by visiting our Community Forum. Feel free to say hello and introduce yourself here. We are proud of our developers and researchers that are actively publishing their research for the benefit of the community; you can read the research here.

If you are looking to monetize your AI services or create new ones, we invite you to learn more about the nature of our platform and what its Beta version has to offer by visiting the SingularityNET developer portal.

For any additional information, please refer to our roadmaps and subscribe to our newsletter to stay informed about all of our developments.