How Ubcoin Market Uses Machine Learning to Filter Content

Ubcoin. Cryptocurrency reimagined
Ubcoin Blog
Published in
4 min readJun 18, 2018

Ubcoin is a unique crypto-to-goods exchange — a blockchain-based platform where everyone can buy or sell real goods for cryptocurrency. You can buy Ubcoin tokens with a discount now on https://ubcoin.io/en.

This article describes the details of implementing a solution for moderating content using artificial intelligence. Moderation of content on the Ubcoin platform is necessary to minimize ethical and legal risks, as well as to improve the user experience. The artificial Intelligence used by Ubcoin checks the ads legal infringements, morality infringements, suspicious activity, and duplicate ads.

Ubcoin Market has developed and continues to develop its own solution for content moderation. In the article, How Ubcoin Market Modulates Content Using Artificial Intelligence, we wrote that content moderation in services with variable content and with millions of ads, such as Ubcoin Market, is necessary and is solved by applying machine learning technologies in the field of text and image processing.

The principle of the Ubcoin Content Moderation Solution

For automatic announcement moderation, machine learning algorithms are used, trained on examples of real ads. For this, a training sample is created by manual markup. After learning the algorithm, when a new ad is submitted to the input, a classification takes place, which shows which type it belongs to.

Verification Process Regarding Compliance with the Law

There are categories of goods that are prohibited to be sold by law. For example, weapons, drugs, pornography, etc. Therefore, Ubcoin does not allow the placement of such ads, which is reflected in the White paper of the project.

To automatically detect illicit content, a training sample is created, consisting of ad texts and images. Furthermore, In order to classify frequently occurring texts, an algorithm based on a recurrent neural network is used. For rare cases, where the training sample is between 500–1000 examples, algorithms of logistic regression and random forests are used, which give state of the art results when working with sparse matrices resulting from working with texts. By applying this approach, the artificial intelligence used to moderate Ubcoin has the ability to detect inconspicuous instances.

For image moderation, a solution based on a neural network on the basis of the ResNet architecture is used.

Verification of Ad and Comment Compliance with Morality

As in the case of ad compliance testing, the solution uses text and image analysis. The verification process determines the permissibility of moral content, and is assessed on the basis of whether the ad and comments contain threats, insults or racism.

Identifying Suspicious Ads

Some ads may be classified as “suspicious”. For example, if a specific type or category of goods on the market are on average $ 100, but are listed in the ad as $ 10, it will be flagged. To detect such anomalies, the approach of finding anomalies and identifying patterns is used. So, if an abnormal combination of parameters is fixed, this is a signal, for required additional verification.

Fraudulent Ads

Duplicate ads are often the result of fraudulent activities. Moreover, even if the duplicate ads are not created by scammers, this reduces the quality of the user experience, and is ultimately misleading the buyer.

To identify duplicate images, the average “grayscale” in the image is used. Search for identical images is made by comparing the average of “grayscale” level with minimal computational costs.

Finding a duplicate text is not a trivial task. However, the text can be rewritten using other words, which makes direct comparisons ineffective. Therefore, to search for duplicates, Ubcoin Market uses a technology which creates a translation of the sentence into the vector Doc2Wec and then uses the technology of word translation into the vector Word2Wec. The approach determines the semantic closeness of textual constructs, which, together with the ad data, is used to identify duplicate texts. For example, the words credit, mortgage and installments have different spelling, but are close in meaning. The algorithm used with Ubcoin Market takes into account the closeness of meaning when searching for duplicate ads. The algorithm is similar to the, modern solutions for searching for plagiarism in written papers.

Those who wish to violate site regulations can come up with a variety of ways to bypass moderation. To maintain the relevance and effectiveness of algorithms, Ubcoin Market will regularly retrain artificial intelligence using new data, including those obtained from user reviews.

The current level of development of artificial intelligence technology cannot completely replace a person 100%. Although, AI is an effective tool to help accelerate the process and accuracy of moderating content while minimizing the manual human factor. Ubcoin Market is developing an advanced solution for the moderation of content based on artificial intelligence for its exchange platform for goods. Such a solution has been devised in order to minimize legal and ethical risks and to improve the overall user experience.

Ubcoin is conducting a public token sale now and has already raised more than 4 million dollars from private investors with some undisclosed amounts from institutional investors — the Inventure fund and the Singapore-based Amereus fund. Ubcoin will be part of the Ubank mobile app that is pre-installed by Samsung in 10 countries and has over 16 million installations worldwide. You can buy Ubcoin tokens with a discount now on https://ubcoin.io/en

--

--

Ubcoin. Cryptocurrency reimagined
Ubcoin Blog

A unique crypto-goods exchange. A blockchain-based smart ecosystem for easily exchanging real goods for cryptocurrency and vice versa. Website — ubcoin.io