An introduction to ShareChat: Personalising content for Next Billion Users — ShareChat
This article is written by Ayush Mittal, Lead Data Scientist at ShareChat
ShareChat is an Indian social media startup that presents its interface and recommended content in 14 Indian languages. Here’s taking a look at what makes us work
- ShareChat is a social medium made available in 14 Indian languages.
- It is a platform for everyone to share content as text, images or videos.
- The ShareChat data science team is using some of the most advanced machine learning technologies to learn about usage trends, in order to sharpen recommendation and build a more wholesome platform.
- The article provides an introduction into ShareChat, and how it is using key technologies to shape up its platform.
ShareChat, as many Indian users have noticed already, is a social network built to enable the next generation of India’s internet users. The application presents a platform for active discourse in vernacular languages. With the advent of super affordable internet connectivity and wider network coverages, the Next Billion of the world are slowly coming online, and this beckons the need to provide them with a platform to talk and share, crucially in their native language. Naturally, with such a task at hand, there are innumerable challenges that our data science team faces, in order to streamline recommendations and continuously make the platform better.
The importance of data
India has a staggering amount of diversity unified within its borders. Our nation is home to 22 different languages, further subdivided into over 700 different regional dialects. At this point, ShareChat is available in 14 vernacular languages — Hindi, Punjabi, Bangla, Gujrati, Marathi, Telugu, Tamil, Malayalam, Bhojpuri, Odia, Kannada, Assamese, Rajasthani and Haryanvi. It is this that has massive implications for the work our data science team does.
At ShareChat, our data is fresh. With most users coming online for the first time, and with there being no precedent to their usage or to their language, the nature of data and usage patterns being formed are varied, diverse and immense. This leaves us with a huge amount of data to process, and identify how internet behaviour varies within different ethnic groups, and how first-time users differ by nature from long-time users. Our data science team further has the task of identifying the finer elements, like the tone of a statement in any vernacular language, processing data from video consumption and so on.
In essence, we at ShareChat are witnessing many firsts, in terms of technology, usage and users as a whole. Every day, we learn along with our algorithms, and it is this that excites us to build stronger technologies to better develop our platform.
So what do we do?
To answer in the simplest form, we are a social medium catering strictly to Indian internet users. ShareChat has been built to empower the latest internet users of India, and allow them to share their own content, in their own languages. This has multiple aspects to it — for one, it provides a social platform for everyone to share content ranging from humour and devotion, to politics and regional updates.
The idea is for individuals to follow each other, along with personalities on our platform. This not only gives these personalities a greater context with their local followers, but also gives him/her a ground check clock on local sentiments. Second, it allows individuals from various corners, who have moved away from their motherland because of work or other opportunities, to be connected to updates form their own social circle(s).
At present, the ShareChat app sees over 70 million people accessing our platform every month — not a mean feat, considering how short our span of operations has been so far. We believe that communication does not necessarily come with a specific form, and as a result, our platform today supports text updates, still photographs, gifs and videos. Every day, we see over 1 million new posts across all languages and regions. When you compute all this information, you sit back and realise one thing — this is a staggering amount of data that we have in our hands. Hence enters our data science team.
Data Science at ShareChat
With such tumultuous amounts of data, it is both a lucrative opportunity and a massive scope for us to learn and grow. The data science team at ShareChat is thus pivotal to computing, assessing and decoding the base of data that is generated on our platform every day. Key to this are two of the most pioneering technologies of our present generation — Artificial Intelligence (AI) and Machine Learning (ML).
As an organisation building a social medium for a new generation of users, our data science team goes through a massive amount of data, which is forever changing. Users on our platform have diverse nature and opinions, and often there is no linearity or rhythm to content that is shared or newly generated. The key for us is to train our advanced AI and ML algorithms efficiently, in order to implement multiple things. The first of these things is our Trending Feed, where the algorithms are constantly processing data to dynamically assess proper recommendation of content.
This is closely linked to the content processing pipeline, which in turn is simultaneously linked to various factors, such as judging tonality of content, filtering sensitive content, assessing usage patterns and summarising region-based nature of content consumption to improve recommendation and build a more wholesome platform. This comprises of several algorithms in domains of Natural Language Processing, Computer Vision and Recommendation using Deep Neural Networks working in tandem to serve right content to right users at right time. While these are broad topics in themselves, we shall discuss them in more detail, going forward.
Tying the knots
The task at hand, hence, is understandably massive. Not only is there a wide reserve of newly generated data, but the entire nature of data is new in all specifications. At ShareChat, we continuously strive to use this information to build a better platform — for us, our present patrons and the future generations to come. Here, we gave you a glimpse of the work that we undertake on everyday basis. See you soon, with a deeper dive into our core technologies, and how they work!
Originally published at https://blog.sharechat.com on January 4, 2019.