BirdiBlue is a sentiment analysis application developed by SynergyCrowds. This is accessible within the SynergyCrowds platform at https://crypto.synergycrowds.com. In the initial setup (2018 July), its feed is the incoming data from Twitter regarding #Bitcoin.
Sentiment analysis is a branch of data science covering the extraction of opinions, feelings or attitudes on a certain subject, by employing machine learning algorithms to classify these as positive/negative/anger/joy etc.
This analysis becomes more and more important as tremendous amounts of data became freely available in recent years. Whether it’s about feedback on some electronic device, the experience had at a hotel stay or the crowd sentiment in the cryptocurrency markets, sentiment analysis can provide valuable knowledge. And it can effectively replace expensive (time, money, human resources) market studies specially built to capture information on these matters.
BirdieBlue was born from the need of filtering all the noise created around the crypto markets, and Bitcoin, in particular. Being a popular subject, of course there is a lot of related gossip. For those preoccupied in analyzing the cryptocurrencies market, the vox populi should also be an indicator.
In our approach, we considered the sentiment in the cryptocurrency markets as either positive or negative, and built a mechanism to quantify the intensity of these sentiments.
The application collects the data from Twitter and stores it in a data warehouse where it becomes available for further pre-processing. The extraction process consists in applying some predefined filters in order to make sure only relevant data is gathered.
The flow (see bellow figure) continues with a numerical transformation of the text. The algorithms works with numbers, not words. This means every word in a tweet [even emojis :)] will be coded into relevant numbers.
The above steps are part of preprocessing phase, which even if it doesn't seem spectacular it’s critical for the success of the entire process. Did you know that in data scientists’ work about 80% of their time is consumed preparing and managing data for the analysis?
Once the data preprocessing is finished, the actual analysis takes place. Each tweet is processed through the Recurrent Neural Network, where the sentiments are extracted. The Network makes a decision on the relevant words in a tweet based on previous experiences from which it learned.
Old or new, the rules discovered are kept in a Lexicon, which is a dictionary that keeps in memory all the knowledge regarding sentiments classification.
Based on several factors (e.g. count, volumes of tweets), the app computes the intensity of sentiments (positive and negative) and provides it to the user, taking into account a certain time frame.
The time frame is an important feature when analyzing the sentiment (as it is when analyzing financial data). A sudden change in sentiment intensity can be the result of a disruptive news release, but one should also take a look at the big picture to check the persistence of this change and compare it to previous values.
The crowds’ sentiment can be a prolific indicator for trading. Several strategies[1,2,3] were built and tested aiming to exploit this information. Usually, the best results can be obtained in conjunction with other indicators based on price and of course strategy tuning is required.
An introduction of the SynergyCrowds platform can be found at https://medium.com/synergycrowds/synergycrowds-platform-introduction-82a3bd45588f
1. Sul, H.K., Dennis, A.R. and Yuan, L.I., 2017. Trading on twitter: Using social media sentiment to predict stock returns. Decision Sciences, 48(3), pp.454–488
2. Oliveira, N., Cortez, P. and Areal, N., 2017. The impact of microblogging data for stock market prediction: using Twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Systems with Applications, 73, pp.125–144.
3. Sun, Y., Fang, M. and Wang, X., 2018. A novel stock recommendation system using Guba sentiment analysis. Personal and Ubiquitous Computing, 22(3), pp.575–587.