Sentiment Analysis for Algorithmic Trading + Free Data Science Books

Brianna Taylor
Sep 6, 2018 · 4 min read

What’s New

Cannon Beach — Haystack Rock

I got back from a trip to Oregon and the views are stunning. I meandered up from Florence (best chowder is at the Depot) to Astoria (where some of the Goonies got filmed). Cuteness, breweries and antiques abound!

Things I Enjoyed:

  • Cascades Brewery, Portland — even if you are not a sour beer fan they will convert you. Barrel-aging in wine barrels works wonders. You should also go if you love tie-dye shirts.
  • Pok Pok, Portland — small plates, entrees, house brews and cocktails make up this super cute place to eat fantastic Thai. Yum.
  • Portland Markets — lots of gifted people have put down roots in Portland. You will not leave empty handed.
  • Florence and Yachats — If you like browsing cute shops, cruising dunes and eating good food then Florence is where it’s at. Go to the Green Salmon Coffee Co. in Yachats for a plethora of drinks. Try the Clockwork Orange Mocha.
  • Cannon Beach — see the Haystack Rock above!

Sentiment Analysis for Algorithmic Trading — Webinar with Datacamp and Quantopian

Dialed into a webinar this morning where there was a quick walk-through on using sentiment analysis to determine whether to long or short a stock. Some was a review of things I’ve learned previously in projects and there were some things that I now have to check out!

Notes/Further Education:

  • Lemmatization — an improvement on stemming words (just keeping the root word) in text processing, it takes into account parts of speech (adverb, noun etc.) and thus improves context and accuracy. There’s some really cool things you can look up in the dataset too, such as word similarity with the accompanying strength of that relationship.
  • Sense2Vec — provides more context for word embedding (where words or phrases from the vocabulary are mapped to vectors of real numbers). I had only heard of Word2Vec before now so I’m interested in trying this out for sure.
  • Using Recurrent Neural Networks over simple Neural Networks for text processing is recommended. RNN views sentences as sequences of words, whereas simple NN does not. Order matters! Especially when you are using n-grams to properly classify a sentence as positive or negative. Ex: Seeing “Not bad” versus just seeing “bad
  • Long Short Term Memory Network (LSTM) — LTSM are a kind of RNN. It processes data sequentially and uses distance and weight as part of the training process. It is capable of learning long-term dependencies (aka can remember information for long periods of time, can forget when necessary).
  • Factor Models — Models used in asset pricing and portfolio management (CAPM, Fama-French Factors) etc. Multi Factor Models can be used to explain either an individual security or a portfolio of securities. Quantopian actually looks like it has a really nice environment for creating and testing algorithms. You work within a Jupyter Notebook with some pre-loaded finance packages and data. Check them out if interested.
  • During the webinar the speaker went through an example of using sentiment analysis with sentiment140.com data. He showed, inside Quantopian, how you could create and analyze a strategy of shorting/going long on the most positively/negatively classified companies and have a profitable spread.

One thing that I thought of while this webinar was running was the possibility of bot accounts distorting actual sentiment and thus the results of the algorithm, seeing as sentiment analysis is a popular go-to for trading algorithms now. If there were groups interested in taking advantage of companies/individual users who use sentiment analysis in their algorithms, they could do so by creating thousands of tweets that would advocate buying/selling a stock using positive/negative keywords/phrases in tweets. These groups could be trying to push up the value of companies they have a vested interest in and sell high/hold, punish competitors, or sneak in and buy stock when a negatively classified company’s stock value goes down temporarily.

In short, if I ever end up doing sentiment analysis using social media I will be reading posts like Identifying “Dirty” Twitter Bots with R and Python by Paul van der Laken before I move onto text processing and deep learning.

Learn Hands-On - Sentiment Analysis:


FREE Books!

There is currently a deal on at Humble to get up to 15 O’Reilly Data Science books. Hurry if you want to take advantage, there’s only 4 days left to do so!

Humble Book Bundle: Machine Learning by O’Reilly

  • Machine Learning Is Changing the Rules
  • Introduction to Machine Learning with R
  • Introduction to Machine Learning with Python
  • Thoughtful Machine Learning with Python
  • Machine Learning for Hackers
  • Practical Machine Learning with H2O
  • Natural Language Annotation for Machine Learning
  • An Introduction to Machine Learning Interpretability
  • Learning TensorFlow
  • Machine Learning and Security
  • Feature Engineering for Machine Learning
  • Learning OpenCV
  • Fundamentals of Deep Learning
  • Deep Learning
  • Deep Learning Cookbook

Something Completely Different

Brianna Taylor

Written by

Data Analyst Learning ML/Data Engineering | Moving to Australia in July

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade