Michelle Bonat
Jul 25, 2018 · 5 min read
Track your portfolio in the Data Simply dashboard
if ft.complete_html.blank? && ft.plain_text.blank?
PopulateFilingTextJob.perform_async ft.sec_filing_id
# Setup for Outlier Detection
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
import matplotlib.font_manager

from sklearn import svm
from sklearn.covariance import EllipticEnvelope
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor

print(__doc__)

rng = np.random.RandomState(42)
# 2) Load the data from the remote url via an AWS S3 bucket

dataset_url = 'https://s3.amazonaws.com/your-url-goes-here.csv'
data = pd.read_csv(dataset_url)
print data.head()
Data head with the first 5 rows
print data.shape
print data.describe()

Michelle Bonat

Written by

Leader, entrepreneur, software engineer, data scientist. See my personal site at http://michellebonat.com and on the Twitter @mbonat

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade