Start researching today!

Image for post
Image for post
Source: https://passportstatus.co/2016/12/08/us-map-50-states-and-capitals-maxresdefault/

Every state in America has numerous organizations and offices that handle large amounts of data for different things, from traffic issues to weather patterns to education details. Luckily for us, almost all the states have made it mandatory to make this information public for the sake of transparency. Listed below are 5 different such sets for you to use in your own data science project!

If you’re interested in similar datasets with a much smaller scope, check out our article on 5 city datasets worth looking into.

Washington

Try looking at the most viewed mountain pass cameras and compare that with the road and weather conditions, or finding a relationship between Puget Sound travel alerts and overall travel times. Be sure to check out the “New and Notable section” for deep and descriptive data visualizations, like an interactive plot of Governor Inslee’s proposed budget plan! …


Start a project today!

Image for post
Image for post

Dubstech, the largest tech community at the University of Washington, hosted UW’s first Datathon, a data science hackathon for both beginner and advanced data science students, not too long ago. My team and I were responsible for writing the prompts. We decided to start searching the web (mostly Kaggle) for interesting data science problems which were challenging to both beginners and pros and could be scaled into full-blown projects after the Datathon.

Here are the 5 Data Science Project Prompts we came up with. …


Start Following, Networking and Sharing today

Image for post
Image for post

Twittter Data Scientists to Follow

Facebook Groups to join

LinkedIn Individuals to Follow

Some more:

Thank you for reading. Do give us a clap if you liked our article.


Amazing ML libraries to use in R

Image result for machine learning r

The no-nonsense guide to Machine Learning libraries to use in R

sourced from: https://github.com/qinwf/awesome-R

  • AnomalyDetection - AnomalyDetection R package from Twitter.
  • ahaz — Regularization for semiparametric additive hazards regression.
  • arules — Mining Association Rules and Frequent Itemsets
  • bigrf — Big Random Forests: Classification and Regression Forests for Large Data Sets
  • bigRR — Generalized Ridge Regression (with special advantage for p >> n cases)
  • bmrm — Bundle Methods for Regularized Risk Minimization Package
  • Boruta — A wrapper algorithm for all-relevant feature selection
  • BreakoutDetection- Breakout Detection via Robust E-Statistics from Twitter.
  • bst — Gradient Boosting
  • CausalImpact- Causal inference using Bayesian structural time-series models.
  • C50 — C5.0 Decision Trees and Rule-Based Models
  • caret - Classification and Regression…

Start Playing with Data Today!

Image for post
Image for post
Source: https://journeyofanalytics.files.wordpress.com/2016/01/cloud-1.png?w=676&h=434

Kaggle

Kaggle has come up with a platform, where people can donate datasets and other community members can vote and run Kernel / scripts on them.

World Bank

The open data from the World bank. The platform provides several tools like Open Data Catalog, world development indices, education indices etc.

Five Thirty Eight Datasets

Here is a link to datasets used by Five Thirty Eight in their stories. Each dataset includes the data, a dictionary explaining the data and the link to the story carried out by Five Thirty Eight.

Amazon Web Services (AWS) datasets

Amazon provides a few big datasets, which can be used on their platform or on your local computers. …


Start analyzing a city’s data today!

Image for post
Image for post
Source: https://images.readwrite.com/wp-content/uploads/2018/03/smartcity-01-min-e1475721806228-825x500.jpg

All cities have to keep track of the information that flows through them from various departments and organizations. Fortunately, almost all of this data is available to the public for the sake of transparency. We have collected 6 of these datasets for you to use in your own projects!

New York City, New York

Look through the extensive list of NYC departments and offices to see if something catches your eye, like maybe the Mayor’s Office to Combat Domestic Violence or the Office of Film, Theatre, and Broadcasting. …


Start preparing now

Image for post
Image for post
Source: https://img.huffingtonpost.com/asset/58e32f9a2c00003c00ff2150.jpg?ops=scalefit_820_noupscale

Applying for a data science job can be extremely stressful. The industry is bulging with new fresh candidates and is looking for the best candidates out there. To ace the interview, you have to PREPARE PREPARE PREPARE.

Below are the most helpful data science interview resources we have found on the internet for you.

Data Science

Statistics:

SQL:

Python + R:

Tableau:

If you liked this article, give us a clap and we will make more such articles. Thank you for reading.


Start Your Data Science Career Here!

Image result for data science courses

Data science is a booming industry today, often called “the sexiest job in the world”. However, it can be a pain to manually research and find the right learning resources to start your journey. We here at Data Science Library have put together an extensive index of online courses, videos, Medium posts, and books to help you get your journey started. Please enjoy!

Learn how to use Git:

Git is a very complicated piece of software to explain, but there are a few good Medium posts that explain them in-depth such as this one and this one. …


Get your dose of daily data science from these sources!

Image for post
Image for post
Source:https://cdn.technologynetworks.com/tn/images/thumbnails/rectangle/data-visualization-innovations-in-life-sciences-and-drug-discovery-296360.png

Here is a no-nonsense guide to the best Data Science publications you can find on Medium, brought to you by the Data Science Library!

Follow us on Twitter at @DataSciLibrary and check out our parent club DubsTech on Facebook here!

General Data Science Topics

There’s a lot to cover in data science, so it might be easier to check out these blogs rather than dive headfirst into more specialized ones! They help put data science concepts into perspective by keeping them grounded in more understandable, real-world issues.

Machine Learning

Machine learning uses statistical models and algorithms to teach a computer to perform a specific task. It sounds a lot easier than it actually is, as you can see from a few of these publications! …

About

Zoshua Colah

Information Specialist and Educator aiming to make the world a better place one step at a time

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store