Bite-Sized Neo4j for Data Scientists

Since signing on as a Data Science Advocate at Neo4j, I will admit to being overwhelmed with the amount of content that I could be generating. I could write a blog post a week and retire with more content to create! However, the data science cycle is not like other software development. Simply put, it takes longer. Before you can write really solid code, you first have to do a bunch of exploratory data analysis (EDA), and that assumes you are starting with a clean data set that answers the question to begin with! I frequently say that the role of a data scientist is at least 75% “data janitor,” constantly, constantly cleaning data. After the EDA, you begin trying to create some sort of analysis or model that answers the question being asked. You have to experiment with that model and optimize the hyperparameters. And then say nothing about MLOps and putting it into production! (You have a solid data engineer…or several…right???) And did I even mention scale?

The problems encountered by data scientists are many and broad. There are plenty of resources out there, but I don’t have time to read through a whole text book on a subject to find the one function I need, much less watch a 4 hour video tutorial. I need to quickly solve a problem because, like I said, the data science cycle can take a while.

To that end, I have started a series of “micro-tutorials” in the forms of both videos and Jupyter notebooks in a repo. Google Colab notebooks are also used and linked in the repo. This will be done as a series of many 5-ish minute videos where you can learn just what you need to do. The series is called “Bite-Sized Neo4j for Data Scientists.” To keep things easy to find, below is a compiled list, that I will update regularly as I create more content, with the episodes.

Check back here regularly since I hope to update this list weekly with a new video!

Videos

Part 1: Connect from Jupyter to a Neo4j Sandbox

Part 2: Using the py2neo Python Driver

Part 3: Using the Neo4j Python Driver

Part 4: Basic Cypher Queries (and with Google Colab)

Part 5: Populating the Database from Pandas

Part 6: Populating the Database with LOAD CSV

Part 7: Populating the Database with the neo4j-admin Tool

--

--

Director of Data Science at Vail Resorts. Formerly Data Science Advocate at Neo4j, Machine Learning Engineer at GitHub.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
CJ Sullivan

Director of Data Science at Vail Resorts. Formerly Data Science Advocate at Neo4j, Machine Learning Engineer at GitHub.