Using a Virtual Environment to Avoid Seeming like a Sadist

TL;DR: $ pip freeze > requirements.txt

Why not just write pretty code and push it to GitHub like a happy little clam, and not worry about making a requirements.txt? If my code runs on my computer, why should I give a care about my python environment? What even is a python environment? Perhaps a reticulated python’s terrarium?

Image for post
Image for post
There’s a happy little clam, in her environment

Nope. In short, we generate and share requirements.txt files to make it easier for other developers to install the correct versions of the required Python libraries (or “packages”) to run the Python code we’ve written.

Open-source python packages — like beautifulsoup, or jupyter, or any of the other 158,872+ projects on the PyPi index — offer tremendous functionality, way beyond that of the standard Python library. It’s like you can push a button and download any one of a bazillion effects pedals for your neat but sort of vanilla Fender Stratocaster, for…


with brew Python 3.8.x, in a regular virtual environment

Image for post
Image for post
Image via PixaBay.com under Creative Commons

Intro

I try to avoid (ana)conda when I can. I prefer PyEnv.

Lately, I’ve found myself repeatedly fumbling trying to get Jupyter up and running and packages like pandas importing properly. Maybe this had to do with recent upgrades to my OS (to Catalina) or switching from bash to zsh as my primary shell. In any case, I’d like to lay out some steps to repeatably getting jupyter notebooks running on OSX Catalina, in a normal (brew/pyenv) python virtual environment, without conda.

Steps Overview

Here’s the 9-step-process that seems to be working for me right now:

  1. Via zsh: Install homebrew
  2. Via homebrew: Install…


Serverless Infrastructure as Code via AWS CloudFormation

2020–09–10: Is the Bryant Park Lawn Open?

When I used to work at a Data Analytics firm in Times Square, I’d frequently enjoy lunch on the beautiful Bryant Park Lawn. I could check https://bryantpark.org/ to see if the lawn was currently open, but the org page loaded somewhat slowly; it was full of images I didn’t care about.

Because of this (and a desire to learn AWS things), I spun up a dumb single-serving website from an ec2 box, which just said whether or not the lawn was open. You’d go to “is the bryant park lawn open dot com” and see this:

Image for post
Image for post
Inspired by http://hasthelargehadroncolliderdestroyedtheworldyet.com/

In the malaise of the COVID pandemic, my humble site has fallen into disrepair, offering the opportunity to make some retrofits. The last time I worked on this, I relied heavily upon the clicking around in the AWS GUI console method, in a web browser—a behavior I’d like to get away from. …


Home Automation // My First Arduino Project

Image for post
Image for post

Tl;dr: I covered the perimeter of my apartment hallway in fancy LED strips. I installed motion sensors above all the doorways. I wrote a bunch of Arduino code. Now when you trip one of the motion sensors, the hallway lights up from that point outward, creating a “runway” effect. This is how.

2019–04–10: BEGIN PROJECT LOG

A few weeks ago I took an intro to Arduino class at my local hackerspace, NYC Resistor, and got inspired!

I’d like to install motion sensors and smart LED strips in our apartment’s entrance hallway. I’ll make them do a runway/chase effect, where the lights spark up sequentially (outward) in both directions, from the location at which the motion sensor is tripped. There could also be a button that forces the lights to stay on indefinitely, like…a light switch! …


But I’ve been Looking for a While and Haven’t Found Much Yet.

The most detailed film production data I’ve been able to find is on WikiLeaks, like this budget for Unforgettable Season 3, and this budget for The Interview:

Image for post
Image for post
That there is some high-resolution budget data

I’m not even supposed to be able to see these two line-item film budgets, but I want thousands. Where do I find them? How do I open the floodgates?

Data science bootcamp students often do toy projects relating to movies, because everybody loves movies duh, and because there’s always a nice kaggle-style tabular movie dataset readily available within the first few pages of web search results. There’s OMdB, where you can query a nice REST-ful API for a bunch of box-office data (and demonstrate that you can make API calls as an added bonus). You can even buy an upgraded OMdB key that lets you make many API calls for one whole dollar. …


Also Featuring: What and Why

Image for post
Image for post

When you write code, primarily, you want for it to work, and secondarily, you want for it to work efficiently. Efficient code minimizes those situations where you’re sitting there waiting for your script to run, wondering if you could have written more efficient code and spent the extra time sleeping, dreaming of rainbow unicorns. Efficient code is also just less likely to fail. We like that.

For example, sometimes, it’s better to use generator expressions than list comprehensions in Python—especially when you don’t need the list implicitly created by a list comprehension.

For example, if you were to run a list comprehension squaring numbers within a sum function like this…


Why Even Bother with Small N?

Big data is fine and good: As the sample size of our dataset (n) approaches infinity, we can make increasingly confident and general assertions, based on increasingly nuanced aberrations and trends in that data.

Image for post
Image for post
Handy visualization of statistical power from rpsychologist.com

Large sample sizes are major key in many contexts, e.g. rocket science: Precision is king, academic reputations are at stake, and crossing some statistical confidence threshold might validate a lifetime of investigation into something as pivotal to our existence as, like, the big bang theory:

I’ve never seen anybody more excited by standard deviations

As great as Big Data is, there’s also a case to be made for Small Data (with purposely ironic capital letters), and here I’ll add to the choir making it. …


A Cursory Exploration

The very likable computer scientist, entrepreneur, and venture capitalist Paul Graham once mentioned, in his essay “Write Like you Talk”:

Informal language is the athletic clothing of ideas.

With that in mind, let’s discuss what computer vision “even is,” starting from not really knowing sh*t about it.

TechCrunch neatly opens the floor with this thought:

Someone across the room throws you a ball and you catch it. Simple, right? Actually, this is one of the most complex processes we’ve ever attempted to comprehend — let alone recreate.

And now, a quick glance at Wikipedia:

Computer vision is an interdisciplinary field that deals with how computers can be made to gain high-level understanding from digital images or videos. …


Image for post
Image for post
More on this later

You, a sometimes fun, sometimes creative individual. You, sometimes dabbling in the arts, perhaps even making a living in a more or less creative or artistic profession. You tell someone, perhaps an extended family member, perhaps a mildly estranged college friend, perhaps over coffee, that you have left your maybe cool-seeming, maybe stable-seeming job, to participate in a several-months-long, full-time, in-person coding boot camp.

Your companion then tends to reply, politely, somberly, in so many words:

Oh wow. Wonderful. So…you haveutterly given up on all of your dreams? You are making a hard-pivot career change to the tech biz? You finally admit that your liberal arts bachelors was totally worthless? Rekt. Won’t you miss at least seeming cool? I was lightweight living vicariously through you fam. Have fun convincing people that boot camps work lol. …

About

Robert Boscacci

Data Scientist // @cinemarob1

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store