Why and How to make a Requirements.txt
Using a Virtual Environment to Avoid Seeming like a Sadist
Why not just write pretty code and push it to GitHub like a happy little clam, and not worry about making a requirements.txt? If my code runs on my computer, why should I give a care about my python environment? What even is a python environment? Perhaps a reticulated python’s terrarium?
Nope. In short, we make requirements.txt files from within virtual environments to avoid seeming like negligent sadists.
Python Packages, and Environment
Open-source python packages — like beautifulsoup, or jupyter, or any of the other 158,872 (sic) projects on the PyPi index — offer tremendous functionality, way beyond that of the standard Python library. It’s like you can push a button and download any one of a bazillion effects pedals for your neat but sort of vanilla Fender Stratocaster:
When I say python environment, I mean the ecosystem consisting of your particular installed version of python, plus all the third-party packages it can access (and their versions). Every time you
$ pip install this_or_that, you are expanding your python environment with packages that are not part of the Python stdlib.
$ pip install a bunch of stuff outside of a virtual environment (more on this later), then you are adding to your “base” or “root” or “system” python environment. That is fine and good and totally valid for many sandbox-y purposes.
However, working exclusively in your base environment will inevitably cause headaches later, when you try to show the code you’ve built to other human beings. Believe me, other people are out there, and they want to see and use the work you’ve done—whether they’re trying to repurpose your code, or simply evaluate your competence as a programmer, perhaps in the process of trying to hire you. You can also start to run into compatibility issues (with your own code) as time goes on.
The problem we run into when we share our Python code is this: Not everyone has the same packages, or the same versions of those packages, installed and ready to rock when they grab your code from GitHub and try to run it. They strum the guitar you built, expecting to hear Metallica, but instead, they hear naught but errant farts.
Augustine Chang and I made a simple dash/flask app for class at Flatiron School. For the purpose of this demo I will pretend that I was not involved in the creation of this python code, so that I may highlight the potential pain involved in trying to download and run it on a totally random, different computer. Here we go:
# Just running the repo's main file, called "run.py" here.
$ python run.pyTraceback (most recent call last):
File "run.py", line 1, in <module>
from pkg import app
File "/Users/rob/Documents/_flatiron/mod-1-proj/pkg/__init__.py", line 2, in <module>
from flask import Flask
ImportError: No module named flask# Gasp. What fresh hell is this?
This is an error message saying “No module named [n].” Okay, well, simple solution: We should be able to download and install this [n] python module using pip, which we’ll touch on very briefly:
# Let's download and install the missing thing. Ezpz.$ pip install flaskCollecting flask
Downloading https://files.pythonhosted.org/packages/7f/e7/08578774ed4536d3242b14dacb4696386634607af824ea997202cd0edb4b/Flask-1.0.2-py2.py3-none-any.whl (91kB)
100% |████████████████████████████████| 92kB 3.2MB/s
.Successfully installed Werkzeug-0.14.1 click-7.0 flask-1.0.2 itsdangerous-1.1.0# Phew, good thing we have [n] now so we can make our run.py go.
Great, we installed [n] by using pip. Now that that’s out of the way, let’s try running the run.py again and jam out:
# Here we go again:$ python run.pyTraceback (most recent call last):
File "run.py", line 1, in <module>
from pkg import app
File "/Users/rob/Documents/_flatiron/mod-1-proj/pkg/__init__.py", line 4, in <module>
from flask_sqlalchemy import SQLAlchemy
ModuleNotFoundError: No module named 'flask_sqlalchemy'# Dear god
Cigar-less. Now it wants some other package called flask_sqlalchemy. Gross.
How many times will we have to undergo this tedious back-and-forth with what feels like a hungry-hippo software version of the DMV? This could go on for ages. Why can’t we just install all the packages this run.py script depends on, all at once? Are we even installing the right versions of the packages?
This, friends, is where a requirements.txt file comes into play. As long as the developers of this app provide a text file listing the necessary packages, we can simply
$ pip install -r requirements.txt and voila! All of the program’s “dependencies” will be downloaded, installed, and ready to go in one fell swoopidy-woop.
But alas: The developers failed to include a requirements.txt with their code, so we are back to error-message ping-pong, installing packages one-by-one as our terminal bosses us around. What clown forgets to include a requirements.txt? Certainly not our future selves.
Back into developer mode now. We want to make this requirements.txt file. We don’t want people to think of us as blatant sadists, or, like, negligent. But first, briefly: What is pip?
Pip literally stands for Pip Installs Packages. Enjoy that, momentarily. MIT computer scientists have a long and storied history full of recursive acronyms.
Pip, in addition to downloading and installing packages from the PyPi repository, can generate a requirements.txt file with this command:
$ pip freeze > requirements.txt
Here’s what comes out when I try this:
# write the file
$ pip freeze > requirements.txt# show the file's contents
$ cat requirements.txtappnope==0.1.0
Whoa dang. That’s a heck of a lot of packages. I know for a fact that we did not implement every single one of these in the app we just wrote. This is just a list of all the python packages I’ve ever downloaded and installed on my base system.
We can’t give this list to our users. There’s no need to make everybody who wants to use this app download and install every single package I have on my computer—that might take just as long as playing robo-error-ping-pong. We want to create a list of only the packages relevant to our app, and we want that list to detail the correct versions of every package.
Here’s what we should have done from the start:
We need to make a pristine little bubble world. A world where no python packages exist, except all the ones we need for this particular program to run. Then we can do a pip freeze, and archive the resulting requirements.txt for future users, including ourselves.
One nice thing about making a virtual environment (a “venv”) is that it enables us to get away from our base environment, and isolate our project in this little bubble world, where all these packages are currently working nicely with one another. We can take a snapshot of this fully functioning little bubble world, fold it up, and put it away. In the future, even after many new versions of each package (including python itself) have come and gone, we will be able to re-hydrate this little retro bubble world, and re-populate it with the correct versions of all the packages needed to make our code happy.
The Python 3 stdlib has built-in venv creation capabilities, but I don’t feel like talking about that. We’re gonna cover the very basics of creating and navigating virtual environments with Conda.
First, install Anaconda. That gets you the Conda package and environment manager, which just makes life more pleasant, in my experience, and allows us to do this:
conda create -n shiny_new_env python=3.4
We’ve just created a new virtual environment, and specified which version of Python will be installed to it! Now we can
conda env list to see a list of venv’s available for us to play with:
$ conda env list# conda environments:
base * /Users/rob/anaconda3
You’ll notice the star indicating which environment we’re in right now, the base environment. We can switch to our newly created virtual environment like so:
$ conda activate shiny_new_env(shiny_new_env)$
Suddenly our venv’s name appears in parentheses to the left of our terminal prompt ($). This means we’re in our venv.
Now I can play the game of error-message ping-pong in this bubble world—just once, so that nobody else ever has to do it again to run our code. I’ll pip-install all the packages our app requires in this as-of-yet empty place, then try another
$ pip freeze > requirements.txt:
# write the file
(shiny_new_env)$ pip freeze > requirements.txt# show the contents
(shiny_new_env)$ cat requirements.txtcertifi==2018.10.15
Still a lot of dependencies, but better. This is just the stuff our run.py needs and nothing more. Now we can roll this requirements.txt file into our GitHub repo, and nobody else will have to go back-and-forth with their terminal installing all these dependencies manually. The correct versions of each package are safely tucked away for future reference. Hooray!
Let’s pretend to be a new user and try to get set up in another completely new, empty virtual environment:
# Exit the current venv
(shiny_new_env)$ conda deactivate# Spin up a new one
$ conda create -n env_2 python=3.4# Activate it
$ conda activate env_2# Install from our fancy new file
(env_2)$ pip install --user --requirement requirements.txt
Don’t be a sadist. Always make a requirements.txt, and do it from inside a nice clean virtual environment. If you use a new virtual environment for every python project you undertake, you will thank yourself later, when new versions are released and the cracks in the sidewalk start to become fissures.
Lastly, remember to breathe!