I have some code I maintain as generic functions that I use throughout most of my Python files and notebooks. I have that code stored on a private Bitbucket repo, but I want to be able to access that code like I would any other Python package.
Luckily the process is relatively simple. For the most part, the steps are the exact same as creating any other Python package. We’ll go through each step by step.
Setup: File Structure
To make our lives easier, we need to have the right file structure. A Python package should look like this:
- package_name
- functions_1.py
- functions_2.py
- __init__.py
- README.md
- setup.py
Step 1: Example Code
First, we’ll build a Python file containing an example function. This will be functions_1.py
from the project structure above. We’re purposefully using pandas
here so we can demonstrate how to add package dependencies to your custom package.
Step 2: __init__.py
In the same folder with your source code, you’ll need to include a file called __init__.py
. This file doesn’t need to contain anything, it just needs to exist so the Python packaging mechanism knows where to look for functions. If you’re using a Unix-based system like MacOS or if you’re using something like Cygwin on a Windows machine, you can navigate to the proper folder (using the structure above) and simply use the command touch __init__.py
.
Step 3: setup.py
A package setup file can be as complex as you want it to be. There are A LOT of options here, but we’ll take the easy road. All we’ll do is add the required fields and some useful metadata:
A few notes:
- It’s important that the value of
name
in line 4 not have spaces or special characters. Including underscores, for example, will only make your life more difficult down the road. Trust me. - The value of
packages
in line 9 must match the name of the root folder in the project structure we enumerated in our setup step. install_packages
in line 10 can take an arbitrary list of packages, including specific version numbers if required. To specify a version number use the following format:pandas==0.23.3
Step 4: Install Package
We’re done with the package setup, now all we have to do is install it! It’s relatively easy, all we really need is the project clone URL. That should look something like this:
git clone https://username@bitbucket.org/private_domain/package_name
Once we have that, all we need to do is open our CLI and navigate to where our python packages are stored:
- If you’re using conda on a Windows machine, that probably looks something like
C:\Users\username\.conda\pkgs
. - If you’re not using conda and you’re on a Windows machine, you can use the command
pip show pandas
to have pip show you the install location of a package. Here we’re using pandas as an example, but it could be any package you have installed. That should show you the full path of where pip installs packages. You’ll need to copy that path, going up one level (ie: the parent directory where pandas is stored).
Once we have that path, navigate it to it and input the following command to install your package:
pip install -e "git clone https://username@bitbucket.org/private_domain/package_name#egg=package_name"
Important: Don’t forget to plug in the clone command you copied from Bitbucket within the double quotes!
You will likely be asked for credentials by Atlassian, so login if required.
Package Usage
That’s really it! You should be able to access that package now, just like any other Python package: