This is how I built my first pip package
😃

Kabilesh Kumararatnam
Tech-Sauce
Published in
4 min readSep 2, 2019

All the Python developers out there.. surely you would have installed so many packages from the Python Package Index (PyPI) repository for various purposes. Have you ever wondered how you could develop your own packages or libraries and share those among the Python Community so that everyone can make use of your packages?? This is how I did it and you can follow me too 😃.

Developing the package and project structure

We will develop a Spam Classifier that classifies emails as Spam or Ham(not spam) using Bigram approach. Create a Python project with a virtual environment and create the following file structure. You can find the complete project in this github repository as well. Cloning this project to your local environment will be an easier way.

File structure of the project

spam_collection.csv is the dataset and a few sample entries in the dataset will be as following.

sample entries in the dataset

Following is the complete code for our Spam Classifier

Here, bigram_spam_classifier will be our package name. Once you create this structure, you’ll want to run all of the commands in this tutorial within the top-level folder — so be sure to cd Spam_Classifier.

You should also edit bigram_spam_classifier/__init__.py and put the following code in there. This is just so that you can verify that it installed correctly and is not used by PyPI.

name = "bigram_spam_classifier"

setup.py is the build script for setuptools. It tells setuptools about your package (such as the name and version) as well as which code files to include.

Open setup.py and enter the following content.

Open README.md and enter guidance for users of your package.

It’s important for every package uploaded to the Python Package Index to include a license. This tells users who install your package the terms under which they can use your package. For help picking a license, seehttps://choosealicense.com/. Once you have chosen a license, open LICENSE and enter the license text. For example, if you had chosen the MIT license:

The tricky part for me was to upload non Python files along the Python files. After some research I found that the MANIFEST.in file can be used for this purpose. Therefore my MANIFEST.in file is as following..

Generating distribution archives

The next step is to generate distribution packages for the package. These are archives that are uploaded to the Package Index and can be installed by pip. Make sure you have the latest versions of setuptools and wheel installed:

python3 -m pip install --user --upgrade setuptools wheel

Now run this command from the same directory where setup.py is located:

python3 setup.py sdist bdist_wheel

This command should output a lot of text and once completed should generate two files in the dist directory:

Folder structure created after build command

Uploading the distribution archives

Once you have finished building your package, register an account on Test PyPI. Test PyPI is a separate instance of the package index intended for testing and experimentation. To register an account, go to https://test.pypi.org/account/register/ and complete the steps on that page. Now that you are registered, you can use twine to upload the distribution packages. You’ll need to install Twine:

python3 -m pip install --user --upgrade twine

Once installed, run Twine to upload all of the archives under dist:

python3 -m twine upload --repository-url https://test.pypi.org/legacy/ dist/*

Installing your newly uploaded package

The following command installs the package from the TestPyPI

pip install -i https://test.pypi.org/simple/ bigram-spam-classifier

When you run the package for the first time you will have to download wordnet package from nltk.

If you get “1” as the result, the input message is classified as “Spam” and if it was a “0” it is a “Ham”.

You can also find the unigrams and bigrams in the message as following..

The following will give the probabilities of the message being a “Spam” or “Ham” .

Publishing your package in PyPI

Now, we shall upload our package to PyPl package manager instead of the test environment. When you are ready to upload a real package to the Python Package Index you can do much the same as you did in this tutorial, but with these important differences:

  • Choose a memorable and unique name for your package.
  • Register an account on https://pypi.org — note that these are two separate servers and the login details from the test server are not shared with the main server.
  • Use twine upload dist/* to upload your package and enter your credentials for the account you registered on the real PyPI.
  • Install your package from the real PyPI using pip install [your-package].

You can install my package using pip install bigram-spam-classifier. See https://github.com/Kabilesh93/bigram-spam-classifier for more information on the package.

--

--