Package shorttext 1.0.0 released

1 min readJul 14, 2018

The Python package shorttext 1.0.0 has been released. This package provides functions and classes that facilitates the text preprocessing, the use of topic modeling, machine learning, various deep neural network architectures, and computation of certain metrics. It smoothes the process of text mining pipelines.

The package runs under Python 2.7, 3.5, and 3.6.

To install, type in the command line

pip install -U shorttext

You might need to add sudo in front to do it as admin. It provides functions and classes to do the following:

example data provided (including subject keywords and NIH RePORT);
text preprocessing;
pre-trained word-embedding support;
gensim topic models (LDA, LSI, Random Projections) and autoencoder;
topic model representation supported for supervised learning using scikit-learn;
cosine distance classification;
neural network classification (including ConvNet, and C-LSTM);
maximum entropy classification;
metrics of phrases differences, including soft Jaccard score (using Damerau-Levenshtein distance), and Word Mover’s distance (WMD);
character-level sequence-to-sequence (seq2seq) learning; and
spell correction.

# Links

The PyPI page: https://pypi.org/project/shorttext/

Github: https://github.com/stephenhky/PyShortTextCategorization

Documentations: http://shorttext.rtfd.io

Package shorttext 1.0.0 released

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Stephen Ho

No responses yet