R in Python
Sometimes you need the best of both worlds, using R and Python together.
Here is one approach, without using elaborate libraries, to write and then run R scripts within python. (It even works in Colab!)
An interesting point in the relationship between stop words, word frequency, document frequency, and document-frequency-high/low-cut-off, comes from the parameter descriptions in sklearn for the two functions TfidfVectorizer & CountVectorizer (more details below).
In the past year I had the opportunity to work with a cross functional team of coders and fellow data scientists on a project for C4ADS: The Center for Advanced Defense Studies. Due to privacy allowances and NDA requirements I will be somewhat vague about the project, but here…