Patrick van KesselinPew Research Center: DecodedProfessional translators or Google Translate? Weighing the pros and cons of eachPew Research Center recently sought to translate more than 11,000 open-ended survey responses into English.Dec 17, 2021Dec 17, 2021
Patrick van KesselinPew Research Center: DecodedAre topic models reliable or useful?The final post in our series examines how topic models can and can’t help when classifying large amounts of text.Sep 24, 2021Sep 24, 2021
Patrick van KesselinPew Research Center: DecodedHow keyword oversampling can help with text analysisKeyword oversampling can be a powerful way to analyze uncommon subsets of text data.Sep 24, 2021Sep 24, 2021
Patrick van KesselinPew Research Center: DecodedIntroducing Pew Research Center’s Python librariesWe’re excited to release a collection of Python tools that we’ve found ourselves returning to again and again.Jun 9, 2020Jun 9, 2020
Patrick van KesselinPew Research Center: DecodedInterpreting and validating topic modelsInterpreting topics from a model can be more difficult than it may initially seem.Aug 1, 20191Aug 1, 20191
Patrick van KesselinPew Research Center: DecodedOvercoming the limitations of topic models with a semi-supervised approachDifficulties can arise when researchers attempt to use topic models to measure content. A “semi-supervised” approach can help.Apr 10, 20193Apr 10, 20193
Patrick van KesselinPew Research Center: DecodedMaking sense of topic modelsTopic models can produce clusters of words that characterize written documents. But how do we figure out what those clusters mean, exactly?Aug 13, 20183Aug 13, 20183
Patrick van KesselinPew Research Center: DecodedAn intro to topic models for text analysisTopic models can scan documents, examine words and phrases within them, and “learn” groups of words that characterize those documents.Aug 13, 20182Aug 13, 20182