華田士多
Published in

華田士多

點樣可以將 Harry Potter 劇情用一張圖概括出嚟呢?一個 NLTK 小實驗 (1)

最近同田太喺度研究 data science 喺人文學科嘅應用,先知原來近年人文學科好熱烈喺度搞 “digital humanities”,基礎如 OCR 啲古籍、中級少少將文本數據化、再到高級少少做吓啲 visualization、再勁啲就有 style analysis、authorship classifier、network analysis 等等,都幾百花齊放的。

咁到底會唔會只係瀨尿牛丸,我就唔夠 knowledge 去 comment 喇,不過有啲應用都幾有趣的,可以為本來基本係靠 feel 嘅文本分析變成一啲數據化、可視化嘅資訊。

今次會用 Harry Potter Book 1 配合 Python 做 natural language processing (NLP) 最常用嘅 package NLTK (Natural Language Toolkit) 分享 2 個小應用,今篇會講將劇情圖像化嘅方法,將如果只可以用一張圖,我哋可以點樣 summarize 一整本書嘅情節呢?

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
華田 Watin

華田 Watin

Wallace Tin || Sloppy Mercenary || Banks, Languages & Mo-liu Things || fb.me/WatinResearch || patreon.com/watin || watinmedium@gmail.com