Beginner to Japanese Morphology Analysis
This article gives you basics about morphology analysis on Japanese. All you need is to follow the below 3 questions.
1. What is Morphology?
You know that article is built by paragraphs, paragraph is built by sentences. How about sentence? Word, right?
Yes, sentence is built by words. But what builds the word? For English, there is alphabet. For Japanese, there is term called Morphology that builds the word. It’s same for Chinese.
2. What is Morphology Analysis?
So for the Morphology Analysis, it is to analysis the composition of word. How likely a couple of conjunctive morphologies together can be a word. As a result of analysis, a sentence can be split into words.
3. Why we need Morphology Analysis?
You may ask why we need to split the sentence into word.
I suppose that you never see any Chinese or Japanese text before. Why not use google to get some sample? So that you can understand immediately.
If you are still not able to get it, let me explain you the reason.
For English text, naturally there is word boundary, usually using space, inside of sentence.
Imagine that there is sentence “Today is a nice day.”. If we remove the space , then it will be “Todayisaniceday.”. So how would you likely understand this?
For Japanese, naturally there is no word boundary inside of sentence.
Let’s see an example of Japanese sentence.
This sentence means today’s weather is good.
Well, so now, please ask yourself, how many words are there in this Japanese sentence?
What Morphology Analysis does here is to tell you :
There are 6 words “今日”, “は”, “天気”, “が”, “いい”, “です”.
In my next article, I will write about how to perform Japanese Morphology Analysis by using MeCab library.
This will require some technique background of Python.