Do you need a Khmer name for your baby? How about using artificial intelligence (AI) to generate names. Artificial Intelligence (AI) has advanced quite a bit in the last decade. The term “machine learning” which is a subset of AI or sometimes used interchangeably has taken off quite recently with an approach called “deep learning.” This is generally used in computer vision but this technique also applies to other fields such as audio processing and natural language processing (NLP). This article will discuss one of the approaches in NLP that used to generate names.
We can use machine learning to generate a whole new name based on a style of names we want the machine to learn from. We used Khmer names as the input. The algorithm will learn from those names and will generate a similar style of names. The algorithm is one of the deep learning approach called Recurring Neural Network (RNN).
Now, we go through the process to generate these names. First, we want to get an existing list of Khmer names. The first set will be Khmer's name in English of about 600 names. We will also do the same with Khmer's name in Khmer language.
A similar technique can be used as a translation in which a computer learns the probability of words from a pair of sentences from two different languages. As in the name generation, the algorithm will use character instead of words. So we feed the name to the algorithm character by character as its input with “new line” character as the end character of each name.
The next step is to define the architecture of how the algorithm is going to learn. This architecture has multiple inputs through an RNN known as an encoder. The multiple inputs are each character of each name. From these inputs, it generates a hidden state, then it passes on to the decoder of multiple outputs.
The decoder first takes an empty character as the initial input, then predict one output. With that output, it feeds to the next unit as the input and predicts another output. That output becomes the input to the next unit again and so forth until the output predicted is an end of line character. This is when it ended the generation of that name.
As the flow of input from one unit to the next, the information flow is called forward propagation. Each time it predicted the name, it will adjust its weights from the end layers to the beginning know as back-propagation. This type is known as back-propagation through time which passes activation from one sequence element to another like backward in time.
There are cases where the weights get multiply many times through several steps that these numbers may get either too small or too large exceeding the valid range thus throwing errors. This type of issue is call vanishing or exploding gradient. As an example, the exploding gradient happened when we compute a letter in position n, we computer gradient on an earlier letter (n-1 to 1) behind it. When multiplying with a big number many times, the gradient can get very large thus exploding gradient. To resolve this issue we use gradient clipping to limit the range of the gradient to some values.
The next step in setting up this deep learning algorithm is to choose the loss function. The loss function tells the algorithm how to optimize the result as loop through the data many times. The algorithm will try to minimize the loss close to zero as it can. For this approach, we use cross-entropy loss. It is defined as follow:
With these three setups, we now can run the algorithm. This RNN ran over 80 thousand iterations to learn the Khmer name structure and produce a model with the loss of around 10.
After the model is built, we pass in the a<0> = zero vector, then predict y<1>. Then pass y<1> to the next sequence and get the y<2>
Keep repeating this step until y<n> is an end of word. So y<1> to y<n> is our generated name.
The result with just these small number of names as input, it can reproduce a good list of output names here. Here are some sample that I like:
There are some odd ones too like: