Tried “Solar”, the LLM of Korean AI Startup “UpStage”

Yicheng Xian
3 min readMar 27, 2024

--

Recently, the Korean AI startup company “UpStage” announced that the latest version of their LLM “Solar” has been officially deployed on Amazon SageMaker.

https://www.aitimes.com/news/articleView.html?idxno=158165

I tried out Solar a bit myself, so today I’d like to introduce Solar, UpStage, and the current state of Korean LLMs.

1. About UpStage

UpStage is currently the most famous AI company in Korea.

Around the end of last year, the LLM “SOLAR 10.7B” developed by UpStage briefly became #1 in both English and Korean on the HuggingFace OSS LLM leaderboard. This model was trained using UpStage’s proprietary “DUS (Depth Up-Scaling)” technique, which is a different approach from the Mixture of Experts (MoE) used by Mistral, which was mainstream at the time. Put simply, DUS involves significantly increasing the size of an existing model (expanding layers) while providing additional training data.

It’s a bit different from typical continual learning — it’s more like continuing pretraining while expanding layers. Using this method, even relatively small models can outperform large models. And the training process is not that complex either.

In the benchmark results ⇩ at the time, Solar outperformed Mistral’s MoE models and China’s Qwen, which were 4–7 times larger than itself, garnering some attention. (I didn’t realize at the time that Solar was a Korean company’s LLM and thought it was from some American company…)

2. Current State of Korean LLMs

Most Korean LLMs are basically made by fine-tuning Solar. Last week, Korea’s National Information Society Agency (NIA) released a ranking called “Ko-LLM” evaluating the performance of Korean LLMs. Looking at the results ⇩, most of the top-ranked LLMs have names like “XXX-Solar”. These are models developed by different companies, but they are all based on fine-tuning UpStage’s Solar model.

3. Impressions From Actually Trying Solar

I tried Solar myself yesterday and noticed a few characteristics. First, this model has very high performance in both English and Korean.

Especially when it comes to Korean, it demonstrates performance nearly on par with ChatGPT. One slight difference is that Solar’s Korean sounds completely natural as if a native Korean speaker is talking, while ChatGPT’s Korean gives the impression of a foreigner who speaks Korean well.

Also, Solar specializes in Document Chat and has excellent recognition, extraction, and comprehension abilities for Korean text within documents.

For example, below are the recognition results for sample data provided by UpStage, and it was able to recognize everything perfectly. (Its English recognition accuracy is also very high with almost no errors, but ChatGPT has already achieved this level so it’s not that special.)

I also tried searching for random Korean images and feeding them to Solar (I used images from the Korean version of the Doraemon manga). I tried using the OCR function and it was able to recognize the text flawlessly. I tried the same thing with ChatGPT but it couldn’t do it well. (Korean characters look very similar to each other, so recognizing them without errors is actually not that easy. For example, the characters 일, 말, and 밑 look very similar visually.)

By the way, Solar’s Japanese ability is still at a quite low level. It can’t even correctly answer very simple questions like ⇩. (However, according to the Amazon SageMaker news released last week, there was talk of adding Japanese capability to Solar within this year.)

If you’re interested, give it a try! ⇩:

https://www.upstage.ai/

--

--

Yicheng Xian

Quadrilingual Chinese, focusing on LLMs. Kaggle Master