Newsletter

Two Digit Dominates Open Ko-LLM Leaderboard with Dramatic Performance Improvement

(Picture = Two Digits)

2Digit won first place on the Open Ko-LLM leaderboard. In particular, it attracts attention for its dramatic improvement in performance, with the average score increasing by more than 7 points in just one week.

Two Digit (CEO Park Seok-jun), a natural language understanding (NLP) expert, announced on the 19th that he had risen to the top again with 67.77 points, a week after taking first place on the Open Ko-LLM leaderboard with one average score of 60.59.

There are currently around 1,200 models registered in the ‘Open Ko-LLM Leaderboard’, a Korean language model evaluation index jointly hosted by Upstage and the Korea Intelligence and Information Association Agency (NIA). It started with an average of 50 points in September last year and its performance improved over several months, but no model that exceeded an average of 60 points has appeared for a while.

However, on the 11th, Two Digit surpassed the 60-point barrier for the first time, suggesting the possibility of improving the overall level of the Korean language model. In fact, as a result of several companies competing, the score showed a constant upward trend.

Among them, Two Digit lost the top position for a while, but from the 19th, he not only returned to the first position (67.77 points), but also came second (65.38 points). This is the result of a whopping 7.18 points higher than a week ago, meaning that a performance that would normally take several months has been improved in just one week.

Park Seok-jun, CEO of Two Digit, said, “It’s been just over a month since the company took on the challenge of competing for Korean language model rankings,” adding, “This is thanks to the knowledge is fine-tuning that has accumulated constantly.”

Two Digit has previously stood out in research on source technologies related to natural language processing (NLP). In April 2022, it came 7th in the global artificial intelligence (AI) machine reading competition ‘SQuAD 2.0’, ahead of major technology companies such as Amazon, Google, Facebook, and Microsoft.

In June of the same year, he was ranked 14th in the world in ‘GLUE’, an AI language understanding evaluation competition run by DeepMind. This was the highest score among domestic companies and organizations.

In particular, in order to overcome the infrastructural limitations of domestic start-ups, they have been focusing on the refinement and technology of learning sources that specialize in detailed categories such as ‘reading’ and ‘literacy’.

The explanation is that although it was possible to solve the dataset problem by using pre-learning models from large technology companies such as Google, it was necessary to accumulate knowledge on certain aspects such as source code.

CEO Park said, “As a result, I gained knowledge of natural language-based machine learning and gained confidence that I could learn Korean models beyond the English model, which is the main source.”

He said, “As expected, I was able to get good grades in the Korean language evaluation,” and “The reason why I was able to score high in detailed evaluation indicators such as reasoning, common sense, language comprehension, hallucination prevention. , and producing Korean common sense was due to the abilities I had accumulated while participating in various NLP competitions.“Thank you,” he said.

In particular, it was explained that learning a model is done efficiently by refining the input data using dataset preprocessing technology, refining the data to fit the model, or removing data that did not fit the task .

In addition, it was presented that the supervised refinement learning (SFT) and direct selection optimization (DPO) parameters were modified to enable the model to achieve high performance through domain-optimized learning.

CEO Park Seok-jun said, “We will continue to strive to revive the Korean language model ecosystem along with application capabilities.” In fact, they announced that they will be releasing all the refined source code and datasets for the No. 1 ranked model on the leaderboard.

“This is similar to the role of a basic book that lowers the barrier of entry to learning and developing a language model,” he said. of ‘standard fine tuning’.” revealed.

Upstage, which hosts the open Ko-LLM leaderboard, also gave meaning.

“It is difficult to make an accurate evaluation because we cannot get detailed information about individual companies or models,” he said. “However, in line with the intention to expand the domestic LLM ecosystem, the score of model No. 1 has been gradually increasing recently, and the current “I see the ‘standardization up’ phenomenon itself meaningful,” he said.

Meanwhile, AI models and data released by Two Digit can be found on GitHub.

Reporter Jang Se-min semim99@aitimes.com

#Digits #recorded #points #month #Open #KoLLM #challenge…7 #points #increase #week

Trending