Armenian company develops its own LLM that will understand more than 100 languages, including spoken Armenian

May 21, 2024  15:09

A large language model (LLM) is being developed in Armenia, which will understand more than 100 languages, including, of course, Armenian. At the same time, the model will understand not only written Armenian, but also spoken language. Gevorg Balyan, the founder of UCRAFT and HOORY, said this in a conversation with NEWS.am Tech.

HOORY's AI platform, also created by Armenian developers, perfectly understands written Armenian, even when it is written in Latin transliteration (but not Cyrillic). It is true that the developers had to make many efforts to achieve this: the main problem was not to transcribe the Latin alphabet, but to teach the model to understand as quickly as possible that the text is specifically written in Armenian, because many words written in Latin can be similar to words in other languages. At the moment, as the he notes, Hoory correctly identifies the language written in Latin letters in about 99% of cases.

Chatbots based on HOORY are now actively used as AI assistants on the websites of a number of games, technology companies, banks, and startups. The new model will be a separate service that different companies will be able to integrate into their products. Also, this model, as Gevorg Balyan mentioned, can be used on the HOORY platform.

To train this language model to recognize spoken Armenian, the Fastbank team collected and prepared a high-quality database, including 10,000 hours of recordings of local Armenians speaking. With the help of this data, as Gevorg Balyan says, it will be possible to solve the problem of speech-to-text and to train the model to turn the heard words into written text.

You can watch the video (in Armenian) to know more.

 


 
 
 
 
  • Archive