Will it be possible to run large language models on platforms with limited resources?

September 8, 2023  18:13

Artificial intelligence has been growing and developing rapidly in recent years, but with this growth come new challenges. One of the main problems is that the size of modern neural networks is increasing and they consume more resources. In a conversation with NEWS.am Tech, Cognaize founder Vahe Andonians talks about how to overcome this problem and run large language models on platforms with limited resources. He also gave a lecture on the same topic during the 4th "Datafest Yerevan" forum.

Huge models and limited resources

Artificial intelligence, as the specialist emphasized, requires enormous resources to work properly. The popular ChatGPT chatbot, for example, has 1.7 trillion parameters. Models like ChatGPT achieve impressive results in various tasks, but their huge size requires huge resources. Not all companies can afford such huge resources. According to the expert, there are probably only about 10 companies in the world today that can do this.

Reducing resource consumption

One way to solve this problem is to optimize neural networks. Neural networks that model the functioning of the human brain include so-called weights that determine the importance of signals transmitted between neurons. These weights are usually represented by floating point numbers.

However, as the specialist mentioned, in order to reduce the consumption of resources by neural networks, it is possible to consider the possibility of replacing floating point numbers with integers or simpler, smaller numbers. This can significantly reduce the amount of memory and computing power used.

This will lead to a slight deterioration in the quality of neural network performance, but, on the other hand, such optimization can enable large models to run on more affordable platforms.

The future of large language models

Large language models never cease to amaze us with their potential. A key factor in the development of these models is memory. As the amount of memory increases, the models are able to perform a wider range of tasks.

Models are currently limited in memory. For example, ChatGPT can only remember up to 80,000 tokens. Increasing this memory, Andonians notes, will be an important step in developing larger language models.

Running large language models on resource-constrained platforms is an extremely modern problem in the world of artificial intelligence today. Optimization and finding a balance between model size and resource consumption are key factors that will determine the future of AI development. As many experts point out, increased memory and the discovery of new optimization methods will help continue the use of artificial intelligence on platforms of all sizes and capabilities.

  • Archive