Group and Shuffle: Researchers at HSE University and AIRI Accelerate Neural Network Fine-Tuning
Researchers at HSE University and the AIRI Institute have proposed a method for quickly fine-tuning neural networks. Their approach involves processing data in groups and then optimally shuffling these groups to improve their interactions. The method outperforms alternatives in image generation and analysis, as well as in fine-tuning text models, all while requiring less memory and training time. The results have been presented at the NeurIPS 2024 Conference.
The larger the neural network, the more challenging it becomes to quickly adapt it to a new task. Retraining a model from scratch is a time-consuming and costly process. Therefore, developers seek cost-effective ways to adapt a model to a specific task while preserving the overall quality of the original.
One such approach is fine-tuning using orthogonal matrices, which, unlike other methods, preserve the essential features of the original model. Popular alternatives, such as block-diagonal or butterfly matrices, have drawbacks: they are either limited in scope or require extensive computations.
Researchers at the HSE Faculty of Computer Science and the AIRI Institute have proposed a new method of constructing matrices, which they call Group-and-Shuffle. Instead of working with all the data at once, they divide the parameters into small groups, process each group separately, and then shuffle them together. This structure is both flexible and efficient: it enables the model to adapt more precisely to the task while requiring fewer computations and less memory.
Building on GS matrices, the researchers developed GSOFT, a new method for orthogonal fine-tuning of neural networks. Unlike previous approaches, GSOFT uses fewer parameters while maintaining training stability and quality, even with limited data. The team also introduced a two-sided version of the method—Double GSOFT—which allows simultaneous adjustment of parameters from both sides, enhancing the model’s flexibility and accuracy.
'We discovered how to construct orthogonal matrices using only two special types of matrices, instead of five or six as required by previous methods. This saves computational resources and training time,' explains Nikolay Yudin, Research Assistant at the HSE Laboratory for Matrix and Tensor Methods in Machine Learning.
The researchers tested the approach on three types of tasks. When fine-tuning the RoBERTa language model, the method outperformed others while using a comparable number of parameters. In image generation, where the model needed to preserve the original features while adapting to the user’s request, GSOFT and Double GSOFT outperformed popular methods like LoRA and BOFT, all while using less memory and training time.

The authors also tested their approach on convolutional neural networks, which are commonly used for image and video analysis, such as in face recognition. The team adapted the GS matrices even for cases where the model required strong resistance to interference and distortion.
'We tested the method across various scenarios—from language and generative models to robust convolutional networks. In every case, it performed reliably while using fewer resources. This confirms that the method can be applied effectively to a variety of purposes,' comments Aibek Alanov, Senior Research Fellow at the Centre of Deep Learning and Bayesian Methods, AI and Digital Science Institute, HSE FCS, and leader of the Controllable Generative AI team at FusionBrain, AIRI.
See also:
Large Language Models No Longer Require Powerful Servers
Scientists from Yandex, HSE University, MIT, KAUST, and ISTA have made a breakthrough in optimising LLMs. Yandex Research, in collaboration with leading science and technology universities, has developed a method for rapidly compressing large language models (LLMs) without compromising quality. Now, a smartphone or laptop is enough to work with LLMs—there's no need for expensive servers or high-powered GPUs.
AI to Enable Accurate Modelling of Data Storage System Performance
Researchers at the HSE Faculty of Computer Science have developed a new approach to modelling data storage systems based on generative machine learning models. This approach makes it possible to accurately predict the key performance characteristics of such systems under various conditions. Results have been published in the IEEE Access journal.
Researchers Present the Rating of Ideal Life Partner Traits
An international research team surveyed over 10,000 respondents across 43 countries to examine how closely the ideal image of a romantic partner aligns with the actual partners people choose, and how this alignment shapes their romantic satisfaction. Based on the survey, the researchers compiled two ratings—qualities of an ideal life partner and the most valued traits in actual partners. The results have been published in the Journal of Personality and Social Psychology.
Trend-Watching: Radical Innovations in Creative Industries and Artistic Practices
The rapid development of technology, the adaptation of business processes to new economic realities, and changing audience demands require professionals in the creative industries to keep up with current trends and be flexible in their approach to projects. Between April and May 2025, the Institute for Creative Industries Development (ICID) at the HSE Faculty of Creative Industries conducted a trend study within the creative sector.
From Neural Networks to Stock Markets: Advancing Computer Science Research at HSE University in Nizhny Novgorod
The International Laboratory of Algorithms and Technologies for Network Analysis (LATNA), established in 2011 at HSE University in Nizhny Novgorod, conducts a wide range of fundamental and applied research, including joint projects with large companies: Sberbank, Yandex, and other leaders of the IT industry. The methods developed by the university's researchers not only enrich science, but also make it possible to improve the work of transport companies and conduct medical and genetic research more successfully. HSE News Service discussed work of the laboratory with its head, Professor Valery Kalyagin.
Children with Autism Process Sounds Differently
For the first time, an international team of researchers—including scientists from the HSE Centre for Language and Brain—combined magnetoencephalography and morphometric analysis in a single experiment to study children with Autism Spectrum Disorder (ASD). The study found that children with autism have more difficulty filtering and processing sounds, particularly in the brain region typically responsible for language comprehension. The study has been published in Cerebral Cortex.
HSE Scientists Discover Method to Convert CO₂ into Fuel Without Expensive Reagents
Researchers at HSE MIEM, in collaboration with Chinese scientists, have developed a catalyst that efficiently converts CO₂ into formic acid. Thanks to carbon coating, it remains stable in acidic environments and functions with minimal potassium, contrary to previous beliefs that high concentrations were necessary. This could lower the cost of CO₂ processing and simplify its industrial application—eg in producing fuel for environmentally friendly transportation. The study has been published in Nature Communications.
HSE Scientists Reveal How Staying at Alma Mater Can Affect Early-Career Researchers
Many early-career scientists continue their academic careers at the same university where they studied, a practice known as academic inbreeding. A researcher at the HSE Institute of Education analysed the impact of academic inbreeding on publication activity in the natural sciences and mathematics. The study found that the impact is ambiguous and depends on various factors, including the university's geographical location, its financial resources, and the state of the regional academic employment market. A paper with the study findings has been published in Research Policy.
When Thoughts Become Movement: How Brain–Computer Interfaces Are Transforming Medicine and Daily Life
At the dawn of the 21st century, humans are increasingly becoming not just observers, but active participants in the technological revolution. Among the breakthroughs with the potential to change the lives of millions, brain–computer interfaces (BCIs)—systems that connect the brain to external devices—hold a special place. These technologies were the focal point of the spring International School ‘A New Generation of Neurointerfaces,’ which took place at HSE University.
New Clustering Method Simplifies Analysis of Large Data Sets
Researchers from HSE University and the Institute of Control Sciences of the Russian Academy of Sciences have proposed a new method of data analysis: tunnel clustering. It allows for the rapid identification of groups of similar objects and requires fewer computational resources than traditional methods. Depending on the data configuration, the algorithm can operate dozens of times faster than its counterparts. Thestudy was published in the journal Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia.