• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Genetic Prediction of Cancer Recurrence: Scientists Verify Reliability of Computer Models

Genetic Prediction of Cancer Recurrence: Scientists Verify Reliability of Computer Models

© iStock

In biomedical research, machine learning algorithms are often used to analyse data—for instance, to predict cancer recurrence. However, it is not always clear whether these algorithms are detecting meaningful patterns or merely fitting random noise in the data. Scientists from HSE University, IBCh RAS, and Moscow State University have developed a test that makes it possible to determine this distinction. It could become an important tool for verifying the reliability of algorithms in medicine and biology. The study has been published on arXiv.

Machine learning methods help analyse complex biological data, ie for predicting the likelihood of cancer recurrence based on gene expression, which reflects the activity levels of specific DNA regions within cells. However, it is not always clear whether these algorithms are detecting meaningful patterns or merely fitting random noise in the data.

A team of scientists from HSE University, IBCh RAS, and Moscow State University has developed a test to assess how reliably the classifier distinguishes between different patient groups. In this case, the two groups were patients who experienced a recurrence of the disease and those who did not. A model performs correctly if it effectively captures biologically meaningful differences. If the algorithm simply separates the data at random, its accuracy may appear deceptively high. The researchers focused on linear classifiers, one of the most widely used ML tools in biomedicine.

Anton Zhiyanov

'We aimed to test whether randomly generated (synthetic) data could be separated by a linear classifier as effectively as real biological samples. To do this, we calculated an upper bound on the p-value, which indicates the likelihood that the model is merely "guessing." The lower this p-value, the more reliable the classifier,' explains Anton Zhiyanov, Research Fellow at the HSE Laboratory of Molecular Physiology. 

The researchers conducted a series of experiments using synthetic data, allowing them to precisely control the degree of differences between classes. They then applied the new test to real-world medical models that predict the risk of breast cancer recurrence. 

The results showed that most classifiers failed to capture any meaningful differences between patients with and without recurrence. Further analysis revealed that 559 out of 570 models produced results consistent with random chance. This suggests that many algorithms may appear accurate, while in reality their predictions are driven by coincidences rather than genuine patterns.

However, the researchers also identified reliable models that reveal biologically meaningful patterns. One such model was a classifier that focused on the activity levels of the ELOVL5 and IGFBP6 genes. This algorithm was further tested on an independent data sample, confirming that differences in the expression of these genes are indeed linked to the risk of cancer recurrence.

Each point on the graph represents a patient, with the expression levels of two genes measured: IGFBP6 on the X-axis and ELOVL5 on the Y-axis. The orange dots represent patients with a recurrence, while the blue dots represent those without. In the first graph, these points (patients) are clearly separated by a straight line, representing a linear classifier. In the second graph, the points are randomly distributed, and the classifier fails to identify any patterns between gene expression and actual recurrence.

Alexander Tonevitsky

'Our test could become an important tool for verifying the reliability of algorithms in biology and medicine. It helps prevent false conclusions and emphasises models that truly identify important patterns, which is crucial for making decisions about patient treatment,' comments Alexander Tonevitsky, Professor at the HSE Faculty of Biology and Biotechnology.

The study was conducted with support from HSE University's Basic Research Programme within the framework of the Centres of Excellence project.

See also:

Similar Comprehension, Different Reading: How Native Language Affects Reading in English as a Second Language

Researchers from the MECO international project, including experts from the HSE Centre for Language and Brain, have developed a tool for analysing data on English text reading by native speakers of more than 19 languages. In a large-scale experiment involving over 1,200 people, researchers recorded participants’ eye movements as they silently read the same English texts and then assessed their level of comprehension. The results showed that even when comprehension levels were the same, the reading process—such as gaze fixations, rereading, and word skipping—varied depending on the reader's native language and their English proficiency. The study has been published in Studies in Second Language Acquisition.

Mortgage and Demography: HSE Scientists Reveal How Mortgage Debt Shapes Family Priorities

Having a mortgage increases the likelihood that a Russian family will plan to have a child within the next three years by 39 percentage points. This is the conclusion of a study by Prof. Elena Vakulenko and doctoral student Rufina Evgrafova from the HSE Faculty of Economic Sciences. The authors emphasise that this effect is most pronounced among women, people under 36, and those without children. The study findings have been published in Voprosy Ekonomiki.

Scientists Discover How Correlated Disorder Boosts Superconductivity

Superconductivity is a unique state of matter in which electric current flows without any energy loss. In materials with defects, it typically emerges at very low temperatures and develops in several stages. An international team of scientists, including physicists from HSE MIEM, has demonstrated that when defects within a material are arranged in a specific pattern rather than randomly, superconductivity can occur at a higher temperature and extend throughout the entire material. This discovery could help develop superconductors that operate without the need for extreme cooling. The study has been published in Physical Review B.

Scientists Develop New Method to Detect Motor Disorders Using 3D Objects

Researchers at HSE University have developed a new methodological approach to studying motor planning and execution. By using 3D-printed objects and an infrared tracking system, they demonstrated that the brain initiates the planning process even before movement begins. This approach may eventually aid in the assessment and treatment of patients with neurodegenerative diseases such as Parkinson’s. The paper has been published in Frontiers in Human Neuroscience.

Civic Identity Helps Russians Maintain Mental Health During Sanctions

Researchers at HSE University have found that identifying with one’s country can support psychological coping during difficult times, particularly when individuals reframe the situation or draw on spiritual and cultural values. Reframing in particular can help alleviate symptoms of depression. The study has been published in Journal of Community Psychology.

Scientists Clarify How the Brain Memorises and Recalls Information

An international team, including scientists from HSE University, has demonstrated for the first time that the anterior and posterior portions of the human hippocampus have distinct roles in associative memory. Using stereo-EEG recordings, the researchers found that the rostral (anterior) portion of the human hippocampus is activated during encoding and object recognition, while the caudal (posterior) portion is involved in associative recall, restoring connections between the object and its context. These findings contribute to our understanding of the structure of human memory and may inform clinical practice. A paper with the study findings has been published in Frontiers in Human Neuroscience.

Researchers Examine Student Care Culture in Small Russian Universities

Researchers from the HSE Institute of Education conducted a sociological study at four small, non-selective universities and revealed, based on 135 interviews, the dual nature of student care at such institutions: a combination of genuine support with continuous supervision, reminiscent of parental care. This study offers the first in-depth look at how formal and informal student care practices are intertwined in the post-Soviet educational context. The study has been published in the British Journal of Sociology of Education.

AI Can Predict Student Academic Performance Based on Social Media Subscriptions

A team of Russian researchers, including scientists from HSE University, used AI to analyse 4,500 students’ subscriptions to VK social media communities. The study found that algorithms can accurately identify both high-performing students and those struggling with their studies. The paper has been published in IEEE Access.

HSE Scientists: Social Cues in News Interfaces Build Online Trust

Researchers from the HSE Laboratory for Cognitive Psychology of Digital Interface Users have discovered how social cues in the design of news websites—such as reader comments, the number of reposts, or the author’s name—can help build user trust. An experiment with 137 volunteers showed that such interface elements make a website appear more trustworthy and persuasive to users, with the strongest cue being links to the media’s social networks. The study's findings have been published in Human-Computer Interaction.

Immune System Error: How Antibodies in Multiple Sclerosis Mistake Their Targets

Researchers at HSE University and the Institute of Bioorganic Chemistry of the Russian Academy of Sciences (IBCh RAS) have studied how the immune system functions in multiple sclerosis (MS), a disease in which the body's own antibodies attack its nerve fibres. By comparing blood samples from MS patients and healthy individuals, scientists have discovered that the immune system in MS patients can mistake viral proteins for those of nerve cells. Several key proteins have also been identified that could serve as new biomarkers for the disease and aid in its diagnosis. The study has been published in  Frontiers in Immunology. The research was conducted with support from the Russian Science Foundation.