St John’s College researchers present automatic language model to combat neurodegenerative diseases

0
St John’s College researchers present automatic language model to combat neurodegenerative diseases

Researchers at St John’s College, University of Cambridge, have been looking at the molecular grammar of proteins. In the conclusion of one of their publications published last week in PNAS, they indicate that their study could be used to “correct grammatical errors inside cells causing cancer or Alzheimer’s disease”. Their research was conducted using several artificial intelligence-based automatic language models, the final model of which is DeePhase.

This study began with the desire to create an automatic language model through which researchers could analyze the megadata generated over the past decades of research. Dr Kadi Liis Saar, author of the publication and a researcher at St John’s College, says he was inspired by machine learning algorithms from Netflix or Facebook to design a large-scale language model.

The latter, the most successful form of which has been dubbed DeePhase, looks at all the proteins in a cell and then compares them with protein sequencing of healthy and diseased cells. Dr Kadi Liis Saar said of the use of AI and machine language models in biomolecular research:

“The human body is home to thousands and thousands of proteins and scientists still do not know the function of many of them. We asked a neural network-based language model to learn the language of proteins.”

Through this study, the researchers discovered that machine learning and language processing technologies could decipher the “biological language” of cancer, Alzheimer’s disease or other neurodegenerative diseases that they estimate to number in the hundreds.

“We specifically asked the program to learn the language of shape-shifting biomolecular condensates – protein droplets found in cells – that scientists really need to understand to decipher the language of biological functions and dysfunctions that cause cancer and neurodegenerative diseases like Alzheimer’s. We found that it could learn, without being explicitly told, what scientists have already discovered about the language of proteins over several decades of research.”

The scientists said that according to their study, some disordered proteins with particular sequencing form condensates -liquid droplets of protein- without membranes that fuse with other cells or condensates. Professor Tuomas Knowles, author of the paper and a member of St John’s College, said:

‘The integration of machine learning technology into neurodegenerative disease and cancer research is a complete game changer. Ultimately, the goal will be to use artificial intelligence to develop targeted drugs to dramatically reduce symptoms or prevent dementia from occurring. […] Protein condensates have recently attracted a lot of attention in the scientific world because they control key events in the cell such as gene expression, conversion of DNA into proteins, protein synthesis or even how cells make proteins.”

Translated from Des chercheurs du St John’s College présentent un modèle de langage automatique pour lutter contre les maladies neurodégénératives