In many aspects of our lives, manual efforts are being replaced with a more time- and cost-effective option: artificial intelligence. Now, researchers are using AI to help uncover new organically-based substances and properties at a molecular level.
Secondary natural compounds found in many bacteria, plants, and fungi have anti-inflammatory properties that can help fight off diseases — and even hinder cancer cells from growing. Using nature’s medicine cabinet and discovering new natural chemicals, on the other hand, is time-consuming, labor-intensive, and costly. To remedy these costs, a group of bioinformaticians at Friedrich Schiller University Jena in Germany has devised a method for identifying tiny active ingredient molecules that is faster and easier.
When conducted by a human, a researcher uses mass spectrometry to determine which compounds are present in a biological sample, such as a plant extract. A mass spectrometer breaks molecules down into fragments to analyze their mass. Professor Sebastian Böcker of the University of Jena explains, “The CSI: FingerID molecule search engine we developed allows us to search specifically for molecular structures that match these fragments. Whether this search is successful — i.e., whether the search result represents the correct structure — is not something we can distinguish in this way.”
There are currently massive data banks with billions of mass spectrometry data items from millions of biological sample examinations, the great majority of which have not been structurally classified. This is where COSMIC comes in — it allows for the automatic deciphering of structures for a substantial number of these need-to-be-identified molecules. “To this end, we use machine-learning methods,” explains lead author Martin Hoffmann. “First, the mass spectrum of the sample under examination is compared with the available structural data.”
With COSMIC, you’ll get a results page of more or less extensive hits — similar to Google — by calculating a score that assesses the accuracy of the suggested hit and determines whether it is correct or not. Hoffmann adds, “Our method now indicates how confident one can be that the hit found in the first place is actually the structure we are looking for.”
With this technology, new bile acids have been found. In collaboration with colleagues from the University of California, San Diego, Böcker and his team were able to demonstrate how well their strategy actually works. They looked at data from mice’s digestive systems using mass spectrometry to look for previously unknown bile acids. More than 28,000 theoretically conceivable bile acid structures were built for this purpose, and the results were matched to data from the mice’s microbiota. Following that, the COSMIC analysis revealed a total of 11 previously unknown bile acid structures. Two of these were later validated using custom-synthesized reference samples. “This shows, firstly, that our method works reliably,” says Böcker. It also demonstrates that COSMIC significantly speeds up the search for undiscovered chemicals by allowing screening to be done fully automatically, with no manual work, and in a very short amount of time. In the future years, Böcker believes that thousands of new molecular structures will be uncovered this way.
Reference: https://www.sciencedaily.com/releases/2021/10/211014131201.htm