A new machine learning model can predict enzyme-substrate pairs, with an accuracy of over 90% and is ready to be used in pharmaceutical and industrial biotechnology.
Researchers from Germany, Sweden and India have reported an AI method that predicts whether an enzyme can work with a specific substrate. In Nature Communications, the team headed by Martin Lercher from University of Dusseldorf published data that confirm the 90% accuracy of the machine learning tool previously reported in the preprint journal bioRx .
Even though genes which encode enzymes can easily be identified as such, the exact function of the resultant enzyme is unknown in tover 99% of cases. This is because experimental characterisations of their function – i.e. which starting molecules a specific enzyme converts into which concrete end molecules – is extremely time-consuming.
According to Lercher, “the special feature of our Enzyme Substrate Prediction model is that we are not limited to individual, special enzymes and others closely related to them, as was the case with previous models. Our general model can work with any combination of an enzyme and more than 1,000 different substrates.”
Numerical vectors of around 18,000 experimentally validated enzyme-substrate pairs – where the enzyme and substrate are known to work together – were used as input to train the Deep Learning model.
“After training the model in this way, we then applied it to an independent test dataset where we already knew the correct answers”, said first author Alexander Kroll. “In 91% of cases, the model correctly predicted which substrates match which enzymes.”
The method offers a wide range of applications inn both drug discovery and industrial biotechnology According to Lercher, the AI tool will enable research and industry to narrow a large number of possible pairs down to the most promising, which they can then use for the enzymatic production of new drugs, chemicals or biofuels. “It will also enable the creation of improved models to simulate the metabolism of cells”, added Krol.l In addition, it can help to understand the physiology of various organisms – from bacteria to people.