Scientific Background



In order to solve the problem of recognizing and learning chemical structures in image documents, our chemoCRTM system combines pattern recognition techniques with supervised machine-learning concepts. The method is based on the idea of identifying from depictions the most significant semantic entities (e.g. chiral bonds, super atoms, reaction arrows…). The workflow consists of three phases: image preprocessing, semantic entity recognition, and molecule reconstruction plus validation of the result. All steps of the process make use of chemical knowledge in order to detect and fix errors. The system can be adapted to different sets of input images.

 

SCAI and its partners have developed

 

cf. references

 

The validation module computes several reconstruction scores and highlights parts of the molecule where errors could have occurred.

Publishing Notes - Contact