Misplaced Pages

Optical chemical structure recognition

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Chemical structure recognition
This article is an orphan, as no other articles link to it. Please introduce links to this page from related articles; try the Find link tool for suggestions. (January 2025)

Optical chemical structure recognition (OCSR) is the translation of images that depict chemical structure information into machine-readable formats. It addresses the challenge of translating chemical structures from graphical representations into their corresponding chemical formulas.

In scientific publications, documents, and textbooks, molecular structures are typically represented through images and annotated text. These structural formulas are depicted as chemical graphs, where the vertices represent atoms, and the edges signify bonds between them. However, much of the data from older publications remains undigitised, both in image and descriptive formats. This lack of digitisation makes extracting useful information a time-consuming, manual process. OSCR can also translate digital images of molecules available online and scanned pages of chemical documents.

The development of the first OCSR systems faced limitations due to the computational resources available and the early stages of Computer Vision and machine learning algorithms. These initial systems primarily relied on heuristic and rule-based approaches, supported by classic Artificial Intelligence (AI) and optical character recognition techniques.

However, advancements in hardware, cloud computing, and deep neural networks have revolutionised OCSR. Modern systems now employ attention-based and context-aware image classification models, eliminating the need for separate pre-processing steps like noise removal or image restoration.

References

  1. Rajan, Kohulan; Brinkhaus, Henning Otto; Agea, M. Isabel; Zielesny, Achim; Steinbeck, Christoph (2023-08-19). "DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications". Nature Communications. 14 (1): 5045. Bibcode:2023NatCo..14.5045R. doi:10.1038/s41467-023-40782-0. ISSN 2041-1723. PMC 10439916. PMID 37598180.
  2. Valko, Aniko T.; Johnson, A. Peter (2009-04-27). "CLiDE Pro: The Latest Generation of CLiDE, a Tool for Optical Chemical Structure Recognition". Journal of Chemical Information and Modeling. 49 (4): 780–787. doi:10.1021/ci800449t. ISSN 1549-9596. PMID 19298076.
  3. Musazade, Fidan; Jamalova, Narmin; Hasanov, Jamaladdin (2022-09-09). "Review of techniques and models used in optical chemical structure recognition in images and scanned documents". Journal of Cheminformatics. 14 (1): 61. doi:10.1186/s13321-022-00642-3. ISSN 1758-2946. PMC 9461257. PMID 36076301.
Category:
Optical chemical structure recognition Add topic