Finding Largest Common Substructures of Molecules in Quadratic Time

Finding Largest Common Substructures of Molecules in Quadratic Time

Abstract

Finding the common structural features of two molecules is a fundamental task in cheminformatics. Most drugs are small molecules, which can naturally be interpreted as graphs. Hence, the task is formalized as maximum common subgraph problem. Albeit the vast majority of molecules yields outerplanar graphs this problem remains NP-hard. We consider a variation of the problem of high practical relevance, where the rings of molecules must not be broken, i.e., the block and bridge structure of the input graphs must be retained by the common subgraph. We present an algorithm for finding a maximum common connected induced subgraph of two given outerplanar graphs subject to this constraint. Our approach runs in time O(Δn2) in outerplanar graphs on n vertices with maximum degree Δ. This leads to a quadratic time complexity in molecular graphs, which have bounded degree. The experimental comparison on synthetic and real-world datasets shows that our approach is highly efficient in practice and outperforms comparable state-of-the-art algorithms.

Grafik Top
Authors
  • Droschinsky, Andre
  • Kriege, Nils M.
  • Mutzel, Petra
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
43rd International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM)
Divisions
Data Mining and Machine Learning
Event Location
Lero - Limerick, Irland
Event Type
Conference
Event Dates
16.-20.01.2017
Series Name
LNCS
ISSN/ISBN
978-3-319-51962-3
Publisher
Springer
Page Range
pp. 309-321
Date
16 January 2017
Official URL
http://dx.doi.org/10.1007/978-3-319-51963-0_24
Export
Grafik Top