Protein Complex Similarity Based on Weisfeiler-Lehman Labeling

Protein Complex Similarity Based on Weisfeiler-Lehman Labeling

Abstract

Proteins in living cells rarely act alone, but instead perform their functions together with other proteins in so-called protein complexes. Being able to quantify the similarity between two protein complexes is essential for numerous applications, e.g. for database searches of complexes that are similar to a given input complex. While the similarity problem has been extensively studied on single proteins and protein families, there is very little existing work on modeling and computing the similarity between protein complexes. Because protein complexes can be naturally modeled as graphs, in principle general graph similarity measures may be used, but these are often computationally hard to obtain and do not take typical properties of protein complexes into account. Here we propose a parametric family of similarity measures based on Weisfeiler-Lehman labeling. We evaluate it on simulated complexes of the extended human integrin adhesome network. We show that the defined family of similarity measures is in good agreement with edit similarity, a similarity measure derived from graph edit distance, but can be computed more efficiently. It can therefore be used in large-scale studies and serve as a basis for further refinements of modeling protein complex similarity.

Grafik Top
Authors
  • Stöcker, Bianca K.
  • Schäfer, Till
  • Mutzel, Petra
  • Köster, Johannes
  • Kriege, Nils M.
  • Rahmann, Sven
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
12th International Conference on Similarity Search and Applications (SISAP)
Divisions
Data Mining and Machine Learning
Event Location
Newark, NJ, USA
Event Type
Conference
Event Dates
02.-04.10.2019
Series Name
Similarity Search and Applications 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2–4, 2019, Proceedings
ISSN/ISBN
978-3-030-32046-1
Publisher
Springer
Page Range
pp. 308-322
Date
2 October 2019
Export
Grafik Top