Designing q-Unique DNA Sequences with Integer Linear Programs and Euler Tours in De Bruijn Graphs
DNA nanoarchitechtures require carefully designed oligonucleotides with certain non-hybridization guarantees, which can be formalized as the q-uniqueness property on the sequence level. We study the optimization problem of finding a longest q-unique DNA sequence. We first present a convenient formulation as an integer linear program on the underlying De Bruijn graph that allows to flexibly incorporate a variety of constraints; solution times for practically relevant values of q are short. We then provide additional insights into the problem structure using the quotient graph of the De Bruijn graph with respect to the equivalence relation induced by reverse complementarity. Specifically, for odd q the quotient graph is Eulerian, so finding a longest q-unique sequence is equivalent to finding an Euler tour and solved in linear time with respect to the output string length. For even q, self-complementary edges complicate the problem, and the graph has to be Eulerized by deleting a minimum number of edges. Two sub-cases arise, for one of which we present a complete solution, while the other one remains open.
Top- D'Addario, Marianna
- Kriege, Nils M.
- Rahmann, Sven
Category |
Paper in Conference Proceedings or in Workshop Proceedings (Paper) |
Event Title |
German Conference on Bioinformatics (GCB) |
Divisions |
Data Mining and Machine Learning |
Event Location |
Jena, Germany |
Event Type |
Conference |
Event Dates |
20.-22.09.2012 |
Series Name |
OASICS |
ISSN/ISBN |
978-3-939897-44-6 |
Publisher |
Schloss Dagstuhl - Leibniz-Zentrum f\"r Informatik |
Page Range |
pp. 82-92 |
Date |
20 September 2012 |
Official URL |
https://doi.org/10.4230/OASIcs.GCB.2012.82 |
Export |