On the Computation of Longest Previous Non-overlapping Factors

On the Computation of Longest Previous Non-overlapping Factors

Abstract

The f-factorization of a string is similar to the well-known Lempel-Ziv (LZ) factorization, but differs from it in that the factors must be non-overlapping. There are two linear time algorithms that compute the f-factorization. Both of them compute the array of longest previous non-overlapping factors (LPnF-array), from which the f-factorization can easily be derived. In this paper, we present a simple algorithm that computes the LPnF-array from the LPF-array and an array prevOcc that stores positions of previous occurrences of LZ-factors. The algorithm has a linear worst-case time complexity if prevOcc contains leftmost positions. Moreover, we provide an algorithm that computes the f-factorization directly. Experiments show that our first method (combined with efficient LPF-algorithms) is the fastest and our second method is the most space efficient way to compute the f-factorization.

Grafik Top
Authors
  • Ohlebusch, Enno
  • Weber, Pascal
Grafik Top
Shortfacts
Category
Book Section/Chapter
Divisions
Data Mining and Machine Learning
Title of Book
String Processing and Information Retrieval
ISSN/ISBN
978-3-030-32685-2
Page Range
pp. 372-381
Date
October 2019
Export
Grafik Top