Abstract :
[en] The notion of string attractor has been introduced by Kempa and Prezza in 2018 in the context of data compression. It represents a set of positions of a finite word in which all of its factors can be “attracted”, i.e., each factor occurs crossing a position from the set. The smallest size γ∗ of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure γ∗ have been studied in 2021 by Mantaci et al. Very recently, a complexity function, called the string attractor profile function, has been introduced for infinite words, by evaluating γ∗ on each prefix. Such a complexity function has been studied for automatic sequences and linearly recurrent infinite words by Schaeffer and Shallit.
Our contribution to the topic is threefold. First, we explore the relationship between the string attractor profile function and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the property of recurrence. Moreover, we study its asymptotic growth in the case of purely morphic words and obtain a complete description in the binary case. Second, we study similar properties for two new string attractor-based complexity functions, in which the structure and the distribution of positions in a string attractor are taken into account. We also show that these measures provide a finer classification of some infinite families of words, namely the Sturmian and quasi-Sturmian words. Third, we explicitly give the three complexities for some specific morphic words called k-bonacci words.
Scopus citations®
without self-citations
0