2020 Sep - 2023 Mar: Assistant Professor at the M&D Data Science Center of Tokyo Medical and Dental University, Division of Data Science Algorithm Design and Analysis
2018 Sep - 2020 Sep: Post-Doctoral Researcher at the Department of Informatics of Kyushu University, String Data Processing Laboratory
Dominik Köppl, Jannik Olbrich: 未決定文字列における欠如単語の検索の困難さ. Local Proceedings of the 200th アルゴリズム研究会, 研究報告アルゴリズム 200 (7), pages 1–5. (2024) We prove that computing absent words in indeterminate strings is NP-complete.manuscript
Dominik Köppl: Overcoming boundaries of AI: future prospects. EU-Japan AI Bridge – Connecting International Researchers and the Japanese AI Start-up Scene. (2024) A talk about the limitations of machine learning with future prospects in combining techniques of different subjects to improve accuracy.
Hiroki Shibata, Dominik Köppl: LZ78 Substring Compression with CDAWGs. Proc. SPIRE, LNCS 14899, pages 289–305. (2024) We study the substring compression in the LZ78 scheme introduced in [j9] with a compressed text index. The presented solution has superlinear running time with respcet to the number of factors to compute, but can use space asymptotically smaller than the text size.doidblpcodemanuscriptslide
Golnaz Badkobeh, Hideo Bannai, Dominik Köppl: Bijective BWT based Compression Schemes. Proc. SPIRE, LNCS 14899, pages 16–25. (2024) We put the bijective Burrows-Wheeler transform into the landscape of compressivitiy measures by drawing connections to bidirectional macro schemes, study the number of runs with respect to the classic Burrows-Wheeler transform, give a linear-time algortihm for computing the Lyndon factorization of all cyclic rotations of a text, and conjecture that strings with the same Parikh-vector can be bijectively mapped by a sequence of cyclic rotatations and bijective Burrows-Wheeler transform applications.doidblpcodemanuscriptslide
Yuto Nakashima, Dominik Köppl, Mitsuru Funakoshi, Shunsuke Inenaga, Hideo Bannai: Edit and Alphabet-Ordering Sensitivity of Lex-Parse. Proc. MFCS, LIPIcs 306, pages 75:1–75:15. (2024) We give tight logarithmic bounds on the change of the number of factors produced by the lex-parse compression scheme when we change a single character of the input or change the order of the alphabet.doidblparxivmanuscriptmirrorslide
Kento Iseri, Tomohiro I, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, Ayumi Shinohara: Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching. Proc. ICALP, LIPIcs 297, pages 89:1–89:19. (2024) We adapt the online Burrows-Wheeler transform construction on the reversed text for constructing the parameterized Burrows-Wheeler transform like in [c35]. By binary searching the interval for the insertion position, we obtain the first construction algorithm of pamaterized indexes whose time only logarithmically depend on the parameterized alphabet size. Up to that point, only linear-dependence has been known.doidblparxivmirror
Yasuko Matsui, Hirotaka Ono, Dominik Koeppl: Enumerating full binary trees in polynomial delay. 25th International Symposium on Mathematical Programming (ISMP). (2024)
Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, Ayumi Shinohara: Algorithms for Galois Words: Detection, Factorization, and Rotation. Proc. CPM, LIPIcs 296, pages 18:1–18:16. (2024) Galois words are conceptually Lyndon words based on the alternating order, for which we give algorithms to determine whether a word is Galois, to factorize a non-Galoid word into Galois words in a unique manner like the Lyndon factorization, and to find the rotation of a word that is Galois. All algorithms work in linear time.doidblparxivcodemirrorslide
Aaron Hong, Marco Oliva, Dominik Köppl, Hideo Bannai, Christina Boucher, Travis Gagie: Pfp-fm: an accelerated FM-index. Algorithms for Molecular Biology 19 (15), pages 1–14. (2024) Journal version of [c41]. We improved the space bounds by run-length encoding both FM-indexes. We called this solution PFP-FM-CSA in the article. Compared to the conference version, this implementation has higher memory consumption during the construction, but significantly smaller index sizes.doidblpcodemirror
Jacqueline W. Daykin, Dominik Köppl, David Kübel, Florian Stober: On arithmetically progressed suffix arrays and related Burrows–Wheeler transforms. Discrete Applied Mathematics Volume 355, pages 180-199. (2024) Compared to the conference version [c21], we here additionally analyze the shapes of Burrows-Wheeler transforms of strings whose suffix arrays are arithmetically progressed. Additionally, we give applications for Christoffel words, balanced words, and meta strings. Finally, we extend our study on binary and ternary alphabets to general alphabets.doidblpmirror
Hideo Bannai, Tomohiro I, Tomasz Kociumaka, Dominik Köppl, Simon J. Puglisi: Computing Longest Lyndon Subsequences and Longest Common Lyndon Subsequences. Algorithmica 86 (3), pages 735–756. (2024) We improve on [c32], we could improve the time and space complexity of our online algorithm, where we shaved off a factor linear in the alphabet sizedoimirror
Eric M. Osterkamp, Dominik Köppl: Extending the Parameterized Burrows–Wheeler Transform. Proc. DCC, pages 143–152. (2024) We propose a generalized pattern matching combining parameterized with circular pattern matching. As an indexing data structure for such a matching, we enhance the parameterized Burrows–Wheeler transform (pBWT) of Kim and Cho with techniques derived from the extended Burrows–Wheeler transform of Mantaci et al., obtaining time and space complexities similar to the pBWT.doidblpmanuscriptphotoslidevideoyoutube
Dominik Köppl: Computing LZ78-Derivates with Suffix Trees. Proc. DCC, pages 133–142. (2024) We extend the LZ78 substring compression introduced in [j9] to the LZ78 derivates LZMW and LZ double. As a byproduct, we obtain the first linear-time algorithm for integer alphabets independent on the text compressibility.doidblparxivmanuscriptslidevideoyoutube
Akiyoshi Kawamoto, Tomohiro I, Dominik Köppl, Hideo Bannai: On the Hardness of Smallest RLSLPs and Collage Systems. Proc. DCC, pages 243–252. (2024) We show that finding a smallest run-length compressed straight line programs (RLSLPs) is NP-hard for unbounded alphabet size. The same proof can be adapted for finding a smallest collage system. Additionally, we give a MAX-SAT encoding for computing a smallest RLSLP.doidblpmanuscript
Dominik Köppl: Computing LZ78-Derivates with Suffix Trees. International Workshop on Discrete Mathematics and Algorithms. (2024) Workshop-Presentation of [c43].
クップル ドミニク, 番原 睦則: Answer Set Programming を用いた圧縮指標の計算. Local Proceedings of the LA Symposium Winter 2023. (2024) A translation of the MAX-SAT encodings presented in [c33] to the ASP language, in Japanese.codemirrorphotoslide
Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Piątkowski: Constructing and Indexing the Bijective and Extended Burrows–Wheeler Transform. Inf. Comput. 297 (105153), pages 1–30. (2024) Journal version of [c14] and [c24] with more examples, higher detail in the explanations and an experimental evaluation of the presented construction algorithm with known algorithms constructing the BBWT.doidblpcodemanuscript
Eric M. Osterkamp, Dominik Köppl: パラメタ化 Burrows–Wheeler 変換の拡張. Local Proceedings of コンピュテーション研究会, 信学技報 123 (325), pages 53–55. (2023) Presentation of the results of the Bachelor thesis of Eric M. Osterkamplinkmanuscriptphotoslide
中島 祐人, クップル ドミニク, 舩越 満, 稲永 俊介: lex-parse の圧縮感度. Local Proceedings of the 195th アルゴリズム研究会, 研究報告アルゴリズム 195 (2), pages 1–3. (2023) We study the chance of the number of factors of lex-parse when changing a character in the input.linkmanuscript
クップル ドミニク: LZD と LZMW 分解の部分文字列圧縮について. Local Proceedings of the 195th アルゴリズム研究会, 研究報告アルゴリズム 195 (1), pages 1–3. (2023) We present a deterministic linear-time algorithm for computing LZD and LZMW, where linear either means linear in the input size, or in the model of substring compression linear in the number of factors of the output if we allow linear-time precomputation on the input text.linkmanuscriptphotoslide
Aaron Hong, Marco Oliva, Dominik Köppl, Hideo Bannai, Christina Boucher, Travis Gagie: Acceleration of FM-Index Queries Through Prefix-Free Parsing. Proc. WABI, LIPIcs 273, pages 13:1–13:16. (2023) By storing additionally to the FM-index the index of [c30], we can make use of both to accelerate counting queries for all pattern lengths and the expense of more space compared to the FM-index.doilinkdblparxivcodemirror
Nicola Cotumaccio, Travis Gagie, Dominik Köppl, Nicola Prezza: Space-time Trade-offs for the LCP Array of Wheeler DFAs. Proc. SPIRE, LNCS 14240, pages 143–156. (2023) We sample the longest-common prefix array generalized from strings to Wheeler DFAs, which takes space of words linear to the number of states, for a representation that requires linear number in bits with a logarithmic access time. Like our precursor, we give matching statistics computation as an application, which we now can do with a time-space trade-off.doidblparxivmanuscript
Paola Bonizzoni, Christina Boucher, Davide Cozzi, Travis Gagie, Dominik Köppl, Massimiliano Rossi: Data Structures for SMEM-Finding in the PBWT. Proc. SPIRE, LNCS 14240, pages 89–101. (2023) We propose space-efficient data structures that augment the positional Burrows–Wheeler Transform for efficiently finding set-maximal exact matches, and compare these with the baseline approach, which uses the plain divergence array.doidblparxivcodemanuscript
Dominik Köppl, Florian Kurpicz, Daniel Meyer: Faster Block Tree Construction. Proc. ESA, LIPIcs 274, pages 74:1–74:20. (2023) We leverage the LPF array to judge whether a block in the block tree has already a previous occurrence. By building the block tree levelwise and using information about the previous level, we can obtain the same time complexity as previous approaches while gaining practical performance, primarily because more sophisticated techniques are no longer needed.doilinkdblpcodemirror
Davide Cozzi, Massimiliano Rossi, Simone Rubinacci, Travis Gagie, Dominik Köppl, Christina Boucher, Paola Bonizzoni: μ-PBWT: a lightweight r-indexing of the PBWT for storing and querying UK Biobank data. Bioinformatics 39 (9). (2023) We augment the positional Burrows–Wheeler Transform (PBWT) with indexing data structures akin to the r-index to devise an indexing data structure for matching stastistics while keeping the working space bounded in the number of runs in the PBWT when read column-wise.doidblpcodemirror
Hideo Bannai, Tomohiro I, Dominik Köppl: Longest bordered and periodic subsequences. Inf. Process. Lett. 182 (106398), pages 1–6. (2023) We use state-of-the-art tools for computing the longest common subsequences between all prefixes and all suffixes of a text to compute longest bordered subsequences or longest periodic subsequences. For the longest bordered subsequences, we give a conditional lower bound that matches our quadratic running time.doidblparxivmirror
Davide Cozzi, Massimiliano Rossi, Simone Rubinacci, Travis Gagie, Dominik Köppl, Christina Boucher, Paola Bonizzoni: μ-PBWT: a lightweight r-indexing of the PBWT for storing and querying UK Biobank data. Poster at ISMB/ECCB. (2023) poster
Dominik Köppl: Encoding Hard String Problems with Answer Set Programming. Proc. CPM, LIPIcs 259, pages 17:1–17:21. (2023) We study common NP-hard problems that have strings as inputs under the light of solvers that work with problem encodings written in answer set programming (ASP). Our evaluation reveals that naive implementations are usually outpaced by ASP encodings if the solver manages to considerably reduce the number of choices to take. While we empirically experienced this situation for most of the presented problems, solving the traveling salesman problem was more challenging with ASP than with a solution working with dynamic programming.doilinkdblpcodemirrorphotoslidevideoyoutube
César Martínez-Guardiola, Nathaniel K. Brown, Fernando Silva-Coira, Dominik Köppl, Travis Gagie, Susana Ladra: Augmented Thresholds for MONI. Proc. DCC, pages 268–277. (2023) We practically accelerate matching statistics computation of PHONI (the r-index with a grammar supporting LCE queries) by augmenting thresholds introduced in MONI with LCE information between the thresholds and the neighboring run endings. This allows us to skip some LCE queries for the grammar, which was the computational bottleneck of PHONI.
doidblparxivcodemanuscript
Dominik Köppl: Dynamic Skyline Computation with LSD Trees. Analytics 2 (1), pages 146–162. (2023) We enhance the algorithm of [c3] for dynamic skyline computation that exhibits practically good performance for small dimensions. The dynamic skyline problem is to support successive queries, where the query parameters can slightly be changed. Our idea is to cache the query results and quickly decide on a new query which cached results can be reused. For this task, a geometric data structure like the LSD tree used in the original paper is useful. doilinkcodemirror
Christina Boucher, Dominik Köppl, Herman Perera, Massimiliano Rossi: r インデックスにおける接尾辞配列を模倣するデータ構造. Local Proceedings of the LA Symposium Winter 2022. (2023) Presentation of [c34] in Japanese.manuscriptslide
Szymon Grabowski, Dominik Köppl: Space-efficient Huffman codes revisited. Information Processing Letters 179 (106274), pages 1–8. (2023) We lower the space requirements of a Huffman tree representation with constant time for a single encoding and decoding operation by a partitioning of the codewords and indexing each part by a dedicated fusion tree to improve the space bounds. doilinkdblparxivmirror
中島 祐人, クップル ドミニク, 舩越 満, 稲永 俊介: アルファベット順による lex-parse サイズ比. Local Proceedings of the 191th アルゴリズム研究会, 研究報告アルゴリズム 191 (4), pages 1–4. (2023) We show that changing the alphabet order can lead to an logarithmic-multiplicative increase of the number of factors of lexicographic parsings.linkmanuscript
クップル ドミニク: 接尾辞木に基づくLZ77とLPF配列の変種の計算. Local Proceedings of コンピュテーション研究会, 信学技報 122 (294), pages 29–30. (2022) Presentation of [j11] and [j9] in Japanese.linkmanuscriptslide
Dominik Köppl, Gonzalo Navarro, Nicola Prezza: Lempel–Ziv 項の距離を高次情報量で表現する符号. Local Proceedings of the 190th アルゴリズム研究会, 研究報告アルゴリズム 190 (1), pages 1–3. (2022) Presentation of [c29] in Japanese.linkmanuscriptslide
Daiki Hashimoto, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, Ayumi Shinohara: Computing the Parameterized Burrows–Wheeler Transform Online. Proc. SPIRE, LNCS 13617, pages 70–85. (2022) We propose an algorithm for the construction of the parameterized Burrows-Wheeler transform whose time complexity linearly depends on the number of parameterized characters and logarithmically on the text length. The latter is due to dynamic wavelet trees necessary for building the Burrows-Wheeler transform from right to left.doidblparxivmanuscript
Christina Boucher, Dominik Köppl, Herman Perera, Massimiliano Rossi: Accessing the Suffix Array via ϕ^-1-Forest. Proc. SPIRE, LNCS 13617, pages 86–98. (2022) We augment the r-index with a graph of sparse ϕ-array values for improving random access to the suffix array.
The r-index uses a predecessor data structure on these sparse ϕ-array values,
and computes suffix array values sequentially starting from a run boundary by issuing a predecessor query per entry.
We here propose a graph whose traversal simulates such a sequential computation of suffix array entries by avoiding predecessor queries if possible.
This graph can speed up random access to the suffix array when long traversals on it are possible. doidblpcodemanuscriptslidevideoyoutube
Hideo Bannai, Keisuke Goto, Masakazu Ishihata, Shunsuke Kanda, Dominik Köppl, Takaaki Nishimoto: Computing NP-hard Repetitiveness Measures via MAX-SAT. Proc. ESA, LIPIcs 244, pages 12:1–12:16. (2022) We provide MAX-SAT formulations for the smallest string attractor, the smallest straight-line program, and the smallest bidirectional scheme, and give implementations in pysat. This is the first study that tackles these NP-hard problems practically and on real data.doidblparxivcodemirror
Paolo Ferragina, Giovanni Manzini, Travis Gagie, Dominik Köppl, Gonzalo Navarro, Manuel Striani, Francesco Tosoni: Improving Matrix-vector Multiplication via Lossless Grammar-Compressed Matrices. Proc. VLDB 15 (10), pages 2175–2187. (2022) We apply the grammar compressor Re-Pair on matrices by linearizing a matrix row-wise and joining the rows by special delimiters. The resulting grammar is used to build an arithmetic circuit for performing matrix-vector multiplications in compressed space related to the number of rules produced by the grammar.
The reordering of the columns improves the compression performance by bringing columns with related content together.
doilinkdblparxivcodemirror
Hideo Bannai, Tomohiro I, Tomasz Kociumaka, Dominik Köppl, Simon J. Puglisi: Computing Longest (Common) Lyndon Subsequences. Proc. IWOCA, LNCS 13270, pages 128–142. (2022) We give algorithms computing the smallest lexicographic sequence for each possible length, the longest Lyndon subsequence, and the longest common Lyndon subsequence. For the longest Lyndon subsequence, we additionally present an online algorithm.doidblparxivmanuscriptphotoslidevideoyoutube
Tomohiro I, Dominik Köppl: Space-Efficient B Trees via Load-Balancing. Proc. IWOCA, LNCS 13270, pages 327–340. (2022) We could improve the space bounds for B trees by a combination of two B tree variants, the B+ tree and B* tree.
We extend the B* tree technique to share keys among a non-constant number of sibling nodes.
Like the B+ tree, we store the actual data in the leaves, whose capacity we enlarging to reduce the number of internal nodes.
We further augment our B tree with aggregate values such as storing the minimum in stored in the descendant leaves of a node, and show that our complexity bounds also hold for certain types of aggregate functions. doidblparxivmanuscriptslidevideoyoutube
Dominik Köppl, Simon J. Puglisi, Rajeev Raman: Fast and Simple Compact Hashing via Bucketing. Algorithmica 84 (9), pages 2735–2766. (2022) We extend [c20] with SIMD instruction support for accelerating the sequential scan in the buckets. We theoretically support our devised data structure with probabilistic time complexities and a lower bound on the time after which a rehashing must occur. We further showed that overflow tables can improve the distribution of elements among the buckets when the maximum bucket size is too small with respect to the number of elements to hash.The measured memory must be treated with some reservation since the actual operating system memory allocator is not measured in the experiments. By allocating many tiny fragments of space, the non-measured space might not be non-negligible.doilinkdblpcodemirror
Kazuya Tsuruta, Dominik Köppl, Shunsuke Kanda, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda: c-trie++: A dynamic trie tailored for fast prefix searches. Inf. Comput. 285 Part B (104794), pages 1–22. (2022) We extended our introduction of a novel trie in [c17] by a more detailed evaluation. Here, we tested random and sorted insertion of keywords, and the effect of querying the generated data structure with the same distribution, a different random distribution or in sorted order. Despite the fact that the performance of our presented trie shows a strong dependency on that order with the sorted setting being the best case, it still outperforms other compact tries on less favorable distributions.doilinkdblpcodemirror
Dominik Köppl: Linking Off-Road Points to Routing Networks. Algorithms 15(5) (163), pages 1–15. (2022) We analyze ways in how to add two-dimensional points to routing networks that can be represented locally on a two-dimensional plane with the Euclidean metric.
We give a practical use case for the spatial extension pgrouting of the Postgres database in how this linkage can practically be achieved.doilinkdblpmirror
Alexandre P. Francisco, Travis Gagie, Dominik Köppl, Susana Ladra, Gonzalo Navarro: Graph Compression for Adjacency-Matrix Multiplication. SN Computer Science 3 (193), pages 1–8. (2022) We study two compression schemes for binary matrices in the light of an index for matrix-vector multiplications, namely (1) the WebGraph format, which encodes a row by a reference to a former row plus difference information, and (2) an extraction of bicliques, i.e., two sets of vertices where each node of the first set has arcs to all nodes of the second set. We perform such a compression and use the compressed matrix for PageRank computation, where we directly compute multiplications on the compressed representation without decompression, leveraging the fact that the compression removed redundancies. doidblpcodemanuscriptmirror
Jin Jie Deng, Wing-Kai Hon, Dominik Köppl, Kunihiko Sadakane: FM-Indexing Grammars Induced by Suffix Sorting for Long Patterns. Proc. DCC, pages 63–72. (2022) We build the Burrows-Wheeler transform on a text that has been grammar-compressed with the grammar deduced by induced suffix sorting, and show that this grammar exhibits properties to accelerate the backward search of the FM-index, which we now perform not only single characters but on the non-terminals produced by the grammar that serve as meta-characters.
This speed-up can compensate the additional time for the grammar parse of the pattern if the pattern is long enough.doilinkdblparxivcodemanuscript
Dominik Köppl, Gonzalo Navarro, Nicola Prezza: HOLZ: High-Order Entropy Encoding of Lempel–Ziv Factor Distances. Proc. DCC, pages 83–92. (2022) We propose an alternative encoding of the offset values of the LZ77 parsing by using the distances of the already parsed prefixes of the text sorted in colexicographic order, and achieve better compression ratios than the rightmost parsing or the bit-optimal parsing when the text has low k-th order empirical entropy.This work is strongly related to old compression methods like the ACB and HYZ compression with the difference that these methods use only contexts of bounded length when sorting the prefixes of the processed text in co-lexicographic order.doilinkdblparxivcodemanuscriptslidevideoyoutube
Dominik Köppl: Computing Lexicographic Parsings. Proc. DCC, pages 232–241. (2022) We present a linear-time algorithm for computing the lex-parse via the ϕ-Array, for which we devise a sparse representation. Compared to previous implementations, we could significantly reduce the working space required for computing lex-parse. We obtain similar results for plcpcomp introduced in [c15].doilinkdblpmanuscriptslidevideoyoutube
Dominik Köppl: Inferring Spatial Distance Rankings with Partial Knowledge on Routing Networks. Information 13(4) (168), pages 1–28. (2022) We study a weaker problem of the shortest path problem in that we just want a ranking of vertices with respect to the distance to a query vertex, which can be helpful in recommender systems. There, we provide a cache that can speed up such queries when they are issued subsequently and in a sufficient vicinity.
This cache is an orthogonal enhancement to graph indices like contraction hierarchies.doilinkdblpcodemirror
赤木 亨, ドミニク クップル, 中島 祐人and 稲永 俊介, 坂内 英夫, 竹田 正幸: 文法圧縮索引構造 GCIS-index. Local Proceedings of the LA Symposium Summer 2021. (2022) Presentation of [c26] in Japanese.
Tomohiro I, Dominik Köppl, Robert Irving, Lorna Love: Extracting the Sparse Longest Common Prefix Array from the Suffix Binary Search Tree. Proc. SPIRE, LNCS 12944, pages 143–150. (2021) We review the suffix binary search tree for sparse suffix sorting. The suffix binary search tree is a binary search tree for storing dynamically suffixes, and can be made balanced by using AVL tree techniques. While it is easy to extract the sparse suffix array with an in-order tree traversal, we can do so also for the sparse longest common prefix table if we keep track of the longest common prefix of the lowest left and the lowest right ancestor node.doidblpmanuscriptslidevideoyoutube
Tooru Akagi, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda: Grammar Index by Induced Suffix Sorting. Proc. SPIRE, LNCS 12944, pages 85–99. (2021) We index the rules of the grammar deduced by induced suffix sorting with a suffix tree and use it for pattern matching on the pattern compressed with the same grammar.
For that, we make the observation that the grammar rules produced for the pattern equal the rules for the text, except at the borders of the pattern.
After finding these equal grammar rules in the grammar tree of the text, we need to do some extra work to extend these to cover the whole pattern.
For that, we build the generalized suffix tree on the right hand sides of all rules of the text grammar to support longest common extension queries that speed up the comparison.
Our index performs practically well on highly-repetitive texts.
doidblparxivcodemanuscript
Hideo Bannai, Mitsuru Funakoshi, Tomohiro I, Dominik Köppl, Takuya Mieno, Takaaki Nishimoto: A Separation of γ and b via Thue–Morse Words. Proc. SPIRE, LNCS 12944, pages 167–178. (2021) While it is know that Thue-Morse words have string attractors of size 4 (for large enough such words),
we here proof that the size of the smallest bidirectional macro scheme of such a word scales logarithmic with their lengths.
Up so far, this is the first observation that the size of the smallest string attractors is asymptotically smaller than the size of the smallest bidirectional macro scheme of more than a constant factor.
doidblparxivmanuscript
Diego Arroyuelo, Rodrigo Cánovas, Johannes Fischer, Dominik Köppl, Marvin Löbel, Gonzalo Navarro, Rajeev Raman: Engineering Practical Lempel-Ziv Tries. ACM JEA 26 (14), pages 1.14:1–1.14:47. (2021) We improve on the LZ trie implementations studied in [c13] together with an LZ78 variant that encodes the output in a compact hash table to keep the working space small. We engineer a new trie representation with the hash table proposed in [c20]. This trie keeps multiple of these hash tables, each for different bit ranges for the values, and optionally also for the keys to store. We also use the latter hash table in the LZ78 variant, where we use a Bonsai trie representing the LZ78 trie as well as part of the parsing, since we here only emit the positions of the stored nodes in the hash table instead of the standard LZ78 coding. This can save space if the hash table has a high load factor.doidblpmirror
Dominik Köppl: Computation of Variations of the LZ77 factorization and the LPF Array with Suffix Trees. WCTA. (2021) Presentation of [j11] and [j9].slidevideo
Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Piątkowski: Constructing the Bijective and the Extended Burrows–Wheeler Transform in Linear Time. Proc. CPM, LIPIcs 191, pages 7:1–7:16. (2021) We show how to construct the bijective Burrow-Wheeler transform (BBWT) in linear time with a modification of the SAIS algorithm computing the suffix array in linear time. The idea is to simulate SAIS while paying attention at the Lyndon factor borders such that the inducing step of SAIS is based on the context within a Lyndon factor, and not on the characters preceding the factor. This gives us a so-called circular suffix array, with which it is easy to obtain the BBWT.doidblparxivmirrorslidevideoyoutube
Dominik Köppl: Reversed Lempel–Ziv Factorization with Suffix Trees. Algorithms 14(6) (161), pages 1–26. (2021) We apply the suffix tree compositions studied in [j1] for the reversed LZ variant and the respective LPF array variant. The space become twice as big since we index also the reversed text for finding reversed matches during the factorization. On the upside, the time complexities are the same as when computing the standard LZ parsing.doidblpmirror
Christina Boucher, Travis Gagie, Tomohiro I, Dominik Köppl, Ben Langmead, Giovanni Manzini, Gonzalo Navarro, Alejandro Pacheco, Massimiliano Rossi: PHONI: Streamed Matching Statistics with Multi-Genome References. Proc. DCC, pages 193–202. (2021) We practically implement an algorithm computing the matching statistics with the r-index and a grammar supporting longest common extension (LCE) queries by a single reverse scan over the pattern.
This scan is done by classic FM-index backward search steps.
Whenever we have a mismatch with the r-index, we issue LCE queries at the text positions of the closest positions in the BWT with characters equal to the pattern character we want to match.
The longest such match leads us to the next matching statistics length. Unlike previous algorithms that use two passes over the pattern but have better theoretical time complexities,
our proposed algorithm is practically faster since querying the grammar sometimes can be less expensive than a second pass over the pattern.
doilinkdblparxivcodeslidevideoyoutube
Christina Boucher, Travis Gagie, Tomohiro I, Dominik Köppl, Ben Langmead, Giovanni Manzini, Gonzalo Navarro, Alejandro Pacheco, Massimiliano Rossi: PHONI: Streamed Matching Statistics with Multi-Genome References. Data Structures in Bioinformatics. (2021) Workshop-Presentation of [c23].
Dominik Köppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto: Re-Pair in Small Space. Algorithms 14(1) (5), pages 1–20. (2021) We elaborate on the nearly in-place algorithm of [c22]. We derive from these techniques a bit-parallel algorithm, a parallel algorithm (working with multiple threads), and an algorithm working in external memory. We further show that these techniques can be adapted for the computation of the MR-Re-Pair compression variant, that takes not the most frequent bigram per se, but the maximal repeat containing that bigram.doidblpcodemirror
Dominik Köppl: Non-Overlapping LZ77 Factorization and LZ78 Substring Compression Queries with Suffix Trees. Algorithms 14(2) (44), pages 1–21. (2021) First, we propose an application of the suffix tree compositions studied in [j1] for the non-overlapping LZ variant and the respective LPF array variant. This algorithm works in the same space as the algorithm for the standard LZ parsing, and the time complexities are linear or near-linear. Second, we show how to perform substring compression queries for LZ78 in time linear in the number of phrases to compute. With space-efficient suffix tree representations, we get a logarithmic time penalty, which we can slightly improve by using the centroid path-decomposed suffix tree.doidblpmirror
Takuya Mieno, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda: Space-efficient algorithms for computing minimal/shortest unique substrings. Theor. Comput. Sci. Volume 845, pages 230–242. (2020) We enhance our results in [c16] with variations using compressed data structures with time and space trade-offs. On of these are the suffix trees studied in [j1]. We further give the maximum working space needed for each phase during the construction of our data structure. doidblparxivmanuscript
Shunsuke Kanda, Dominik Köppl, Yasuo Tabei, Kazuhiro Morita, Masao Fuketa: Dynamic Path-Decomposed Tries. ACM JEA 25 (1), pages 1.13:2–1.13:28. (2020) We derive a dynamic variant of the centroid-path decomposition of static trees for devising a dynamic trie,
where each node stores a keyword, while its children share a common prefix with this keyword.
While the keywords are stored prefix-compressed in an auxiliary data structure, the remaining tree is indexed with a compact hash table trie representation such as the one suggested in [c13]. doidblparxivcodemanuscript
Dominik Köppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto: Re-Pair in Small Space. Proc. PSC, pages 134–147. (2020) We provide an algorithm computing the grammar compression re-pair in small up to constant additional space while allowing up to quadratic time. The idea is that we successively use freed-up text space for storing frequency tables storing the most frequent bigrams. We study the conditions in which such a table needs to be updated, or when we can continue the computation with the old table.linkdblpmirrorphotoslidevideoyoutube
Kazuya Tsuruta, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda: Grammar-compressed Self-index with Lyndon Words. IPSJ TOM 13 (2), pages 84–92. (2020) We build a grammar index on the grammar induced by the Lyndon tree, which uses properties of Lyndon words to accelerate pattern matching as usually done by a grid for grammar indices that tries to divide a pattern at each possible position.linkarxivmanuscript
Jacqueline W. Daykin, Dominik Köppl, David Kübel, Florian Stober: On Arithmetically Progressed Suffix Arrays. Proc. PSC, pages 96–110. (2020) We exhaustively characterize all strings having an arithmetic progression as a suffix array, and determine the shape of the Burrows-Wheeler transform induced by such suffix arrays. In particular, given an arithmetic progression, we find a uniquely defined string of minimal alphabet size whose suffix array is this progression. This minimal alphabet size can be at most three. Further, this generalizes the results presented in [c6].linkdblparxivmirror
Johannes Fischer, Tomohiro I, Dominik Köppl: Deterministic Sparse Suffix Sorting in the Restore Model. ACM Trans. Algorithms 16 (4), pages 50:1–50:53. (2020) We provide a full version of [c9] with an additional proof that the upper bound on the number of changed nodes in the edit-sensitive parsing has to be higher by a logarithmic factor than claimed in literature. This observation strengthens the reason of our proposed modification of that grammar, which is not prone to this change.
doilinkdblpmirror
Dominik Köppl, Simon J. Puglisi, Rajeev Raman: Fast and Simple Compact Hashing via Bucketing. Proc. SEA, LIPIcs 160, pages 7:1–7:14. (2020) We extend the study in [i7] by providing a variant, where buckets are implicitly represented as parts of a meta-bucket, which stores a bit vector for the beginning of a bucket inside the meta-bucket. This improves the average load-factor per bucket, which can become skewed if the maximal allowed bucket size is too small with respect to the input size.doidblparxivmirrorslidevideoyoutube
Dominik Köppl, Daiki Hashimoto, Diptarama Hendrian, Ayumi Shinohara: In-Place Bijective Burrows–Wheeler Transforms. Proc. CPM, LIPIcs 161, pages 21:1–21:15. (2020) We provide algorithms computing and reverting the bidirectional Burrows-Wheeler transform (bijective BWT) in quadratic time with only constant additional space.
Up so far, it was only known how to do the same for the BWT in the same complexities.
We additionally give an algorithm for converting a BWT to the bijective BWT or back directly within the same complexities.
doilinkdblparxivcodemirrorslidevideoyoutube
Roland Glück, Dominik Köppl: Computational Aspects of Ordered Integer Partition with Bounds. Algorithmica 82 (10), pages 2955–2984. (2020) We parallelized our algorithm proposed in [c1]. For that, it is possible to generate a dependency graph of the polynomials we want to construct.
To maximize parallelization, we traverse this graph in topological order, starting with the polynomials having no dependencies.
We empirically could evaluate that this schedule leads to an algorithm that scales very well with increasing number of threads.doidblpcodeNature SharedItmanuscript
Dominik Köppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto: Re-Pair in Small Space (Poster). Proc. DCC, pages 377. (2020) Poster presentation of [c22].doilinkdblparxivmanuscriptpostervideoyoutube
Kazuya Tsuruta, Dominik Köppl, Shunsuke Kanda, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda: c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches. Proc. DCC, pages 243–252. (2020) We algorithmically engineer a combination of z-fast tries and compact tries, which allows us to perform fast prefix search queries via word-packing.
Its empirical space usage is much lower than other known compact trie implementations while also being among one of the fastest.
doilinkdblparxivslidevideo
Dominik Köppl: Constructing the Bijective BWT. The 28th London Stringology Days and London Algorithmic Workshop. (2020) Workshop-Presentation of [c24].mirror
Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Piątkowski: Constructing the Bijective BWT. Local Proceedings of the 175th アルゴリズム研究会, 研究報告アルゴリズム 175 (19), pages 1–6. (2019) Presentation of [c24] in Japanese.linkmanuscriptslide
Takuya Mieno, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda: Compact Data Structures for Shortest Unique Substring Queries. Proc. SPIRE, LNCS 11811, pages 107–123. (2019) We give a lightweight data structure for answering point and interval shortest unique substring (SUS) queries, for which we detect and index minimal unique substrings. While previous data structures already can report a point SUS in constant time, they need words of space linear in text size. Here, we give solutions working in bits of space linear in the text length. doidblparxivmanuscript
Patrick Dinklage, Jonas Ellert, Johannes Fischer, Dominik Köppl, Manuel Penschuck: Bidirectional Text Compression in External Memory. Proc. ESA, pages 41:1–41:16. (2019) We give a space-efficient algorithm for the lcpcomp compressor defined in [c11].
We can also compute lcpcomp in external memory for massive datasets efficiently. For the decompression, we give the first algorithm for decompressing an arbitrary bidirectional macro scheme, and give I/O complexities for this algorithm.doidblparxivmirrorslide
クップル ドミニク , 井 智弘, 古谷 勇, 高畠 嘉将, 酒井 健輔, 後藤 啓介: Re-Pair In-Place. Local Proceedings of the LA Symposium Summer 2019. (2019) Presentation of [c22] in Japanese.manuscriptphotoslide
鶴田 和弥, Dominik Köppl, 神田 峻介, 中島 祐人, 稲永 俊介, 坂内 英夫, 竹田 正幸: Dynamic Trie Tailored for Fast Prefix Searches. Local Proceedings of the LA Symposium Summer 2019. (2019) Presentation of [c17] in Japanese.
三重野 琢也, Dominik Köppl, 中島 祐人, 稲永 俊介, 坂内 英夫, 竹田 正幸: 最短非反復部分文字列問題に対するコンパクトなデータ構造. Local Proceedings of the LA Symposium Summer 2019. (2019) Presentation of [c16] in Japanese.
Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Piątkowski: Searching Patterns in the Bijective BWT. Dagstuhl Seminar 19241 "25 Years of the Burrows-Wheeler Transform", pages 65. (2019) Presentation of ideas published in [c14] with more details.mirrorslide
Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Piątkowski: Indexing the Bijective BWT. Proc. CPM, LIPIcs 128, pages 17:1–17:14. (2019) We augment the bijective Burrows-Wheeler transform with FM-indexing techniques for pattern matching with a potential multiplicative-logarithmic slow down for tracking false and missing occurrences of a pattern. Since the bijective BWT is derived by the previous character of each cyclic rotation of all Lyndon conjugates sorted in the ω-order, a backward search step at the beginning of a Lyndon factor gives us not the previous character but the end of this Lyndon factor. For such cases, the index has to do extra counting for occurrences not in the suffix array interval, or false positives contained in it. doidblpmirrorslide
Dominik Köppl: Separate Chaining Meets Compact Hashing. Local Proceedings of the 173th アルゴリズム研究会, 研究報告アルゴリズム 173 (5), pages 1–11. (2019) We algorithmically engineer a compact hash table with separate chaining, suitable for storing and querying integers with small bit widths.linkmanuscriptphotoslide
Patrick Dinklage, Jonas Ellert, Johannes Fischer, Dominik Köppl, Manuel Penschuck: PLCP配列に基づくテキスト圧縮. Local Proceedings of the LA Symposium Winter 2018. (2019) Presentation of [c15] in Japanese.manuscriptphotoslide
赤木 亨, 中島 祐人, ドミニク クップル, 稲永 俊介, 坂内 英夫, 竹田 正幸: GCIS による文法圧縮テキスト上のパターン照合. Local Proceedings of the LA Symposium Winter 2018. (2019) Presentation of [c26] in Japanese.
Tomohiro I, Dominik Köppl: Improved upper bounds on all maximal α-gapped repeats and palindromes. Theor. Comput. Sci. Volume 753, pages 1–15. (2019) We refine the upper bounds stated in [j2] on the maximum number of all maximal α-gapped repeats and palindromes.
To this end, we leverage some properties of periodic substrings, and improve the analysis where we map gapped repeats or palindromes to two-dimensional points spawning rectangles that must not overlap. doidblpmanuscript
Paweł Gawrychowski and
Tomohiro I and
Shunsuke Inenaga and
Dominik Köppl and
Florin Manea: Tighter Bounds and Optimal Algorithms for All Maximal α-gapped
Repeats and Palindromes - Finding All Maximal α-gapped
Repeats and Palindromes in Optimal Worst Case Time on Integer Alphabets. Theory Comput. Syst. 62 (1), pages 162–191. (2018) We improve [c7] with tighter combinatorial bounds. We state that the upper bound, which is linear in the text length and α, is tight by a known lower bound. We explicitly give numbers for the upper bounds on the maximum number of all maximal α-gapped repeats and palindromes appearing in a text of length n. We also explicitly give proof details for the palindromic case, which we omitted in the conference version due to page limit restrictions. We also give an algorithm finding all maximal α-gapped palindromes in time linear in α and the text length, which is optimal in the worst case. doilinkdblpNature SharedItmirror
Dominik Köppl: Exploring regular structures in strings. Dissertation at TU Dortmund. (2018) We unify the analysis of α-gapped repeats and α-gapped palindromes of our previous papers and give some applications for our algorithm computing LZ77 with the suffix tree. Compared to [j2], we could improve the proofs with some cases neglected in that version. Explanations have been enhanced with elaborated examples.The analysis of fragile nodes for the lower-bound of the ESP parsing should emphasize that there is an example in which *all* of the analyzed fragile nodes change. The notion of fragile nodes just states that a fragile node can be parsed differently, but in the shown example all studied fragile nodes indeed change.doilinkdblpmanuscriptslide
Johannes Fischer, Tomohiro I, Dominik Köppl, Kunihiko Sadakane: Lempel–Ziv Factorization Powered by Space Efficient Suffix Trees. Algorithmica 80 (7), pages 2048–2081. (2018) We unify the results in [c5] and [c8]. We improve the time complexities of the algorithms working with compressed suffix trees to linear. We largely simplified the LZ78 computation by using bits for counting in unary to what extent a suffix tree edge has been explored by the LZ trie when superimposing the suffix tree with the LZ trie.doidblpNature SharedItmanuscript
Dominik Köppl: Plug&Play Kompression mit dem Framework tudocomp. Vortragsreihe der Regionalgruppe der Gesellschaft für Informatik aus Dortmund. (2018) User guide in how to work with the compression framework tudocomp.manuscriptslide
Dominik Köppl: Lempel–Ziv with Integer Coders. Newsletter of the SoBigData Research Infrastructure, Issue 1. (2018) A study of the distribution of distances and lengths of LZ phrases.mirror
Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Piątkowski: Indexing the Bijective BWT. 32th Mini-Workshop "Theoretical Computer Science" at TuDo and RUB. (2018) Workshop-Presentation of [c14].
Johannes Fischer and
Dominik Köppl: Practical Evaluation of Lempel–Ziv-78 and Lempel–Ziv–Welch Tries. Proc. SPIRE, LNCS 9472, pages 191–207. (2017) We conduct an experimental analysis for various practical trie data structures for computing the LZ78 parsing, with focus on some trie representations that use hashing techniques. Here, we present a Monte-Carlo algorithm based on fingerprint-hashing substrings of the text for checking whether substringns are already part of the LZ78 dictionary. For large fingerprints, we could not observe a failure for even hundreds of megabytes of real-word texts, while outpacing other common solutions. We also present a representation of the LZ78 trie with a compact hash table that uses less space than other known solutions.doilinkdblparxivslide
Andre Brehme, Sven Brümmer, Jonas Charfreitag, Jonas Ellert and
Jannik Junghänel, Dominik Köppl, Christopher Osthues, Sven Rahmann, Dennis Rohde and
Julian Sauer, Jonas Schmidt, Lars Schäpers, Uriel Elias Wiebelitz, Jens Zentgraf: PanGeA: Pan-Genome Annotation Indexing Annotated Human Genome Collections. German Conference on Bioinformatics (GCB). (2017) We compare several approaches such as colored de-Brujin graphs, CHICO, journaled string trees and relative LZ for their suitability in maintaining and querying genomic data with high similarity. We empirically evaluated that colored de-Brujin graphs are good pre-filters since small yet efficient implementations exist. mirrorposter
Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Piątkowski: Indexing the Bijective BWT. WCTA. (2017) Workshop-Presentation of [c14].manuscriptslide
Hideo Bannai, Shunsuke Inenaga, Dominik Köppl: Computing All Distinct Squares in Linear Time for Integer Alphabets. Proc. CPM, LIPIcs 78, pages 22:1–22:18. (2017) We present a practical linear-time algorithm computing all distinct squares for integer alphabets. We improve on an algorithm of Gusfield and Stoye, who used the Lempel-Ziv factorization for spotting possible locations of the leftmost occurrences of squares. For that, we used a range minimum query data structure on the longest common prefix factor array to quickly decide whether a substring of a certain length is the leftmost occurrence. With that, we could get the algorithm down to linear time, which was previously only possible with this algorithm for constant-size alphabets. We further studied similar techniques when computing all distinct squares online by using an online construction of the suffix tree augmented with dynamic tree data structures.The definition of the factors by the LCP array in Sect. 3 is wrong; we need make each factor shorter by one character to be correct. We also have to note that the linear-time computation is already possible with the work of Crochemore et al.: "Extracting powers and periods in a word from its runs structure", TCS'14.doilinkdblparxivcodemirrorphotoslide
Patrick Dinklage, Johannes Fischer, Dominik Köppl, Marvin Löbel, Kunihiko Sadakane: Compression with the tudocomp Framework. Proc. SEA, LIPIcs 75, pages 13:1–13:22. (2017) We present a C++ compression framework for easy implementation and comparison with well-known compression techniques like LZ77. To this end, we proposed a new bidirectional parsing, called lcpcomp, which is a lexicographic parsing in that it greedily takes the maximum longest longest common prefix values of neighboring suffixes in the suffix array, and replaces one with a reference to the other. Further, we provide an LZ78-variant where factors are enlarged until they are right-maximal repeats. We empirically show that the encoding of both factorizations are competitive with existing compression solutions.The linear time argument for the compression algorithm in Section 3.2.2. is not complete.
The linear time follows from the fact that we look-up m positions for decrease while deleting m positions when creating a factor of length m.
Therefore, the time for deleting and decreasing is bounded by the factor lengths, which sums up to O(n).
doilinkdblparxivcodemanuscriptmirrorslide
Hideo Bannai, Shunsuke Inenaga, Dominik Köppl: Computing All Distinct Squares in Linear Time for Integer Alphabets. 30th Mini-Workshop "Theoretical Computer Science" at TuDo and RUB. (2017) Presentation of the result of [c12].linkslide
Johannes Fischer, Dominik Köppl, Florian Kurpicz: On the Benefit of Merging Suffix Array Intervals for Parallel Pattern
Matching. Proc. CPM, LIPIcs 54, pages 26:1–26:11. (2016) We use the merging of suffix array intervals to speed up approximate pattern matching queries, i.e., matching a pattern with a text while allowing an upper bound on the distance (Levenshtein or Hamming distance).
Our main tool is that the merging of suffix array intervals corresponding to matched parts of the pattern can be parallelized -
a fact that we use when first computing the suffix array intervals of all prefixes and suffixes of parts of the pattern for the case of one error.
We finally generalize this to arbitrary error bounds with the result of Huynh et al.doilinkdblparxivmirror
Johannes Fischer and
Tomohiro I and
Dominik Köppl: Deterministic Sparse Suffix Sorting on Rewritable Texts. Proc. LATIN, LNCS 9644, pages 483–496. (2016) We devise a deterministic algorithm for sparse suffix sorting in compact space when allowing to reversibly edit the input text.
We use a grammar compression technique for fast longest common extension queries, which we write on redundant parts of the text detected while computing the sparse suffix sorting online.
Our grammar is based on the edit-sensitive parsing, where our modification makes it possible to bound the number of nodes of a grammar tree deriving a certain part of the text, while being robust to context changes, i.e., two grammar trees containing a large portion of the same substring also have most nodes in common.
doilinkdblparxivmirrorslide
Dominik Köppl, Kunihiko Sadakane: Lempel–Ziv Computation in Compressed Space (LZ-CICS). Proc. DCC, pages 3–12. (2016) We adapt the algorithm of [c5] with focus on small alphabets by using the compressed suffix tree. The main obstacle for using the compressed suffix tree for efficient computation of the LZ77 and LZ78 parsing is the inability to perform fast string-depth queries, i.e., the length of a string a node in the suffix tree presents. Here, we provide a solution that emulates suffix links by taking two leaves of a node and compute their ψ-values, which takes us to two leaves whose lowest common ancestor would be connected by a suffix link of that node.The number of giant nodes can be in the worst case O(n) - O(lg n), not O(n/lg n) like proposed in the paper. This is not a problem since it is sufficient to store the exploration counters of the lowest giant nodes, which are O(n/ lg n) many. A giant node that is not the lowest giant node can adopt one exploration counter of one lowest giant node in its subtree until it got fully explored.doilinkdblparxivmanuscriptslide
Paweł Gawrychowski and
Tomohiro I and
Shunsuke Inenaga and
Dominik Köppl and
Florin Manea: Efficiently Finding All Maximal α-gapped Repeats. Proc. STACS, LIPIcs 47, pages 39:1–39:14. (2016) We propose the first algorithm computing all maximal α-gapped repeats in time linear in α and the text length. We additionally give upper bounds on the maximum number of all maximal α-gapped repeats a text of a certain length can have, which matches the asymptotic time bounds of the proposed algorithm, supporting that its time complexity is optimal in the worst case. The previously best known upper bound was quadratic in α. doilinkdblparxivmirror
Paweł Gawrychowski, Tomohiro I, Shunsuke Inenaga, Dominik Köppl, Florin Manea: Efficiently Finding All Maximal α-gapped Repeats. 28th Mini-Workshop "Theoretical Computer Science" at TuDo and RUB. (2015) Presentation of the result of [c7].linkslide
Dominik Köppl and
Tomohiro I: Arithmetics on Suffix Arrays of Fibonacci Words. Proc. WORDS, LNCS 9304, pages 135–146. (2015) We study the shapes of the suffix array, its inverse, and the Burrows-Wheeler transform of Fibonacci words and some of its relatives. We discover that the suffix array is an arithmetic progression for even Fibonacci words, and that the inverse suffix array is a rotation of it.doilinkdblpmanuscriptslide
Marcel Bargull, Kada Benadjemia, Benjamin Kramer, David Losch, Jens
Quedenfeld, Sven Schrinner, Jan Stricker, Dominik Köppl, Dominik Kopczynski, Henning Timm, Johannes Fischer, Sven Rahmann: Variant Tolerant Read Mapping with Locality Sensitive Hashing. German Conference on Bioinformatics (GCB). (2015) We present a read-mapper that uses local-sensitive hashing as a pre-filter and a semi-global alignment against a reference augmented variations for variant-tolerant read-mapping.
Most read-mappers work with a single reference genome such that they cannot take common variants into considerations.
Using a set of reference genomes of the same species makes it possible to tolerate variants with the hope to increase the accuracy of the read-mapping.
While our proposed pre-filter is fast, it produces many false positives, which we eliminate by a careful alignment step using Ukkonen's semi-global alignment algorithm.
doimirrorposter
Johannes Fischer and
Tomohiro I and
Dominik Köppl: Lempel Ziv Computation in Small Space (LZ-CISS). Proc. CPM, LNCS 9133, pages 172–184. (2015) We present linear-time LZ77 and LZ78 algorithms using a lightweight suffix array implementation whose largest component is the suffix array.
Compared to other linear-time algorithms computing LZ77 or LZ78, these algorithms have the lowest space requirements up so far when the alphabet size of the input text is not too small.
For that, we heavily relied on overwriting parts of our working space such that in the end we transformed the suffix array into an array storing the output of one of the factorizations.
doilinkdblparxivslide
Don S. Batory and
Peter Höfner and
Dominik Köppl and
Bernhard Möller and
Andreas Zelend: Structured Document Algebra in Action. Proc. Software, Services, and Systems, LNCS 8950, pages 291–311. (2015) We put software product lines into algebraic context and give examples with extensions to common programming languages to highlight where such modularization helps. Some programming languages like C have macros that can be used as variation points in program code, which can be filled afterwards, for instance with the inclusion of external files. Here, we provide an abstract framework for such kind of substitutions, and present an algebra for working with them.doilinkdblpmanuscript
Dominik Köppl: Dynamic Skyline Computation with the Skyline Breaker Algorithm. Local Proceedings of the Workshop on Massive Data. (2014) We present theoretical thoughts on how to use [c3] for the dynamic skyline problem. We prove that caching helps whenever subsequent queries are within a geometric region of a cached query. doimanuscript
Dominik Köppl: Breaking Skyline Computation Down to the Metal: the Skyline Breaker Algorithm. Proc. IDEAS, pages 132–141. (2013) We study the effect of LSD-trees for computing the Pareto front, and enhance this tree-based approach with heuristics to improve the execution time empirically.
The LSD-tree is a kd-tree whose leaves are enlarged to store buckets of arbitrary size. Because a bucket contains multiple elements before it gets split, the distribution of these elements can be taken into account to decide how to partition a full bucket into two. Such splitting effects the overall running time. We further parallelized our algorithm such that each thread maintains an LSD-tree but uses a synchronized list of output candidates for pre-filtering. doilinkdblpcodemirror
Florian Wenzel and
Dominik Köppl and
Werner Kießling: Interactive Toolbox for Spatial-Textual Preference Queries. Proc. SSTD, LNCS 8098, pages 462–466. (2013) We give a show case for graph database systems that can find Pareto-optimal answers to queries that use spatial or/and textual attributes. We annotated a part of the San-Francisco bay location with points of interests storing textual attributes, and provided a database system that could filter points of interests by specifications in both textual and the spatial domains. doilinkdblpmanuscript
Roland Glück and
Dominik Köppl and
Günther Wirsching: Computational Aspects of Ordered Integer Partition with Upper Bounds. Proc. SEA, LNCS 7933, pages 79–90. (2013) We devise an algorithm with fixed-parameter tractability in the dimension size for finding the number of integer partitions for a queried integer if this integer has to be split into at most n parts, where each part has a fixed upper bound. We observed that this computation involves the iterative computation of polynomial summations, for which we used Faulhaber's formula. Experimental results show that a naive algorithm based on recursion is quickly outperformed on five dimensions with reasonably large query numbers. doilinkdblp
Dominik Köppl: Pseudodifferentialoperatoren mit nichtglatten Koeffizienten auf Mannigfaltigkeiten. Diplomarbeit (German) at Universität Regensburg. (2012) We extended the analysis of Hitoshi Kumano-Go's work on pseudo-differential operators to the case of non-smooth components. We also define such pseudo-differential operators on non-smooth manifolds, and show under which properties the transition map for non-smooth pseudo-differential operators is well-defined.arxivmanuscript
Lectures
2025 Oct - 2026 Mar: CAN009 Human and Computer (course), University of Yamanashi
2024: Yamanashi Prefecture Science Promotion Section, 一次元データの効率的な処理手法の開発, Research Grant for Young Scholars Funded by Yamanashi Prefecture (grant number 2291), 7 months, at University of Yamanashi
2023: Japan Society for the Promotion of Science, Constructing Compressed Indexes for Biological Sequences, Grant-in-Aid for Transformative Research Areas (A) (grant number 23H04378), 2 years, at Tokyo Medical and Dental University
2023: LegalOn Technologies, 高速な自然言語処理システムのためのアルゴリズム設計, Research-Collaboration, 20 months, at Tokyo Medical and Dental University
2022: Japan Society for the Promotion of Science, 広義文字列のアルゴリズムと組合せ論, Grant-in-Aid for Scientific Research (B) (grant number 22H03551), 4 years, at Kyushu University
2020: German Academic Exchange Service - DAAD, Synergies between Enumeration Algorithms and Stringology, Internationale Forschungsaufenthalte für Informatikerinnen und Informatiker - IFI (grant number 57515245) -- declined, 2 years, at National Institute of Informatics
2018: Japan Society for the Promotion of Science, 文字列圧縮と組合せ論による大規模データ管理・処理技法の開発, Grant-in-Aid for JSPS Fellows (grant number 18F18120), 3 years, at Kyushu University
2016: Japan Society for the Promotion of Science, Visiting Prof. Inenaga at Kyushu University, Summer Program (grant number SP16305), 2 months, at Kyushu University
2017: Bioinformatics and Information Retrieval Data Structures Analysis and Design - BIRDS, Travel Support for WCTA'17, Marie Skłodowska-Curie Travel Grant (grant number 690941), 3 days, at Technical University Dortmund
2016: German Academic Exchange Service - DAAD, Summer internship of Sean Tohidi on compression algorithms, RISE Internship Hosting, 2 months, at Technical University Dortmund
2016: German Academic Exchange Service - DAAD, Travel Support for LATIN'16, Kongressreisenprogramm, 4 days, at Technical University Dortmund
2015: Studienwerk für Deutsch-Japanischen Kulturaustausch in NRW e.V., Visiting Prof. Sadakane at the University of Tokyo, Travel Grant, 2 weeks, at Technical University Dortmund
2015 Nov 25, giving the lecture Sortieren mit Gnome-Sort at Technical University Dortmund, part of the Studieninformationstag der Fakultät für Informatik mirrorslide
2013 Mar 31, giving the workshop Arbeiten mit Datenbanken at University of Augsburg, part of the Girls' Day photo
I am a permanent editor of the IWOCA Open Problem
List for strings/words/1-dimensional data. If you have an
interesting problem to share, please consider dropping me a mail!