Test instances: Patterns
We compiled a collection of patterns for approximate pattern matching. For all test instances we prepared pattern sets designed for searching with Hamming distance or edit distance with different pattern lengths and with different degrees of pertubation. We extracted the patterns from the underlying text at random positions and then introduced pertubations. For doing that we used our tool generate_patterns. Each file contains 1000 entries, seperated by a dollar sign followed by a line break: $\n.
It is also possible to download all pattern files at once. The filenames are composed as follows: patterns_<TEXT-NAME>_<PATTERN-LENGTH>_<DISTANCE-MEASURE>_<TOLERANCE>.txt.