# Source code and tools

Our index structures and algorithms for aproximate pattern matching were implemented in the software library . The source code is available on GitHub.

We designed and implemented some tools for generating and analyzing test data for approximate pattern matching:

strip_headers |
This tool preprocesses text files from the Project Gutenberg and strips for example the headers and footers. | documentation | source | Win32 | Mac |

file_statistics |
Efficient computation of some text statistics for a given file, including the text length, alphabet size, number of distinct q-grams and empirical entropy. | documentation | source | Win32 | Mac |

generate_patterns |
This tool extracts substrings from a text and generates a set of search patterns for approximate pattern matching. | documentation | source | Win32 | Mac |

tt-analyze |
Calculates efficiently some statistical properties of texts and estimates parameters for probability models implemented by tt-generate. |
documentation | source | Win32 | Mac |

tt-generate |
This tool generates random texts using different models (such as markov chain, discrete autoregressive process, uniform distribution, or fibonacci word). | documentation | Win32 | Mac |

(Deprecated: The source code of the dissertation project of Johannes Krugel is available for download as zip file and has to be included as sandbox project in the folder `sandbox/tum`.)