log in | about 
 

Datasets and software used in the following papers:

  1. SISAP 2012: Leonid Boytsov, Super-Linear Indices for Approximate Dictionary Searching. SISAP 2012: 162-176
  2. JEA ACM 2011: Leonid Boytsov. 2011. Indexing methods for approximate dictionary searching: Comparative analysis. J. Exp. Algorithmics 16, 1, Article 1 (May 2011).

You can download the articles on this page.
There are virtually no restrictions on using software and data (see details here) that was designed by me. I appreciate if you cite my work.
However, the archives contain several third-party packages (which I used for comparison). These packages may be a subject to different licenses and can be guarded by patents (I do believe that agrep and the underlying shift-and algorithm is patented). These packages include at least the following:

  1. G. Navarro's implementation of the lazy Levenshtein automaton. (The folder NavarroDFA).
  2. NR-grep (developed by G. Navarro).
  3. agrep (developed by Sun Wu and Udi Manber)
  4. FastSS

    Download and build instructions.

    The data and sources used in the JEA ACM 2011 paper are also available here (check the tab "Source Materials").

    1. Download and unpack the source file;
    2. Download the datasets to the source file directory;
    3. Check the README file for building/testing instructions.

    Source files.

    Data sets.