Computational systems biology department, VIGG RAS, Moscow

The Sequence Similarities by Markov Chain Monte Carlo (SeSiMCMC) algorithm finds DNA motifs of unknown length and complicated structure, such as direct repeats or palindromes with variable spacers in the middle in a set of unaligned DNA sequences. It uses an improved motif length estimator and careful Bayesian analysis to consider site absence in a sequence.

The algorithm, the software and its application description are published in Bioinformatics, 2005, 21(10):2240-2245. Here are supplementary tables 1, 2, 3 and 4 for the paper.

A comparison of a set of motif identifiers, including our SeSiMCMC algorithm, has been published in Nature Biotechnology, 2005, 23(1):137-144.

SeSiMCMC is involved in DMMPMM pipeline. The pipeline is described in Bioinformatics, 2009, epub Jul 15.

The latest release of the sofware (version 4.36+) was made at 2021-06-17.

The source code is available here under the MIT licence. The options are expalined in the readme file. Use GNU make to compile.

Our ChIPMunk motif discovery tool for huge datasets and ChIP-Seq data processing.

Thanks to Ludmila Danilova and to Anna Gerasimova for useful discussion and assistance with the data. Thanks to Elana Fertig for help in text editing. This research is partially supported by grants from the Howard Hughes Medical Institute (55000309 to M. Gelfand), Ludwig Cancer Research Institute (CRDB RBO-1268 to M. Gelfand), the Russian Foundation of Basic Research (02-04-49111 and 04-04-49601 to V. Makeev; 07-04-01623 to A. Favorov), the Russian Academy of Sciences (Programs 'Molecular and Cellular Biology', project #10, and 'Origin and Evolution of The Biosphere') and by RF State Contract #

Alexander Favorov.