Prokaryotes are known to acquire immunity against phages and viruses through a widely conserved RNA-based gene silencing pathway. Fragments of the foreign DNA are initially integrated into clusters of regularly interspaced short palindromic repeats (CRISPRs). During a new invasion these fragments are extracted from the CRISPR array as mature CRISPR-RNAs (crRNAs), which target the viral DNA for degradation. A small hairpin structure has been proposed to guide an endoribonuclease to the cleavage site and there is recent evidence of direct interaction between the stem-loop and the Cas6, one of the many CRISPR-associated (Cas) proteins. Based on sequence similarity, Kunin et al. (2007) reported 12 major families of CRISPR repeats and claimed that only six exhibit the typical hairpin motif. We have revisited structural properties of the CRISPR system on a genome-wide scale. Our results show that the hairpin motif is present in almost all families. Additionally, some sequences are misplaced and a few clusters ought to be subdivided into further families. Since repeats are able to form multiple stem-loop structures, we have developed an approach to predict the functional hairpin of a single CRISPR array by folding the repeat structures within the context of all spacer sequences. We show that the most probable hairpin in such a context is not always the minimum free energy (MFE) structure.
Evolutionary conservation of sequence and secondary structures in CRISPR repeats
Kunin V, Sorek R, Hugenholtz P
Genome Biol. (2007); 8:(4) R61