Home Search Distribution Filter Download Import Help

The computational pipeline used by the PRFdb performs a series of filtration steps:

  1. Genomic sequences are imported into the database. These must contain a defined start and stop. Most of these sequences were provided by either the Saccharomyces Genome Database or The Mammalian Gene Collection. Sequences may also be imported from Genbank as long as they contain a 'CDS' feature with the start and stop positions defined.
  2. A simple pattern match is performed to find sequence windows which have slippery heptamers in the correct reading frame.
  3. Each remaining sequence window is scanned using rnamotif 1 for base patterns which have the potential to form a pseudoknot.
  4. Sequence windows are folded in silico using one or more minimum free energy secondary structure algorithms. 2,3,4 The output from these programs is saved.
  5. Each sequence window is then randomized using one or more randomization algorithms including: (Shuffling, shuffle with maintenance of dinucleotides, amino acids, etc. The resulting random sequences are refolded, the resulting putative minimum free energies are stored and used to generate a z-score to compare against the putative minimum free energy of the initial folding.
  6. A 'minimum free energy' landscape may be generated for the entire sequence. To do so, multiple Mfe folding algorithms are passed over the entire sequence in overlapping windows, the resulting minimum free energies are stored and used to generate a graph upon which may be plotted the position of a given slippery heptamer.
  7. All data is saved in a relational database.
  8. Sequences which are of interest may then be taken and studied in the laboratory.

    1. T. Macke, D. Ecker, R. Gutell, D. Gautheret, D.A. Case and R. Sampath. RNAMotif -- A new RNA secondary structure definition and discovery algorithm. Nucl. Acids Res. 29, 4724-4735 (2001).
    2. Rivas E., Eddy S.R. A dynamic programming algorithm for RNA structure prediction including pseudoknots. J. Mol. Biol. 285, 2053-2068 (1999).
    3. Dirks, R.M. and Pierce, N.A. A partition function algorithm for nucleic acid secondary structure including pseudoknots. J. Comput. Chem. 24,1664–1677. (2001)
    4. Ren J., Rastegari B., Condon A., Hoos H.H. HotKnots: heuristic prediction of RNA secondary structures including pseudoknots. RNA. 11: 1494-1504 (2005).