seq_oligocount.pl


NAME

seq_oligocount.pl - Count oligos from an input sequence


VERSION

This documentation refers to program version $Rev: 606 $


SYNOPSIS

Usage

    seq_oligocount.pl -i InFile -o OutDir -db index.fasta
                      -n SeqName -k 20

Required Arguments

    -i,--infile   # Path to the input fasta file
    -d,--db       # Path to the mkvtree index file
    -o,--outdir   # Path to the base output directory
    -n,--name     # Name to assign to the sequence file
    -k,--kmer     # Oligomer query length


DESCRIPTION

The seq_oligocount program will take a query sequence, break it into subsequences of size k and query it against an persistent index created by the mkvtree program. It produces a GFF output file describing the number of copies of every oligomer in the query sequence in the subject index database.


REQUIRED ARGUMENTS

-i,--infile

Path of the input file.

-o,--outdir

Path of the output file.

-n,--name

Name to assign to the sequence file

-d,--db

Path to the fasta file that was indexed with the mkvtree program.


OPTIONS

-k,--kmer

Length of the kmer to index. The default value of this variable is 20.

-l, --len

Length of window for summarizing kmer index counts. This will be a non overlapping window and can range from 1 to the length of the sequence.

-s,--seqname

The name of the sequence being annotated. This is the first column of data in the GFF output file.

--usage

Short overview of how to use program from command line.

--help

Show program usage with summary of options.

--version

Show program version.

--man

Show the full program manual. This uses the perldoc command to print the POD documentation for the program.

-q,--quiet

Run the program with minimal output.


DIAGNOSTICS

Error messages generated by this program and possible solutions are listed below.

ERROR: Could not create the output directory

The output directory could not be created at the path you specified. This could be do to the fact that the directory that you are trying to place your base directory in does not exist, or because you do not have write permission to the directory you want to place your file in.


CONFIGURATION AND ENVIRONMENT

An external configuration file is not required for this program, and it does not make use of any variables set in the user's environment.


DEPENDENCIES

Required Software

Vmatch

This program requires the Vmatch package of programs. http://www.vmatch.de . This software is availabe at no cost for noncommercial academic use. See program web page for details.

Required Perl Modules


BUGS AND LIMITATIONS

Bugs

Limitations


SEE ALSO

The seq_oligocount.pl program is part of the DAWG-PAWS package of genome annotation programs. See the DAWG-PAWS web page ( http://dawgpaws.sourceforge.net/ ) or the Sourceforge project page ( http://sourceforge.net/projects/dawgpaws ) for additional information about this package.


REFERENCE

A manuscript is being submitted describing the DAWGPAWS program. Until this manuscript is published, please refer to the DAWGPAWS SourceForge website when describing your use of this program:

JC Estill and JL Bennetzen. 2009. The DAWGPAWS Pipeline for the Annotation of Genes and Transposable Elements in Plant Genomes. http://dawgpaws.sourceforge.net/


LICENSE

GNU General Public License, Version 3

http://www.gnu.org/licenses/gpl.html

THIS SOFTWARE COMES AS IS, WITHOUT ANY EXPRESS OR IMPLIED WARRANTY. USE AT YOUR OWN RISK.


AUTHOR

James C. Estill <JamesEstill at gmail.com>


HISTORY

STARTED: 10/11/2007

UPDATED: 03/24/2009

VERSION: $Rev: 606 $

 seq_oligocount.pl