VERSION

This documentation refers to batch_genemark.pl version $Rev: 555 $

SYNOPSIS

Usage

    batch_genemark.pl -i DirToProcess -o OutDir

Required Arguments

    -i, --indir    # Directory of fasta files to process
    -o, --outdir   # Path to the base output directory
    -c, --config   # Path to the config file

DESCRIPTION

Run the GeneMarkHMM gene prediction program in batch mode. Runs genmark as well as converts output to gff format. Requires a config file to specify libraries to use.

REQUIRED ARGUMENTS

-i,--indir: Path of the directory containing the sequences to process.
-o,--outdir: Path of the directory to place the program output.
-c, --config: Currently this program does NOT make use of a configuration file. This will be a fairly easy thing to add, and is on my TODO List.

OPTIONS

--genemark-dir: Directory that contains the GeneMark.hmm binaries. This can also be set with the environment variable GM_BIN_DIR.
--lib-dir: The full path to the directory that contains the model libraries for GeneMarkHMM. This can also be set with the environment varaible GM_LIB_DIR.
--usage: Short overview of how to use program from command line.
--help: Show program usage with summary of options.
--version: Show program version.
--man: Show the full program manual. This uses the perldoc command to print the POD documentation for the program.
-q,--quiet: Run the program with minimal output.
--test: Run the program without doing the system commands.

DIAGNOSTICS

Error messages generated by this program and possible solutions are listed below.

ERROR: No fasta files were found in the input directory: The input directory does not contain fasta files in the expected format. This could happen because you gave an incorrect path or because your sequence files do not have the expected *.fasta extension in the file name.
ERROR: Could not create the output directory: The output directory could not be created at the path you specified. This could be do to the fact that the directory that you are trying to place your base directory in does not exist, or because you do not have write permission to the directory you want to place your file in.

CONFIGURATION AND ENVIRONMENT

Configuration File

The batch_genemark.pl program does not currently make use of a configuration file.

User Environment

This program makes use of the following variables defined in the user's environment.

GM_BIN_DIR: Directory that contains the GeneMark.hmm binaries.
GM_LIB_DIR: The full path to the directory that contains the model libraries for GeneMarkHMM.

The following example illustrates the ENV options set in the bash shell.

    export GM_BIN_DIR='$HOME/apps/GenMark/genemark_hmm_euk.linux/'
    export GM_LIB_DIR='$HOME/apps/GenMark/genemark_hmm_euk.linux/'

DEPENDENCIES

Required Software

GeneMark.HMM
The GeneMark.HMM program is available for non-commercial Academic use for a limited time by applying for a license at: The http://opal.biology.gatech/edu/GeneMark

Required Perl Modules

File::Copy
This module is required to copy the output results.
Getopt::Long
This module is required to accept options at the command line.
Bio::Tools::Genemark
This module is required to parse the results from the Genemark program The module is part of the BioPerl package http://www.bioperl.org.

BUGS AND LIMITATIONS

Bugs

No bugs currently known
If you find a bug with this software, file a bug report on the DAWG-PAWS Sourceforge website: http://sourceforge.net/tracker/?group_id=204962

Limitations

Limited gene model supported
This program is currently limited to using the gene models that are relevant to wheat annotation (Rice|Maize|Wheat|Barley). I will be adding a config file option that will allow multiple gene models to be used.

REFERENCE

A manuscript is being submitted describing the DAWGPAWS program. Until this manuscript is published, please refer to the DAWGPAWS SourceForge website when describing your use of this program:

JC Estill and JL Bennetzen. 2009. The DAWGPAWS Pipeline for the Annotation of Genes and Transposable Elements in Plant Genomes. http://dawgpaws.sourceforge.net/

LICENSE

GNU GENERAL PUBLIC LICENSE, VERSION 3

http://www.gnu.org/licenses/gpl.html

THIS SOFTWARE COMES AS IS, WITHOUT ANY EXPRESS OR IMPLIED WARRANTY. USE AT YOUR OWN RISK.

AUTHOR

James C. Estill <JamesEstill at gmail.com>

HISTORY

STARTED: 11/09/2007

UPDATED: 03/24/2009

VERSION: $Rev: 555 $

batch_genemark.pl