Thanks to the advances of next-generation sequencing methods, complete genomic sequences are becoming more and more abundant. Most frequently, after a first genome assembly, the consensus sequence is used for structural annotation. This type of annotation, among others, provides information about gene locations. Gene-Finding is an essential step preceding the functional characterisation of genomics elements and fits perfectly in the Blast2GO workflow.
There are basically two types of gene finding, 'ab initio' and `hint based'. For the `ab initio', we need the DNAseq data and by using HMM we are able to build models to mathematically and probabilistically predict the positions of potential genes.
In all 'ab initio' gene prediction approaches, the number of genes is overestimated, i.e 'ab initio' methods raise the number of false positives in order to minimize the false negatives.
With the gene finding option in Blast2GO we provide an easy and fast way to locate the genes using the 'ab initio' methodology on your prokaryotic or eukaryotic query genome, without need of RNA-seq data, and directly obtaining fully a exportable Blast2GO project and GFF files.
This has been achieved by integrating Augustus and the Glimmer algorithm within Blast2GO. The corresponding algorithms are executed via the Blast2GO Service Cloud which allows run the different algorithms and tools efficiently, platform independent and without excessive memory consumption on the client side.
Figure 1: Gene Finding options