InterProScan Annotation

Content of this page:

General

The functionality of InterPro annotations in Blast2GO allows to retrieved domain/motif information in a sequence-wise manner. Corresponding GO terms are then transferred to the sequences and merged with already existing GO terms. InterProScan results can be viewed through the Single Sequence Menu (Figure 4) and saved in TXT and XML format (Figure 3). The sequences will turn violet if no other analysis has been executed before. 

Image interproscan

Figure 1: InterProScan options

  • Run InteProScan. Start sending sequences to the EBI.
  • Merge InterProScan GOs to Annotation. Add GO terms obtained through motifs/domains to the current annotations.
  • Remove InterProScan. Delete InterProScan results for the selected sequences.

Run InterProScan

There are two options to run InterProScan in Blast2GO, either with CloudIPS or via the public web service at EBI.

CloudIPS is a cloud-based Blast2GO PRO community resource for fast and reliable InterPro analysis for everything from small to big data-sets. It allows executing the original InterPro algorithms against up-to-date databases in our dedicated computing cloud. This is a high-performance, secure and cost-optimized solution for your analysis.
The public EMBL-EBI InterPro web-service scans your sequences against InterPro's signatures and performance and results depend on the EBI web-server.

InteProScan can only be performed if the sequences are shown in the sequence table that contains the actual sequence information (loaded via fasta file). You have to be careful if you created a project via a blast XML file or if you loaded a .annot file.
To add the sequences to the current Blast2GO project see Add sequences to existing Blast2GO project section. 

You can save the InterProScan results in different file formats, in tab separated values (TVS), XML, which is the default output, GFF3 and the input (query) sequence itself (Figure 5). 
If you are working with nucleotide sequences, Blast2GO translates it to the longest open reading frame and sends it to InterProScan. For this particular case when exporting the input sequence Blast2GO will save the protein sequence itself and not the nucleotide one.

Once the InterProScan has finished it is possible to view the results of each sequence via the context menu (Figure 6). 



Figure 2: InterProScan Configuration


Figure 3: Selection of Member Databases

Figure 4: Selection of Member Databases

Figure 5:  Save InterProScan Results



Image showipsresult

Figure 6: Show InterProScan Results 

Figure 7: InterProScan Results

Merge InterProScan GOs to Annotation

The InterProScan GOs results can now be added to the already existing annotations based on the BLAST results. This option is available from the IPS submenu (little arrow).
Once the merge has finished a distribution chart is displayed in the Results menu showing the number of GOs that have been added to (or confirmed) the current annotation results.




Image mergeIPS

Figure 8: Merge InterProScan results


Image mergeIPSStatistics

Figure 9: Statistics after merging InterProScan to GO Annotation

Statistics

On the submenu of the "Charts'' icon it is possible to select InterProScan statistics to see how many sequences still do or do not have IPS results and how many sequences have GOs resulting from InterProScan.

  • InterProScan Results: This chart reflects the effect of adding the GO-terms retrieved through the InterProScan results (Figure 11).
  • InterProScan Families Distribution: Bar chart representing the number of sequences that belong to a particular IPS family.
  • InterProScan Domains Distribution: Bar chart showing the number of sequences that belong to a particular IPS domain.
  • InterProScan Repeats Distribution: Bar chart reflecting the number of sequences that belong to a particular IPS repeat.
  • InterProScan Sites Distribution: Bar chart representing the number of sequences that belong to a particular IPS sites.
  • InterProScan IDs Distribution: Bar chart showing the number of sequences that have been annotated with that InterProScan IDs.
  • InterProScan IDs by Database: Pie chart reflecting the number of sequences of the InterProScan IDs for a particular InterProScan Database. In Figure 10 the Pfam database is selected.



Figure 10: InterProScan Statistics Configuration Window



Image ipsstatistics

Figure 11: InterProScan Statistics

Load InterProScan Results

The InterProScan results saved in XML format can be loaded in the current Blast2GO project (File > Load > Load InterProScan Results).

When loading the InterProScan results it is possible to select the input format.

  • Protein - If InterProScan has been performed inside Blast2GO (Blast2GO translates the nucleotide sequences to the longest ORF peptides)
  • Nucleotides - If InterProScan has been performed with nucleotide sequences and InterProScan binaries.




Figure 12: Load InterProScan Results