Introduction

The MycoLec database provides predicted lectin domains (reversible carbohydrate binding site) in all species proteomes (de novo translated genomes) available in Mycocosm database from the JGI. To identify the lectin domains, MycoLec use UniLectin3D classes to generate conserved motifs of the lectin domains. A total of 107 lectin classes are available.

The following options for searching are available:  1. Lectin fold and class  2. Taxonomy (superkingdom, phylum and species)  3. Field search with multiple criteria Once selected, lectins(and their features) can be explored by Accession Number (with the NCBI AC and UniProt AC).

For each lectin a detailed panel and page are available with the NCBI gene viewer and a representation of the lectin domain conservation compared to the reference.

1. Searching by lectin class
  The search can be performed by selecting a lectin class.
  
mycolec/tuto_pic/mycolec_tuto_2.PNG?v=564164

The barchart represent the distribution of the predicted lectins by lectin class. Each class can be clicked to explore the corresponding predicted lectins.

2. Searching by Taxonomy
  The search can be performed by selecting in the Taxonomy sunburst a superkingdom, kingdom, phylum, family, genus.
  
mycolec/tuto_pic/mycolec_tuto_4a.PNG?v=564164

Predicted lectins species can be explored with the inner circle representing the superkingdom, the second circle the kingdom, the third circle the phylum, the fourth circle the species group. Each section can be clicked to get further details.

mycolec/tuto_pic/mycolec_tuto_4b.PNG?v=564164

The tree viewer allows exploring the predicted lectins taxonomy with a better insight in less represented branches. The nodes can be opened to access in order the superkingdom, kingdom, phylum, species group, and species. Each node label can be clicked to access further details.

3. Field search
  The search can be performed with multiple criteria on the advanced search page.
  
mycolec/tuto_pic/mycolec_tuto_5.PNG?v=564164

Predicted lectins can be explored based on selected criteria, ordered by scores. The score threshold by default is at 0.25 (25% of similarity to the reference); the lectin class identified, keywords to exclude proteins based on their description by default set with partial, synthetic and undefined keywords, taxonomy, PFAM domains, RefSeq AC, protein name and description (to include) and the UniProt AC.

The button checkbox pathogen species allows to keep only species contained in a predefined list of pathogene species (based on the NIH pathogen species list).

Graphics are generated to have an overview of the predicted lectins distribution, for the lectins corresponding to the criteria selected by the user. MycoLec families distribution, taxonomy sunburst and tree are available as in the homepage. The graphics can be clicked to access further details.

mycolec/tuto_pic/mycolec_tuto_7.PNG?v=564164

The predicted proteins matching the criteria are ordered by score with 20 results displayed by page. For each predicted lectins features are displayed with the protein name, UniProt AC and RefSeq AC which can be clicked to access the corresponding pages, length of the protein, species, the lectin class identified, the similarity score (against the reference), and the gene list with all chromosome and position encoding this protein. When multiple chromosomes are available they correspond to the different version of the chromosome encoding this exact protein.

For each predicted lectin a 2D sequence feature viewer allows visualizing the localization of the predicted domains and eventual PFAM domains, with a drag and drop button to zoom in the sequence. At the top, a button allows to display further details in a new window with the gene viewer and the domain conservation viewer.

Cite How to cite