Prokaryotic Genome Analysis Tool (PGAT) Tutorial

Select a PGAT

  1. From the PGAT home page (http://tools.uwgenomics.org/pgat), select the Demo tool for Acinetobacter baumannii to begin this tutorial. This is a public version of the tool, so you will not need to login. Any changes you make in the tool will be reset daily. You may also see updates made by other users that have used the tool since the last update.
  2. Note the banner at the top of the page indicating that you are in the Acinetobacter PGAT Demo tool.

Select Strains to Compare

  1. The front page lists a summary of the genomes loaded in the tool database. Note the status column on the right hand side. Genomes that are "complete" have been fully sequenced, annotated and submitted to Genbank. Other genomes may be designated as "draft assembly", indicating they have not yet been fully assembled and/or have not yet been published to Genbank (partial assembly of shotgun sequence). You will notice that many of the draft genomes consist of contigs rather than full chromosomes.
  2. Uncheck all genomes by clicking the box in the header row at the top left of the genomes table. Select all complete baumannii genomes by clicking the complete checkbox above the genomes table. Note the genomes that are now selected. Click the Update button. The genomes that were not selected should now be grayed out. Any further analyses and comparisons performed within the tool will now only be performed on the selected genomes.

Gene search

  1. Click the Search Options link in the top menu bar.
  2. Click the Search by Keyword tab.
  3. In the Enter Search Text box, type "secretion".
  4. Click the Search button.
  5. A results list of 132 genes should appear.
  6. Click on the FASTA aa file to generate a FASTA file of amino acid sequences for the list of genes in a new browser window. To save the FASTA file, in the browser menu, click File | Save As, select to Save As a Text File, rename as desired (e.g. secretion.faa), and then click Save.

Save search

  1. Deselect all checkboxes in the gene list by selecting the checkbox in the grey header row at the top of the gene list.
  2. Re-select the first 5 checkboxes to select the first 5 genes.
  3. Scroll to the bottom of the page.
  4. In the Enter Search Name box, type "10 secretion genes - Your name".
  5. Click the Save button.
  6. After viewing the details of your saved search, click on the Saved Searches tab. You should see your search listed.
  7. Other users will not see your search listed, and you will not be able to see other users' searches.

Ortholog tables

  1. Click on the Orthologs tab.
  2. Click on the checkbox in the header row of the Genome table to select all genomes.
  3. Make sure that the checkbox Only include reference genes from the following selected saved queries is checked.
  4. Make sure that the search you just saved is checked.
  5. Click the Submit button.
  6. The ortholog table displayed indicates for which genomes each gene is present or absent.

Reference genes

A "reference gene" is one of the orthologs that was selected to represent the group of orthologs.

From each set of orthologs, one of the members (or the only member) is designated as the "reference" for the group. The collection of all reference genes represents the pan-genome for the species. The set of reference genes are processed through various bioinformatics prediction routines (e.g. COG, Pfam, Prosite, PsortB, Blast, etc.) to further develop the annotation for these genes. The particular genes designated as reference genes are generally chosen from a manually curated genome (if available), and the designation of which member of the group is the reference can even be changed in this tool.

  1. In the ortholog table, click the first gene pCP001182_005683.
  2. Inspect the graphic at the top of the page displaying the 6-frame translation.
  3. View Annotation Summary

Update Annotation

Annotators can update a gene's start site, name, gene name aliases, description, or KO number. A note about a gene can also be added. If the start site for a reference gene is modified, all ortholog start sites can be automatically updated by the same offset, or ortholog start sites can be assigned individually. In addition, annotators can also update gene mappings, such as removing reference genes, unmapping orthologs, or reassigning a gene as a pseudogene. All annotation updates must be approved by a curator before the change is committed to the database. In this part of the tutorial, you will update the description for a gene and approve the modification as a curator.

  1. In the Annotation Summary section of the gene page, click the description "general secretion pathway protein G".
  2. This should jump to the Update Reference Gene Annotation section of the reference gene.
  3. Under the Description section, in the Description box, type "Test update description". In the Comment box, type "Test".
  4. Click the Update button.
  5. Note that the annotation update is added to the Description History table with status under review..
  6. To approve the update, scroll back to the top of the page, and click the Curators Only link in the top navigation bar.
  7. Under the heading Reference Gene Annotation Proposals, you should see the update you just made in yellow. Click the blue approve button in the Review column at the right side of the table.
  8. You should see that the annotation modification turns brown and status now says "most recently approved". Click the poson number link at the left to return to the gene page.
  9. Note that the new description now appears in the Annotation Summary for the gene.

Mapped genes

  1. To view all orthologs for the current gene, find the Genomic Comparison section. Under View strain comparison, click the Ortholog table link.
  2. A new browser window should open listing all the Mapped POSONs (orthologs).
  3. Click the second poson in the list pCP002522_005253, to view the details for the mapped gene in Ab TCDC-AB0715.
  4. Note again the 6-frame translation graphic, but this time with the gspG bar colored gray, indicating a mapped gene (not reference).
  5. Note that the Annotation Summary box is split into two sections for mapped genes.
  6. Scroll down to the Bioinformatics Predictions section. The results shown are based on predictions made on the reference gene sequence (section is highlighted in blue to emphasize this).

Synteny Map

  1. Scroll back up the page and find the Genomic Comparison section in the Pan Genome Reference Gene box.
  2. Under View strain comparison, select the Synteny Map link.
  3. A new browser window/tab should open displaying a graphic for comparing the local genomic region around each ortholog. The graphic helps identify synteny among the Acinetobacter genomes.
  4. Use the embedded scrollbar to scroll through the genomes and facilitate direct comparison with the reference genome.
  5. Mouseover any gene to see additional information.
  6. Use the Zoom tool to modify genomic region size.
  7. Use the Image Width tool to control the image size to best suit the size/resolution of your monitor.
  8. Click any gene in a mapped genome to jump to its detail page. Click on a gene in the reference (top) genome to recenter the view on that gene.
  9. Close the browser window/tab displaying the synteny image.

SNPs

  1. Back in the Genomic Comparison section on the poson detail page, find the View strain comparison section.
  2. Click on the SNPs link to view sequence polymorphisms and the translation of the codon in which the SNP occurs such as C(D).
  3. The last column indicates whether a non-synonymous mutation occurs in any of the orthologs (seen by comparing the translations in parentheses).
  4. Close the browser window to return to the gene page.

Muscle Sequence alignment of orthologs

  1. Again in the Genomic Comparison section, find Run Muscle and click the Amino Acid link.
  2. A new browser window/tab should open displaying the Muscle tool with default input of all ortholog sequences.
  3. Click the Submit button.
  4. The multiple sequence alignment output from the Muscle analysis should be displayed.
  5. Close the browser windows/tabs with the alignment results to continue the tutorial.

BLAST queries

  1. In the expandable Sequence section, Find the BLAST tool which contains a box for selecting a BLAST database, options for output format, and a text area for specifying a query sequence.
  2. In the BLAST Database box, scroll down and select All above databases for testing.
  3. Leave the default values specifying protein BLAST, Graphical output format and the POSON aa sequence for the Query Sequence
  4. Click the BLAST button. The BLAST results should be displayed in graphical and text format.
  5. Mouse over the bars in the hit graphic to display alignment details. These details are also listed in the text format below, with links to hits within PGAT and to external NCBI annotations
  6. Close the BLAST results page to continue the tutorial

Metabolic pathways

  1. Scroll back down to the Bioinformatics Predictions section, and find the KEGG Pathways results.
  2. Click on the first pathway map03070, for Bacterial secretion system.
  3. A new browser window/tab should open displaying the KEGG pathway description and diagram. Genes found in Ab TCDC-AB0715 are blue, and genes found in other Acinetobacter genomes, but not Ab TCDC-AB0715, are pink. Genes not found in any Acinetobacter genomes in the PGAT database are left white.
  4. A table summarizing the Acinetobacter genes in the pathway is displayed below the diagram.
  5. Close this browser window/tab to continue the tutorial.

Presence/Absence of genes

  1. Click on Search Options in the top navigation bar. Then, click on the Presence and Absence tab under the Search Options.
  2. In the list of genomes, leave the first genome (Ab 1656-2) as Present, select Absent for the rest of the B. pseudomallei. Leave all Options below unselected. Click Submit.
  3. The results page shows the genes found in Ab 1656-2, but absent (or pseudogene, as indicated by *) in the other genomes.

Gene search using genome coordinates

This option enables searching a genomic element to find genes based on a list of coordinates. Any coding gene whose POSON overlaps the coordinate will be returned.

  1. Click on the Genome Coordinates tab under the Search Options.
  2. Select Acinetobacter baumannii AB0057 from the genome list and chromosome from the genomic element list.
  3. Enter the following coordinates in the Specify genome coordinates text box (one number per line):
    3729000
    3730000
    3731000
  4. Click Submit.
  5. You should recover three pilus genes. In the table, you can see that the coordinate in the left hand column is between the POSON begin and end coordinates.
  6. Now select the option to Specify coordinate bounds to find genes/pseudogenes within range. Type values 3728000 to 3732000 in the input fields.
  7. The tool should list the same three pilus genes, along with a few other genes in this same region.

Gene Category Search

This page allows searching by several different gene categories. Genes are categorized using the bioinformatics software predictions (e.g. COG, PsortB, Pfam, etc.) that have been run for their reference genes.

  1. Click the Search by Category tab under Search Options.
  2. Click the triangle next to COG Categories to expand the COG category list.
  3. Check the box next to Defense mechanisms.
  4. Expand the PSort Localization section and check the boxes for Cytoplasmic and Periplasmic.
  5. Scroll to the bottom and click Search.
  6. The search should return 14 genes. The genes returned had to be in any of the COG categories you selected and in any PSort locations selected. The Category column on the right shows which criteria were met for each gene returned.


© University of Washington 2016