Prokaryotic Genome Analysis Tool (PGAT) Tutorial
- Select a PGAT
- Gene search
- Save search
- Ortholog tables
- Reference genes
- Update Annotation
- Mapped genes
- Synteny Map
- Muscle sequence alignment of orthologs
- BLAST queries
- Metabolic pathways
- Presence/absence of genes
- Gene search using genome coordinates
- Gene search by category
Select a PGAT
- From the PGAT home page (http://tools.uwgenomics.org/pgat), select the Demo tool for Acinetobacter baumannii to begin this tutorial. This is a public version of the tool, so you will not need to login. Any changes you make in the tool will be reset daily. You may also see updates made by other users that have used the tool since the last update.
- Note the banner at the top of the page indicating that you are in the Acinetobacter PGAT Demo tool.
Select Strains to Compare
- The front page lists a summary of the genomes loaded in the tool database. Note the status column on the right hand side. Genomes that are "complete" have been fully sequenced, annotated and submitted to Genbank. Other genomes may be designated as "draft assembly", indicating they have not yet been fully assembled and/or have not yet been published to Genbank (partial assembly of shotgun sequence). You will notice that many of the draft genomes consist of contigs rather than full chromosomes.
- Uncheck all genomes by clicking the box in the header row at the top left of the genomes table. Select all complete baumannii genomes by clicking the complete checkbox above the genomes table. Note the genomes that are now selected. Click the Update button. The genomes that were not selected should now be grayed out. Any further analyses and comparisons performed within the tool will now only be performed on the selected genomes.
- Click the Search Options link in the top menu bar.
- Click the Search by Keyword tab.
- In the Enter Search Text box, type "secretion".
- Click the Search button.
- A results list of 132 genes should appear.
- Click on the FASTA aa file to generate a FASTA file of amino acid sequences for the list of genes in a new browser window. To save the FASTA file, in the browser menu, click File | Save As, select to Save As a Text File, rename as desired (e.g. secretion.faa), and then click Save.
- Deselect all checkboxes in the gene list by selecting the checkbox in the grey header row at the top of the gene list.
- Re-select the first 5 checkboxes to select the first 5 genes.
- Scroll to the bottom of the page.
- In the Enter Search Name box, type "10 secretion genes - Your name".
- Click the Save button.
- After viewing the details of your saved search, click on the Saved Searches tab. You should see your search listed.
- Other users will not see your search listed, and you will not be able to see other users' searches.
- Click on the Orthologs tab.
- Click on the checkbox in the header row of the Genome table to select all genomes.
- Make sure that the checkbox Only include reference genes from the following selected saved queries is checked.
- Make sure that the search you just saved is checked.
- Click the Submit button.
- The ortholog table displayed indicates for which genomes each gene is present or absent.
A "reference gene" is one of the orthologs that was selected to represent the group of orthologs.
From each set of orthologs, one of the members (or the only member) is designated as the "reference" for the group. The collection of all reference genes represents the pan-genome for the species. The set of reference genes are processed through various bioinformatics prediction routines (e.g. COG, Pfam, Prosite, PsortB, Blast, etc.) to further develop the annotation for these genes. The particular genes designated as reference genes are generally chosen from a manually curated genome (if available), and the designation of which member of the group is the reference can even be changed in this tool.
- In the ortholog table, click the first gene pCP001182_005683.
- Inspect the graphic at the top of the page displaying the 6-frame translation.
- The grey horizontal brackets in each reading frame designate "Posons", which are open reading frames (ORFs) that are generally 35 AAs in length or more. Calling these ORFs posons and assigning them their own accession number (e.g. pNC_006350_00509) is convenient for database and Blast searches, and other features in the tool as you will see.
- Top 3 frames are forward strand, Bottom 3 frames are reverse strand - for each translation frame
- Coding sequences (CDS) are indicated by bold bars - dark blue indicating reference genes
- Gene currently being viewed is positioned at the center with its name displayed above or below the bar in red (e.g. gspG)
- Positions of bioinformatics predictions (e.g. Prodigal, COG) are displayed below POSONs
- Click on the zoom tool to zoom in or out to specified kb region
- Click on any poson to jump to its detail page
- For reference genes the gene name and description were taken from the published annotation in Genbank.
- Start position (value in parentheses is amino acid offset from poson begin position) and locus tag are also taken from Genbank.
- Click the plus sign next to the heading marked "Sequence" below. Nucleotide sequence and translation are displayed in this section, with start position highlighted in red.
Annotators can update a gene's start site, name, gene name aliases, description, or KO number. A note about a gene can also be added. If the start site for a reference gene is modified, all ortholog start sites can be automatically updated by the same offset, or ortholog start sites can be assigned individually. In addition, annotators can also update gene mappings, such as removing reference genes, unmapping orthologs, or reassigning a gene as a pseudogene. All annotation updates must be approved by a curator before the change is committed to the database. In this part of the tutorial, you will update the description for a gene and approve the modification as a curator.
- In the Annotation Summary section of the gene page, click the description "general secretion pathway protein G".
- This should jump to the Update Reference Gene Annotation section of the reference gene.
- Under the Description section, in the Description box, type "Test update description". In the Comment box, type "Test".
- Click the Update button.
- Note that the annotation update is added to the Description History table with status under review..
- To approve the update, scroll back to the top of the page, and click the Curators Only link in the top navigation bar.
- Under the heading Reference Gene Annotation Proposals, you should see the update you just made in yellow. Click the blue approve button in the Review column at the right side of the table.
- You should see that the annotation modification turns brown and status now says "most recently approved". Click the poson number link at the left to return to the gene page.
- Note that the new description now appears in the Annotation Summary for the gene.
- To view all orthologs for the current gene, find the Genomic Comparison section. Under View strain comparison, click the Ortholog table link.
- A new browser window should open listing all the Mapped POSONs (orthologs).
- Click the second poson in the list pCP002522_005253, to view the details for the mapped gene in Ab TCDC-AB0715.
- Note again the 6-frame translation graphic, but this time with the gspG bar colored gray, indicating a mapped gene (not reference).
- Note that the Annotation Summary box is split into two sections for mapped genes.
- The gray section displays gene name and description from the reference gene annotation. The start site was calculated by the PGAT pipeline software based on alignment. The locus tag is from the annotation published in Genbank.
- The brown section displays details about the reference gene, including the genome, sequence and ortholog summary.
- Scroll back up the page and find the Genomic Comparison section in the Pan Genome Reference Gene box.
- Under View strain comparison, select the Synteny Map link.
- A new browser window/tab should open displaying a graphic for comparing the local genomic region around each ortholog. The graphic helps identify synteny among the Acinetobacter genomes.
- Use the embedded scrollbar to scroll through the genomes and facilitate direct comparison with the reference genome.
- Mouseover any gene to see additional information.
- Use the Zoom tool to modify genomic region size.
- Use the Image Width tool to control the image size to best suit the size/resolution of your monitor.
- Click any gene in a mapped genome to jump to its detail page. Click on a gene in the reference (top) genome to recenter the view on that gene.
- Close the browser window/tab displaying the synteny image.
- Back in the Genomic Comparison section on the poson detail page, find the View strain comparison section.
- Click on the SNPs link to view sequence polymorphisms and the translation of the codon in which the SNP occurs such as C(D).
- The last column indicates whether a non-synonymous mutation occurs in any of the orthologs (seen by comparing the translations in parentheses).
- Close the browser window to return to the gene page.
Muscle Sequence alignment of orthologs
- Again in the Genomic Comparison section, find Run Muscle and click the Amino Acid link.
- A new browser window/tab should open displaying the Muscle tool with default input of all ortholog sequences.
- Click the Submit button.
- The multiple sequence alignment output from the Muscle analysis should be displayed.
- Close the browser windows/tabs with the alignment results to continue the tutorial.
- In the expandable Sequence section, Find the BLAST tool which contains a box for selecting a BLAST database, options for output format, and a text area for specifying a query sequence.
- In the BLAST Database box, scroll down and select All above databases for testing.
- Leave the default values specifying protein BLAST, Graphical output format and the POSON aa sequence for the Query Sequence
- Click the BLAST button. The BLAST results should be displayed in graphical and text format.
- Mouse over the bars in the hit graphic to display alignment details. These details are also listed in the text format below, with links to hits within PGAT and to external NCBI annotations
- Close the BLAST results page to continue the tutorial
- Scroll back down to the Bioinformatics Predictions section, and find the KEGG Pathways results.
- Click on the first pathway map03070, for Bacterial secretion system.
- A new browser window/tab should open displaying the KEGG pathway description and diagram. Genes found in Ab TCDC-AB0715 are blue, and genes found in other Acinetobacter genomes, but not Ab TCDC-AB0715, are pink. Genes not found in any Acinetobacter genomes in the PGAT database are left white.
- A table summarizing the Acinetobacter genes in the pathway is displayed below the diagram.
- Close this browser window/tab to continue the tutorial.
Presence/Absence of genes
- Click on Search Options in the top navigation bar. Then, click on the Presence and Absence tab under the Search Options.
- In the list of genomes, leave the first genome (Ab 1656-2) as Present, select Absent for the rest of the B. pseudomallei. Leave all Options below unselected. Click Submit.
- The results page shows the genes found in Ab 1656-2, but absent (or pseudogene, as indicated by *) in the other genomes.
Gene search using genome coordinates
This option enables searching a genomic element to find genes based on a list of coordinates. Any coding gene whose POSON overlaps the coordinate will be returned.
- Click on the Genome Coordinates tab under the Search Options.
- Select Acinetobacter baumannii AB0057 from the genome list and chromosome from the genomic element list.
- Enter the following coordinates in the Specify genome coordinates text box (one number per line):
- Click Submit.
- You should recover three pilus genes. In the table, you can see that the coordinate in the left hand column is between the POSON begin and end coordinates.
- Now select the option to Specify coordinate bounds to find genes/pseudogenes within range. Type values 3728000 to 3732000 in the input fields.
- The tool should list the same three pilus genes, along with a few other genes in this same region.
Gene Category Search
This page allows searching by several different gene categories. Genes are categorized using the bioinformatics software predictions (e.g. COG, PsortB, Pfam, etc.) that have been run for their reference genes.
- Click the Search by Category tab under Search Options.
- Click the triangle next to COG Categories to expand the COG category list.
- Check the box next to Defense mechanisms.
- Expand the PSort Localization section and check the boxes for Cytoplasmic and Periplasmic.
- Scroll to the bottom and click Search.
- The search should return 14 genes. The genes returned had to be in any of the COG categories you selected and in any PSort locations selected. The Category column on the right shows which criteria were met for each gene returned.