The Annotation Tutorial¶
This tutorial is a step-by-step guide for using SciApps to perform MAKER based annotation.
|Assembled genome||A scaled-down genome that is comprised of the first 300kb of three chromosomes of rice||test_genome.fasta.gz|
|Annotated gene models||MAKER output in GFF3 format||maker_out.gff.gz|
|App name||Version||Description||App link||Notes/other links|
|MAKER||2.32||A portable and easily configurable genome annotation pipeline||MAKER-2.32||MAKER documentation|
|SNAP||0.0.1||Semi-HMM-based Nucleic Acid Parser||SNAP-0.0.1||SNAP documentation|
Step 1: Requiring access to SciApps¶
This is one-time operation. Please login to SciApps directly if you have completed this step before.
Log into CyVerse User portal at https://user.cyverse.org.
By default, you will be under the ‘Services’ page, click on ‘AVAILABLE’, then ‘REQUEST ACCESS’ to SciApps.
Click on ‘MY SERVICES’, then click on ‘LAUNCH’ for Discovery Environment.
Once in Discovery Environment, click to open the ‘Data’ window. You should see the sci_data folder under your root folder:/iplant/home/YOUR_USER_NAME.
Step 2: Uploading data for SciApps¶
This step will demo how to upload data to the sci_data folder for accessing from SciApps.
Click sci_data folder to open it.
Click ‘Upload’, then ‘Import from URL’ to import this URL: https://data.sciapps.org/example_data/maker/my.all.gff.gz
This may take a few minutes. You can check the status by clicking the ‘Bell’ on the top right corner of DE. Once importing completed, ‘Refresh’ the window to see the file. This is a GFF3 formatted file from MAKER.
Step 3: HMM parameters estimation with SNAP¶
Login to SciApps at https://www.SciApps.org.
Click Prediciton category (left panel) or search for SNAP, then click SNAP to load SNAP-0.0.1.
Under “GFF file”, click Browse DataStore, then navigate to the maker folder (example > maker); select maker_out.gff.gz and click ‘Select and Close’.
Click ‘Refresh’ if you can not see any newly uploaded files.
Leave other parameters as default, and click Submit Job. You will be asked to confirm; click “Submit”. You will be prompted to check the job status in the right panel.
Step 4: Running MAKER with SNAP output¶
This step will demo how to use SNAP output with MAKER to do a second around annotation.
Click Annotation category (left panel) or search for MAKER, then click to load MAKER-2.32.
Under “Genome sequence file” click Browse DataStore, then navigate to the maker folder (example > maker); select test_genome.fasta.gz and click ‘Select and Close’.
Click SNAP-0.0.1 in the History panel to expand its outputs, then drag and drop snap_out.hmm into the SNAP HMM file field.
Under “Maker annotations” click Browse DataStore, then navigate to the maker folder (example > maker); select maker_out.gff.gz and click ‘Select and Close’.
Leave others as defaults, then click the “Submit Job” button.
Once COMPLETED, click the Visualization icon for MAKER-2.32 in the History panel to bring up its outputs. Select jbrowse_out.view.tgz from the list of outputs, then click Visualize, you will be directed to a genome browser to visualize your annotation results.
Step 5: Creating a Workflow¶
This step will demo how to build a two-step workflow with previously completed MAKER and SNAP jobs.
Check the checkboxes for step 1 (SNAP), and step 2 (MAKER) in the History panel, then click the ‘build a workflow’ link to load the Workflow building page.
History panel Checkboxes and the workflow building page are interactive. Use the ‘Select All’ or ‘Reset’ button to simplify the selection process.
Modify Workflow Name and Workflow Description, then click the ‘Build Workflow’ button to visualize the workflow.
The connection between SNAP-0.0.1 and MAKER-2.32 (via my.all.hmm) is recorded through dragging and dropping, which feeds the output of SNAP as an input for MAKER.
On the ‘Workflow Diagram’, you can save the workflow. Your saved workflows will appear in ‘My workflows’ (under the ‘Workflow’ menu from top navigation panel).
Step 6: Running a Workflow¶
This step will demo how to run a workflow you created or someone shared with you.
Navigate to ‘Workflow’, then ‘My workflows’, to load the workflow you created and saved (in Step 2).
Alternatively, you can load the app forms and job histories directly if you have the direct link for a workflow.
Scroll down the main panel, then click Submit Workflow. You will be asked to confirm and prompted to check the job status in the right panel. Then a live workflow diagram will be displayed with real-time analysis status updates.
Step 7: Using Apollo for Community Annotation¶
In reality, annotated genes from MAKER will be further filtered or even manually annotated before being released (for example, by Gramene/Plant Ensembl). For manual annotation of the MAKER results with Apollo, we set up a demo at http://data.maizecode.org/apollo. You can login with username: email@example.com, and password: demo.
This tutorial covers how to use SciApps for your annotaition work, including accessing data in CyVerse Data Store, launching jobs, building workflows, running workflows, visualizing results, and importing workflows to re-run.
More help and additional information¶
- GMOD MAKER tutorial
- MAKER 2.31.9 with CCTOOLS Jetstream Tutorial
- Bioinformatics workshop of 2017 Plant Genome & Biotechnology meeting
Fix or improve this documentation
- Search for an answer: |CyVerse Learning Center|
- Ask us for help: click |Intercom| on the lower right-hand side of the page
- Report an issue or submit a change: |Github Repo Link|
- Send feedback: Tutorials@CyVerse.org