Quantify Reads with Kallisto¶
Kallisto uses a ‘hash-based’ pseudo alignment to deliver extremely fast matching of RNA-Seq reads against the transcriptome index. Each Kallisto job in this tutorial will take only several minutes to complete.
|Kallisto Index||Indexed transcriptome||Example Kallisto index|
|RNA-Seq Reads||Cleaned fastq files||Example fastq files|
Quantify RNA-Seq reads with Kallisto¶
A Kallisto analyses must be run for each mapping of RNA-Seq reads to the index. In this tutorial, we have 36 fastq files (18 pairs), so you will need to add these to the Kallisto analyses. It is sufficient here to launch a single Kallisto job to examine the input and then use the completed results (which are small files) for Sleuth analyses.
- If necessary, login to the CyVerse Discovery Environment.
- Open the Kallisto-0.42.3-quant-PE App.
- Name your analysis, and if desired enter comments. In the App’s ‘Input’ step under ‘Index file’ browse to and select the Kallisto index generated in the previous tutorial section. In the output directory, enter the name for the output directory that will be created. For this tutorial, name your output directory pair01_wt_mock_r1 (This is for our first pair of WT reads, mock treatment, replicate 1).
- Under ‘Read 1 Fastq files’ and ‘Read 1 Fastq files’ the respective right ang left sequences. In this tutorial click ‘Add’ to select the following files located in Community Data > cyverse_training > tutorials > kallisto > 00_input_fastq_trimmed: - SRR1761506_R1_001.fastq.gz_fp.trimmed.fastq.gz - SRR1761506_R2_001.fastq.gz_fp.trimmed.fastq.gz
- Click ‘Launch Analyses’ to launch the job and monitor its progress.
Kallisto jobs will generate 3 files per read pair:
|abundances.h5||HDF5 binary file containing run info, abundance estimates, bootstrap estimates, and transcript length information length. This file can be read in by Sleuth||example abundance.h5|
|abundances.tsv||plaintext file of the abundance estimates. It does not contains bootstrap estimates. When plaintext mode is selected; output plaintext abundance estimates. Alternatively, kallisto h5dump will output an HDF5 file to plaintext. The first line contains a header for each column, including estimated counts, TPM, effective length.||example abundance.tsv|
|run_info.json||a json file containing information about the run||example json|
Description of results and next steps
Kallisto quantifies RNA-Seq reads against an indexed transcriptome and generates a folder of results for each set of RNA-Seq reads. Sleuth will be used to examine the Kallisto results in R Studio.
Fix or improve this documentation