1. PARS
1. Download and decompress SRA files to FastQ format using the NCBI SRA Toolkit:
# Download/decompress reads
$ fastq-dump -A SRR972714 # Nuclease S1 sample
$ fastq-dump -A SRR972715 # RNase V1 sample
# Rename files
$ mv SRR972714.fastq S1.fastq
$ mv SRR972715.fastq V1.fastq
2. Prepare the reference index using rf-index
. To build the RefSeq gene annotation for Homo sapiens (hg38 assembly), simply type:
$ rf-index -g hg38 -a refGene
This will build a Bowtie v1 reference index. To use Bowtie v2, simply append the -b2
(or --bowtie2
) parameter to the previous command:
$ rf-index -g hg38 -a refGene --bowtie2
A folder named "hg38_refGene_bt/" (or "hg38_refGene_bt2/" in case Bowtie v2 is used) will be created in the current working directory.
3. Map reads to reference using rf-map
(Note: according to the GEO dataset's page, the last 51 nt of reads should be trimmed):
# Reads will be trimmed by 51 nt from their 3'-end, an mapped to transcripts
# sense strand only, allowing a maximum of 20 equally scoring alignments
$ rf-map -bnr -b3 51 -bm 20 -bi hg38_refGene_bt/hg38_refGene S1.fastq V1.fastq
To use Bowtie v2, simply append the -b2
(or --bowtie2
) parameter to the previous command:
$ rf-map -bnr -b3 51 -bi hg38_refGene_bt2/hg38_refGene S1.fastq V1.fastq --bowtie2
4. Count RT-stops in both samples using rf-count
:
$ rf-count -r -f hg38_refGene_bt/hg38_refGene.fa rf_map/*.bam
5. Normalize data using rf-norm
:
# Data will be normalized by default using Ding et al., 2014
# scoring method, and 2-8% normalization
$ rf-norm -u rf_count/V1.rc -t rf_count/S1.rc -i rf_count/index.rci
6. Perform transcriptome-wide inference of secondary structures usign rf-fold
:
# Inference will be performed by default according to Deigan et al., 2009,
# using the ViennaRNA algorithm
$ rf-fold -g S1_vs_V1_norm/
A folder named "rf_fold/" will be generated, containing two subdirectories:
- "structures/": inferred structures in dot-bracket notation
- "images/": graphical summaries in SVG format
2. DMS-MaPseq
1. Download and decompress SRA file to FastQ format using the NCBI SRA Toolkit:
# Download/decompress reads
$ fastq-dump -A SRR3929629 # S. cerevisiae Tagmented rRNA
# Rename file
$ mv SRR3929629.fastq Sc_Tag_rRNA.fastq
2. Prepare the reference index using rf-index
. To download the pre-built Bowtie v2 Saccharomyces cerevisiae ribosomal RNAs reference index, simply type:
$ rf-index -pb 3 --bowtie2
3. Map reads to reference using rf-map
:
$ rf-map -ca3 CTGTCTCTTATACACATCT -bs -bi Scerevisiae_rRNA_bt2/reference Sc_Tag_rRNA.fastq --bowtie2
4. Count mutations using rf-count
:
$ rf-count -r -m -f Scerevisiae_rRNA_bt2/reference.fa rf_map/Sc_Tag_rRNA.bam
5. Normalize data using rf-norm
:
# Data will be normalized on A/C residues only, using Zubradt et al., 2016
# scoring method, and 90% Winsorising
$ rf-norm -t rf_count/Sc_Tag_rRNA.rc -i rf_count/index.rci -sm 4 -nm 2 -rb AC
A folder named "Sc_Tag_rRNA_norm/" will be generated, containing one XML file for each analyzed transcript.
3. SHAPE-MaP
1. Obtain the HIV-1 genome's sequence from NCBI (extracting only bases 455-9626, corresponding to the primary transcript) and save it to HIV.fasta. In case you have Entrez Direct installed, simply type:
$ esearch -db nucleotide -query "M19921.2" | efetch -format fasta | perl -e 'while(<>) { chomp; next if (m/^>/); $seq .= $_; } print ">HIV\n" . substr($seq, 454, 9172) . "\n";' > HIV.fasta
2. Create the reference index:
$ bowtie2-build HIV.fasta HIV
3. Obtain FastQ files from SRA Database:
$ fastq-dump -A SRR1301979 --split-files -O Denatured/
$ fastq-dump -A SRR1301974 --split-files -O 1M7/
$ fastq-dump -A SRR1301978 --split-files -O Untreated/
4. Rename FastQ files:
$ mv Denatured/SRR1301979_1.fastq Denatured_R1.fastq
$ mv Denatured/SRR1301979_2.fastq Denatured_R2.fastq
$ mv 1M7/SRR1301974_1.fastq 1M7_R1.fastq
$ mv 1M7/SRR1301974_2.fastq 1M7_R2.fastq
$ mv Untreated/SRR1301978_1.fastq Untreated_R1.fastq
$ mv Untreated/SRR1301978_2.fastq Untreated_R2.fastq
5. Map reads to reference using rf-map
:
$ rf-map -p 3 -b2 -cqo -cq5 20 -bs -bl 15 -bN 1 -bD 20 -bR 3 -bdp 100 -bma 2 -bmp 6,2 -bdg 5,1 -bfg 5,1 -bd \
-mp "--maxins 200" -bi HIV Denatured_R1.fastq,Denatured_R2.fastq 1M7_R1.fastq,1M7_R2.fastq \
Untreated_R1.fastq,Untreated_R1.fastq
6. Count mutations using rf-count
:
$ rf-count -p 3 -r -f HIV.fasta -m -na -md 200 rf_map/Denatured.bam rf_map/1M7.bam rf_map/Untreated.bam
7. Normalize data using rf-norm
:
# Data will be normalized using Siegfried et al., 2014
# scoring method, and Box-plot normalization
$ rf-norm -t rf_count/1M7.rc -u rf_count/Untreated.rc -d rf_count/Denatured.rc -i rf_count/index.rci -sm 3 -nm 3 -o HIV_norm/
A folder named "HIV_norm/" will be generated, containing a single XML file.
8. Fold HIV-1 genome using rf-fold
:
$ rf-fold -m 2 -g -md 500 -w -pk -km 2 -ko 100 -pw 1600 -po 375 -wt 300 -fw 3000 -fo 300 HIV_norm/
4. m6A-seq
1. Download and decompress SRA files to FastQ format using the NCBI SRA Toolkit:
# Download/decompress reads
$ fastq-dump -A SRR456551 # m6A IP sample
$ fastq-dump -A SRR456555 # Input sample
# Rename files
$ mv SRR456551.fastq IP.fastq
$ mv SRR456555.fastq Input.fastq
2. Prepare the reference index using rf-index
. To build the Homo sapiens mRNAs reference index, simply type:
$ rf-index -g hg19 -a refGene
This will download a Bowtie v1 reference index. To use Bowtie v2, simply append the -b2
(or --bowtie2
) parameter to the previous command:
$ rf-index -g hg19 -a refGene --bowtie2
A folder named "hg19_refGene_bt/" (or "hg19_refGene_bt2/" in case Bowtie v2 is used) will be created in the current working directory.
3. Map reads to reference using rf-map
:
$ rf-map -ca3 GATCGGAAGAGCGGTTCAGCAG -bm 20 -bi hg19_refGene_bt/hg19_refGene Input.fastq IP.fastq
To use Bowtie v2, simply append the -b2
(or --bowtie2
) parameter to the previous command:
$ rf-map -ca3 GATCGGAAGAGCGGTTCAGCAG -bm 20 -bi hg19_refGene_bt/hg19_refGene Input.fastq IP.fastq --bowtie2
4. Calculate read coverage in both samples using rf-count
:
$ rf-count -r -co -f hg19_refGene_bt/hg19_refGene.fa rf_map/*.bam
5. Call m6A peaks using rf-peakcall
:
$ rf-peakcall -c rf_count/Input.rc -I rf_count/IP.rc -i rf_count/index.rci -e 2.5
A BED file named "IP_vs_Input.bed" will be generated, containing the called peaks.
5. 2OMe-seq
1. Download and decompress SRA files to FastQ format using the NCBI SRA Toolkit:
# Download/decompress reads
$ fastq-dump -A SRR2414087 # High dNTP (1 mM) sample
$ fastq-dump -A SRR2414088 # Low dNTP (4 nM) sample
# Rename files
$ mv SRR2414087.fastq HeLa_1mM_dNTP.fastq
$ mv SRR2414088.fastq HeLa_4nM_dNTP.fastq
2. Prepare the reference index using rf-index
. To download the pre-built Homo sapiens ribosomal RNAs reference index, simply type:
$ rf-index -pb 1
This will download a Bowtie v1 reference index. To use Bowtie v2, simply append the -b2
(or --bowtie2
) parameter to the previous command:
$ rf-index -pb 1 --bowtie2
A folder named "Hsapiens_rRNA_bt/" (or "Hsapiens_rRNA_bt2/" in case Bowtie v2 is used) will be created in the current working directory.
3. Map reads to reference using rf-map
:
$ rf-map -b5 5 -bi Hsapiens_rRNA_bt/reference HeLa_1mM_dNTP.fastq HeLa_4nM_dNTP.fastq
To use Bowtie v2, simply append the -b2
(or --bowtie2
) parameter to the previous command:
$ rf-map -b5 5 -bi Hsapiens_rRNA_bt/reference HeLa_1mM_dNTP.fastq HeLa_4nM_dNTP.fastq --bowtie2
4. Count RT-stops in both samples using rf-count
:
$ rf-count -r -fh -f Hsapiens_rRNA_bt/reference.fa rf_map/*.bam
5. Calculate per-base score and ratio using rf-modcall
:
$ rf-modcall -u rf_count/HeLa_1mM_dNTP.rc -t rf_count/HeLa_4nM_dNTP.rc -i rf_count/index.rci
A folder named "HeLa_4nM_dNTP_vs_HeLa_1mM_dNTP/" will be generated, containing one XML file for each analyzed transcript.