The RF ModCall module takes two RC files generated by the RF Count module, and performs transcriptome-wide single-base resolution calling of Ψ/2'-OMe residues from Ψ-seq/Pseudo-seq and 2OMe-seq experiments.
Information
For more details, please refer to Carlile et al., 2014 (Pseudo-seq, PMID: 25192136), Schwartz et al., 2014 (Ψ-seq, PMID: 25219674), and Incarnato et al., 2016 (2OMe-seq, PMID: 27614074).
For each transcript's position, two measures are computed:
where Si and Ri are respectively the score and the ratio at position i of the transcript, w is the size (in nt) of a window centered on position i, nTi and nUi are respectively the number of RT-stops in the CMCT treated (or low dNTP) and CMCT untreated (or high dNTP) samples, and cTi is the read coverage at position i in the CMCT treated (or low dNTP) sample.
The score is a measure of the RT-stop enrichment in the CMCT treated (or low dNTP) sample at a given position, with respect to the surrounding bases, and to the CMCT untreated (or high dNTP) sample.
The ratio is a relative quantitation of the modification stoichiometry at a given position in the CMCT treated (or low dNTP) sample.
Usage
To list the required parameters, simply type:
$ rf-modcall -h
Parameter | Type | Description |
---|---|---|
-u or --untreated | string | Path to the RC file for the CMCT untreated (or high dNTP) |
-t or --treated | string | Path to the RC file for the CMCT treated (or low dNTP) sample |
-i or --index | string[,string] | A comma separated (no spaces) list of RCI index files for the provided RC files Note #1: RCI files must be provided in the order 1. Untreated, 2. Treated Note #2: If a single RTI file is specified, it will be used for all RC files Note #3: If no RCI index is provided, it will be generated at runtime, and stored in the same folder of the untreated/treated samples |
-p or --processors | int | Number of processors (threads) to use (Default: 1) |
-o or --output-dir | string | Output directory for writing site scores and ratios in XML format (Default: <treated>_vs_<untreated>/) |
-ow or --overwrite | Overwrites the output directory if already exists | |
-w or --window | int | Window size (in nt) for score calculation (≥3, Default: 150) |
-ts or --to-smaller | The larger sample will be scaled toward the smaller one (Default: scale smaller sample to the larger one) | |
-mc or --mean-coverage | float | Discards any transcript with mean coverage below this threshold (≥0, Default: 0) |
-ec or --median-coverage | float | Discards any transcript with median coverage below this threshold (≥0, Default: 0) |
-D or --decimals | int | Number of decimals for reporting scores/ratios (1-10, Default: 3) |
-n or --nan | int | Positions of transcript with read coverage behind this threshold, will be reported as NaN in the reactivity profile (>0, Default: 10) |
Output XML files
RF ModCall produces a XML file for each transcript being analyzed, with the following structure:
<?xml version="1.0" encoding="UTF-8"?>
<data [attributes]>
<transcript id=”Transcript ID” length=”Transcript length”>
<sequence>
Transcript sequence
</sequence>
<score>
Comma-separated list of scores
</score>
<ratio>
Comma-separated list of ratios
</ratio>
</transcript>
</data>
The data tag’s attributes allow keeping track of the analysis performed:
Attribute | Possible values | Description |
---|---|---|
win | Positive integer ≥ 3 | Window's size (in nt) for score calculation |
tosmaller | TRUE/FALSE | Whether the larger dataset has been scaled to the size of the smaller one |