The RF JackKnife takes one or more XML reactivity files, and a set of reference RNA structures in dotbracket notation, and iteratively calls rf-fold by tuning the slope and intercept folding parameters. This is useful to calibrate the folding parameters for a specific probing reagent or experiment type.
It produces a CSV table the FMI (Fowlkes-Mallows index, the geometric mean of PPV and sensitivity), or the mFMI (modified FMI, for additional details check the Metrics section of RF Compare) for each slope/intercept pair:
PPV Sensitivity table

Combining multiple experiments

Since version 2.9.0 it is possible to identify a single slope/intercept pair that yields the best prediction across multiple experiments.

In the follwing example:

$ rf-jackknife -x -r reference.db experiment_1/ experiment_2/ experiment_3/

slope and intercept will be optimized on the reference transcripts present in reference.db. All transcripts, including those that are present only in a subset of the experiments, will by default be used for parameter optimization. If parameter -oc is enabled, however, only transcripts for which reactivity data is available across all experiments will be used.

For additional details on how multiple replicates are combined into a single prediction, please refer to the "Combining multiple experiments" paragraph of the RF Fold's documentation page.

Usage

To list the required parameters, simply type:

$ rf-jackknife -h
Parameter Type Description
-r or --reference string A file containing reference structures in Vienna format (dotbracket notation)
-oc or --only-common In case of replicates, only transcripts covered across all experiments will be used to derive the optimal slope/intercept pair
-p or --processors int Number of processors to use (Default: 1)
-o or --output-dir string Output directory (Default: rf_jackknife/)
-ow or --overwrite Overwrites output directory (if the specified path already exists)
-g or --img Generates heatmap of grid search results (requires R)
-sl or --slope float,float Range of slope values to test (Default: 0,5)
-in or --intercept float,float Range of intercept values to test (Default: -3,0)
-ss or --slope-step float Step for testing slope values (Default: 0.2)
-is or --intercept-step float Step for testing intercept values (Default: 0.2)
-x or --relaxed Uses relaxed criteria (Deigan et al., 2009) to calculate the FMI
-m or --mFMI Uses modified FMI (mFMI, Lan et al., 2022; for additional details check the Metrics section of RF Compare) instead of standard FMI to quantify the agreement between predicted and reference structure
-kp or --keep-pseudoknots Keeps pseudoknotted basepairs in reference structure
-kl or --keep-lonelypairs Keeps lonely basepairs (helices of length 1 bp) in reference structure
-i or --ignore-sequence Ignores sequence differences (e.g. SNVs) between the compared structures
-e or --median The FMI across multiple reference structures is aggregated by median
Note: by default, FMI values are aggregated by geometric mean
-am or --arithmetic-mean The FMI across multiple reference structures is aggregated by arithmetic mean
Note: by default, FMI values are aggregated by geometric mean
-rf or --rf-fold string Path to rf-fold executable (Default: assumes rf-fold is in PATH)
-rp or --rf-fold-params string Manually specify additional RF Fold parameters (e.g. -rp "-md 500 -m 2")
-R or --R-path string Path to R executable (Default: assumes R is in PATH)
Note: also check $RF_RPATH under Environment variables


Output CSV files

RF JackKnife produces a CSV file reporting the FMI (or mFMI) for each intercept (x-axis) and slope (y-axis) value pair