Home » SCRUB » SCRUB(1) Man Page

SCRUB(1) Man Page

Brian E. Coggins

Version 1.0.0, November 2013

NAME

scrub - removes artifacts from NMR spectra acquired with sparse sampling

SYNPOSIS

scrub sampling-pattern input-file [output-file] [options]

DESCRIPTION

The scrub program uses the SCRUB algorithm (see J. Am. Chem. Soc. 134, 18619-18630 (2012)) to remove sampling artifacts from NMR spectra acquired with sparse sampling.

The sampling-pattern file should be a text file with one line for each sampling point. Within each line, whitespace- and/or comma-separated numbers indicate the coordinates for the sampling point, normally as integer multiples of the dwell time in each indirect dimension for "on-grid" experiments. There should be one column for each sparsely sampled dimension. For "off-grid" experiments, the coordinates should be given as floating-point values denoting evolution times in seconds. An additional column of floating-point weighting values may be included after the coordinates. Comment lines beginning with the "#" character may be included. Sampling patterns with extra columns of numbers can be parsed using the special override options described below.

The input-file may be in NMRPipe, NMRView, SPARKY/UCSF, or XEASY format; scrub will identify the format from the file extension.

To process the input data in-place, use the --in-place option; otherwise, provide an output-file to receive the processed data. This file is created if it does not exist, with the format determined from the output-file's file extension; the output format need not be the same as the input format. If the output-file already exists, the -w/--overwrite option must be given to indicate that it is safe to overwrite it.

By default, scrub assumes that the dimensions of the experiment correspond to the dimensions of the sampling pattern, with the F1 dimension matching to the first column (called "U" by scrub), F2 matching to the second column ("V"), etc. Any leftover experimental dimensions are treated as non-sparse. This mapping can be overridden using the dimension assignment options described below.

OPTIONS

Options may be given with either one or two dashes (-threads 3 and --threads 3 are both acceptable) and with either a space or an equal sign between the option name and value (--threads 3 and --threads=3 are both acceptable). In this man page, we follow custom and separate the option name and value by a space when there is one dash and by an equal sign when there are two.

General Options

-h, -help, --help

Show usage information

--version

Show version information

--in-place

Process the input spectrum in-place.

-w, -overwrite, --overwrite

Overwrite an existing output file with the same name.

-q, -quiet, --quiet

Suppress information about the configuration and progress of the calculation.

-t THREADS, -threads THREADS, --threads THREADS

Use THREADS threads for parallel computation. The default is to set the number of threads to match the number of processor cores.

Dimension Assignment Options

Note: providing any of the following dimension assignment options turns off automatic dimension assignment. If you need to provide one of these options, go ahead and provide the full set for all of the dimensions in the experiment.

-1 ASSIGNMENT, -f1 ASSIGNMENT, --f1=ASSIGNMENT

Dimension assignment for the F1 dimension of the experiment, matching it to one of the sparse sampling dimensions in the sampling pattern (U, V, or W, corresponding to the first, second, or third column in pattern file, respectively) or indicating that it is not sparse (an "index" dimension). Valid ASSIGNMENT values are u, v, w, and index.

-2 ASSIGNMENT, -f2 ASSIGNMENT, --f2=ASSIGNMENT

Dimension assignment for the F2 dimension of the experiment, matching it to one of the sparse sampling dimensions in the sampling pattern (U, V, or W, corresponding to the first, second, or third column in pattern file, respectively) or indicating that it is not sparse (an "index" dimension). Valid ASSIGNMENT values are u, v, w, and index.

-3 ASSIGNMENT, -f3 ASSIGNMENT, --f3=ASSIGNMENT

Dimension assignment for the F3 dimension of the experiment, matching it to one of the sparse sampling dimensions in the sampling pattern (U, V, or W, corresponding to the first, second, or third column in pattern file, respectively) or indicating that it is not sparse (an "index" dimension). Valid ASSIGNMENT values are u, v, w, and index.

-4 ASSIGNMENT, -f4 ASSIGNMENT, --f4=ASSIGNMENT

Dimension assignment for the F4 dimension of the experiment, matching it to one of the sparse sampling dimensions in the sampling pattern (U, V, or W, corresponding to the first, second, or third column in pattern file, respectively) or indicating that it is not sparse (an "index" dimension). Valid ASSIGNMENT values are u, v, w, and index.

Dimension Apodization Information

-f1-apod APOD, --f1-apod=APOD

Apodization information for the F1 dimension. Required for each indirect dimension that was apodized during post-processing, unless the data are in NMRPipe format, in which case this information can be extracted from the file header automatically. Specify the apodization function by putting the corresponding NMRPipe command in quotes, e.g.

--f1-apod="-fn EM -lb 10 -c 0.5"

for an exponential function with 10 Hz line-broadening and a first-point correction of 0.5.

-f2-apod APOD, --f2-apod=APOD

Apodization information for the F2 dimension. Required for each indirect dimension that was apodized during post-processing, unless the data are in NMRPipe format, in which case this information can be extracted from the file header automatically. Specify the apodization function by putting the corresponding NMRPipe command in quotes, e.g.

--f2-apod="-fn EM -lb 10 -c 0.5"

for an exponential function with 10 Hz line-broadening and a first-point correction of 0.5.

-f3-apod APOD, --f3-apod=APOD

Apodization information for the F3 dimension. Required for each indirect dimension that was apodized during post-processing, unless the data are in NMRPipe format, in which case this information can be extracted from the file header automatically. Specify the apodization function by putting the corresponding NMRPipe command in quotes, e.g.

--f3-apod="-fn EM -lb 10 -c 0.5"

for an exponential function with 10 Hz line-broadening and a first-point correction of 0.5.

-f4-apod APOD, --f4-apod=APOD

Apodization information for the F4 dimension. Required for each indirect dimension that was apodized during post-processing, unless the data are in NMRPipe format, in which case this information can be extracted from the file header automatically. Specify the apodization function by putting the corresponding NMRPipe command in quotes, e.g.

--f4-apod="-fn EM -lb 10 -c 0.5"

for an exponential function with 10 Hz line-broadening and a first-point correction of 0.5.

-f1-fpc FPC, --f1-fpc=FPC

First-point correction for the F1 dimension, if a correction is needed but not otherwise specified either in the file header or an f1-apod flag. FPC should be a number, typically 0.5.

-f2-fpc FPC, --f2-fpc=FPC

First-point correction for the F2 dimension, if a correction is needed but not otherwise specified either in the file header or an f2-apod flag. FPC should be a number, typically 0.5.

-f3-fpc FPC, --f3-fpc=FPC

First-point correction for the F3 dimension, if a correction is needed but not otherwise specified either in the file header or an f3-apod flag. FPC should be a number, typically 0.5.

-f4-fpc FPC, --f4-fpc=FPC

First-point correction for the F4 dimension, if a correction is needed but not otherwise specified either in the file header or an f4-apod flag. FPC should be a number, typically 0.5.

Sampling Pattern Parsing Options

Note: providing one or more of the --u-col, --v-col, --w-col, or --weight-col options turns off automatic parsing of the sampling pattern. In such cases, supply as many of the other options as are needed to interpret the sampling pattern.

-u-col COL, --u-col=COL

The column of the sampling pattern containing the U dimension. The first column is numbered 0, the second 1, etc.

-v-col COL, --v-col=COL

The column of the sampling pattern containing the V dimension, if applicable. The first column is numbered 0, the second 1, etc.

-w-col COL, --w-col=COL

The column of the sampling pattern containing the W dimension, if applicable. The first column is numbered 0, the second 1, etc.

-weight-col COL, --weight-col=COL

The column of the sampling pattern containing the weighting information, if applicable. The first column is numbered 0, the second 1, etc.

-off-grid, --off-grid

Interpret the sampling pattern as off-grid.

-ignore-weights, --ignore-weights

Ignore any weighting information found in the sampling pattern.

Options for Calculating Only Part of an Input Spectrum

-position POS1 [additional POS], --position=POS [additional POS]

Instead of processing the entire input spectrum, process one or more specific vectors, planes, or cubes (depending on the number of sparse indirect dimensions). Since each position on the index dimension(s) is independent in terms of artifacts from all other index dimension positions, one can choose individual index dimension positions for processing.

Specify a position by giving one integer for each index dimension in the spectrum, separated by commas but no white space, where each integer designates a data point ranging from zero at the low-frequency end to the number of points on that dimension minus one at the high-frequency end. The dimensions should be listed in the order of least-frequently changing to most-frequently changing.

To process multiple positions, repeat this option for each position to calculate (e.g. -position 10 -position 14) OR list the positions, separated by spaces, after the flag (e.g. -position 10 14).

Example: in a 3-D spectrum with two sparse dimensions, process the 8th F1/F2 plane:

-position 7

Example: in a 4-D spectrum with three sparse dimensions, process the 63rd and 64th F1/F2/F3 cubes:

-position 62 63

Example: in a 4-D spectrum where F2 and F3 are sparse and F1 and F4 are conventional, process the F2/F3 planes numbered 14-16 at F4 position 9:

-position 14,9 15,9 16,9
-insert-in-place, --insert-in-place

When processing specific positions, write the results either:

  • back to the input file, overwriting the original data, if --in-place is also provided on the command-line

  • to the corresponding locations in output-file, creating the file if need be, otherwise overwriting the existing data

-separate-outputs, --separate-outputs

When processing specific positions, write the result for each position into a separate lower-dimensional file. These files are named based on the output-file parameter, with an underscore and one or more position numbers added.

SCRUB Options

-g GAIN, -gain GAIN, --gain=GAIN

The gain, in percent. The default is 10% for spectra with one or two sparse dimensions and 50% for those with three sparse dimensions.

-b BASE, -base BASE, --base=BASE

Parameter indicating how far to continue subtraction during SCRUB processing. SCRUB will continue to subtract until the estimated residual artifacts from all signals currently being processed are less than BASE times the current estimated noise level. The default is 0.01.

Pure Component Calculation Options

-pure-comp-mode MODE, --pure-comp-mode=MODE

The method used to determine the pure component, where MODE may equal contour-irregular, contour-ellipsoid (the default), or fixed-ellipsoid.

-pc-contour-irregular-level LEVEL, --pc-contour-irregular-level=LEVEL

The contour level for the contour-irregular calculation.

-pc-contour-irregular-margin MARGIN, --pc-contour-irregular-margin=MARGIN

If MARGIN is nonzero, add a one-point margin around the contour; otherwise (the default), do not add a margin.

-pc-contour-ellipsoid-level LEVEL, --pc-contour-ellipsoid-level=LEVEL

The contour level for the contour-ellipsoid calculation.

-pc-contour-ellipsoid-margin MARGIN, --pc-contour-ellipsoid-margin=MARGIN

Expand the ellipsoid by a margin of MARGIN percent beyond what is needed to circumscribe the contour. Default is no margin.

-pc-fixed-ellipsoid-size SIZE, --pc-fixed-ellipsoid-size=SIZE

Size of the ellipsoid for fixed-ellipsoid calculations. Each semiaxis is SIZE percent of the spectral width.

PSF and Pure Component Output

-psf PSFFILE, --psf=PSFFILE

Save the calculated PSF to the spectrum file PSFFILE

-pure-comp PURECOMPFILE, --pure-comp=PURECOMPFILE

Save the calculated pure component to the spectrum file PURECOMPFILE

Diagnostic Options

-l LOGFILE, -log LOGFILE, --log=LOGFILE

Record a log to the text file LOGFILE

-verbose-log, --verbose-log

Record complete details of the calculation to the log file; without this option, more abbreviated information is recorded.

-r REPORT, -noise-report-csv REPORT, --noise-report-csv=REPORT

Record a report on noise (artifact) reduction to a CSV file REPORT

SEE ALSO

clean(1), pipewash(1)

HTML and PDF user's manuals are provided in the doc subdirectory of the nmr_wash distribution, and at the nmr_wash web page.