GelBuddy's Automatic Signal Detection Algorithm
This document provides an outline of GelBuddy's automatic signal detection algorithm and attempts
to explain the significance of many of the settings available in the Analyze Gel
window. A detailed description of the algorithm is in press.
GelBuddy begins automatic signal detection by extracting one-dimensonal electropherogram data from the
two-dimensional image data, using previously established lane tracks.
1. Raw image data, 700nm channel.
2. Extracted electropherogram data, presented as a "virtual gel" image.
Next, the data are resampled, using the de-smiling curves to
stretch or squish each lane to compensate for artifactual differences in mobility.
This "flattens" the data so that co-migrating bands appear at the
same y-coordinate.
3. Resampled data.
The intensity values are scaled to mean value 1 (reducing variations in lane intensity),
and the 20th percentile intensity value in each row is calculated, forming an artificial
background pattern. The Background Percentile value is user-adjustable. A higher value
will result in a more accurate background pattern but is more likely to cause dark bands to appear in the
background pattern at the positions of common polymorphisms, resulting in erratic detection of
signals at these mobilities.
4. Rescaled data and calculated background pattern.
GelBuddy then uses a decorrelation algorithm to subtract the background pattern from each lane,
producing a set of "foreground" data in which bands are more easily detected.
5. Decorrelation Output.
The decorrelated data contains high-spatial-frequency artifacts caused by inaccuracy of
the desmiling curves and lane-specific variation in background banding.
A smoothing pass reduces the intensity of these artifacts.
6. Smoothed data.
Each peak in the smoothed data is assigned a signal score based on signal strength. Very weak
signals (those below the signal detection threshold) are eliminated, as are signals
that appear in the same location in both channels, and those appearing where lane markers
are expected. Next, GelBuddy searches for pairs of strong signals (dark bands)
whose fragment lengths are close to the full PCR product length. Weak signals will be
given a higher pair score if a strong complementary signal is present in the other channel. Signals
whose pair score exceeds the
confirmation threshold are then marked for review by the user.
7. Marked bands. 700nm bands are marked in red, 800nm bands (not visible in this image) are marked in blue.