Monday 2 June 2014

wigCorrelate

Download:

wigCorrelate is downloadable from https://github.com/adamlabadorf/ucsc_tools/blob/master/executables/wigCorrelate

wigCorrelate needs to be made executable for calling it.

Principle:

wigCorrelate, finds the overlapping features between two wig and calculate base-to-base Pearson's Coefficient.

Cited from https://www.biostars.org/p/85006/

wigCorrelate - Produce a table that correlates all pairs of wigs.
usage:
wigCorrelate one.wig two.wig ... n.wig
This works on bigWig as well as wig files.
The output is to stdout
options:
-clampMax=N - values larger than this are clipped to this value

It works by finding items that overlap in the different wig files.
Within the overlap it considers each base a separate
observation and calculates Pearson's R based on that. Here's two
relevant snippets of the code from lib/correlate.c

void correlateNext(struct correlate c, double x, double y)
/ Add next sample to correlation. */ {
c->sumX += x;
c->sumXX += x*x;
c->sumXY += x*y;
c->sumY += y;
c->sumYY += y*y;
c->n += 1;
}

double correlateResult(struct correlate c)
/ Returns correlation (aka R) */ {
double r = 0;
if (c->n > 0) {
double sp = c->sumXY - c->sumX*c->sumY/c->n;
double ssx = c->sumXX - c->sumX*c->sumX/c->n;
double ssy = c->sumYY - c->sumY*c->sumY/c->n;
double q = ssx*ssy;
if (q != 0)
r = sp/sqrt(q);
}
return r;
}

It has a little optimization that lets it work faster than this when it
knows it has a multiple-base window where x and y are constant that
keeps the base-by-base approach from actually getting expensive when
it's not really needed.

Unfortunately it doesn't do anything particularly sensible with the
regions where there is data in one wig but not another. It's treated the
same as something that wasn't covered by either wig.


Cited from http://redmine.soe.ucsc.edu/forum/index.php?t=msg&goto=4215&S=793c2bd15fc7a2af0bf8351cb46e32b4

No comments:

Post a Comment