Difference in Mean

The mean is taken for each subgroup, at each datapoint. The difference between these means is displayed. If subgroup1 has the larger mean, the plotted point will be a red bar above the midline. If subgroup2 has the larger mean, the plotted point will be a green bar below the midline.

More...

Student's T-Test

The negative log-probability that the two subgroups are from the same distribution is plotted. The calculation of this probability assumes both subgroups contain data that is normally distributed. Loci with P<=0.05 (i.e. values >= 1.3) are displayed in color, indicating significance at that locus. If subgroup1 has the larger mean, the plotted point will be above the midline. If subgroup2 has the larger mean, the plotted point will be below the midline.

More...

Mann-Whitney-Wilcoxon Test (Wilcoxon test)

The negative log-probability that the two subgroups are from the same distribution is plotted. This is a non-parametric calculation. At least 5 members need to be in each subgroup for this test. Loci with P<=0.05 (i.e. values >= 1.3) are displayed in color, indicating significance at that locus. If subgroup1 has the larger mean, the plotted point will be above the midline. If subgroup2 has the larger mean, the plotted point will be below the midline.

More...

Fisher's Exact Test

The negative log-probability that, given the subgroups, the values are as unbalanced around zero as is observed. This assumes zero as a global midpoint in the data, such as in log-ratios. This test is computationally expensive so is limited to subgroup sizes of N < 300, and may be slow to appear on subgroup sizes above N > 100. There is no subgroup directionality associated with this statistic, so it is always plotted above the midline. Significant (P<=0.05, i.e. values >= 1.3) loci are colored red.

More...

Fisher's Linear Discriminant

The negative log-probability of obtaining the data separation that the two subgroups give is plotted. This is a function of the ratio of within-group against between-group variance. This is a non-parametric test. As there is no directionality associated with this statistic, points are all plotted above the midline. Loci with P<=0.05 (i.e. values >= 1.3) are plotted in red, indicating significance.

More...

Jarque-Bera Test of Normality

This test indicates whether the data in the subgroups defined are normally distributed. This is a useful test before relying on parametric tests such as the Student's T-Test. Two log-probabilities are calculated (one for each subgroup), then the lower probability is displayed. Significantly non-normal loci (P<=0.05, i.e. values >=1.3) are colored: Red and above the midline indicates subgroup1 is most non-normal, green and below the midline indicates subgroup2 is most non-normal. Note that this test is extremely sensitive with small sample sizes so may give false negatives.

More...

Brown-Forsythe Test for Homogeneity of Variance

The negative log-probability that the underlying distributions for the subgroups have the same variance, given the observed datapoints. Unlike the above statistical tests which test whether levels are different between subgroups, Brown-Forsythe and Levene's tests indicate where the spread is different between subgroups. Loci that are significantly differentially variant (P<=0.05, i.e. values >= 1.3) are colored: Red and above the midline indicates subgroup1 is more varied, green and below the midline indicates subgroup2 is more varied. This test uses the mean to center the variance so can give false positives if the data are skewed.

More...

Levene's Test for Homogeneity of Variance

The negative log-probability that the underlying distributions for the subgroups have the same variance, given the observed datapoints. Unlike the above statistical tests which test whether levels are different between subgroups, Brown-Forsythe and Levene's tests indicate where the spread is different between subgroups. Loci that are significantly differentially variant (P<=0.05, i.e. values >= 1.3) are colored: Red and above the midline indicates subgroup1 is more varied, green and below the midline indicates subgroup2 is more varied. This test uses the median to center the variance so is more robust for non-gaussian data than the Brown-Forsythe test.

More...

Bonferonni Correction

This is a conservative method to correct estimated p-values when testing multiple hypotheses. The correction is performed by simply multiplying the p-values calculated by the methods listed above by the number of tests performed. The number of tests depend on the current view of the data. In "chromosome" view, p-values are multiplied by the number of visible probes. When more than one chromosome is visible, the maximum number of probes used for Bonferroni correction will come from the down-sampled table that has approximately 3,000 down-sampled probes. When zoomed to within a single chromosome, the maximum number of tests is the number of high-resolution probes for that chromosome. In "genesets" view, the Bonferroni correction is calculated by summing up the total number of probes across all genesets displayed.

More...

Benjamini-Hochberg (FDR)

FDR (false discovery rate) control is a multiple hypothesis test correction method that is a less conservative procedure than the Bonferonni method. The false discovery rate is defined as the expected proportion of false positives among all significant tests. FDR control allows researchers to identify a set of "candidate positives," of which a high proportion are likely to be true. This is similar to R's implementation of "p.adjust(p, "BH")" where BH stands for the "Benjamini & Hochberg" method (aka "fdr").

More...
And more...