Per comparison error rate (PCER)
Per-family error rate (PFER)
Family-wise error rate (FWER)
False discovery rate (FDR)
Positive false discovery rate (pFDR)
Multiple Testing
Wednesday, 25 February 2015
Tuesday, 24 February 2015
Gene Set Enrichment Analysis
Command line options for running GSEA: Syntax
Excerpts from GSEAPreranked Page
- when using the GSEAPreranked tool, we recommend you provide a ranked list that already has unique human gene symbols and select false for the parameter Collapse data set to gene symbols.
- In standard GSEA you can choose to set the parameter Permutation type to phenotype (the default) or gene set, but this option is not available in GSEAPreranked.
- In the case of GSEAPreranked, you should make sure that this weighted scoring scheme applies to your choice of ranking statistic. When in doubt, we recommend using a more conservative scoring approach by setting Enrichment statistic to classic.
- select Tools>GseaPreranked.
- Gene sets database.
- Number of permutations. Specify the number of gene_set permutations to perform in assessing the statistical significance of the enrichment score. It is best to start with a small number, such as 10. After the analysis completes successfully, run it again with a full set of permutations. The GSEA recommends 1000 gene_set permutations.
Friday, 20 February 2015
Monday, 16 February 2015
Batch Effect Visualisation
An excerpt from A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data
"Significant batch effects can be seen by the perfect separation of different batches on the PCA score plots for most data sets. Other visualization techniques can also be used to evaluate batch effects such as hierarchical clustering dendrogram, correlation heat-map and variance components pie chart from analysis of variance. The latter is a quantitative technique that gives the variances contributed by all factors when the class labels of all the samples are available. This allows the comparison of variances contributed by batch effects, biological effects and other effects. However, for cross-batch prediction in real applications, the class labels of the samples in the test set (future batch) are to be predicted and are unavailable, and thus analysis of variance cannot be applied for the endpoint factor. This approach is useful for evaluating the sources of variation and process control of sample handling and processing when all of these factors are recorded and reported."
Wednesday, 11 February 2015
Monday, 2 February 2015
PCA and eQTLs of RNA-seq
PCA
Gene expression analysis identifies global gene dosage sensitivity in cancer
eQTLs
Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins
Putative cis-regulatory drivers in colorectal cancer
Impact of regulatory variation from RNA to protein
Gene expression analysis identifies global gene dosage sensitivity in cancer
eQTLs
Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins
Putative cis-regulatory drivers in colorectal cancer
Impact of regulatory variation from RNA to protein
Subscribe to:
Posts (Atom)