Shortcuts in Science: September 2014

Migration drift equilibrium

Mutation drift equilibrium: The equilibrium value of diversity is known as the population mutation parameter (theta).

Recombination drift equilibrium: As a consequence LD can reach an equilibrium value in finite populations. This equilibrium value of LD is determined by the population recombination parameter (rho). This parameter combines information on effective population size and recombination rate (c) using the equation: rho=4Ne*c. The precise relationship between measures of LD and rho is complex, but under certain conditions (r^2=1/1+rho). So that when rho is large, r^2=1/rho.

If we are examining LD over a large genomic region containing many polymorphic markers, it is unclear how best to combination the information from measures LD based on comparisons between individual pairs of markers (i.e., D, |D'| or r^2). Therefore recent attention has focused on estimating rho itself for these kinds of data, as this gives a single measure of LD for the entire region.

Mutation selection equilibrium:
Some deleterious mutational events have sufficiently high mutation rates that within a large population they occur several times within a single generation, and can be considered recurrent mutations.

The rate at which new mutations are generated can be balanced by the eventual elimination of each mutant by selection so that the average number of a given mutation reaches an equilibrium value within the population.

If we consider all deleterious alleles together, a balance between mutation and selection may operate over the genome as a whole, such that at equilibrium each genome contains a certain number of deleterious alleles.

A general rule for diploid loci is that for selection to be operating then the following relationship should hold:
s>1/2Ne
For haploid loci with one quarter the effective population size of diploid loci, the relevant rule is s>2/Ne.

Migration from the Evoluationary Perspective

Colonization is the process of movement into previously unoccupied land, thus entailing a founder effect. By contrast, migration is the movement from one occupied area to another. Gene flow is the outcome when a migrant contributes to the next generation in their new location.

Selection from the Evolutionary Perspective

Balancing selection: A selective regime that favours more than one allele and thus prevents the fixation of any allele.

Genetic Drift

Effective population size (Ne) measures the magitude of genetic drift: the smaller the effective population size, the greater the drift.

There are two ways of defining effective population sizes: one is based on the sampling variance of allele frequencies (i.e., how an allele's frequency might vary from one generation to the next), and the other utilizes the concept of inbreeding (i.e., the probability that the two alleles within an individual are identical by descent from a common ancestor).

Under most simple population size give identical values for Ne, in more complex situations this is not the case.

It is not easy to relate the effective population size (Ne) to the census size of a population (N), as there are many parameters that can affect his relationship, only some of which are relevant to humans.

With no favouring of either outcome, the fixation probability of an allele in the absence of selection is equal to its frequency of 1/2N.

The average time to fixation (t) in generations has been shown to be:
t=4Ne

The long-term effective population size has been shown to be approximately equal to the harmonic mean, rather than the arithmetic mean of the population sizes over time. The harmonic mean is the reciprocal of the mean of the reciprocals: (1/t) sum_{i=1}^{t}(1/Ni) for t generations.

Founder effects relate to the process of colonization and the genetic separation of a subset of the diversity present within the source population. In contrast, bottlenecks refer to the reduction in size of a single, previously large, population and a loss of prior diversity.

In general, the higher the reproductive variance, the lower the effective population size, because parental contributions become more and more unequal. It is worth nothing that when reproductive variance is less than expected under a Poisson distribution then Ne can be greater than N.

Reproductive variance: The variance in number of offspring produced by a group of individuals.

Subpopulation: A randomly mating population that exchanges migrants with other populations to form a meta-population.

Meta-population: A group of populations connected by migration.

Fst: the fixation indices is a measure of the deviation of observed heterozygote frequencies from those expected under Hardy-Weinberg theorem. Fst compares the mean amount of genetic diversity found within subpopulations (Hs: the expected heterozygosity) to the genetic diversity of the meta-population (Ht).
Fst=(Ht-Hs) /Ht

Recombination: the evolutionary perspective

The benefits of recombination:

Recombination makes combinations of alleles across two or more loci that may be advantageous. Especially important with epistasis (interactions between loci) favouring a specific combination of alleles at the two loci.

Recombination helps get rid of bad mutations to create mutation-free offspring.

========================================

Recombination is capable of breaking up advantageous allelic combinations. This results in the theoretical possibility that by increasing the likelihood of disrupting a beneficial haplotype, outbreeding can result in a drop in fitness known as outbreeding depression.

========================================

An allele that rises to high frequency through positive selection at a linked locus is said to be "hitchhiking". The reduction in diversity at loci linked to a recently fixed allele is dubbed as a selective sweep. Conversely, negative selection at a locus also reduces diversity at linked loci, albeit at a slow rate, by a processed known as background selection.

========================================

If we know the recombination rate per generation (r) between the newly mutated locus and a given locus, after a certain number of generations (t) we can track the decay of LD over time, by relating the present value of D (Dt) to the inital value of D (D0) using the equation:
Dt=(1-r)^t X D0

The Glossary of Evolution

Back mutation: A mutation from the derived state back to the ancestral state.

Recurrent mutation: A mutation that independently generates a derived state previously observed within the population.

Thursday 18 September 2014

Publications of Selections on Non-coding Sequences

"The second hypothesis proposes that natural selection operates differently on mutations in cis-regulatory sequences^6,^7,^8,¹¹. This hypothesis is based on two properties of the organization and function of cis-regulatory regions. First, allele-specific measures of transcript abundance indicate that each allele in a diploid organism is transcribed largely independently^11,^12,^13,¹⁴, suggesting that mutations in cis-regulatory regions are often co-dominant."

Natural selection operates far more efficiently on co-dominant mutations because they can have fitness consequences as heterozygotes: a new variant is visible to selection immediately rather than requiring drift to raise allele frequencies to the point at which homozygotes begin to appear in the population¹¹.

Review: the evolutionary significance of cis-regulatory mutations

Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences

Genome-wide inference of natural selection on human transcription factor binding sites

The high degree of protein sequence similarity between phenotypically diverged species has led some to propose that regulatory evolution may be of considerably more importance than protein evolution^4,⁵.

Most of our direct knowledge regarding the evolution of regulatory elements comes from a handful of direct functional studies^5,⁶. A second, indirect approach is based on comparative genomics⁷. The rationale for this second approach is that if newly arising mutations are typically detrimental to gene function, functionally important parts of the genome are expected to evolve more slowly than those lacking function^8,^9,^10,¹¹.

There are some limitations to the comparative genomics approach. First, a given genomic region might be conserved owing simply to a lower mutation rate¹². Second, known regulatory elements do not seem to be particularly well conserved as a class, at least in Drosophila¹⁰. This finding suggests that taking an approach based on sequence conservation alone may lead to a biased view of regulatory evolution. Functionality of DNA sequences implies that they can be subject to both negative and positive selection. If a significant fraction of divergence between species observed in non-coding DNA is positively selected rather than selectively neutral or constrained, this could lead to underestimates of the functional importance of non-coding DNA and cause researchers to overlook the contribution of arguably the most interesting class of mutations in genome evolution—those reflecting adaptive differences between populations and species.

Adaptive evolution of non-coding DNA in Drosophila

Sunday 14 September 2014

Gene-level summaries of transcript expression estimates (the genesum software)

Kingsford-Group/genesum

Sunday 7 September 2014

Ingenuity Pathway Analysis

Causal analysis approaches in Ingenuity Pathway Analysis

Friday 5 September 2014

C Programming isinf function

Unix, C, and C++ Function Reference Errors

Thursday 4 September 2014

Bash Array Variable Slicing

b=("${init[@]:0:$((k-1))}")

Monday 1 September 2014

C Programming Assignment of Values to C Pointers

Directly assigning values to C Pointers

Shortcuts in Science

Tuesday 30 September 2014

GATK: DepthOfCoverage and DiagnoseTargets

Monday 29 September 2014

Evolution on the Regulatory Sequences

Context Likelihood of Relatedness (CLR) Algorithm

Wednesday 24 September 2014

Sambamba: Filter Expression Syntax

Samtools: Counting the Number of Reads in a BAM File

Tuesday 23 September 2014

Download SRR Files Based on ID

Monday 22 September 2014

Reply to Reviewer Comments

Papers to Read

Saturday 20 September 2014

Models of DNA Evolution

Gene Set Enrichment Analysis

The INSIGHT algorithm Inference of Natural Selection from Interspersed Genomically coHerent elemenTs

Friday 19 September 2014

Interplay amaong different forces of evolution