Recent selective sweeps in North American Drosophila melanogaster show signatures of soft sweeps

Our paper on detecting hard and soft sweeps in D. melanogaster population genomic data from North America is finally published in PLoS Genetics!

Check it out here.

Current Issues In Genetics: Pervasive long-range linkage disequilibrium in natural populations of D. melanogaster

On Friday, November 14, I will be giving a talk to the Stanford Genetics department about a paper I am writing on pervasive long-range linkage disequilibrium (LD) in natural populations of D. melanogaster. LD is a measure of the amount of correlation between pairs of polymorphisms in the data, also known in statistics at R^2. The expectation is that polymorphisms far apart from one another should have low amounts of LD because recombination and mutation events should break up any structure in the genome. However, I show that there actually is a very high amount of LD even at long ranges where neutral expectations suggest there should be little to no LD. I suggest that a plausible explanation for the genome-wide elevation in LD is repeatable selective events in the Drosophila genome.

Github repository: SelectionHapStats

SelectionHapStats is a repository of Python scripts written to identify natural selection events in the genome and R scripts written to visualize the signatures of selective sites. The python code provided calculates haplotype homozygosity statistics H12 and H2/H1 in a genome-wide scan, as well as identified H12 peaks in genomic data. The R code provided visualizes the haplotype frequency spectra for the top peaks in the data and the genome-wide scan of H12.

This code presented in this repository is based on the arXived paper, Recent selective sweeps in North American Drosophila melanogaster show signatures of soft sweeps (http://arxiv.org/abs/1303.0906).

Check out my blog post for further description on the project and examples of visual output from the code!

URL: https://github.com/ngarud/SelectionHapStats

Video of my talk on the detection of hard and soft sweeps

Here is a video of me giving at talk at the Bay Area Population Genetics meeting at Stanford University in January 2013 on my project on the detection of hard and soft selective sweeps. 

Stanford CEHG Fellow

I have been awarded a fellowship from the Stanford Center for Human Genomics to support my PhD studies for the upcoming Fall and Winter quarters.

SMBE oral presentation: Disentangling the effects of demography and selection on haplotype structure in Drosophila melanogaster

This year at the Society for Molecular Biology and Evolution meeting I am presenting a talk on my work on “Disentangling the effects of demography and selection on haplotype structure in Drosophila melanogaster“. In my paper, I show that current demographic models that have been fit to neutral regions of the genome fit some summary statistics which assume independence between polymorphic sites, such as S, Pi, but fail to fit other summary statistics which take into account correlation in the data, such as long-range linkage disequilibrium and genome-wide haplotype homozygosity levels.

In addition, I am a co-author on Philipp Messer’s talk on “New statistical methods detect both hard and soft sweeps in malaria parasites.” In this paper, Philipp and I apply several different, but related, haplotype homozygosity statistics to the malaria genome and show that we have great power to recover several positive controls, depending on the method used.

We will both be presenting in the session Wednesday, June 11 titled “Detecting selection in natural populations: making sense of genome scans and towards alternative solutions.”

Bay Area Population Genetics X

I presented a poster at the tenth Bay Area Population Genetics meeting based on my work with Dr. Noah Rosenberg on the mathematical properties of the H12 and H2/H1 statistics. The H12 statistic is a haplotype homozygosity statistic used to identify regions of the genome under positive selection, and the H2/H1 statistic is used to distinguish whether the candidate region under selection shows signatures of a hard versus soft sweep. In our paper, Noah and I show that there is an upper bound for H2/H1 as a function of the corresponding H12 value. We apply this upper bound to data and show that it can help facilitate the interpretation of H12 and H2/H1 measured in heterogenous data sets with varying sample sizes and missing data rates.

Here is a copy of my poster:

normalization_poster_052114

Stanford CEHG Evolgenome speaker seminar

This spring quarter at Stanford, I am co-organizing the Stanford CEHG Evolgenome speaker seminar. This is a weekly seminar with speakers from around the campus part of the Center for Evolution and Human Genomics, as well as local visitors from nearby institutions in the Bay Area. Check out our exciting line up of speakers:

cehgPoster

Simons Institute for the Theory of Computing

I attended the Simons Institute workshop titled Computation-Intensive Probabilistic and Statistical Methods for Large-Scale Population Genomics. There was a great lineup of speakers and some opportunities to meet colleagues!

Talk at the Biomedical Computation at Stanford Conference (BCATS)

Today I gave a talk on my work on detecting hard and soft sweeps in Drosophila at BCATS. I presented new work inferring the softness of the sweeps in Drosophila, showing that sweeps on average have an adaptive theta compatible with the number of sweeping haplotypes to be around 12.8. The talk was well received, and I appreciate all the questions and comments from the audience.