Paper accepted: Elevation of linkage disequilibrium above neutral expectations in ancestral and derived populations of Drosophila melanogaster

My paper with Dmitri Petrov on Elevation of linkage disequilibrium above neutral expectations in ancestral and derived population of Drosophila melanogaster is accepted at Genetics. In this paper we show that signatures of soft sweeps are common to multiple populations of D. melanogaster.

In our previous paper in PLoS Genetics, we showed that soft sweeps are common in the Raleigh population of D. melanogaster. However, there were many questions raised regarding the extent to which soft sweeps are specific to the North Carolina population we studied. There are many factors that challenge the results of the North Carolina data set which we sought to address. First, the North American flies have experienced extensive admixture, the effects of which are largely unknown on LD. Second, the Raleigh data set was generated by extensive inbreeding, which could also impact LD. Dmitri and I analyzed a sample of >100 fully sequenced strains from Zambia, an ancestral population that has experienced little to no admixture and was generated by sequencing haploid embryos rather than inbred strains. My results revealed that soft sweeps are common to both Raleigh and Zambia. In addition, in Zambia we found evidence for some hard sweeps.

A copy of our paper is available here and will be available on the Genetics website soon.

Figure 3 from our paper: Haplotype frequency spectra for the 25 H12 peaks in Zambian and Raleigh data. Shown are haplotype frequency spectra for the top 25 peaks in the Zambian H12 scan conducted in 801 SNP windows down-sampled to 401 SNPs (A) and the Raleigh H12 scan conducted in 401 SNP windows (B). For each peak, the frequency spectrum corresponding to the analysis window with the highest H12 value was plotted. The height of the upmost shaded region (light blue) in each bar indicates the frequency of the most prevalent haplotype in the sample of 145 individuals, and heights of subsequent colored bars indicate the frequency of the second, third, and so on most frequent haplotypes in a sample. Grey bars indicate singletons. In Zambia, sweeps reach a smaller partial frequency than Raleigh. Many peaks in the Zambian data have multiple haplotypes present at high frequency indicative of soft sweeps, and many peaks have a single haplotype dominating the haplotype spectra, indicative of hard sweeps. In Raleigh all sweeps have multiple haplotypes at high frequency consistent with signatures of soft sweeps.

fig3