Promoting Inclusivity in Computing at SFSU

Last week I had the opportunity to present at the ‘Promoting Inclusivity in Computing (PINC)‘ program at the San Francisco State University, hosted by Dr. Pleuni Pennings. I spoke about why computing is useful and necessary in analyzing metagenomic data. I had a chance to meet the students in small groups to answer their questions about computer science in biology and job opportunities for programmers. I really enjoyed my time at SFSU!

Bay Area Population Genetics and Lake Arrowhead Microbial Genetics

I’m looking forward to attending two meetings this week: Bay Area Population Genetics, hosted by San Francisco State University, and the Lake Arrowhead Microbial Genetics conference hosted by UCLA.

Detection of strain-level variation in the microbiome

A paper I recently contributed to is now accepted at Genome Research:

An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography

Stephen Nayfach, Beltran Rodriguez-Mueller, Nandita Garud, Katherine S Pollard
In this paper, we introduce a new software, MIDAS, which can identify SNPs and CNVs in shotgun metagenomic data. We then apply the software to a mother-infant data set and show that while infant gut microbiomes resemble mother’s microbiomes over time at the species level, the majority of the strain transmissions from mother to infant occur closer to birth rather than later in life. We also apply MIDAS to ocean metagenomic data and show that there is substructure at the strain level in different geographic regions. MIDAS offers the ability to track strain level variation in the microbiome, making it possible to delve more deeply into the evolutionary forces shaping the the microbiome.

Paper accepted: Elevation of linkage disequilibrium above neutral expectations in ancestral and derived populations of Drosophila melanogaster

My paper with Dmitri Petrov on Elevation of linkage disequilibrium above neutral expectations in ancestral and derived population of Drosophila melanogaster is accepted at Genetics. In this paper we show that signatures of soft sweeps are common to multiple populations of D. melanogaster.

In our previous paper in PLoS Genetics, we showed that soft sweeps are common in the Raleigh population of D. melanogaster. However, there were many questions raised regarding the extent to which soft sweeps are specific to the North Carolina population we studied. There are many factors that challenge the results of the North Carolina data set which we sought to address. First, the North American flies have experienced extensive admixture, the effects of which are largely unknown on LD. Second, the Raleigh data set was generated by extensive inbreeding, which could also impact LD. Dmitri and I analyzed a sample of >100 fully sequenced strains from Zambia, an ancestral population that has experienced little to no admixture and was generated by sequencing haploid embryos rather than inbred strains. My results revealed that soft sweeps are common to both Raleigh and Zambia. In addition, in Zambia we found evidence for some hard sweeps.

A copy of our paper is available here and will be available on the Genetics website soon.

Figure 3 from our paper: Haplotype frequency spectra for the 25 H12 peaks in Zambian and Raleigh data. Shown are haplotype frequency spectra for the top 25 peaks in the Zambian H12 scan conducted in 801 SNP windows down-sampled to 401 SNPs (A) and the Raleigh H12 scan conducted in 401 SNP windows (B). For each peak, the frequency spectrum corresponding to the analysis window with the highest H12 value was plotted. The height of the upmost shaded region (light blue) in each bar indicates the frequency of the most prevalent haplotype in the sample of 145 individuals, and heights of subsequent colored bars indicate the frequency of the second, third, and so on most frequent haplotypes in a sample. Grey bars indicate singletons. In Zambia, sweeps reach a smaller partial frequency than Raleigh. Many peaks in the Zambian data have multiple haplotypes present at high frequency indicative of soft sweeps, and many peaks have a single haplotype dominating the haplotype spectra, indicative of hard sweeps. In Raleigh all sweeps have multiple haplotypes at high frequency consistent with signatures of soft sweeps.

fig3

 

Videos on our review on adaptation in pathogens

Along with Pleuni Pennings, Ben Wilson, Alison Feder, and Zoe Assaf, I recently published a review on adaptation in pathogens in Molecular Ecology. In this review we discuss the state of the art population genetic analyses conducted in a wide array of pathogens including P. falciparum (the malaria causing pathogen), HIV, tuberculosis, Staph, and flu. Please check out our paper here.

We made short videos highlighting the different pathogens that we wrote about.

This is me discussing adaptation in P. falciparum:

Ben on influenza:

Here’s Pleuni talking about HIV:

Alison talking about tuberculosis:

And Zoe sharing work on Staphylococcus aureus:

 

Review accepted: The population genetics of drug resistance evolution in natural populations of viral, bacterial, and eukaryotic pathogens.

My co-authors, Pleuni Pennings, Zoe Assaf, Alison Feder, and Ben WIlson, and I recently wrote a review on: The population genetics  of drug resistance evolution in natural populations of viral, bacterial, and eukaryotic pathogens. Our review will be coming out in Molecular Ecology.

Paper on elevation of LD in Drosophila on BioRxiv

I posted my latest paper with Dmitri Petrov on BioRxiv on the Elevation of linkage disequilibrium above neutral expectations in ancestral and derived populations of Drosophila melanogaster. In this paper, we show that signatures of elevated LD and haplotype homozygosity are common in multiple populations of D. melanogaster and that signatures of partial soft sweeps are generic to multiple populations. We welcome any feedback or questions about the paper.

SMBE talk: Pervasive long-range linkage disequilibrium in D. melanogaster

I had the opportunity to present my latest paper draft on long range linkage disequilibrium in D. melanogaster at the SMBE 2015 meeting held in Vienna Austria. In this paper, Dmitri Petrov and I show that levels of LD both at short and long distances are elevated above neutral expectations in both Raleigh and Zambian populations of D. melanogaster. Furthermore, we find that levels of haplotype homozygosity are also elevated in both populations. Examination of the haplotype frequency spectra in the two populations reveals that signatures of soft sweeps are common in both  populations, suggesting that soft sweeps are generic to multiple populations of Drosophila.

Here is a picture that Alex Cagan drew of me and my talk!

CKBRxKGWIAA0ed9.jpg_large

Enhancing the mathematical properties of new haplotype homozygosity statistics for the detection of selective sweeps

I’m really pleased to share that my paper with Noah Rosenberg on the mathematical properties of H12 and H2/H1 is now published in Theoretical Population Biology. In this paper we introduce a normalization for the H2/H1 statistic as a function of H12 and show that the two statistics must be used in conjunction with each other to be able to differentiate hard and soft sweeps.

SMBE Young Investigator Travel Award

I am very grateful to have received the SMBE Young Investigator Travel Award for my oral presentation in Vienna this July on Long-range linkage disequilibrium in multiple natural populations of D. melanogaster.