tmpB571.tmp

of 16

Please download to get full document.

View again

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
PDF
16 pages
0 downs
2 views
Share
Description
Crawford et al. BMC Biology (2017) 15:16 DOI 10.1186/s12915-017-0351-0 RESEARCH ARTICLE Open Access Population genomics reveals that an anthropophilic population of Aedes aegypti mosquitoes in West Africa recently gave rise to American and Asian populations of this major disease vector Jacob E. Crawford1,2†, Joel M. Alves3,4†, William J. Palmer3†, Jonathan P. Day
Tags
Transcript
  RESEARCH ARTICLE Open Access Population genomics reveals that ananthropophilic population of   Aedes aegypti  mosquitoes in West Africa recently gaverise to American and Asian populations of this major disease vector Jacob E. Crawford 1,2 † , Joel M. Alves 3,4 † , William J. Palmer 3 † , Jonathan P. Day 3 , Massamba Sylla 5 , Ranjan Ramasamy 6 ,Sinnathamby N. Surendran 6,7 , William C. Black IV  5 , Arnab Pain 8 and Francis M. Jiggins 3* Abstract Background:  The mosquito  Aedes aegypti   is the main vector of dengue, Zika, chikungunya and yellow fever viruses. This major disease vector is thought to have arisen when the African subspecies  Ae. aegypti formosus  evolved frombeing zoophilic and living in forest habitats into a form that specialises on humans and resides near humanpopulation centres. The resulting domestic subspecies,  Ae. aegypti aegypti  , is found throughout the tropics andlargely blood-feeds on humans. Results:  To understand this transition, we have sequenced the exomes of mosquitoes collected from five populationsfrom around the world. We found that  Ae. aegypti   specimens from an urban population in Senegal in West Africa weremore closely related to populations in Mexico and Sri Lanka than they were to a nearby forest population. We estimatethat the populations in Senegal and Mexico split just a few hundred years ago, and we found no evidence of   Ae.aegypti aegypti   mosquitoes migrating back to Africa from elsewhere in the tropics. The out-of-Africa migration wasaccompanied by a dramatic reduction in effective population size, resulting in a loss of genetic diversity and raregenetic variants. Conclusions:  We conclude that a domestic population of   Ae. aegypti   in Senegal and domestic populations on othercontinents are more closely related to each other than to other African populations. This suggests that an ancestralpopulation of   Ae. aegypti   evolved to become a human specialist in Africa, giving rise to the subspecies  Ae. aegypti aegypti.  The descendants of this population are still found in West Africa today, and the rest of the world was colonisedwhen mosquitoes from this population migrated out of Africa. This is the first report of an African population of   Ae.aegypti aegypti   mosquitoes that is closely related to Asian and American populations. As the two subspecies differ intheir ability to vector disease, their existence side by side in West Africa may have important implications for diseasetransmission. Keywords:  Aedes aegypti  , Anthropophilic, Dengue virus, Zika virus, Arboviral diseases, Mosquito evolution, Vector-bornediseases * Correspondence: fmj1001@cam.ac.uk  † Equal contributors 3 Department of Genetics, University of Cambridge, Downing Street,Cambridge CB2 3EH, UK Full list of author information is available at the end of the article © Jiggins et al. 2017  Open Access  This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the srcinal author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated. Crawford  et al. BMC Biology   (2017) 15:16 DOI 10.1186/s12915-017-0351-0  Background Arthropod-borne viruses (arboviruses) are a major threatto human health in many tropical and subtropical coun-tries. The most important vector of human arboviruses isthe mosquito  Aedes aegypti,  which transmits dengue, chi-kungunya, yellow fever and Zika viruses. A widespreadepidemic of the Zika virus has recently occurred acrossSouth America, Central America and the Caribbean andhas been linked to fetal brain abnormalities [1]. Over thelast decade, chikungunya virus, which is transmitted by both  Aedes albopictus  and  Ae. aegypti,  has emerged as amajor cause for concern, causing epidemics in Asia andmany Indian Ocean islands as well as in southern Europeand the Americas [2]. Dengue virus, which is responsiblefor the most common human arboviral disease infectingmillions of people every year, has greatly increased itsrange in tropical and subtropical regions [3, 4].  Ae. aegypti  occurs throughout the tropics and subtropics,but populations vary in their ability to transmit disease(vector capacity) [5 – 11]. Outside of Africa,  Ae. aegypti  hasa strong genetic preference for entering houses toblood-feed on humans and an ability to survive andoviposit in relatively clean water in man-made con-tainers in the human environment [5, 6]. However, across sub-Saharan Africa there is considerable vari-ation among populations in their ecology, behaviourand appearance [10, 12 – 15]. Some populations areless strongly human associated, being found in forests,ovipositing in tree holes and feeding on other mammals[5 – 8]. Elsewhere, populations have become  ‘ domesticated, ’ developing in water in and around homes and feeding onhumans. Aside from a few locations on the coast of Kenyathat appear to have been colonised by non-African popu-lations, African populations tend to cluster together genet-ically regardless of whether they are forest or domesticforms [12]. This was interpreted as suggesting that thesehuman-associated populations in Africa have arisenindependently from the domestic populations found else-where in the tropics [12]. However, as we discuss later,such interpretations of genetic data can be misleading.  Ae. aegypti  has long been hypothesised to have src-inated in Africa, probably travelling in ships alongtrading routes [7, 8]. This out-of-Africa model has been supported by genetic data, as African popula-tions have higher genetic diversity than those fromelsewhere in the tropics [16]. Furthermore, rootedtrees constructed from the sequences of a small num-ber of nuclear genes have consistently found that thegenetic diversity in Asian and New World populationsis a subset of that found in Africa [16]. The exact ori-gin of this migration out of Africa remains uncertain.Furthermore, it is not known whether the speciesevolved to specialise on humans in Africa or after ithad migrated out of Africa [17].The species  Ae. aegypti  has been split into two subspe-cies [7]. Outside Africa, nearly all populations belong tothe subspecies  Ae. aegypti aegypti,  which is light incolour and strongly anthropophilic. In Africa the subspe-cies  Ae. aegypti formosus  is darker in colour and lives inforested habitats. The two subspecies were originally defined based on these differences in colouration, with  Ae. aegypti aegypti  having pale scales on the first ab-dominal tergite [7]. However, West African populationsthat have these pale scales appear to be genetically moresimilar to  Ae. aegypti formosus  populations than  Ae.aegypti aegypti  from elsewhere in the tropics [10, 14, 15] . This has led some authors to call all African populations  Ae. aegypti formosus,  while others have continued to usethe srcinal morphological definition.Population genetics studies of   Ae. aegypti  have a longhistory, but until recently they were limited by the smallnumbers of genetic markers available. Whole genomesequencing is prohibitively expensive due to the largegenome size [18], but three approaches have madegenome-scale analyses possible. Restriction site-associatedDNA (RAD) sequencing has been used to score largenumbers of single nucleotide polymorphisms (SNPs)[16, 19, 20], although the repetitive genome coupled with PCR duplicates due to the low DNA yield of mosquitoes can complicate this approach [20]. An  Ae.aegypti  SNP chip can genotype more than 25,000SNPs [21], although the analysis of these data can becomplicated because a biased set of SNPs is genotyped[22]. Finally, we recently developed exome captureprobes, which allow the protein-coding regions of thegenome to be selectively resequenced [23]. Thismakes sequencing affordable, minimises ascertainmentbias and avoids repetitive regions where it is difficultto map short sequence reads.Here we have used exome sequencing to investigatethe srcins of the domestic  Ae. aegypti aegypti  popu-lations that are the main vectors of human viruses.To do this, we sampled mosquitoes from two nearby populations in Senegal, West Africa, one of whichwas from a forested region and has the classicalphenotype of   Ae. aegypti formosus,  and the other of which was from an urban location and resembled  Ae.aegypti aegypti.  These samples were then comparedto populations from East Africa, Mexico and SriLanka. We found that the domestic population inWest Africa is most closely related to domestic popu-lations in Mexico and Sri Lanka. We conclude thatthe species likely became domesticated in Africa, andthe migration out of Africa came from populationsrelated to extant domestic African populations. Fur-thermore, the out-of-Africa migration and probably the srcinal domestication event in Africa were asso-ciated with population bottlenecks. Crawford  et al. BMC Biology   (2017) 15:16 Page 2 of 16  Methods Mosquito samples We investigated  Ae. aegypti  from five populations (thesample details are given in Additional file 1). Whereverpossible, mosquitoes were sampled from multiple nearby sites. Mexican mosquitoes were all collected from inde-pendent sites in Yucatán state and supplied as extractedDNA by William Black. This group of mosquitoes was amixture of males and females, with the sex of individualsunknown. The collection sites were urban and peri-urban.Female Sri Lankan  Ae. aegypti  were supplied by RanjanRamasamy and Sinnathamby Surendran. Nine individualsfrom the Jaffna district [24] and one from the Batticaloadistrict [24] had been collected from separate ovipositiontraps in 2012 and reared to adulthood in the laboratory.These specimens were from urban and peri-urban areas.Female Ugandan  Ae. aegypti  were supplied by Jeff Powell.They had been collected in Lunyo, Entebbe in 2012 usingoviposition traps and reared in the laboratory.The samples from two populations in Senegal weresupplied as extracted DNA by William Black [10]. They fell into two phenotypically and geographically distinctgroups. The first of these we called  ‘ Senegal Forest ’ ; thisgroup is from the rural forested locations near Kedougou[10]. Here the mosquitoes lacked pale scales on the firstabdominal tergite, which is the classical phenotype associ-ated with  Ae. aegypti formosus  [10, 25]. This group of  mosquitoes was a mixture of males and females, with thesex of individuals unknown. The second group of mosqui-toes, which we call  ‘ Senegal Urban ’ , came from the urbanlocation of Kaolack and had the pale scales on the firstabdominal tergite that are classically associated with  Ae.aegypti aegypti  [10, 25] .  This sample consisted of 2 malesand 10 females. The two locations are approximately 420 km apart.  Aedes bromeliae  eggs were collected in July 2010 fromKilifi in coastal Kenya using oviposition traps. Eggs werehatched in the laboratory in the UK and reared tomaturity. A single female was then used for sequencing. Library preparation and sequencing DNA was extracted from  Ae. aegypti  mosquitoes usingthe DNeasy Blood and Tissue Kit (Qiagen). Illuminasequencing libraries were constructed from individualmosquitoes using the Illumina TruSeq Library Prep Kit.The concentration of each library was estimated by quantitative PCR, and four equimolar pools of the librariesfrom Mexico, Senegal, Uganda and Sri Lanka were made.Exome capture was then performed to enrich for codingsequences using custom SeqCap EZ Developer probes(Nimblegen) [23]. Overlapping probes covering theprotein-coding sequence, not including untranslated re-gions (UTRs), in the AaegL1.3 gene annotations [18] wereproduced by Nimblegen based on coding sequencecoordinates (covering 22.2 Mb) specified by us. In total,26.7 Mb representing 2% of the genome was targeted by capture probes, which includes regions flanking thecoding sequence that were added during the proprietary design process. Exome capture coordinates are availablein Additional file 2 (from [23]). Each of the four exome- captured pools of libraries was then separately sequencedin one lane each of 100-bp paired-end HiSeq2000 runs by the Beijing Genomics Institute (China).DNA was then extracted from a single  Ae. bromeliae individual using the QIAamp DNA Mini Kit. A whole-genome sequencing library was constructed using theIllumina Nextera DNA Library Prep Kit. This library wassequenced in one lane of MiSeq (2×250 bp paired-endreads; Oxford Genomics) and two lanes of HiSeq2000(2×100 bp paired-end reads; King Abdullah University of Science and Technology, KAUST, sequencing core). Sequence alignment and variant calling Initially   Aedes aegypti  reads were demultiplexed usingfastq-grep [26] and hard matching of Illumina barcodes.As such, reads with any errors in barcode sequence werediscarded. The following steps were then performed onreads from each of the populations, and  Aedes brome-liae , separately.Paired reads were quality trimmed from the 3 ′  end,cutting when average quality scores in sliding windowsof 5 bp dropped below 30, and trimmed when the qual-ity score at the end of the read dropped below 30 usingTrimmomatic version 0.27 [27]. As the insert size fromsome individuals was shorter than the length of two se-quencing reads, we initially observed some sequenceoverlap of paired-end reads. This is undesirable, as whenmapped they violate the later sampling assumption thata given SNP observation results from a single molecule.As such, overlapping reads were merged into singlepseudoreads with FLASH version 1.2.11 [28] and thentreated as single-end sequencing reads. Both paired- andsingle-end pseudoreads were then aligned to the  Aedesaegypti  reference genome AaegL3.3 using BWA-MEM version 0.7.10 [29]. Unmapped reads as well as thosemapping below a mapQ of 30 were then discarded usingSAMtools view [30]. SAMtools was then used to mergeand sort the paired- and single-end pseudoreads readalignments into a single BAM file, which was used forall subsequent analyses. We observed a number of   Ae.bromeliae  reads mapping with coordinates outside thenormal range, so for this set we used a custom script toremove read pairs with mapping start positions less than100 bp or greater than 400 bp. Reads were then rea-ligned around indels using GATK version 3.4-0 [31], andboth optical and PCR duplicates were removed usingPicard [32] version 1.90. An uncompressed BCF wasgenerated using SAMtools mpileup version 0.1.19 with Crawford  et al. BMC Biology   (2017) 15:16 Page 3 of 16  Indel calling disabled; skipping bases with a baseQ/BAQ less than 30; and mapQ adjustment (-C) set to 30. Thiswas finally converted to a VCF using bcftools. Low-quality SNPs were removed by using SNPcleaner version2.2.4 [33] to remove sites that had a total depth acrossall individuals of >1500 or had less than 10 individualswith at least 10 reads. Additional sites were filteredbased on default settings within the SNPcleaner script.VCF files were queried using SNPcleaner for each popu-lation separately in order to obtain a set of robust sitesfor analysis. This list was used as a -sites file input forANGSD [34], such that subsequent analysis withinANGSD was restricted to these sites. For some analysesthat require comparison among populations, we foundthe intersect between the lists of high-quality sites foreach population and used this common set for analysis.Minimum map quality and base quality thresholds of 30and 20 were used. For some analyses we convertedgenotype likelihoods into hard-called genotypes usingthe doGeno function in ANGSD with a cutoff of 0.95 forposterior probabilities on the genotype calls and a mini-mum read depth of 8. This read processing and geno-type calling process was repeated for the sequence readsfrom  Ae. bromeliae , except that the  Ae. aegypti  sites listwas used since SNPcleaner is not intended for singlediploid samples . Population genetics analysis We estimated the nucleotide diversity   π  using ANGSD,which calculates  π  based on estimates of per-site allelefrequencies across each population sample (i.e. withoutthe need to call genotypes), directly accounting for sam-ple size and read depth. We estimated 95% bootstrapconfidence intervals (CIs) by resampling scaffolds withreplacement 500 times and recalculating the statistic. Asnucleotide diversity is reduced in coding sequence dueto purifying selection, we only used sites >500 bp fromexons in this analysis ( ≥ 399,259 in each population).To construct a neighbour-joining tree of our samples,we first estimated the pairwise genetic distance (  D  xy  ) be-tween all pairs of samples based on genotype calls.  D  xy  was calculated from the called genotypes as ( h +2  H  )/2  L ,where  h  is the number of sites where one or both indi- viduals carry heterozygous genotypes,  H   is the numberof sites where the two individuals are homozygous fordifferent alleles and  L  is the number of sites where bothindividuals have called genotypes.To investigate population structure and the ancestry of individual mosquitoes, we performed an admixtureanalysis using NGSadmix, which makes inferences basedon genotype likelihoods [35]. We also analysed datafrom the three chromosomes separately using thechromosome assignments of Juneja et al. [20]. As analternative approach to investigate genetic structure, weperformed a principal component analysis (PCA). ThePCA was based on a covariance matrix among individualsthat was computed while accounting for genotype uncer-tainty using the function ngsCovar in ngsTools [33].We calculated  F  ST   [36] between populations fromallele frequencies estimated for each population directly from read data using ANGSD. This analysis used datafrom 17,351,731 coding and non-coding sites with nominimum minor allele frequency.We investigated the historical relationships betweenour populations by reconstructing a population max-imum likelihood tree based in allele frequencies usingthe program TreeMix [37]. This analysis used all high-quality coding and non-coding sites in our dataset, and  Ae. bromeliae  was used as an outgroup. We chose thisspecies, as the more closely related outgroup  Ae.mascarensis  frequently shares polymorphisms with  Ae.aegypti  [16] .  To account for the non-independence of sites due to linkage disequilibrium, we used a blocksize ( k  ) of 100 SNPs. To evaluate the confidence inthe inferred tree topology, 1000 bootstrap replicateswere conducted by resampling blocks of 100 SNPs.To test whether there had been migrations betweenthe populations after they split, we used the three-and four-population tests of Reich et al. [38], also im-plemented in TreeMix.We estimated one- and two-dimensional site fre-quency spectra (SFS) using the doSaf function withinANGSD to estimate per-site allele frequencies combinedwith the realSFS program [39] to optimize the genome-wide SFS. We minimised the effect of natural selectionon the SFS by including only third codon position sitesas well as non-coding sites more than 100 bp from thenearest exon, and as before, only sites passing all filterswere included for analysis. Approximately 6.44 Mb wasincluded in this dataset. To facilitate comparison amongpopulations, we down-sampled the larger populationsamples and chose 10 randomly selected individualsfrom each population. Two-dimensional (2D) spectrawere plotted using  dadi  [40] . We fit two classes of demographic models to the datafrom Senegal Forest, Senegal Urban and Mexico usingfastsimcoal2 version 2.5.2 [41] to distinguish betweenthe hypotheses that Senegal Urban is evolutionarily intermediate because it (1) is admixed with domesti-cated, non-African ancestry, or (2) represents the do-mesticated form within Africa that is the geneticancestor of non-African domesticated populations.We first fit simple three-population models with nosize changes for each of the two classes, and then fita second version of the model including size changesin each of the three populations. Schematics of thetwo models and their parameters can be found inAdditional file 3. Crawford  et al. BMC Biology   (2017) 15:16 Page 4 of 16
Related Search
Advertisements
Advertisements
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks