Genomic Profiling of Insecticide Resistance in Malaria Vectors: Insights into Molecular Mechanisms.

preprint OA: closed
Full text JSON View at publisher
Full text 156,263 characters · extracted from preprint-html · click to expand
Genomic Profiling of Insecticide Resistance in Malaria Vectors: Insights into Molecular Mechanisms. | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Genomic Profiling of Insecticide Resistance in Malaria Vectors: Insights into Molecular Mechanisms. Victoria Ingham, Sanjay Nagi This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3910702/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Malaria control faces challenges from widespread insecticide resistance in major Anopheles species. This study, employing a cross-species approach, integrates RNA-Sequencing, whole-genome sequencing, and microarray data to elucidate drivers of insecticide resistance in Anopheles gambiae complex and An. funestus . Findings show an inverse relationship between genetic diversity and gene expression, with highly expressed genes experiencing stronger purifying selection. These genes cluster physically in the genome, revealing potential coordinated regulation. We identified known and novel candidate insecticide resistance genes, enriched in metabolic, cuticular, and behavioural functions. We also present AnoExpress, a Python package, and an online interface for user-friendly exploration of resistance candidate expression. Despite millions of years of speciation, convergent gene expression responses to insecticidal selection pressures are observed across Anopheles species, providing crucial insights for malaria vector control. This study culminates in a rich dataset that allows us to understand molecular mechanisms, better enabling us to combat insecticide resistance effectively. Biological sciences/Genetics/Gene expression Biological sciences/Genetics/Genetic association study Anopheles malaria transcriptomics insecticide resistance pyrethroids selective sweep purifying selection regulatory network Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction Malaria remains a leading cause of worldwide morbidity and mortality, resulting in an estimated 619,000 deaths in 2021 [ 1 ]. Insecticide-based vector control interventions are the most successful tools for reducing malaria incidence, responsible for over 80% of the malaria cases averted between 2000 and 2015 [ 2 ]; however, the gains made in malaria control have plateaued in the last eight years and cases are beginning to rise [ 1 ]. The reversal of gains is partially due to widespread insecticide resistance within populations of the anopheline mosquito vector [ 3 ], reducing the efficacy of insecticide-treated bed neds (ITNs) and indoor residual spraying (IRS), the two primary malaria intervention tools [ 1 , 4 ]. Across sub-Saharan Africa, four mosquito species are responsible for the majority of malaria transmission, the An. gambiae species complex ( An. gambiae , An. arabiensis , and An. coluzzii ) and An. funestus [ 5 – 7 ]. Understanding the similarities and differences in resistance mechanisms between the different dominant vector species is critical to inform malaria mitigation strategies, which are mostly employed indiscriminately despite differences in vector composition between and within areas of sub-Saharan Africa. Sharing of resistance mechanisms between insects is commonly observed; mutations at the insecticide target site genes Rdl , Kdr and Ace-1 occur at equivalent codons throughout arthropods [ 8 , 9 ], and insecticide detoxification is often driven by cytochrome P450s from the CYP6 insect clade [ 10 ]; this may occur through parallel evolution, or in closely related insects, introgression, where genetic material is passed between species via hybridisation. Although adaptive introgression is seen at resistance loci within the An. gambiae species complex [ 11 , 12 ], there is no evidence of hybridisation or introgression with An. funestus , which diverged approximately 80 million years ago [ 13 , 14 ]. Insecticide resistance in Anopheles mosquitoes commonly involves large gene families, such as cytochrome P450 monooxygenases (P450s), glutathione S-transferases (GSTs), carboxylesterases (COEs), and UDP-glycosyltransferases (UDPs) [ 15 , 16 ]. Upregulation of key P450s has been reported in insecticide resistant populations of each Anopheles species, for example CYP6Z1, CYP9K1, CYP6AA1 and CYP6P9a in An. funestus [ 17 – 19 ]; CYP9K1, CYP6AA1, CYP6P3 and CYP6M2 in An. gambiae and An. coluzzii [ 20 – 22 ]; and CYP6P4 in An. arabiensis [ 23 ]. Similarly, cuticular resistance [ 24 ], which involves thickening and modification of the composition of the cuticle, slowing the rate of absorption of insecticides has been reported in both An. gambiae s.l [ 25 ] and An. funestus [ 26 ]. Work comparing distinct Anopheles species found that gene numbers within these families were similar across species but lineage-specific losses and gains are regularly observed [ 14 ]. Despite these similarities, differences between the An. gambiae species complex and An. funestus are also observed. A prominent example is single nucleotide polymorphisms (SNPs) in the target site of insecticide, which reduce the efficacy of insecticides through changes to binding affinity. The two best studied are Kdr and Ace-1 [ 27 , 28 ], both of which are found in the An. gambiae species complex, but which are seemingly absent in An. funestus . Novel mechanisms of resistance have also recently been discovered, including the direct binding of pyrethroids by chemosensory proteins (CSPs) [ 29 ], as well as evidence for a role of both hexamerins, α-crystallins [ 30 ]. The recent discovery of such mechanisms highlights the complexity of resistance and the need for further study into molecular mechanisms of resistance in major malaria vectors. A plethora of ‘omics data has now been generated for malaria vectors. Over the past two decades, transcriptomic studies have driven discoveries into mechanisms of insecticide resistance, first in the form of microarrays, and subsequently RNA-Sequencing. In addition, since publishing the first phase of the Anopheles 1000 genomes project in 2017 [ 31 ], the MalariaGEN Vector Observatory has generated and made public thousands of high-quality whole-genome sequences of major malaria vectors from throughout sub-Saharan Africa. Although individual -omics experiments have had great success in identifying highly over-expressed transcripts involved in resistance, these studies result in thousands of differentially expressed genes resulting from noise from the susceptible comparator. Previous work has successfully used microarray data in exploratory analyses to identify patterns of gene expression across the An. gambiae species complex, identifying meta-signatures across geographically and temporally disparate data, highlighting the importance of meta-analyses [ 29 , 30 ]. In this study we characterise gene-expression profiles from published transcriptomic studies across two malaria vector complexes and establish relationships with whole-genome sequence data. To achieve this, we explored expression in resistance candidate genes and protein-coding families across 28 individual experiments of An. gambiae s.l and An. funestus [ 32 – 38 ]. These data have been integrated both with the previous microarray meta-analysis [ 30 ] and with whole-genome sequence data from the Anopheles 1000 genome project [ 31 ]. The meta-analysis performed here has been made available as a resource to the community as a user-friendly python package, AnoExpress, combined with convenient interactive notebooks intended to be run in Google Colaboratory. Results AnoExpress We performed a differential expression meta-analysis on read count data from 28 RNA-sequencing datasets representing insecticide-resistant Anopheles populations from 11 countries in sub-Saharan Africa [ 32 – 38 ] (Fig. 1 , Supplementary Table 1). Of the 28 resistant populations, 15 are An. coluzzii , 3 are An. gambiae , 5 are An. arabiensis and 5 are An. funestus . Read-count data from different species were derived from aligning to each respective reference genome assemblies. For differential expression analysis, each resistant population was compared to an insecticide-susceptible strain of the same species from the same study. To enable cross-species comparison of putative resistance genes, we located orthologs between each different genome assembly and the AgamP4 PEST reference genome, using the OrthoMCL algorithm in VectorBase. As the inclusion of each successive assembly reduces the numbers of genes that are present, we provide four differential expression analyses; the full dataset, termed ‘ gamb_colu_arab_fun ’ (8599 genes), and secondary analyses – ‘ gamb_colu_arab’ (8651 genes), ‘ gamb_colu ’ (11288 genes), ‘ fun ’ (14176 genes). Analyses herein were performed on the full gamb_colu_arab_fun dataset unless specified. These data are presented within a python package, AnoExpress, made for the community, with example notebooks provided to run in Google Colab. Users can load, explore, and visualise the gene expression meta-analysis data, including reproducing most analyses presented herein. AnoExpress can also directly load gene expression data from an earlier meta-analysis of microarray data [ 30 ]. AnoExpress is located here: github.com/sanjaynagi/AnoExpress. Documentation and a video user-guide are provided to improve ease of access. Dataset structure To investigate overall structure in the dataset, we performed principal components analysis on both the count and fold-change data. PCA on the log-transformed count data revealed five distinct clusters present. Four clusters contain samples only from An. funestus, An. coluzzii , or An. arabiensis , with a fifth cluster containing a mixture of An. gambiae, An. arabiensis and An. coluzzii (Fig. 1 A). The four species did not resolve into four distinct clusters, suggesting that batch effects from different studies could be present. We therefore only performed within-study differential expression analysis. To compare RNA-Sequencing data with the earlier microarray studies, we performed a PCA on the fold-change data from all available experiments (Supplementary Fig. 1). There was no evidence of clustering between the two technologies, suggesting that the results between the two methodologies are comparable and can be used in combined analyses. Correlations between pairs of orthologous genes count data from different species were high, with lower correlation between An. funestus and the other three species, which is to be expected given their considerable divergence time (Fig. 1 B). Gene expression is physically clustered in the genome As changes in gene expression are likely to be related to cis-acting factors and the local genetic context, genes in close physical proximity are more likely to show similar patterns of expression. To test this within our dataset, we calculated and averaged the Pearson’s correlation on log2 fold-changes between each neighbouring gene in the dataset, resulting in a mean Pearson’s R of 0.10 for neighbouring genes. To determine whether this value was higher than expected based on chance, we randomly permuted the positions of all genes in the genome 1000 times and re-calculated the mean Pearson’s R for now neighbouring genes. Figure 2 C shows the histogram of mean Pearson’s correlations, demonstrating a clear effect of genome proximity on gene expression. To further explore physical clustering of resistance-related gene expression, we visualised fold-change values across the whole genome. In Fig. 3 , we use AnoExpress to plot gene expression for all genes across all comparisons in the gamb_colu_arab_fun analysis and calculate the moving average log2 fold change across the genome in sliding windows. We identify gene clusters known to be associated with insecticide resistance, such as the CYP6P, CYP9K, CYP6M/Z and GSTD gene clusters. This analysis also highlighted numerous clusters of cuticle proteins which show elevated expression. Genetic diversity and purifying selection scale with levels of gene expression We postulated that genes showing higher levels of absolute expression are more likely to be functional in multiple tissues or pathways, and so should be highly conserved, experiencing the highest levels of purifying selection. We integrated whole-genome sequence data from the Anopheles 1000 genomes project [ 31 ], to explore the relationship between gene expression and genetic diversity, for each species in which genomic data was publicly available. To compare between species, we selected representative populations of An. gambiae (Nagongera, Uganda), An. coluzzii (Hauts-Bassins, Burkina Faso), and An. arabiensis (Muleba, Tanzania), collected between 2012 and 2015. These cohorts show no signatures of inbreeding of major demographic events. As a measure of absolute expression, we used normalised count data. We then counted the number of segregating sites of synonymous and non-synonymous mutations, as well as calculating nucleotide diversity, π, in each gene, for each species. Figure 4 shows the ratio of pN/pS and nucleotide diversity, for various levels of expression binned into 5% percentiles for the gamb_colu_arab analysis. We show that average nucleotide diversity is reduced for the most highly expressed genes and is lower overall in An. arabiensis compared to An. coluzzii and An. gambiae , which fits with expectations on genetic diversity from the literature [ 39 ]. We show that for each species, the pN/pS ratio is lower for the most highly expressed genes compared to lowly expressed genes. This holds more strongly in An. gambiae and An. coluzzii than An. arabiensis , which likely relates to the efficiency of purifying selection depending on population size – the effects of purifying selection will be stronger in larger populations. Highly overexpressed genes are associated with selective sweeps in wild-caught mosquitoes Given that insecticide resistance is often associated with increased expression of metabolic genes and that beneficial resistance mutations can cause selective sweeps which spread through a population, we hypothesised that highly expressed genes are more likely to be found in proximity to genomic regions under selection in wild-caught mosquitoes. To determine this, we investigated genome-wide signals of recent selection using the H12 statistic [ 40 ] on data from phase 3 of the Anopheles 1000 genomes project (Nagi et al., in prep). Every signal was then mined to determine which genes in the AgamP4 assembly, if any, lie at the location of the peak of a selection signal. Further information on genome-wide selection scans and signal calling is found in Supplementary Text 1. Candidate genes were then defined as those genes showing a median fold-change of greater than two in the An. gambiae, An. coluzzii and An. arabiensis dataset (gamb_colu_arab) , as these three species are represented in the Ag1000G phase 3 data [ 38 ]. We additionally filtered out lowly expressed genes (median counts < 5). After filtering, 106 candidate genes remained, of which 34 were located at the site of a signal of selection (Supplementary Table 2). Figure 5 displays the location of these genes in the An. gambiae PEST reference genome, and recovers known insecticide resistance candidates such as CYP6P3 , CYP6M2 , GSTE2 and CYP9K1 , as well as numerous other genes putatively involved in resistance. To determine whether this was a significant enrichment of genes compared to random chance, we carried out a permutation analysis. In 10000 permutations, only 2 equalled or exceeded the value of 34 (p-value = 0.0002), suggesting that regions of the genome under selection are indeed enriched for the most highly overexpressed genes. Separately, we utilised the entire dataset of genes and found a positive association between a genes’ median fold-change and whether it was found at the site of an H12 selective sweep signal (p GLM = 0.000). Signatures of resistance-associated gene expression To explore highly over-expressed genes across all four species, we ranked genes by mean and median log 2 fold change across all experiments (Supplementary Tables 3 and 4). Included in the over-expressed set (cut-off of fold change > 2, median counts > 5), are 65 and 78 genes respectively. Within these lists, the known pyrethroid metaboliser CYP6P3 [ 39 ] (the ortholog of CYP6P9a/b in An. funestus ) had a median and mean fold change of > 6.8, demonstrating the utility of the meta-analytic approach. Included in both lists were other cytochrome P450s including CYP6P4, CYP9K1, CYP6M2, CYP6Z2, CYP6Z3 and CYP6P5 , all of which have previously been linked to pyrethroid resistance across both species [ 20 , 21 ]. Interestingly, CYP4H17 which has never been explored for a role in pyrethroid resistance, appears 7th (mean) and 11th (median) with a median fold change of 3.47. Similarly appearing high across both mean and median rankings are cuticular related proteins CPLCG4, CPR150 and CPLCA3 , a venom allergen (AGAP006417), an alkaline phosphatase (AGAP001684), gustatory receptor 49 (AGAP001169) and a protein with no known function (AGAP009327). In total, 50 genes are present in both the median and mean over-expressed sets including one carboxylesterases, 11 cytochrome P450s, three glutathione-S-transferases and one UGT from the major detoxification enzyme families (Fig. 6 A). Additionally, 9 cuticular proteins, one D7 salivary gland protein and two hexamerins are present from gene families previously linked to resistance [ 30 , 41 ]. Two genes related to odorant or taste detection are also present. We then performed gene set enrichment analysis on the top 5% of overexpressed genes based on median fold-changes (419 genes) (Supplementary Table 5). We performed enrichment on both GO terms and protein (PFAM) domains; significant enrichment terms are displayed in Fig. 6 D, categorised by broad resistance mechanism. In expectation with known gene families involved in insecticide resistance, we observe highly significant enrichment for many GO terms relating to detoxification, such as oxidoreductase activity (fdr-corrected p-value = 8e-16), iron ion binding (p = 4.75e-15), heme-binding (p = 6.8e-14), monoxygenase activity (p = 9e-14), glutathione activity (p = 0.001) and glutathione metabolic process (p = 0.001), and the ‘P450’ (p = 1.6e-16), ‘GST_N_3’ (p = 0.002) and ‘GST_C’ (p = 0.004) protein domains. In addition, cuticular-related genes were also enriched, with the GO terms structural constituent of cuticle (p = 1.1e-23), chitin binding (p = 2.3e-09), chitin metabolic process (p = 1.3e-8), fatty acid elongase activity (p = 0.04), and the protein domains ‘Chitin_bind_4’ (p = 3.2e-20), ‘CMB_14’ (p = 1.3e-18), and CPCFC (p = 0.03). Finally, a large group of terms were related to sensory perception, including the GO terms ‘sensory perception of smell’ (p = 1.4e-6), odorant binding (p = 2.4e-6), olfactory receptor activity (p = 0.0002), sensory perception of taste (p = 0.018), taste receptor activity (p = 0.04), and the protein domains ‘7tm_6’ (p = 0.00003) and ‘7tm_7’ (p = 0.03), which indicate odorant and gustatory receptors. We then explored highly down-regulated genes based on median fold-changes (Fig. 6 C). Amongst the most down-regulated genes was CYP307A1 , a known regulator of ecdysone synthesis [ 42 ]. In addition to hormone related transcripts CLIPA13 and Galectin 6 are heavily down-regulated. Zero GO terms were enriched based on the 5% lowest median fold-changes. If we instead use the 5% lowest mean fold-changes, we observe significant enrichment for sodium channel activity (p = 0.00013), sodium ion transport (p = 0.00059) and sodium ion transmembrane transport (p = 0.0044) the target sites for pyrethroids. Consistently differentially expressed genes We then explored genes which were significantly up and down-regulated most consistently. After filtering genes with very lowly expressed genes, only two genes were upregulated in 22 out of 28 RNA-Sequencing experiments, CYP6P3 and CYP9K1. GSTD3 and CYP6P4 were upregulated in 21 out 28. Figure 6 B shows all 18 genes significantly over-expressed in at least 19 out of 28 experiments, and includes CYP4H17, CYP4H18, CYP6M2, GSTE2, a UGT and a trypsin TRYP4 (Supplementary Table 6). This set of genes were enriched for seven P450-related GO terms, including oxidoreductase activity (p = 1.68e-09) and heme binding (p = 5.66e-09), and the PFAM domain ‘p450’ (p = 9.19e-10). In contrast, only 8 genes showed consistent down-regulation in at least 19 out of 28 experiments (Supplementary Table 6), including two opsin genes, GPROP4 and GPROP6, and a molecular chaperone HtpG gene (AGAP006961). The genes are enriched in seven GO terms relating to light detection, such as photoreceptor activity (p = 0.029), visual perception (p = 0.029) and phototransduction (p = 0.039). Detoxification genes In total 18 cytochrome P450s show either a median and mean fold change of > 2 and 9 are significantly upregulated in at least 19 out of 28 of all datasets. The cytochrome P450 family most represented is the CYP6 family, orthologous to the CYP3 clade in humans, responsible for the vast majority of xenobiotic metabolism. The CYP6Ps , CYP6M2 and CYP9K1 which have been repeatedly associated with insecticide resistance are all heavily implicated and are located in regions of the genome under selection (Fig. 5 ). Additionally, four members of the CYP4 family appear to be overexpressed across multiple resistant populations, particularly CYP4H17 , but also CYP4H16 , CYP4H18 , and CYP4C28 . Similarly, 4 out of 27 GSTs (Fig. 6 ) have high mean or median fold-changes with one, GSTD3 showing consistently significant overexpression. Of these, GSTE2 has been shown to be involved in pyrethroid resistance [ 38 ], whilst GSTE1, 2 and 5 have been implicated in DDT resistance [ 43 ]. Of the other phase two and three detoxification families, two UGTs (AGAP006222 and AGAP011564) and two carboxylesterases ( COEAE6O and COEAE8O ) are expressed with high mean or median expression or consistently over-expressed (Fig. 6 ). Cuticular Proteins 113 cuticular proteins were included in the full gamb_colu_arab_fun analysis; after filtering for lowly-expressed genes, only 61 of these remained. 20 of these genes showed a mean or median fold-change of above 2 in resistant populations, and although none showed significant over-expression in 19/28 experiments, 9 showed upregulation in at least 16, including CPLCX3, CPR130, CPAP3-C, CPR16, CPR140, CPR59, CPR79, CPLCP12 and CPR81 . Expression of neuronal-related genes Gene-set enrichment analysis of GO terms on the 5% most down-regulated genes for median expression revealed significant enrichments in synapse (15 genes), sodium channel activity (10 genes) and chemical synaptic transmission (13 genes). The primary target site for pyrethroid insecticides, the Vgsc , is not present in the full gamb_colu_arab_fun analysis, but investigation of the secondary analyses shows sporadic down-regulation in An. coluzzii and An. funestus but not An. gambiae , suggesting that changes in expression of this gene are unlikely to be playing a role in resistance. Interestingly, Ace-1 expression demonstrated up-regulation in almost all An. coluzzii and An. gambiae populations whilst it is down-regulated in all 5 of the An. funestus populations. The gene coding for the beta-2 subunit of the nicotinic acetylcholine receptor is significantly overexpressed in 11 out of 28 experiments. Other a priori resistance candidates A number of genes and gene families have recently been implicated in insecticide resistance, including CSPs, hexamerins, alpha-crystallin, the transcription factor Maf-S and d7 salivary proteins [ 29 , 30 , 41 , 44 ]. Of the 8 chemosensory proteins, just three remained in the full gamb_colu_arab_fun analysis, CSP3 , CSP4 and SAP1 , none of which showed high mean or median fold-changes across four species. Similarly, three hexamerins were included in these data, all three of which showed high mean fold changes ranging from 2.64 to 3.4. Further exploration revealed that overexpression was observed in all An. arabiensis and the majority of An. gambiae experiments, with the orthologs of AGAP001345 and AGAP001657 being overexpressed in two (Uganda and FuMoz) out of five An. funestus datasets. Just two a-crystallins were present in the data, neither of which showed an expression-association with resistance. The transcription factor Maf-S was previously identified as up-regulated across multiple An. gambiae s.l populations in the meta-analysis of microarray data [ 30 ]. It also shows similarly consistent overexpression in the RNA-Sequencing data, with 12 out of 28 populations exhibiting significant up-regulation. Transcriptional regulation of insecticide resistance Previous work using transcriptomic time-course data revealed putative transcription factors regulating the expression of transcripts post-pyrethroid exposure [ 45 ]. Here, we used the GENIE3 algorithm in the Grenadine package to re-capitulate a gene regulatory network utilising the 28 RNA-sequencing and including 31 microarray experiments. After filtering score for the top 5th percentile (Supplementary Table 7), 50,967 interactions were predicted. Firstly, we investigated Maf-S due to its prior link with insecticide resistance [ 44 ]. 269 putative interactions were identified, including the pyrethroid-resistance related transcripts CYP6M2 , CYP6Z3 , CYP9J4 , two ABC transporters and three GSTs including GSTMS1 . To validate the utility of this approach, microarray data published from a Maf-S knockdown [ 44 ] was compared with the model predictions; 83 out of the predicted 269 genes were present on the list. We then carried out a permutation analysis, and out of 1000 permutations only 18 had 83 or more overlaps (p = 0.018) giving confidence to the model predictions. To determine which transcription factors are likely to play a role in resistance, enrichment analysis was performed on the predicted interaction partners for each transcription factor: Rootletin was associated with genes significantly enriched for oxidioreductase activity (p = 0.00943) and was linked with CYP6P3, CYP6P4 and CYP6P5, CYP6AA1, CYP6M2 and CYP6Z4 ; genes predicted to interact with Su(H) were similarly enriched in oxoreductase activity (p = 9.19e-3); Antp, Cbt and Sug were significantly enriched in cellular response to stress (p = 5.19e-4; 3.83e-4; 2.30e-8), Asciz and Org-1 to sensory response to chemical stimulus (p = 5.56e-4; 1.69e-4), E(spl)m3-HLH and Grh with structural constituent of the cuticle (p = 8.36e-5; 1.71e-2); exex with acetylcholine metabolic process (p = 3.57e-3); Grau, Scrt and Toy with G protein-coupled receptor activity (p = 1.47e-5; 1.01e-2; 2.46e-6); HLH3B with neurogenesis (p = 4.01e-2); Onecut with synpase organisation (p = 5.26e-3); Row with response to stress (p = 8.97e-6) and TFAM with electron transport chain (1.25e-7). A number of transcription factors were enriched in higher order functions, such as RNA metabolic processes, gene expression and aromatic compound metabolic processes. Discussion Despite the wide availability of transcriptomic data for major malaria vectors, no study to date has analysed these data across the four primary vectors, An. gambiae, An. coluzzii, An. arabiensis and An. funestus. In this study we performed a meta-analysis of these data, demonstrating clear convergence of molecular mechanisms of known insecticide resistance genes, as well as discovering novel candidates showing high levels of expression across the divergent species. We present a user-friendly python package, AnoExpress, which allows users to explore, visualise, and analyse the transcriptomic meta-analysis data. Integrating differential expression and single nucleotide polymorphism (SNP) data has the potential to enhance the discovery of novel resistance markers [ 46 , 47 ]. We integrated whole-genome sequence data from the Ag1000G with our transcriptomic meta-analysis, estimating signals of positive selection in wild mosquitoes, and locating the intersection of genes that both are highly overexpressed and are located at sites of selection signals. As over-expression of genes is a common mechanism which can confer insecticide resistance, cis-acting mutations which are likely to increase the expression of nearby genes are expected to be under positive selection. Indeed, cis-acting factors have been identified on a triple-mutant haplotype around CYP6AA1 in East African An. gambiae [ 22 ], and in An. funestus driving expression of CYP6P9a/b [ 48 ]. The transcriptomic meta-analysis recovered many known resistance genes from metabolic gene families, including CYP6P3 (the CYP6P9a/b ortholog), CYP6P4 , CYP6AA1 , CYP9K1 , CYP6Z2, CYP6Z3, CYP6M2 and Gste2 [ 20 – 22 ]. Each of these cytochrome P450s have been shown to directly metabolise insecticides [ 20 , 49 ], as well as being associated with selective sweeps in field populations [ 19 , 31 ]. Additionally, several detoxification genes showing consistent overexpression have recently been implicated in gene duplication events in the An. gambiae complex, including CYP9K1 , CYP6AA1, COEAE60 and GSTE2-4 [ 22 , 50 ]. The appearance of these genes in both candidates with consistently high mean/median fold change, as well as those consistently expressed over multiple datasets demonstrates the power of this approach. CYP4H17 is a striking example of a gene which seems to have been missed in earlier studies and is a promising candidate for functional validation in both An. gambiae s.l and An. funestus . In addition to detoxification genes, we see a signature of oxidative stress response. For example, Maf-S is involved in the Nrf2-cnc pathway, which induces expression of stress response genes upon changes to the oxidative stress levels of the cell [ 44 ]. Interestingly, Maf-S has been shown to have a role in pyrethroid resistance and was identified from consistent overexpression in microarray datasets [ 44 ] and is similarly consistently overexpressed in the RNA-Sequencing data, whilst another member of the pathway, Keap1 , is upregulated in An. funestus from Ghana. In addition to this characterised pathway, protein DJ-1 (AGAP000705) and Catalase are consistently overexpressed, both these genes play a role in protection against oxidative stress [ 51 ]. These data are consistent with previous publications indicating that insecticide exposure induces an oxidative stress response [ 52 – 54 ]. Signatures of expression in candidates related to cuticular hydrocarbons were also present, which may indicate a change to lipid processing as recently proposed [ 55 , 56 ] or be related to a thicker cuticle [ 24 ]. Genome-wide expression scans highlighted multiple gene clusters of cuticular protein families as outliers of gene expression, including the CLPCA, CPLCG, CPLCP and CPR families, which is reflected in the large number of these genes in the top candidate gene lists based on median and mean fold-changes. AGAP010368, an ortholog to a gene involved in fatty acid alpha-oxidation in Drosophila , a long chain fatty acid CoA ligase (AGAP009159) and a fatty acid elongase (AGAP003195) are consistently over expressed. The large number of cuticular proteins (20/113) highly expressed and the lack of those consistent across all populations may suggest high levels of redundancy. We observe little differential expression of the major target site genes, such as the target of pyrethroids the Vgsc and dieldrin Rdl . As previously reported, the expression profile of the OP and carbamate target Ace-1 is an exception, often upregulated in An. coluzzii and An. gambiae yet showing negative fold-changes An. funestus experiments. It is worth noting that all five An. funestus experiments included in this study use the same susceptible comparator strain, FANG, and so this result could be a peculiarity of this laboratory strain. We find regular over-expression of the gene coding for the beta-2 subunit of nicotinic acetylcholine receptor (AGAP010057), recently associated with Pirimiphos-methyl resistance in a genome-wide association study [ 12 ]. As well as physiological changes which directly confer resistance to insecticides, the potential for mosquitoes to avoid contact with insecticides and exploit humans when they are least protected is a serious threat to malaria vector control. Mosquito behaviour is likely to be driven in part by gene expression [ 57 ] and transcriptomic studies may allow us to identify candidate genes involved in insecticide-resistant behaviours. This is of particular interest given the difficulty in measuring behavioural resistance in wild mosquito populations. Interestingly, enrichment analyses highlighted that our expression candidate genes were enriched for genes involved in olfactory processes. Gustatory receptor 49 (AGAP001169) shows a median fold-change of 3.39, the highest of any olfactory receptor in our data, and is a promising candidate for behaviourally-linked insecticide resistance. The odorant receptors 13 (AGAP009396) and 9 (AGAP008333) also show high median fold-changes of 2.73 and 2.71, respectively. It is, however, important to note that adaptation of susceptible strains to insectary rearing could cause the downregulation of olfactory genes, and so contribute to these signals. We also identify candidate genes from gene families not previously associated with resistance in malaria vectors. The ortholog of neural lazarillo (AGAP009281) has a high median fold-change, a gene which is known to contribute to longevity, stress resistance and behavioural change through the JNK stress response pathway in Drosophila [ 58 ]. Several venom allergens also exhibit high median fold-changes. Interestingly, Venom allergen 5 has been associated with resistance in Culex through dsRNA and over-expression in cells [ 59 ]. These genes are poorly understood but a number have been found in the salivary gland proteome where they have a diverse range of functions, with none currently being characterised [ 60 ]. Trans-acting regulation plays an important role in modulating gene expression to insecticide stress in insects [ 44 , 61 – 63 ]. To explore whether we could identify transcription factors involved in insecticide resistance, we inferred a gene regulatory network. The predicted interactors of several transcription factors were enriched in GO terms related to insecticide resistance. Amongst them is Rootletin , which is known to play a role in behavioural response in Drosophila [ 64 ] and was previously linked to resistance as a key hub in response to insecticide exposure [ 45 ], whilst Grau has been linked to pyrethroid resistance in Aedes populations [ 65 ]. Other transcription factors enriched in insecticide resistance-related terms that are consistent with known roles in Drosophila includes Cabut , which has been shown to regulate stress response [ 66 ], E(spl)m3-HLH which is linked with stress granules [ 67 ], Grh that is involved in cuticular repair [ 68 ], Exex is involved in neuronal function [ 69 ], Onecut with synapse organisation [ 70 ] and Tfam is a mitochondrial transcription factor, with a role in oxidative stress response [ 71 ]. We also explored the role that gene expression plays in shaping levels of genetic diversity across the genome. Protein evolution is constrained by purifying selection, which acts to prevent deleterious mutations from spreading in a population [ 72 ]. Previous studies have shown a relationship between higher gene expression and a slower accumulation of deleterious mutations [ 46 ]. We find a similar relationship in Anopheles mosquitoes, with the most highly expressed genes displaying lower rates of pN/pS, as well as a reduction in measures of genetic diversity, and replicate this effect across three mosquito species. Anopheles gambiae s.l populations exhibit extremely large effective population sizes, which should accelerate the removal of deleterious mutations. Between species differences support this, as An. arabiensis displays lower nucleotide diversity, but higher rates of pN/pS for the most highly expressed genes. In this study, we present a holistic overview of signatures of gene expression between from the major malaria vectors Anopheles gambiae s.l and Anopheles funestus . As well as establishing relationships between gene expression and whole-genome sequence data, we demonstrate the importance and convergence of detoxification genes, as well as highlighting putative transcription factors and novel gene families involved in insecticide resistance. Despite the power in this study there are several limitations that should be considered; firstly the An. funestus data is from one experiment, and therefore is likely to have significant batch effects. Secondly, Anopheles have a gene/loss factor five times that of Drosophila [ 14 ] resulting in many gene families have one-to-many orthologs which have been excluded from this study, which results in a loss of data. Finally, here we take ‘insecticide resistance’ as an overall phenotype because the populations grouped in this study have differential resistances to different classes, although all are pyrethroid resistance. After taking these limitations into count, these data highlight novel functional validation and genomic surveillance targets for malaria vector control in Africa. Methods RNA-sequencing data All RNA-sequencing data published comparing resistant and susceptible members of An. gambiae , An. coluzzii, An. arabiensis and An. funestus were retrieved from Google Scholar in April 2022. Of the 28 resistant populations studied, 15 are An. coluzzii , 3 are An. gambiae , 5 are An. arabiensis and 5 are An. funestus (Supplementary Table 1) [ 32 – 38 ]. As multiple studies used the same susceptible populations due to the widespread resistance found in endemic settings, replicates of the same populations across studies were present (Supplementary Table 1). Count files were retrieved from authors of each paper and combined to form one large counts file. VectorBase was used to retrieve orthologs from each species to An. gambiae PEST and the counts of one-to-many relationships were averaged. Differential expression analysis The data was then normalised using the DEseq2 1.26.0 [ 73 ] using estimateSizeFactors, estimateDispersions and the varianceStabilizingTransformation. Differential expression analysis (DEA) was performed with DESeq2 v1.26 in R v3.6.3. Due to batch effects and large differences in library depth between the included RNA-Sequencing studies, we only performed DEA within each experiment, comparing the susceptible (control) to the resistant strain (case). Positive fold changes indicate over-expression in the resistant strain. Hypothesis testing was performed with the DESeq2 wald test. Microarray data We integrated Microarray data from an earlier meta-analysis [ 30 ] into AnoExpress. As this was performed at the transcript level rather than genes, we averaged log2 fold change and adjusted p-value data across transcripts for a given gene to match the AnoExpress meta-analysis. Gene-set enrichment analysis Gene set-enrichment was performed using the hypergeometric test implementation in Scipy, incorporated into AnoExpress. GO terms were extracted from VectorBase and PFAM domains from [ 74 ]. Genomic clustering Using the gamb_colu_arab analysis we calculated and averaged the Pearson’s correlation between log2 fold changes for each neighbouring gene pair in the dataset, and then calculated the mean pearsons R across all pairs. We excluded An. funestus due to differences in synteny compared to An. gambiae s.l. To determine whether this value was higher than expected based on chance, we randomly permuted the positions of all genes in the genome 1000 times, calculated the Pearsons correlation between neighbouring genes, and calculated the mean. We considered the effect significant if the fraction of permutations showing a more extreme value than the actual true value was below 0.05. Genetic diversity Using log2 count data from the gamb_colu_arab analysis, we calculated the median counts across all experiments, and binned genes into 5% percentiles of expression level. We excluded An. funestus , as sufficient whole-genome sequence data is not yet publicly available. We selected three cohorts to analyse from the Anopheles 1000 genome project [ 31 ]; An. gambiae from Nagongera, Uganda, An. coluzzii from Hauts-Bassins, Burkina Faso, and An. arabiensis from Muleba, Tanzania, collected between 2012 and 2015. We randomly selected 50 individuals from each cohort. Using segregating sites only, we calculated the number of non-synonymous and synoynmous sites for each transcript, using the first transcript annotation for each gene, up to -RC. We calculated the overall CDS length per transcript and calculated the number of synonymous or non-synoynmous mutations per 1000bp of CDS. We also calculated nucleotide diversity and Wattersons Theta across the entire length of the gene (including introns), using scikit-allel v1.2.1 [ 75 ]. Genome-wide selection scan (GWSS) data We integrate data from H12 genome-wide selection scans [ 40 ] from the selection-atlas. More information on the GWSS and peak-calling algorithm can be found in Supplementary Text 1. Genome-wide expression scans We calculate a sliding window mean of log2 fold-change data, for the gamb_colu_arab_fun analysis, with a window size of 10 genes and a step of 2 genes. Declarations Data availability Results from the differential expression analyses are stored in the github repository for the AnoExpress python package - https://github.com/sanjaynagi/AnoExpress, and can be explored in the cloud with a series of Google Colaboratory notebooks, or on users’ local machines. All RNA-sequencing data is available from the publications listed in Supplementary Table 1. All code used for this study is available on GitHub: https://github.com/sanjaynagi/AnoExpress. The authors declare that all other data supporting the findings of this study, are available within the article and its Supplementary Information files. Acknowledgements This study was funded by a Deutsches Zentrum für Infektionsforschung grant (TTU 03.705) to VAI. SCN was funded by an MRC CASE studentship (MR/R015678/1) and a National Institute of Allergy and Infectious Diseases grant (NIAID R01-AI116811 to Martin Donnelly and David Weetman). The authors would like to thank all those who freely gave count data from prior publications, including Louisa Messenger, Dieunel Derilus, Pie Müller, Nadja Wipf, Charles Wondji, Sulaiman Ibrahim, Jack Hearn, Thomas van Leeuwen and Wannes Dermauw. We also thank Hilary Ranson, David Weetman and Alistair Miles for feedback on the manuscript. Competing Interests The authors declare no competing interests. Contributions Conceptualisation, analysis, paper drafting (VAI); Conceptualisation, analysis, creation of python package, paper drafting (SCN). References Organization, W.H., World Malaria Report . 2022, WHO. Bhatt, S., et al., The effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015. Nature, 2015. 526 : p. 207-211. Organization, W.H., Global report on insecticide resistance in malaria vectors: 2010–2016 . 2018, Geneva: World Health Organization. p. 72. Churcher, T.S., et al., The impact of pyrethroid resistance on the efficacy and effectiveness of bednets for malaria control in Africa . 2016, eLife Sciences Publications Limited. p. e16090. Cornelie, S., et al., Salivary Gland Proteome Analysis Reveals Modulation of Anopheline Unique Proteins in Insensitive Acetylcholinesterase Resistant Anopheles gambiae Mosquitoes . 2014, Public Library of Science. p. e103816. Coetzee, M., et al., Anopheles coluzzii and Anopheles amharicus, new members of the Anopheles gambiae complex . 2013. p. 246-274. Sinka, M.H., et al., The Dominant Anopheles Vectors of Human Malaria in Africa, Europe and the Middle East: Occurrence Data, Distribution Maps and Bionomic Précis. 2010, BioMed Central. p. 117-152. Guo, D., et al., ACE: an efficient and sensitive tool to detect insecticide resistance-associated mutations in insect acetylcholinesterase from RNA-Seq data. BMC Bioinformatics, 2017. 18 (1): p. 330. Scott, J.G., Life and Death at the Voltage-Sensitive Sodium Channel: Evolution in Response to Insecticide Use. Annual Review of Entomology, 2019. 64 (1): p. 243-257. Nauen, R., et al., The Role of Cytochrome P450s in Insect Toxicology and Resistance. Annual Review of Entomology, 2022. 67 (1): p. 105-124. Clarkson, C.S., et al., The genetic architecture of target-site resistance to pyrethroid insecticides in the African malaria vectors Anopheles gambiae and Anopheles coluzzii . 2021, John Wiley & Sons, Ltd. p. 5303-5317. Grau-Bové, X., et al., Resistance to pirimiphos-methyl in West African Anopheles is spreading via duplication and introgression of the Ace1 locus . 2021, Public Library of Science. p. e1009253. Small, S.T., et al., Radiation with reticulation marks the origin of a major malaria vector. Proceedings of the National Academy of Sciences, 2020. 117 (50): p. 31583-31590. Neafsey, D.E., et al., Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes . 2015. Hemingway, J. and H. Ranson, Insecticide resistance in insect vectors of human disease. 2000. p. 371-391. Ingham, V.A., L. Grigoraki, and H. Ranson, Pyrethroid resistance mechanisms in the major malaria vector species complex. Entomologia Generalis, 2023. 43 (3): p. 515-526. Irving, H., et al., Positional cloning of rp2 QTL associates the P450 genes CYP6Z1, CYP6Z3 and CYP6M7 with pyrethroid resistance in the malaria vector Anopheles funestus . 2012, The Author(s). p. 383. Ibrahim, S.S., et al., The P450 CYP6Z1 confers carbamate/pyrethroid cross-resistance in a major African malaria vector beside a novel carbamate-insensitive N485I acetylcholinesterase-1 mutation . 2016, John Wiley & Sons, Ltd (10.1111). p. 3436-3452. Hearn, J., et al., Multi‐omics analysis identifies a CYP9K1 haplotype conferring pyrethroid resistance in the malaria vector Anopheles funestus in East Africa. Molecular ecology, 2022. 31 (13): p. 3642-3657. Yunta, C., et al., Cross-resistance profiles of malaria mosquito P450s associated with pyrethroid resistance against WHO insecticides . 2019. Vontas, J., et al., Rapid selection of a pyrethroid metabolic enzyme CYP9K1 by operational malaria control activities . 2018. p. 4619 LP - 4624. Njoroge, H., et al., Identification of a rapidly‐spreading triple mutant for high‐level metabolic insecticide resistance in Anopheles gambiae provides a real‐time molecular diagnostic for antimalarial intervention deployment. Molecular ecology, 2022. 31 (16): p. 4307-4318. Ibrahim, S.S., et al., The cytochrome P450 CYP6P4 is responsible for the high pyrethroid resistance in knockdown resistance-free Anopheles arabiensis. 2015. Balabanidou, V., L. Grigoraki, and J. Vontas, Insect cuticle: a critical determinant of insecticide resistance . 2018. p. 68-74. Balabanidou, V., et al., Cytochrome P450 associated with insecticide resistance catalyzes cuticular hydrocarbon production in Anopheles gambiae. Proceedings of the National Academy of Sciences, 2016. 113 (33): p. 9268-9273. Wood, O.R., et al., Cuticle thickening associated with pyrethroid resistance in the major malaria vector Anopheles funestus. Parasites & Vectors, 2010. 3 : p. 67. Martinez-Torres, D., et al., Molecular characterization of pyrethroid knockdown resistance (kdr) in the major malaria vector Anopheles gambiae s.s. 1998. p. 179-184. Weill, M., et al., The unique mutation in ace-1 giving high insecticide resistance is easily detectable in mosquito vectors . 2004, Blackwell Publishing Ltd. p. 1-7. Ingham, V.A., et al., A sensory appendage protein protects malaria vectors from pyrethroids. Nature, 2019. 577 : p. 376-380. Ingham, V.A., S. Wagstaff, and H. Ranson, Transcriptomic meta-signatures identified in Anopheles gambiae populations reveal previously undetected insecticide resistance mechanisms. Nature Communications, 2018. 9 : p. 5282. Consortium, A.g.G., Genetic diversity of the African malaria vector Anopheles gambiae . 2017, Nature Publishing Group. p. 96. Williams, J., et al., Sympatric Populations of the Anopheles gambiae Complex in Southwest Burkina Faso Evolve Multiple Diverse Resistance Mechanisms in Response to Intense Selection Pressure with Pyrethroids . 2022. Simma, E.A., et al., Genome-wide gene expression profiling reveals that cuticle alterations and P450 detoxification are associated with deltamethrin and DDT resistance in Anopheles arabiensis populations from Ethiopia. 2019. Messenger, L.A., et al., A whole transcriptomic approach provides novel insights into the molecular basis of organophosphate and pyrethroid resistance in Anopheles arabiensis from Ethiopia . 2021, Elsevier. p. 103655. Ingham, V.A., et al., Integration of whole genome sequencing and transcriptomics reveals a complex picture of the reestablishment of insecticide resistance in the major malaria vector Anopheles coluzzii. PLOS Genetics, 2021. 17 : p. e1009970. Ibrahim, S.S., et al., Molecular drivers of insecticide resistance in the Sahelo-Sudanian populations of a major malaria vector Anopheles coluzzii. BMC Biology, 2023. 21 (1): p. 125. Wipf, N.C., et al., Multi-insecticide resistant malaria vectors in the field remain susceptible to malathion, despite the presence of Ace1 point mutations . 2022, Public Library of Science. p. e1009963. Kouamo, M.F.M., et al., Genome-Wide Transcriptional Analysis and Functional Validation Linked a Cluster of Epsilon Glutathione S-Transferases with Insecticide Resistance in the Major Malaria Vector Anopheles funestus across Africa . 2021. Leffler, E.M., et al., Revisiting an Old Riddle: What Determines Genetic Diversity Levels within Species? PLOS Biology, 2012. 10 (9): p. e1001388. Garud, N.R., et al., Recent Selective Sweeps in North American Drosophila melanogaster Show Signatures of Soft Sweeps. PLOS Genetics, 2015. 11 (2): p. e1005004. Isaacs, A.T., et al., Genome-wide transcriptional analyses in Anopheles mosquitoes reveal an unexpected association between salivary gland gene expression and insecticide resistance . 2018. p. 225. Namiki, T., et al., Cytochrome P450 CYP307A1/Spook: a regulator for ecdysone synthesis in insects . 2005, Elsevier. p. 367-374. Ortelli, F., et al., Heterologous expression of four glutathione transferase genes genetically linked to a major insecticide-resistance locus from the malaria vector Anopheles gambiae . 2003. p. 957-963. Ingham, V.A., et al., The transcription factor Maf-S regulates metabolic resistance to insecticides in the malaria vector Anopheles gambiae. BMC Genomics, 2017. 18 : p. 669. Ingham, V.A., et al., Capturing the transcription factor interactome in response to sub-lethal insecticide exposure. Current Research in Insect Science, 2021: p. 100018. Duret, L. and D. Mouchiroud, Determinants of Substitution Rates in Mammalian Genes: Expression Pattern Affects Selection Intensity but Not Mutation Rate. Molecular Biology and Evolution, 2000. 17 (1): p. 68-070. Nagi, S.C., et al., RNA-Seq-Pop: Exploiting the sequence in RNA sequencing—A Snakemake workflow reveals patterns of insecticide resistance in the malaria vector Anopheles gambiae. Molecular Ecology Resources, 2023. 23 (4): p. 946-961. Leon, M., et al., A rapidly selected 4.3kb transposon-containing structural variation is driving a P450-based resistance to pyrethroids in the African malaria vector Anopheles funestus . 2023, Authorea Preprints. Moyes, C.L., et al., Assessing cross-resistance within the pyrethroids in terms of their interactions with key cytochrome P450 enzymes and resistance in vector populations . 2021. p. 115. Lucas, E.R., et al., Whole-genome sequencing reveals high complexity of copy number variation at insecticide resistance loci in malaria mosquitoes . 2019, Cold Spring Harbor Laboratory Press. p. 1250-1261. Lavara-Culebras, E. and N. Paricio, Drosophila DJ-1 mutants are sensitive to oxidative stress and show reduced lifespan and motor deficits. Gene, 2007. 400 (1): p. 158-165. Champion, C.J. and J. Xu, Redox state affects fecundity and insecticide susceptibility in Anopheles gambiae . 2018. p. 13054. Oliver, S.V. and B.D. Brooke, The Role of Oxidative Stress in the Longevity and Insecticide Resistance Phenotype of the Major Malaria Vectors Anopheles arabiensis and Anopheles funestus . 2016, Public Library of Science. p. e0151049. Otali, D., et al., Increased production of mitochondrial reactive oxygen species and reduced adult life span in an insecticide-resistant strain of Anopheles gambiae . 2014. p. 323-333. Adams, K.L., et al., Cuticular hydrocarbons are associated with mating success and insecticide resistance in malaria vectors. Communications Biology, 2021. 4 : p. 911. Adams, K.L., et al., Selection for insecticide resistance can promote Plasmodium falciparum infection in Anopheles. PLOS Pathogens, 2023. 19 (6): p. e1011448. Govella, N.J., et al., Heritability of biting time behaviours in the major African malaria vector Anopheles arabiensis. Malaria Journal, 2023. 22 (1): p. 238. Ruiz, M., et al., Grasshopper Lazarillo, a GPI-anchored Lipocalin, increases Drosophila longevity and stress resistance, and functionally replaces its secreted homolog NLaz. Insect Biochemistry and Molecular Biology, 2012. 42 (10): p. 776-789. Lv, Y., et al., Venom allergen 5 is Associated With Deltamethrin Resistance in Culex pipiens pallens (Diptera: Culicidae). Journal of Medical Entomology, 2015. 52 (4): p. 672-682. Arcà, B., et al., Anopheline salivary protein genes and gene families: an evolutionary overview after the whole genome sequence of sixteen Anopheles species. BMC Genomics, 2017. 18 (1): p. 153. Gao, L., et al., Xenobiotic responses in insects . 2022, John Wiley & Sons, Ltd. p. e21869. Wondji, C.S., et al., Mapping a Quantitative Trait Locus (QTL) conferring pyrethroid resistance in the African malaria vector Anopheles funestus. BMC Genomics, 2007. 8 (1): p. 34. Mugenzi, L.M.J., et al., Cis-regulatory CYP6P9b P450 variants associated with loss of insecticide-treated bed net efficacy against Anopheles funestus . 2019, Nature Publishing Group. p. 1-11. Styczynska-Soczka, K. and A.P. Jarman, The Drosophila homologue of Rootletin is required for mechanosensory function and ciliary rootlet formation in chordotonal sensory neurons. Cilia, 2015. 4 (1): p. 9. Campbell, C.L., et al., Vgsc-interacting proteins are genetically associated with pyrethroid resistance in Aedes aegypti. PLOS ONE, 2019. 14 (1): p. e0211497. Mukherjee, S., N. Paricio, and N.S. Sokol, A stress-responsive miRNA regulates BMP signaling to maintain tissue homeostasis. Proceedings of the National Academy of Sciences, 2021. 118 (21): p. e2022583118. Singh, A., et al., The transcriptional response to oxidative stress is independent of stress-granule formation. Molecular Biology of the Cell, 2022. 33 (3): p. ar25. Mace, K.A., J.C. Pearson, and W. McGinnis, An Epidermal Barrier Wound Repair Pathway in Drosophila Is Mediated by grainy head. Science, 2005. 308 (5720): p. 381-385. Joanne, P.O., H. Scott, and Q.D. Chris, DrosophilaHB9 Is Expressed in a Subset of Motoneurons and Interneurons, Where It Regulates Gene Expression and Axon Pathfinding. The Journal of Neuroscience, 2002. 22 (21): p. 9143. Nguyen, D.N.T., M. Rohrbaugh, and Z.-C. Lai, The Drosophila homolog of Onecut homeodomain proteins is a neural-specific transcriptional activator with a potential role in regulating neural differentiation. Mechanisms of Development, 2000. 97 (1): p. 57-72. Matsuda, T., et al., Effects of overexpression of mitochondrial transcription factor A on lifespan and oxidative stress response in Drosophila melanogaster . 2013, Elsevier. p. 717-721. Emiliano, T., et al., Gene expression is the main driver of purifying selection in large penguin populations. bioRxiv, 2023: p. 2023.08.08.552445. Love, M.I., W. Huber, and S. Anders, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 . 2014, Springer. p. 550. Grau-Bové, X., et al., Evolution of the Insecticide Target Rdl in African Anopheles Is Driven by Interspecific and Interkaryotypic Introgression. Molecular Biology and Evolution, 2020. 37 (10): p. 2900-2917. Miles, A. and N.J. Harding, scikit-allel: A Python package for exploring and analysing genetic variation data . 2016. Additional Declarations There is NO Competing Interest. Supplementary Files SupplementaryFigure1.png Supplementary Figure 1: PCA analysis on combined microarray and RNAseq fold changes. PCA analysis coloured by species of the combined RNAseq data from this study and collated microarray data from Ingham et al. 2018. SupplementaryTable1.xlsx SupplementaryTable2.xlsx SupplementaryTable3.xlsx SupplementaryTable4.xls SupplementaryTable5.xlsx SupplementaryTable6.xlsx SupplementaryTable7.xlsx SupplementaryText1.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3910702","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":272675365,"identity":"99799eb7-c7fe-46cf-b2da-9bc8140f08c3","order_by":0,"name":"Victoria Ingham","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABTElEQVRIie2RP2uDQByGTw7Mcup6DjZfIXKQBtrSr9Ij0EyFjBlCOBHM0sZVv4VTp0IUIV3SPWCHQKAdshgKpX9C6KkFazSdC/XhOF7eu8efcADU1PxFYLrjNCT5CACB8YAF1kDpmcBKSlLlCskVeED5WSUKZd/hoNK6h8/LzV2nCRpSuO4Pb3teZDLYH3Q0G0rBavABNHtPCUXdcJ6wzqB8eeLMoivvMWDQmWPimnJXf5gA4hbHqCYSTOSnf94mSOTKgo4/JQtTL0Rt1bgG1PMrlfNM2UW91oIyKO0wnYbo+D1RpkVFgZnCryGykqzoIlMYn8I/IrA3PgXsKaLuOj7uWvwClCaR7iYKmmHihHJXNRgPxSmiEi7j2B+d2cqcvKDXqCkvekuIhiPNHt8EG7Y91ezyw2QuX7hcC1ZFmQPjinL7m1FTU1PzP/gCGDFy2m4EsV0AAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0001-5708-4741","institution":"Heidelberg University","correspondingAuthor":true,"prefix":"","firstName":"Victoria","middleName":"","lastName":"Ingham","suffix":""},{"id":272675366,"identity":"dd0e290e-08ee-4b07-959a-682d15e958cf","order_by":1,"name":"Sanjay Nagi","email":"","orcid":"","institution":"Liverpool School of Tropical Medicine","correspondingAuthor":false,"prefix":"","firstName":"Sanjay","middleName":"","lastName":"Nagi","suffix":""}],"badges":[],"createdAt":"2024-01-30 13:33:03","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3910702/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3910702/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":51099988,"identity":"09fe43ef-2fe8-409d-9f96-8c97034d8683","added_by":"auto","created_at":"2024-02-14 06:02:53","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1656257,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eOverview of collection location for resistant populations in each RNA-Sequencing study. \u003c/strong\u003eDifferent colours show different species. Sample sites with multiple colours indicate multiple species comparisons at that location. Information on the susceptible comparator species is available in Supplementary Table 1.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/5968dd1e28be2153af61cd5b.png"},{"id":51099986,"identity":"fd0a4bb7-c9c4-4b75-a91b-6e597918ce2e","added_by":"auto","created_at":"2024-02-14 06:02:53","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1737838,"visible":true,"origin":"","legend":"\u003cp\u003eA) A principal components analysis of normalised log2 read count data from the AnoExpress differential expression meta-analysis. \u003cem\u003eAn. gambiae = orange, An. coluzzii = blue, and An. arabiensis = green\u003c/em\u003e B) Per-gene correlations of normalised log2 read count data between species. C) A histogram of mean pearsons R values of log2 fold change between neighbouring genes, from 1000 random permutations of genome position, with the true value plotted as a dashed vertical line.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/6dd45347aa3fde8e0cde4431.png"},{"id":51099989,"identity":"a437ab07-74ec-4032-a3d8-0d1466198d4f","added_by":"auto","created_at":"2024-02-14 06:02:53","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":2722035,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eGenome-wide expression scans. \u003c/strong\u003eExpression for all genes in the \u003cem\u003egamb_colu_arab_fun \u003c/em\u003eanalysis is plotted against the \u003cem\u003eAn. gambiae \u003c/em\u003ePEST reference genome. Different colours represent different species – blue = \u003cem\u003eAn. coluzzii\u003c/em\u003e, orange = \u003cem\u003eAn. gambiae\u003c/em\u003e, green = \u003cem\u003eAn. arabiensis\u003c/em\u003e and red = \u003cem\u003eAn. funestus\u003c/em\u003e. A moving average of gene expression is plotted as a black line, calculated in sliding windows of 10 genes, with a step of 2 genes. The Y-axis is truncated between -8 and +10 log2 Fold-change to ease interpretation. Signals in close proximity to known, or putative, IR loci are labelled and highlighted with dashed grey lines. Gene density is displayed below the x-axis. In AnoExpress, plots are interactive, aiding interpretation.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/4c27570024ade6322f9f1c7c.png"},{"id":51099987,"identity":"1d395149-93ef-452a-8948-3bfa41250fba","added_by":"auto","created_at":"2024-02-14 06:02:53","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":371334,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eGene Expression shapes levels of purifying selection and nucleotide diversity across \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eAnopheles \u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003especies\u003c/strong\u003e. A) The ratio of non-synonymous to synonymous segregating sites (pN/pS) in\u003cem\u003e An. coluzzii = blue,\u003c/em\u003e \u003cem\u003eAn. gambiae = orange, and An. arabiensis = green, \u003c/em\u003eat different levels of gene expression. B) Nucleotide diversity, π, calculated across the entire length of a gene. Levels of gene expression are based on median log2 counts from the \u003cem\u003egamb_colu_arab\u003c/em\u003e analysis.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/2a0d74737ea0be9e6de4a0a1.png"},{"id":51099996,"identity":"bc033e8e-2b22-45d2-9d5a-506daf6a0e48","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":63241,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eInsecticide resistance-associated selection and expression in \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eAn. gambiae. \u003c/strong\u003e\u003c/em\u003eThe location of genes in the AgamP4 PEST reference genome, which are both found in regions of selective sweeps in whole-genome sequence data, and which are also resistance candidates based on average fold-changes in the RNA-Sequencing meta-analysis data.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/19dac9591bacac18fbb59b2a.png"},{"id":51099994,"identity":"c3f08332-48a1-40a1-9948-1c5e68ee9f4e","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":762019,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eResistance candidates based on gene expression and gene set enrichment analyses. A)\u003c/strong\u003e Top genes based on the intersection of genes with mean and median fold-change as above 2, filtered for lowly expressed genes. Ag: \u003cem\u003eAn. gambiae, \u003c/em\u003eAc: \u003cem\u003eAn. coluzzii\u003c/em\u003e, Aa: \u003cem\u003eAn. arabiensis\u003c/em\u003e, Af: \u003cem\u003eAn. funestus\u003c/em\u003e \u003cstrong\u003eB)\u003c/strong\u003e Consistently over-expressed genes. Genes that show significant (p \u0026lt; 0.05) and positive log2 fold-changes in more than 19 out of 28 experiments. \u003cstrong\u003eC) \u003c/strong\u003eMost down-regulated genes based on median-fold changes\u003cstrong\u003e D) \u003c/strong\u003eMost consistently down-regulated genes. Genes that show significant and negative log2 fold-changes in more than 19 out of 28 experiments\u003cstrong\u003e E) \u003c/strong\u003eSignificant GO terms and PFAM domains from gene set enrichment analysis (hypergeometric test) for the top 5% of overexpressed genes based on median fold-changes. Green shows up-regulation of genes whilst purple shows down-regulation, intensity shows relative levels of differential expression.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/667bac797b9e806ad8b69193.png"},{"id":51100583,"identity":"f4a13a8c-8bd5-40e5-bb74-6ded26caca47","added_by":"auto","created_at":"2024-02-14 06:18:57","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2473356,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/238da7e1-8510-4e96-8905-20966e830328.pdf"},{"id":51100356,"identity":"de2e46b8-c217-4ced-856b-ddf09341dd53","added_by":"auto","created_at":"2024-02-14 06:10:53","extension":"png","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":35909,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSupplementary Figure 1: PCA analysis on combined microarray and RNAseq fold changes.\u003c/strong\u003e PCA analysis coloured by species of the combined RNAseq data from this study and collated microarray data from Ingham et al. 2018.\u003c/p\u003e","description":"","filename":"SupplementaryFigure1.png","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/417d3a24b4051d056f72c471.png"},{"id":51099984,"identity":"1839f377-b176-4fb5-82d1-d99855e22a7e","added_by":"auto","created_at":"2024-02-14 06:02:53","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":8373,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/d3210a4c5e9caaf137d367d4.xlsx"},{"id":51099993,"identity":"af400210-1aa2-4185-94c0-60e8355f5415","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":12634,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/ec596bffa6344e6165f43eac.xlsx"},{"id":51099990,"identity":"6bc874d2-35d1-4b74-aaee-7842a1976f54","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":370709,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/eebbca308a3095c128ac4444.xlsx"},{"id":51099995,"identity":"24baa8ad-25a1-4eca-a3cd-d93ba1787bfe","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"xls","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":1249280,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable4.xls","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/2fb0d3ef4b6b5eb14bc86141.xls"},{"id":51099998,"identity":"e584bf86-e09c-47d8-8953-00c283a91303","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":265445,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/daaf975258b35749770af5c2.xlsx"},{"id":51099991,"identity":"8b5a93fe-1aa4-49b2-96e3-f8505d3a10a4","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":14637,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable6.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/2961a59e23ee960b392664c2.xlsx"},{"id":51099992,"identity":"bd9b108f-d9ee-4ebe-a840-b2a375fcc122","added_by":"auto","created_at":"2024-02-14 06:02:54","extension":"xlsx","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":1894774,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable7.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/d86de28b78c4c450e41ef1f0.xlsx"},{"id":51100357,"identity":"680555da-e0fd-45d8-86c8-b4cfe15b8ba2","added_by":"auto","created_at":"2024-02-14 06:10:54","extension":"docx","order_by":9,"title":"","display":"","copyAsset":false,"role":"supplement","size":6129,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cbr\u003e\u003c/p\u003e","description":"","filename":"SupplementaryText1.docx","url":"https://assets-eu.researchsquare.com/files/rs-3910702/v1/818d551d3329092c11977b82.docx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Genomic Profiling of Insecticide Resistance in Malaria Vectors: Insights into Molecular Mechanisms.","fulltext":[{"header":"Introduction","content":"\u003cp\u003eMalaria remains a leading cause of worldwide morbidity and mortality, resulting in an estimated 619,000 deaths in 2021 [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Insecticide-based vector control interventions are the most successful tools for reducing malaria incidence, responsible for over 80% of the malaria cases averted between 2000 and 2015 [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]; however, the gains made in malaria control have plateaued in the last eight years and cases are beginning to rise [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The reversal of gains is partially due to widespread insecticide resistance within populations of the anopheline mosquito vector [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e], reducing the efficacy of insecticide-treated bed neds (ITNs) and indoor residual spraying (IRS), the two primary malaria intervention tools [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eAcross sub-Saharan Africa, four mosquito species are responsible for the majority of malaria transmission, the \u003cem\u003eAn. gambiae\u003c/em\u003e species complex (\u003cem\u003eAn. gambiae\u003c/em\u003e, \u003cem\u003eAn. arabiensis\u003c/em\u003e, and \u003cem\u003eAn. coluzzii\u003c/em\u003e) and \u003cem\u003eAn. funestus\u003c/em\u003e [\u003cspan additionalcitationids=\"CR6\" citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Understanding the similarities and differences in resistance mechanisms between the different dominant vector species is critical to inform malaria mitigation strategies, which are mostly employed indiscriminately despite differences in vector composition between and within areas of sub-Saharan Africa.\u003c/p\u003e \u003cp\u003eSharing of resistance mechanisms between insects is commonly observed; mutations at the insecticide target site genes \u003cem\u003eRdl\u003c/em\u003e, \u003cem\u003eKdr\u003c/em\u003e and \u003cem\u003eAce-1\u003c/em\u003e occur at equivalent codons throughout arthropods [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], and insecticide detoxification is often driven by cytochrome P450s from the CYP6 insect clade [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]; this may occur through parallel evolution, or in closely related insects, introgression, where genetic material is passed between species via hybridisation. Although adaptive introgression is seen at resistance loci within the \u003cem\u003eAn. gambiae\u003c/em\u003e species complex [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e], there is no evidence of hybridisation or introgression with \u003cem\u003eAn. funestus\u003c/em\u003e, which diverged approximately 80\u0026nbsp;million years ago [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eInsecticide resistance in \u003cem\u003eAnopheles\u003c/em\u003e mosquitoes commonly involves large gene families, such as cytochrome P450 monooxygenases (P450s), glutathione S-transferases (GSTs), carboxylesterases (COEs), and UDP-glycosyltransferases (UDPs) [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. Upregulation of key P450s has been reported in insecticide resistant populations of each \u003cem\u003eAnopheles\u003c/em\u003e species, for example \u003cem\u003eCYP6Z1, CYP9K1, CYP6AA1\u003c/em\u003e and \u003cem\u003eCYP6P9a\u003c/em\u003e in \u003cem\u003eAn. funestus\u003c/em\u003e [\u003cspan additionalcitationids=\"CR18\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]; \u003cem\u003eCYP9K1, CYP6AA1, CYP6P3 and CYP6M2\u003c/em\u003e in \u003cem\u003eAn. gambiae and An. coluzzii\u003c/em\u003e [\u003cspan additionalcitationids=\"CR21\" citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]; and \u003cem\u003eCYP6P4\u003c/em\u003e in \u003cem\u003eAn. arabiensis\u003c/em\u003e [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. Similarly, cuticular resistance [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], which involves thickening and modification of the composition of the cuticle, slowing the rate of absorption of insecticides has been reported in both \u003cem\u003eAn. gambiae s.l\u003c/em\u003e [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e] and \u003cem\u003eAn. funestus\u003c/em\u003e [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]. Work comparing distinct \u003cem\u003eAnopheles\u003c/em\u003e species found that gene numbers within these families were similar across species but lineage-specific losses and gains are regularly observed [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Despite these similarities, differences between the \u003cem\u003eAn. gambiae\u003c/em\u003e species complex and \u003cem\u003eAn. funestus\u003c/em\u003e are also observed. A prominent example is single nucleotide polymorphisms (SNPs) in the target site of insecticide, which reduce the efficacy of insecticides through changes to binding affinity. The two best studied are \u003cem\u003eKdr\u003c/em\u003e and \u003cem\u003eAce-1\u003c/em\u003e [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], both of which are found in the \u003cem\u003eAn. gambiae\u003c/em\u003e species complex, but which are seemingly absent in \u003cem\u003eAn. funestus\u003c/em\u003e. Novel mechanisms of resistance have also recently been discovered, including the direct binding of pyrethroids by chemosensory proteins (CSPs) [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e], as well as evidence for a role of both hexamerins, α-crystallins [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. The recent discovery of such mechanisms highlights the complexity of resistance and the need for further study into molecular mechanisms of resistance in major malaria vectors.\u003c/p\u003e \u003cp\u003eA plethora of \u0026lsquo;omics data has now been generated for malaria vectors. Over the past two decades, transcriptomic studies have driven discoveries into mechanisms of insecticide resistance, first in the form of microarrays, and subsequently RNA-Sequencing. In addition, since publishing the first phase of the Anopheles 1000 genomes project in 2017 [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e], the MalariaGEN Vector Observatory has generated and made public thousands of high-quality whole-genome sequences of major malaria vectors from throughout sub-Saharan Africa. Although individual -omics experiments have had great success in identifying highly over-expressed transcripts involved in resistance, these studies result in thousands of differentially expressed genes resulting from noise from the susceptible comparator. Previous work has successfully used microarray data in exploratory analyses to identify patterns of gene expression across the \u003cem\u003eAn. gambiae\u003c/em\u003e species complex, identifying meta-signatures across geographically and temporally disparate data, highlighting the importance of meta-analyses [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. In this study we characterise gene-expression profiles from published transcriptomic studies across two malaria vector complexes and establish relationships with whole-genome sequence data. To achieve this, we explored expression in resistance candidate genes and protein-coding families across 28 individual experiments of \u003cem\u003eAn. gambiae s.l\u003c/em\u003e and \u003cem\u003eAn. funestus\u003c/em\u003e [\u003cspan additionalcitationids=\"CR33 CR34 CR35 CR36 CR37\" citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. These data have been integrated both with the previous microarray meta-analysis [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] and with whole-genome sequence data from the Anopheles 1000 genome project [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. The meta-analysis performed here has been made available as a resource to the community as a user-friendly python package, AnoExpress, combined with convenient interactive notebooks intended to be run in Google Colaboratory.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n\u003ch2\u003eAnoExpress\u003c/h2\u003e\n\u003cp\u003eWe performed a differential expression meta-analysis on read count data from 28 RNA-sequencing datasets representing insecticide-resistant \u003cem\u003eAnopheles\u003c/em\u003e populations from 11 countries in sub-Saharan Africa [\u003cspan class=\"CitationRef\"\u003e32\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e38\u003c/span\u003e] (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e, Supplementary Table\u0026nbsp;1). Of the 28 resistant populations, 15 are \u003cem\u003eAn. coluzzii\u003c/em\u003e, 3 are \u003cem\u003eAn. gambiae\u003c/em\u003e, 5 are \u003cem\u003eAn. arabiensis\u003c/em\u003e and 5 are \u003cem\u003eAn. funestus\u003c/em\u003e. Read-count data from different species were derived from aligning to each respective reference genome assemblies. For differential expression analysis, each resistant population was compared to an insecticide-susceptible strain of the same species from the same study. To enable cross-species comparison of putative resistance genes, we located orthologs between each different genome assembly and the AgamP4 PEST reference genome, using the OrthoMCL algorithm in VectorBase. As the inclusion of each successive assembly reduces the numbers of genes that are present, we provide four differential expression analyses; the full dataset, termed \u0026lsquo;\u003cem\u003egamb_colu_arab_fun\u003c/em\u003e\u0026rsquo; (8599 genes), and secondary analyses \u0026ndash; \u0026lsquo;\u003cem\u003egamb_colu_arab\u0026rsquo;\u003c/em\u003e (8651 genes), \u0026lsquo;\u003cem\u003egamb_colu\u003c/em\u003e\u0026rsquo; (11288 genes), \u0026lsquo;\u003cem\u003efun\u003c/em\u003e\u0026rsquo; (14176 genes). Analyses herein were performed on the full \u003cem\u003egamb_colu_arab_fun\u003c/em\u003e dataset unless specified.\u003c/p\u003e\n\u003cp\u003eThese data are presented within a python package, AnoExpress, made for the community, with example notebooks provided to run in Google Colab. Users can load, explore, and visualise the gene expression meta-analysis data, including reproducing most analyses presented herein. AnoExpress can also directly load gene expression data from an earlier meta-analysis of microarray data [\u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e]. AnoExpress is located here: github.com/sanjaynagi/AnoExpress. Documentation and a video user-guide are provided to improve ease of access.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\n\u003ch2\u003eDataset structure\u003c/h2\u003e\n\u003cp\u003eTo investigate overall structure in the dataset, we performed principal components analysis on both the count and fold-change data. PCA on the log-transformed count data revealed five distinct clusters present. Four clusters contain samples only from \u003cem\u003eAn. funestus, An. coluzzii\u003c/em\u003e, or \u003cem\u003eAn. arabiensis\u003c/em\u003e, with a fifth cluster containing a mixture of \u003cem\u003eAn. gambiae, An. arabiensis\u003c/em\u003e and \u003cem\u003eAn. coluzzii\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003eA). The four species did not resolve into four distinct clusters, suggesting that batch effects from different studies could be present. We therefore only performed within-study differential expression analysis. To compare RNA-Sequencing data with the earlier microarray studies, we performed a PCA on the fold-change data from all available experiments (Supplementary Fig.\u0026nbsp;1). There was no evidence of clustering between the two technologies, suggesting that the results between the two methodologies are comparable and can be used in combined analyses.\u003c/p\u003e\n\u003cp\u003eCorrelations between pairs of orthologous genes count data from different species were high, with lower correlation between \u003cem\u003eAn. funestus\u003c/em\u003e and the other three species, which is to be expected given their considerable divergence time (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003eB).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\n\u003ch2\u003eGene expression is physically clustered in the genome\u003c/h2\u003e\n\u003cp\u003eAs changes in gene expression are likely to be related to cis-acting factors and the local genetic context, genes in close physical proximity are more likely to show similar patterns of expression. To test this within our dataset, we calculated and averaged the Pearson\u0026rsquo;s correlation on log2 fold-changes between each neighbouring gene in the dataset, resulting in a mean Pearson\u0026rsquo;s R of 0.10 for neighbouring genes. To determine whether this value was higher than expected based on chance, we randomly permuted the positions of all genes in the genome 1000 times and re-calculated the mean Pearson\u0026rsquo;s R for now neighbouring genes. Figure\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003eC shows the histogram of mean Pearson\u0026rsquo;s correlations, demonstrating a clear effect of genome proximity on gene expression.\u003c/p\u003e\n\u003cp\u003eTo further explore physical clustering of resistance-related gene expression, we visualised fold-change values across the whole genome. In Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e, we use AnoExpress to plot gene expression for all genes across all comparisons in the \u003cem\u003egamb_colu_arab_fun\u003c/em\u003e analysis and calculate the moving average log2 fold change across the genome in sliding windows. We identify gene clusters known to be associated with insecticide resistance, such as the \u003cem\u003eCYP6P, CYP9K, CYP6M/Z\u003c/em\u003e and \u003cem\u003eGSTD\u003c/em\u003e gene clusters. This analysis also highlighted numerous clusters of cuticle proteins which show elevated expression.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\n\u003ch2\u003eGenetic diversity and purifying selection scale with levels of gene expression\u003c/h2\u003e\n\u003cp\u003eWe postulated that genes showing higher levels of absolute expression are more likely to be functional in multiple tissues or pathways, and so should be highly conserved, experiencing the highest levels of purifying selection. We integrated whole-genome sequence data from the Anopheles 1000 genomes project [\u003cspan class=\"CitationRef\"\u003e31\u003c/span\u003e], to explore the relationship between gene expression and genetic diversity, for each species in which genomic data was publicly available. To compare between species, we selected representative populations of \u003cem\u003eAn. gambiae\u003c/em\u003e (Nagongera, Uganda), \u003cem\u003eAn. coluzzii\u003c/em\u003e (Hauts-Bassins, Burkina Faso), and \u003cem\u003eAn. arabiensis\u003c/em\u003e (Muleba, Tanzania), collected between 2012 and 2015. These cohorts show no signatures of inbreeding of major demographic events. As a measure of absolute expression, we used normalised count data. We then counted the number of segregating sites of synonymous and non-synonymous mutations, as well as calculating nucleotide diversity, \u0026pi;, in each gene, for each species.\u003c/p\u003e\n\u003cp\u003eFigure \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e shows the ratio of pN/pS and nucleotide diversity, for various levels of expression binned into 5% percentiles for the \u003cem\u003egamb_colu_arab\u003c/em\u003e analysis. We show that average nucleotide diversity is reduced for the most highly expressed genes and is lower overall in \u003cem\u003eAn. arabiensis\u003c/em\u003e compared to \u003cem\u003eAn. coluzzii and An. gambiae\u003c/em\u003e, which fits with expectations on genetic diversity from the literature [\u003cspan class=\"CitationRef\"\u003e39\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eWe show that for each species, the pN/pS ratio is lower for the most highly expressed genes compared to lowly expressed genes. This holds more strongly in \u003cem\u003eAn. gambiae\u003c/em\u003e and \u003cem\u003eAn. coluzzii\u003c/em\u003e than \u003cem\u003eAn. arabiensis\u003c/em\u003e, which likely relates to the efficiency of purifying selection depending on population size \u0026ndash; the effects of purifying selection will be stronger in larger populations.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\n\u003ch2\u003eHighly overexpressed genes are associated with selective sweeps in wild-caught mosquitoes\u003c/h2\u003e\n\u003cp\u003eGiven that insecticide resistance is often associated with increased expression of metabolic genes and that beneficial resistance mutations can cause selective sweeps which spread through a population, we hypothesised that highly expressed genes are more likely to be found in proximity to genomic regions under selection in wild-caught mosquitoes.\u003c/p\u003e\n\u003cp\u003eTo determine this, we investigated genome-wide signals of recent selection using the H12 statistic [\u003cspan class=\"CitationRef\"\u003e40\u003c/span\u003e] on data from phase 3 of the Anopheles 1000 genomes project (Nagi et al., in prep). Every signal was then mined to determine which genes in the AgamP4 assembly, if any, lie at the location of the peak of a selection signal. Further information on genome-wide selection scans and signal calling is found in Supplementary Text 1. Candidate genes were then defined as those genes showing a median fold-change of greater than two in the \u003cem\u003eAn. gambiae, An. coluzzii\u003c/em\u003e and \u003cem\u003eAn. arabiensis\u003c/em\u003e dataset \u003cem\u003e(gamb_colu_arab)\u003c/em\u003e, as these three species are represented in the Ag1000G phase 3 data [\u003cspan class=\"CitationRef\"\u003e38\u003c/span\u003e]. We additionally filtered out lowly expressed genes (median counts\u0026thinsp;\u0026lt;\u0026thinsp;5). After filtering, 106 candidate genes remained, of which 34 were located at the site of a signal of selection (Supplementary Table\u0026nbsp;2). Figure\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e displays the location of these genes in the \u003cem\u003eAn. gambiae\u003c/em\u003e PEST reference genome, and recovers known insecticide resistance candidates such as \u003cem\u003eCYP6P3\u003c/em\u003e, \u003cem\u003eCYP6M2\u003c/em\u003e, \u003cem\u003eGSTE2\u003c/em\u003e and \u003cem\u003eCYP9K1\u003c/em\u003e, as well as numerous other genes putatively involved in resistance.\u003c/p\u003e\n\u003cp\u003eTo determine whether this was a significant enrichment of genes compared to random chance, we carried out a permutation analysis. In 10000 permutations, only 2 equalled or exceeded the value of 34 (p-value\u0026thinsp;=\u0026thinsp;0.0002), suggesting that regions of the genome under selection are indeed enriched for the most highly overexpressed genes. Separately, we utilised the entire dataset of genes and found a positive association between a genes\u0026rsquo; median fold-change and whether it was found at the site of an H12 selective sweep signal (p\u003csub\u003eGLM\u003c/sub\u003e = 0.000).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\n\u003ch2\u003eSignatures of resistance-associated gene expression\u003c/h2\u003e\n\u003cp\u003eTo explore highly over-expressed genes across all four species, we ranked genes by mean and median log\u003csub\u003e2\u003c/sub\u003e fold change across all experiments (Supplementary Tables\u0026nbsp;3 and 4). Included in the over-expressed set (cut-off of fold change\u0026thinsp;\u0026gt;\u0026thinsp;2, median counts\u0026thinsp;\u0026gt;\u0026thinsp;5), are 65 and 78 genes respectively. Within these lists, the known pyrethroid metaboliser \u003cem\u003eCYP6P3\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e39\u003c/span\u003e] (the ortholog of \u003cem\u003eCYP6P9a/b\u003c/em\u003e in \u003cem\u003eAn. funestus\u003c/em\u003e) had a median and mean fold change of \u0026gt;\u0026thinsp;6.8, demonstrating the utility of the meta-analytic approach. Included in both lists were other cytochrome P450s including \u003cem\u003eCYP6P4, CYP9K1, CYP6M2, CYP6Z2, CYP6Z3\u003c/em\u003e and \u003cem\u003eCYP6P5\u003c/em\u003e, all of which have previously been linked to pyrethroid resistance across both species [\u003cspan class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e21\u003c/span\u003e]. Interestingly, CYP4H17 which has never been explored for a role in pyrethroid resistance, appears 7th (mean) and 11th (median) with a median fold change of 3.47. Similarly appearing high across both mean and median rankings are cuticular related proteins \u003cem\u003eCPLCG4, CPR150\u003c/em\u003e and \u003cem\u003eCPLCA3\u003c/em\u003e, a venom allergen (AGAP006417), an alkaline phosphatase (AGAP001684), gustatory receptor 49 (AGAP001169) and a protein with no known function (AGAP009327). In total, 50 genes are present in both the median and mean over-expressed sets including one carboxylesterases, 11 cytochrome P450s, three glutathione-S-transferases and one UGT from the major detoxification enzyme families (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003eA). Additionally, 9 cuticular proteins, one D7 salivary gland protein and two hexamerins are present from gene families previously linked to resistance [\u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e41\u003c/span\u003e]. Two genes related to odorant or taste detection are also present.\u003c/p\u003e\n\u003cp\u003eWe then performed gene set enrichment analysis on the top 5% of overexpressed genes based on median fold-changes (419 genes) (Supplementary Table\u0026nbsp;5). We performed enrichment on both GO terms and protein (PFAM) domains; significant enrichment terms are displayed in Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003eD, categorised by broad resistance mechanism. In expectation with known gene families involved in insecticide resistance, we observe highly significant enrichment for many GO terms relating to detoxification, such as oxidoreductase activity (fdr-corrected p-value\u0026thinsp;=\u0026thinsp;8e-16), iron ion binding (p\u0026thinsp;=\u0026thinsp;4.75e-15), heme-binding (p\u0026thinsp;=\u0026thinsp;6.8e-14), monoxygenase activity (p\u0026thinsp;=\u0026thinsp;9e-14), glutathione activity (p\u0026thinsp;=\u0026thinsp;0.001) and glutathione metabolic process (p\u0026thinsp;=\u0026thinsp;0.001), and the \u0026lsquo;P450\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;1.6e-16), \u0026lsquo;GST_N_3\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;0.002) and \u0026lsquo;GST_C\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;0.004) protein domains. In addition, cuticular-related genes were also enriched, with the GO terms structural constituent of cuticle (p\u0026thinsp;=\u0026thinsp;1.1e-23), chitin binding (p\u0026thinsp;=\u0026thinsp;2.3e-09), chitin metabolic process (p\u0026thinsp;=\u0026thinsp;1.3e-8), fatty acid elongase activity (p\u0026thinsp;=\u0026thinsp;0.04), and the protein domains \u0026lsquo;Chitin_bind_4\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;3.2e-20), \u0026lsquo;CMB_14\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;1.3e-18), and CPCFC (p\u0026thinsp;=\u0026thinsp;0.03). Finally, a large group of terms were related to sensory perception, including the GO terms \u0026lsquo;sensory perception of smell\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;1.4e-6), odorant binding (p\u0026thinsp;=\u0026thinsp;2.4e-6), olfactory receptor activity (p\u0026thinsp;=\u0026thinsp;0.0002), sensory perception of taste (p\u0026thinsp;=\u0026thinsp;0.018), taste receptor activity (p\u0026thinsp;=\u0026thinsp;0.04), and the protein domains \u0026lsquo;7tm_6\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;0.00003) and \u0026lsquo;7tm_7\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;0.03), which indicate odorant and gustatory receptors.\u003c/p\u003e\n\u003cp\u003eWe then explored highly down-regulated genes based on median fold-changes (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003eC). Amongst the most down-regulated genes was \u003cem\u003eCYP307A1\u003c/em\u003e, a known regulator of ecdysone synthesis [\u003cspan class=\"CitationRef\"\u003e42\u003c/span\u003e]. In addition to hormone related transcripts \u003cem\u003eCLIPA13\u003c/em\u003e and \u003cem\u003eGalectin 6\u003c/em\u003e are heavily down-regulated. Zero GO terms were enriched based on the 5% lowest median fold-changes. If we instead use the 5% lowest mean fold-changes, we observe significant enrichment for sodium channel activity (p\u0026thinsp;=\u0026thinsp;0.00013), sodium ion transport (p\u0026thinsp;=\u0026thinsp;0.00059) and sodium ion transmembrane transport (p\u0026thinsp;=\u0026thinsp;0.0044) the target sites for pyrethroids.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\n\u003ch2\u003eConsistently differentially expressed genes\u003c/h2\u003e\n\u003cp\u003eWe then explored genes which were significantly up and down-regulated most consistently.\u003c/p\u003e\n\u003cp\u003eAfter filtering genes with very lowly expressed genes, only two genes were upregulated in 22 out of 28 RNA-Sequencing experiments, \u003cem\u003eCYP6P3\u003c/em\u003e and \u003cem\u003eCYP9K1. GSTD3\u003c/em\u003e and \u003cem\u003eCYP6P4\u003c/em\u003e were upregulated in 21 out 28. Figure\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003eB shows all 18 genes significantly over-expressed in at least 19 out of 28 experiments, and includes CYP4H17, CYP4H18, CYP6M2, GSTE2, a UGT and a trypsin TRYP4 (Supplementary Table\u0026nbsp;6). This set of genes were enriched for seven P450-related GO terms, including oxidoreductase activity (p\u0026thinsp;=\u0026thinsp;1.68e-09) and heme binding (p\u0026thinsp;=\u0026thinsp;5.66e-09), and the PFAM domain \u0026lsquo;p450\u0026rsquo; (p\u0026thinsp;=\u0026thinsp;9.19e-10).\u003c/p\u003e\n\u003cp\u003eIn contrast, only 8 genes showed consistent down-regulation in at least 19 out of 28 experiments (Supplementary Table\u0026nbsp;6), including two opsin genes, GPROP4 and GPROP6, and a molecular chaperone HtpG gene (AGAP006961). The genes are enriched in seven GO terms relating to light detection, such as photoreceptor activity (p\u0026thinsp;=\u0026thinsp;0.029), visual perception (p\u0026thinsp;=\u0026thinsp;0.029) and phototransduction (p\u0026thinsp;=\u0026thinsp;0.039).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\n\u003ch2\u003eDetoxification genes\u003c/h2\u003e\n\u003cp\u003eIn total 18 cytochrome P450s show either a median and mean fold change of \u0026gt;\u0026thinsp;2 and 9 are significantly upregulated in at least 19 out of 28 of all datasets. The cytochrome P450 family most represented is the \u003cem\u003eCYP6\u003c/em\u003e family, orthologous to the \u003cem\u003eCYP3\u003c/em\u003e clade in humans, responsible for the vast majority of xenobiotic metabolism. The \u003cem\u003eCYP6Ps\u003c/em\u003e, \u003cem\u003eCYP6M2\u003c/em\u003e and \u003cem\u003eCYP9K1\u003c/em\u003e which have been repeatedly associated with insecticide resistance are all heavily implicated and are located in regions of the genome under selection (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e). Additionally, four members of the \u003cem\u003eCYP4\u003c/em\u003e family appear to be overexpressed across multiple resistant populations, particularly \u003cem\u003eCYP4H17\u003c/em\u003e, but also \u003cem\u003eCYP4H16\u003c/em\u003e, \u003cem\u003eCYP4H18\u003c/em\u003e, and \u003cem\u003eCYP4C28\u003c/em\u003e.\u003c/p\u003e\n\u003cp\u003eSimilarly, 4 out of 27 GSTs (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e) have high mean or median fold-changes with one, \u003cem\u003eGSTD3\u003c/em\u003e showing consistently significant overexpression. Of these, \u003cem\u003eGSTE2\u003c/em\u003e has been shown to be involved in pyrethroid resistance [\u003cspan class=\"CitationRef\"\u003e38\u003c/span\u003e], whilst \u003cem\u003eGSTE1, 2\u003c/em\u003e and \u003cem\u003e5\u003c/em\u003e have been implicated in DDT resistance [\u003cspan class=\"CitationRef\"\u003e43\u003c/span\u003e]. Of the other phase two and three detoxification families, two UGTs (AGAP006222 and AGAP011564) and two carboxylesterases (\u003cem\u003eCOEAE6O\u003c/em\u003e and \u003cem\u003eCOEAE8O\u003c/em\u003e) are expressed with high mean or median expression or consistently over-expressed (Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\n\u003ch2\u003eCuticular Proteins\u003c/h2\u003e\n\u003cp\u003e113 cuticular proteins were included in the full \u003cem\u003egamb_colu_arab_fun\u003c/em\u003e analysis; after filtering for lowly-expressed genes, only 61 of these remained. 20 of these genes showed a mean or median fold-change of above 2 in resistant populations, and although none showed significant over-expression in 19/28 experiments, 9 showed upregulation in at least 16, including \u003cem\u003eCPLCX3, CPR130, CPAP3-C, CPR16, CPR140, CPR59, CPR79, CPLCP12\u003c/em\u003e and \u003cem\u003eCPR81\u003c/em\u003e.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\n\u003ch2\u003eExpression of neuronal-related genes\u003c/h2\u003e\n\u003cp\u003eGene-set enrichment analysis of GO terms on the 5% most down-regulated genes for median expression revealed significant enrichments in synapse (15 genes), sodium channel activity (10 genes) and chemical synaptic transmission (13 genes). The primary target site for pyrethroid insecticides, the \u003cem\u003eVgsc\u003c/em\u003e, is not present in the full \u003cem\u003egamb_colu_arab_fun\u003c/em\u003e analysis, but investigation of the secondary analyses shows sporadic down-regulation in \u003cem\u003eAn. coluzzii\u003c/em\u003e and \u003cem\u003eAn. funestus\u003c/em\u003e but not \u003cem\u003eAn. gambiae\u003c/em\u003e, suggesting that changes in expression of this gene are unlikely to be playing a role in resistance. Interestingly, \u003cem\u003eAce-1\u003c/em\u003e expression demonstrated up-regulation in almost all \u003cem\u003eAn. coluzzii\u003c/em\u003e and \u003cem\u003eAn. gambiae\u003c/em\u003e populations whilst it is down-regulated in all 5 of the \u003cem\u003eAn. funestus\u003c/em\u003e populations. The gene coding for the beta-2 subunit of the nicotinic acetylcholine receptor is significantly overexpressed in 11 out of 28 experiments.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eOther\u003c/strong\u003e \u003cstrong\u003ea priori\u003c/strong\u003e \u003cstrong\u003eresistance candidates\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA number of genes and gene families have recently been implicated in insecticide resistance, including CSPs, hexamerins, alpha-crystallin, the transcription factor Maf-S and d7 salivary proteins [\u003cspan class=\"CitationRef\"\u003e29\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e41\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e44\u003c/span\u003e]. Of the 8 chemosensory proteins, just three remained in the full \u003cem\u003egamb_colu_arab_fun\u003c/em\u003e analysis, \u003cem\u003eCSP3\u003c/em\u003e, \u003cem\u003eCSP4\u003c/em\u003e and \u003cem\u003eSAP1\u003c/em\u003e, none of which showed high mean or median fold-changes across four species. Similarly, three hexamerins were included in these data, all three of which showed high mean fold changes ranging from 2.64 to 3.4. Further exploration revealed that overexpression was observed in all \u003cem\u003eAn. arabiensis\u003c/em\u003e and the majority of \u003cem\u003eAn. gambiae\u003c/em\u003e experiments, with the orthologs of AGAP001345 and AGAP001657 being overexpressed in two (Uganda and FuMoz) out of five \u003cem\u003eAn. funestus\u003c/em\u003e datasets. Just two a-crystallins were present in the data, neither of which showed an expression-association with resistance. The transcription factor \u003cem\u003eMaf-S\u003c/em\u003e was previously identified as up-regulated across multiple \u003cem\u003eAn. gambiae\u003c/em\u003e s.l populations in the meta-analysis of microarray data [\u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e]. It also shows similarly consistent overexpression in the RNA-Sequencing data, with 12 out of 28 populations exhibiting significant up-regulation.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\n\u003ch2\u003eTranscriptional regulation of insecticide resistance\u003c/h2\u003e\n\u003cp\u003ePrevious work using transcriptomic time-course data revealed putative transcription factors regulating the expression of transcripts post-pyrethroid exposure [\u003cspan class=\"CitationRef\"\u003e45\u003c/span\u003e]. Here, we used the GENIE3 algorithm in the Grenadine package to re-capitulate a gene regulatory network utilising the 28 RNA-sequencing and including 31 microarray experiments. After filtering score for the top 5th percentile (Supplementary Table\u0026nbsp;7), 50,967 interactions were predicted. Firstly, we investigated \u003cem\u003eMaf-S\u003c/em\u003e due to its prior link with insecticide resistance [\u003cspan class=\"CitationRef\"\u003e44\u003c/span\u003e]. 269 putative interactions were identified, including the pyrethroid-resistance related transcripts \u003cem\u003eCYP6M2\u003c/em\u003e, \u003cem\u003eCYP6Z3\u003c/em\u003e, \u003cem\u003eCYP9J4\u003c/em\u003e, two ABC transporters and three GSTs including \u003cem\u003eGSTMS1\u003c/em\u003e. To validate the utility of this approach, microarray data published from a \u003cem\u003eMaf-S\u003c/em\u003e knockdown [\u003cspan class=\"CitationRef\"\u003e44\u003c/span\u003e] was compared with the model predictions; 83 out of the predicted 269 genes were present on the list. We then carried out a permutation analysis, and out of 1000 permutations only 18 had 83 or more overlaps (p\u0026thinsp;=\u0026thinsp;0.018) giving confidence to the model predictions.\u003c/p\u003e\n\u003cp\u003eTo determine which transcription factors are likely to play a role in resistance, enrichment analysis was performed on the predicted interaction partners for each transcription factor: \u003cem\u003eRootletin\u003c/em\u003e was associated with genes significantly enriched for oxidioreductase activity (p\u0026thinsp;=\u0026thinsp;0.00943) and was linked with \u003cem\u003eCYP6P3, CYP6P4\u003c/em\u003e and \u003cem\u003eCYP6P5, CYP6AA1, CYP6M2\u003c/em\u003e and \u003cem\u003eCYP6Z4\u003c/em\u003e; genes predicted to interact with \u003cem\u003eSu(H)\u003c/em\u003e were similarly enriched in oxoreductase activity (p\u0026thinsp;=\u0026thinsp;9.19e-3); \u003cem\u003eAntp, Cbt\u003c/em\u003e and \u003cem\u003eSug\u003c/em\u003e were significantly enriched in cellular response to stress (p\u0026thinsp;=\u0026thinsp;5.19e-4; 3.83e-4; 2.30e-8), \u003cem\u003eAsciz\u003c/em\u003e and \u003cem\u003eOrg-1\u003c/em\u003e to sensory response to chemical stimulus (p\u0026thinsp;=\u0026thinsp;5.56e-4; 1.69e-4), \u003cem\u003eE(spl)m3-HLH\u003c/em\u003e and \u003cem\u003eGrh\u003c/em\u003e with structural constituent of the cuticle (p\u0026thinsp;=\u0026thinsp;8.36e-5; 1.71e-2); \u003cem\u003eexex\u003c/em\u003e with acetylcholine metabolic process (p\u0026thinsp;=\u0026thinsp;3.57e-3); \u003cem\u003eGrau, Scrt\u003c/em\u003e and \u003cem\u003eToy\u003c/em\u003e with G protein-coupled receptor activity (p\u0026thinsp;=\u0026thinsp;1.47e-5; 1.01e-2; 2.46e-6); HLH3B with neurogenesis (p\u0026thinsp;=\u0026thinsp;4.01e-2); \u003cem\u003eOnecut\u003c/em\u003e with synpase organisation (p\u0026thinsp;=\u0026thinsp;5.26e-3); \u003cem\u003eRow\u003c/em\u003e with response to stress (p\u0026thinsp;=\u0026thinsp;8.97e-6) and \u003cem\u003eTFAM\u003c/em\u003e with electron transport chain (1.25e-7). A number of transcription factors were enriched in higher order functions, such as RNA metabolic processes, gene expression and aromatic compound metabolic processes.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eDespite the wide availability of transcriptomic data for major malaria vectors, no study to date has analysed these data across the four primary vectors, \u003cem\u003eAn. gambiae, An. coluzzii, An. arabiensis\u003c/em\u003e and \u003cem\u003eAn. funestus.\u003c/em\u003e In this study we performed a meta-analysis of these data, demonstrating clear convergence of molecular mechanisms of known insecticide resistance genes, as well as discovering novel candidates showing high levels of expression across the divergent species. We present a user-friendly python package, AnoExpress, which allows users to explore, visualise, and analyse the transcriptomic meta-analysis data.\u003c/p\u003e\n\u003cp\u003eIntegrating differential expression and single nucleotide polymorphism (SNP) data has the potential to enhance the discovery of novel resistance markers [\u003cspan class=\"CitationRef\"\u003e46\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e47\u003c/span\u003e]. We integrated whole-genome sequence data from the Ag1000G with our transcriptomic meta-analysis, estimating signals of positive selection in wild mosquitoes, and locating the intersection of genes that both are highly overexpressed and are located at sites of selection signals. As over-expression of genes is a common mechanism which can confer insecticide resistance, cis-acting mutations which are likely to increase the expression of nearby genes are expected to be under positive selection. Indeed, cis-acting factors have been identified on a triple-mutant haplotype around \u003cem\u003eCYP6AA1\u003c/em\u003e in East African \u003cem\u003eAn. gambiae\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e22\u003c/span\u003e], and in \u003cem\u003eAn. funestus\u003c/em\u003e driving expression of \u003cem\u003eCYP6P9a/b\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e48\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eThe transcriptomic meta-analysis recovered many known resistance genes from metabolic gene families, including \u003cem\u003eCYP6P3\u003c/em\u003e (the \u003cem\u003eCYP6P9a/b\u003c/em\u003e ortholog), \u003cem\u003eCYP6P4\u003c/em\u003e, \u003cem\u003eCYP6AA1\u003c/em\u003e, \u003cem\u003eCYP9K1\u003c/em\u003e, \u003cem\u003eCYP6Z2, CYP6Z3, CYP6M2\u003c/em\u003e and \u003cem\u003eGste2\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e22\u003c/span\u003e]. Each of these cytochrome P450s have been shown to directly metabolise insecticides [\u003cspan class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e49\u003c/span\u003e], as well as being associated with selective sweeps in field populations [\u003cspan class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e31\u003c/span\u003e]. Additionally, several detoxification genes showing consistent overexpression have recently been implicated in gene duplication events in the \u003cem\u003eAn. gambiae\u003c/em\u003e complex, including \u003cem\u003eCYP9K1\u003c/em\u003e, \u003cem\u003eCYP6AA1, COEAE60\u003c/em\u003e and \u003cem\u003eGSTE2-4\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e50\u003c/span\u003e]. The appearance of these genes in both candidates with consistently high mean/median fold change, as well as those consistently expressed over multiple datasets demonstrates the power of this approach. \u003cem\u003eCYP4H17\u003c/em\u003e is a striking example of a gene which seems to have been missed in earlier studies and is a promising candidate for functional validation in both \u003cem\u003eAn. gambiae s.l\u003c/em\u003e and \u003cem\u003eAn. funestus\u003c/em\u003e.\u003c/p\u003e\n\u003cp\u003eIn addition to detoxification genes, we see a signature of oxidative stress response. For example, \u003cem\u003eMaf-S\u003c/em\u003e is involved in the \u003cem\u003eNrf2-cnc\u003c/em\u003e pathway, which induces expression of stress response genes upon changes to the oxidative stress levels of the cell [\u003cspan class=\"CitationRef\"\u003e44\u003c/span\u003e]. Interestingly, \u003cem\u003eMaf-S\u003c/em\u003e has been shown to have a role in pyrethroid resistance and was identified from consistent overexpression in microarray datasets [\u003cspan class=\"CitationRef\"\u003e44\u003c/span\u003e] and is similarly consistently overexpressed in the RNA-Sequencing data, whilst another member of the pathway, \u003cem\u003eKeap1\u003c/em\u003e, is upregulated in \u003cem\u003eAn. funestus\u003c/em\u003e from Ghana. In addition to this characterised pathway, protein \u003cem\u003eDJ-1\u003c/em\u003e (AGAP000705) and \u003cem\u003eCatalase\u003c/em\u003e are consistently overexpressed, both these genes play a role in protection against oxidative stress [\u003cspan class=\"CitationRef\"\u003e51\u003c/span\u003e]. These data are consistent with previous publications indicating that insecticide exposure induces an oxidative stress response [\u003cspan class=\"CitationRef\"\u003e52\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e54\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eSignatures of expression in candidates related to cuticular hydrocarbons were also present, which may indicate a change to lipid processing as recently proposed [\u003cspan class=\"CitationRef\"\u003e55\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e56\u003c/span\u003e] or be related to a thicker cuticle [\u003cspan class=\"CitationRef\"\u003e24\u003c/span\u003e]. Genome-wide expression scans highlighted multiple gene clusters of cuticular protein families as outliers of gene expression, including the \u003cem\u003eCLPCA, CPLCG, CPLCP\u003c/em\u003e and \u003cem\u003eCPR\u003c/em\u003e families, which is reflected in the large number of these genes in the top candidate gene lists based on median and mean fold-changes. AGAP010368, an ortholog to a gene involved in fatty acid alpha-oxidation in \u003cem\u003eDrosophila\u003c/em\u003e, a long chain fatty acid CoA ligase (AGAP009159) and a fatty acid elongase (AGAP003195) are consistently over expressed. The large number of cuticular proteins (20/113) highly expressed and the lack of those consistent across all populations may suggest high levels of redundancy.\u003c/p\u003e\n\u003cp\u003eWe observe little differential expression of the major target site genes, such as the target of pyrethroids the \u003cem\u003eVgsc\u003c/em\u003e and dieldrin \u003cem\u003eRdl\u003c/em\u003e. As previously reported, the expression profile of the OP and carbamate target \u003cem\u003eAce-1\u003c/em\u003e is an exception, often upregulated in \u003cem\u003eAn. coluzzii\u003c/em\u003e and \u003cem\u003eAn. gambiae\u003c/em\u003e yet showing negative fold-changes \u003cem\u003eAn. funestus\u003c/em\u003e experiments. It is worth noting that all five \u003cem\u003eAn. funestus\u003c/em\u003e experiments included in this study use the same susceptible comparator strain, FANG, and so this result could be a peculiarity of this laboratory strain. We find regular over-expression of the gene coding for the beta-2 subunit of nicotinic acetylcholine receptor (AGAP010057), recently associated with Pirimiphos-methyl resistance in a genome-wide association study [\u003cspan class=\"CitationRef\"\u003e12\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eAs well as physiological changes which directly confer resistance to insecticides, the potential for mosquitoes to avoid contact with insecticides and exploit humans when they are least protected is a serious threat to malaria vector control. Mosquito behaviour is likely to be driven in part by gene expression [\u003cspan class=\"CitationRef\"\u003e57\u003c/span\u003e] and transcriptomic studies may allow us to identify candidate genes involved in insecticide-resistant behaviours. This is of particular interest given the difficulty in measuring behavioural resistance in wild mosquito populations. Interestingly, enrichment analyses highlighted that our expression candidate genes were enriched for genes involved in olfactory processes. Gustatory receptor 49 (AGAP001169) shows a median fold-change of 3.39, the highest of any olfactory receptor in our data, and is a promising candidate for behaviourally-linked insecticide resistance. The odorant receptors 13 (AGAP009396) and 9 (AGAP008333) also show high median fold-changes of 2.73 and 2.71, respectively. It is, however, important to note that adaptation of susceptible strains to insectary rearing could cause the downregulation of olfactory genes, and so contribute to these signals.\u003c/p\u003e\n\u003cp\u003eWe also identify candidate genes from gene families not previously associated with resistance in malaria vectors. The ortholog of \u003cem\u003eneural lazarillo\u003c/em\u003e (AGAP009281) has a high median fold-change, a gene which is known to contribute to longevity, stress resistance and behavioural change through the \u003cem\u003eJNK\u003c/em\u003e stress response pathway in \u003cem\u003eDrosophila\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e58\u003c/span\u003e]. Several venom allergens also exhibit high median fold-changes. Interestingly, \u003cem\u003eVenom allergen 5\u003c/em\u003e has been associated with resistance in \u003cem\u003eCulex\u003c/em\u003e through dsRNA and over-expression in cells [\u003cspan class=\"CitationRef\"\u003e59\u003c/span\u003e]. These genes are poorly understood but a number have been found in the salivary gland proteome where they have a diverse range of functions, with none currently being characterised [\u003cspan class=\"CitationRef\"\u003e60\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eTrans-acting regulation plays an important role in modulating gene expression to insecticide stress in insects [\u003cspan class=\"CitationRef\"\u003e44\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e61\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e63\u003c/span\u003e]. To explore whether we could identify transcription factors involved in insecticide resistance, we inferred a gene regulatory network. The predicted interactors of several transcription factors were enriched in GO terms related to insecticide resistance. Amongst them is \u003cem\u003eRootletin\u003c/em\u003e, which is known to play a role in behavioural response in \u003cem\u003eDrosophila\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e64\u003c/span\u003e] and was previously linked to resistance as a key hub in response to insecticide exposure [\u003cspan class=\"CitationRef\"\u003e45\u003c/span\u003e], whilst \u003cem\u003eGrau\u003c/em\u003e has been linked to pyrethroid resistance in \u003cem\u003eAedes\u003c/em\u003e populations [\u003cspan class=\"CitationRef\"\u003e65\u003c/span\u003e]. Other transcription factors enriched in insecticide resistance-related terms that are consistent with known roles in \u003cem\u003eDrosophila\u003c/em\u003e includes \u003cem\u003eCabut\u003c/em\u003e, which has been shown to regulate stress response [\u003cspan class=\"CitationRef\"\u003e66\u003c/span\u003e], \u003cem\u003eE(spl)m3-HLH\u003c/em\u003e which is linked with stress granules [\u003cspan class=\"CitationRef\"\u003e67\u003c/span\u003e], \u003cem\u003eGrh\u003c/em\u003e that is involved in cuticular repair [\u003cspan class=\"CitationRef\"\u003e68\u003c/span\u003e], \u003cem\u003eExex\u003c/em\u003e is involved in neuronal function [\u003cspan class=\"CitationRef\"\u003e69\u003c/span\u003e], \u003cem\u003eOnecut\u003c/em\u003e with synapse organisation [\u003cspan class=\"CitationRef\"\u003e70\u003c/span\u003e] and \u003cem\u003eTfam\u003c/em\u003e is a mitochondrial transcription factor, with a role in oxidative stress response [\u003cspan class=\"CitationRef\"\u003e71\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eWe also explored the role that gene expression plays in shaping levels of genetic diversity across the genome. Protein evolution is constrained by purifying selection, which acts to prevent deleterious mutations from spreading in a population [\u003cspan class=\"CitationRef\"\u003e72\u003c/span\u003e]. Previous studies have shown a relationship between higher gene expression and a slower accumulation of deleterious mutations [\u003cspan class=\"CitationRef\"\u003e46\u003c/span\u003e]. We find a similar relationship in \u003cem\u003eAnopheles\u003c/em\u003e mosquitoes, with the most highly expressed genes displaying lower rates of pN/pS, as well as a reduction in measures of genetic diversity, and replicate this effect across three mosquito species. \u003cem\u003eAnopheles gambiae s.l\u003c/em\u003e populations exhibit extremely large effective population sizes, which should accelerate the removal of deleterious mutations. Between species differences support this, as \u003cem\u003eAn. arabiensis\u003c/em\u003e displays lower nucleotide diversity, but higher rates of pN/pS for the most highly expressed genes.\u003c/p\u003e\n\u003cp\u003eIn this study, we present a holistic overview of signatures of gene expression between from the major malaria vectors \u003cem\u003eAnopheles gambiae s.l\u003c/em\u003e and \u003cem\u003eAnopheles funestus\u003c/em\u003e. As well as establishing relationships between gene expression and whole-genome sequence data, we demonstrate the importance and convergence of detoxification genes, as well as highlighting putative transcription factors and novel gene families involved in insecticide resistance. Despite the power in this study there are several limitations that should be considered; firstly the \u003cem\u003eAn. funestus\u003c/em\u003e data is from one experiment, and therefore is likely to have significant batch effects. Secondly, \u003cem\u003eAnopheles\u003c/em\u003e have a gene/loss factor five times that of \u003cem\u003eDrosophila\u003c/em\u003e [\u003cspan class=\"CitationRef\"\u003e14\u003c/span\u003e] resulting in many gene families have one-to-many orthologs which have been excluded from this study, which results in a loss of data. Finally, here we take \u0026lsquo;insecticide resistance\u0026rsquo; as an overall phenotype because the populations grouped in this study have differential resistances to different classes, although all are pyrethroid resistance. After taking these limitations into count, these data highlight novel functional validation and genomic surveillance targets for malaria vector control in Africa.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec16\" class=\"Section3\"\u003e\n\u003ch2\u003eRNA-sequencing data\u003c/h2\u003e\n\u003cp\u003eAll RNA-sequencing data published comparing resistant and susceptible members of \u003cem\u003eAn. gambiae\u003c/em\u003e, \u003cem\u003eAn. coluzzii, An. arabiensis\u003c/em\u003e and \u003cem\u003eAn. funestus\u003c/em\u003e were retrieved from Google Scholar in April 2022. Of the 28 resistant populations studied, 15 are \u003cem\u003eAn. coluzzii\u003c/em\u003e, 3 are \u003cem\u003eAn. gambiae\u003c/em\u003e, 5 are \u003cem\u003eAn. arabiensis\u003c/em\u003e and 5 are \u003cem\u003eAn. funestus\u003c/em\u003e (Supplementary Table\u0026nbsp;1) [\u003cspan class=\"CitationRef\"\u003e32\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e38\u003c/span\u003e]. As multiple studies used the same susceptible populations due to the widespread resistance found in endemic settings, replicates of the same populations across studies were present (Supplementary Table\u0026nbsp;1). Count files were retrieved from authors of each paper and combined to form one large counts file. VectorBase was used to retrieve orthologs from each species to \u003cem\u003eAn. gambiae\u003c/em\u003e PEST and the counts of one-to-many relationships were averaged.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\n\u003ch2\u003eDifferential expression analysis\u003c/h2\u003e\n\u003cp\u003eThe data was then normalised using the DEseq2 1.26.0 [\u003cspan class=\"CitationRef\"\u003e73\u003c/span\u003e] using estimateSizeFactors, estimateDispersions and the varianceStabilizingTransformation. Differential expression analysis (DEA) was performed with DESeq2 v1.26 in R v3.6.3. Due to batch effects and large differences in library depth between the included RNA-Sequencing studies, we only performed DEA within each experiment, comparing the susceptible (control) to the resistant strain (case). Positive fold changes indicate over-expression in the resistant strain. Hypothesis testing was performed with the DESeq2 wald test.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\n\u003ch2\u003eMicroarray data\u003c/h2\u003e\n\u003cp\u003eWe integrated Microarray data from an earlier meta-analysis [\u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e] into AnoExpress. As this was performed at the transcript level rather than genes, we averaged log2 fold change and adjusted p-value data across transcripts for a given gene to match the AnoExpress meta-analysis.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e\n\u003ch2\u003eGene-set enrichment analysis\u003c/h2\u003e\n\u003cp\u003eGene set-enrichment was performed using the hypergeometric test implementation in Scipy, incorporated into AnoExpress. GO terms were extracted from VectorBase and PFAM domains from [\u003cspan class=\"CitationRef\"\u003e74\u003c/span\u003e].\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec20\" class=\"Section2\"\u003e\n\u003ch2\u003eGenomic clustering\u003c/h2\u003e\n\u003cp\u003eUsing the \u003cem\u003egamb_colu_arab\u003c/em\u003e analysis we calculated and averaged the Pearson\u0026rsquo;s correlation between log2 fold changes for each neighbouring gene pair in the dataset, and then calculated the mean pearsons R across all pairs. We excluded \u003cem\u003eAn. funestus\u003c/em\u003e due to differences in synteny compared to \u003cem\u003eAn. gambiae s.l.\u003c/em\u003e To determine whether this value was higher than expected based on chance, we randomly permuted the positions of all genes in the genome 1000 times, calculated the Pearsons correlation between neighbouring genes, and calculated the mean. We considered the effect significant if the fraction of permutations showing a more extreme value than the actual true value was below 0.05.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e\n\u003ch2\u003eGenetic diversity\u003c/h2\u003e\n\u003cp\u003eUsing log2 count data from the \u003cem\u003egamb_colu_arab\u003c/em\u003e analysis, we calculated the median counts across all experiments, and binned genes into 5% percentiles of expression level. We excluded \u003cem\u003eAn. funestus\u003c/em\u003e, as sufficient whole-genome sequence data is not yet publicly available. We selected three cohorts to analyse from the Anopheles 1000 genome project [\u003cspan class=\"CitationRef\"\u003e31\u003c/span\u003e]; \u003cem\u003eAn. gambiae\u003c/em\u003e from Nagongera, Uganda, \u003cem\u003eAn. coluzzii\u003c/em\u003e from Hauts-Bassins, Burkina Faso, and \u003cem\u003eAn. arabiensis\u003c/em\u003e from Muleba, Tanzania, collected between 2012 and 2015. We randomly selected 50 individuals from each cohort. Using segregating sites only, we calculated the number of non-synonymous and synoynmous sites for each transcript, using the first transcript annotation for each gene, up to -RC. We calculated the overall CDS length per transcript and calculated the number of synonymous or non-synoynmous mutations per 1000bp of CDS. We also calculated nucleotide diversity and Wattersons Theta across the entire length of the gene (including introns), using scikit-allel v1.2.1 [\u003cspan class=\"CitationRef\"\u003e75\u003c/span\u003e].\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e\n\u003ch2\u003eGenome-wide selection scan (GWSS) data\u003c/h2\u003e\n\u003cp\u003eWe integrate data from H12 genome-wide selection scans [\u003cspan class=\"CitationRef\"\u003e40\u003c/span\u003e] from the selection-atlas. More information on the GWSS and peak-calling algorithm can be found in Supplementary Text 1.\u003c/p\u003e\n\u003cdiv id=\"Sec23\" class=\"Section3\"\u003e\n\u003ch2\u003eGenome-wide expression scans\u003c/h2\u003e\n\u003cp\u003eWe calculate a sliding window mean of log2 fold-change data, for the \u003cem\u003egamb_colu_arab_fun\u003c/em\u003e analysis, with a window size of 10 genes and a step of 2 genes.\u003c/p\u003e\n\u003c/div\u003e\n\u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eResults from the differential expression analyses are stored in the github repository for the AnoExpress python package - https://github.com/sanjaynagi/AnoExpress, and can be explored in the cloud with a series of Google Colaboratory notebooks, or on users\u0026rsquo; local machines.\u003c/p\u003e\n\u003cp\u003eAll RNA-sequencing data is available from the publications listed in Supplementary Table 1. All code used for this study is available on GitHub: https://github.com/sanjaynagi/AnoExpress. The authors declare that all other data supporting the findings of this study, are available within the article and its Supplementary Information files.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was funded by a Deutsches Zentrum f\u0026uuml;r Infektionsforschung grant (TTU 03.705) to VAI. SCN was funded by an MRC CASE studentship (MR/R015678/1) and a National Institute of Allergy and Infectious Diseases grant (NIAID R01-AI116811 to Martin Donnelly and David Weetman). The authors would like to thank all those who freely gave count data from prior publications, including Louisa Messenger, Dieunel Derilus, Pie M\u0026uuml;ller, Nadja Wipf, Charles Wondji, Sulaiman Ibrahim, Jack Hearn, Thomas van Leeuwen and Wannes Dermauw. We also thank Hilary Ranson, David Weetman and Alistair Miles for feedback on the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConceptualisation, analysis, paper drafting (VAI); Conceptualisation, analysis, creation of python package, paper drafting (SCN).\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eOrganization, W.H., \u003cem\u003eWorld Malaria Report\u003c/em\u003e. 2022, WHO.\u003c/li\u003e\n\u003cli\u003eBhatt, S., et al., \u003cem\u003eThe effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015.\u003c/em\u003e Nature, 2015. \u003cstrong\u003e526\u003c/strong\u003e: p. 207-211.\u003c/li\u003e\n\u003cli\u003eOrganization, W.H., \u003cem\u003eGlobal report on insecticide resistance in malaria vectors: 2010\u0026ndash;2016\u003c/em\u003e. 2018, Geneva: World Health Organization. p. 72.\u003c/li\u003e\n\u003cli\u003eChurcher, T.S., et al., \u003cem\u003eThe impact of pyrethroid resistance on the efficacy and effectiveness of bednets for malaria control in Africa\u003c/em\u003e. 2016, eLife Sciences Publications Limited. p. e16090.\u003c/li\u003e\n\u003cli\u003eCornelie, S., et al., \u003cem\u003eSalivary Gland Proteome Analysis Reveals Modulation of Anopheline Unique Proteins in Insensitive Acetylcholinesterase Resistant Anopheles gambiae Mosquitoes\u003c/em\u003e. 2014, Public Library of Science. p. e103816.\u003c/li\u003e\n\u003cli\u003eCoetzee, M., et al., \u003cem\u003eAnopheles coluzzii and Anopheles amharicus, new members of the Anopheles gambiae complex\u003c/em\u003e. 2013. p. 246-274.\u003c/li\u003e\n\u003cli\u003eSinka, M.H., et al., \u003cem\u003eThe Dominant Anopheles Vectors of Human Malaria in Africa, Europe and the Middle East: Occurrence Data, Distribution Maps and Bionomic Pr\u0026eacute;cis.\u003c/em\u003e 2010, BioMed Central. p. 117-152.\u003c/li\u003e\n\u003cli\u003eGuo, D., et al., \u003cem\u003eACE: an efficient and sensitive tool to detect insecticide resistance-associated mutations in insect acetylcholinesterase from RNA-Seq data.\u003c/em\u003e BMC Bioinformatics, 2017. \u003cstrong\u003e18\u003c/strong\u003e(1): p. 330.\u003c/li\u003e\n\u003cli\u003eScott, J.G., \u003cem\u003eLife and Death at the Voltage-Sensitive Sodium Channel: Evolution in Response to Insecticide Use.\u003c/em\u003e Annual Review of Entomology, 2019. \u003cstrong\u003e64\u003c/strong\u003e(1): p. 243-257.\u003c/li\u003e\n\u003cli\u003eNauen, R., et al., \u003cem\u003eThe Role of Cytochrome P450s in Insect Toxicology and Resistance.\u003c/em\u003e Annual Review of Entomology, 2022. \u003cstrong\u003e67\u003c/strong\u003e(1): p. 105-124.\u003c/li\u003e\n\u003cli\u003eClarkson, C.S., et al., \u003cem\u003eThe genetic architecture of target-site resistance to pyrethroid insecticides in the African malaria vectors Anopheles gambiae and Anopheles coluzzii\u003c/em\u003e. 2021, John Wiley \u0026amp; Sons, Ltd. p. 5303-5317.\u003c/li\u003e\n\u003cli\u003eGrau-Bov\u0026eacute;, X., et al., \u003cem\u003eResistance to pirimiphos-methyl in West African Anopheles is spreading via duplication and introgression of the Ace1 locus\u003c/em\u003e. 2021, Public Library of Science. p. e1009253.\u003c/li\u003e\n\u003cli\u003eSmall, S.T., et al., \u003cem\u003eRadiation with reticulation marks the origin of a major malaria vector.\u003c/em\u003e Proceedings of the National Academy of Sciences, 2020. \u003cstrong\u003e117\u003c/strong\u003e(50): p. 31583-31590.\u003c/li\u003e\n\u003cli\u003eNeafsey, D.E., et al., \u003cem\u003eMosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes\u003c/em\u003e. 2015.\u003c/li\u003e\n\u003cli\u003eHemingway, J. and H. Ranson, \u003cem\u003eInsecticide resistance in insect vectors of human disease.\u003c/em\u003e 2000. p. 371-391.\u003c/li\u003e\n\u003cli\u003eIngham, V.A., L. Grigoraki, and H. Ranson, \u003cem\u003ePyrethroid resistance mechanisms in the major malaria vector species complex.\u003c/em\u003e Entomologia Generalis, 2023. \u003cstrong\u003e43\u003c/strong\u003e(3): p. 515-526.\u003c/li\u003e\n\u003cli\u003eIrving, H., et al., \u003cem\u003ePositional cloning of rp2 QTL associates the P450 genes CYP6Z1, CYP6Z3 and CYP6M7 with pyrethroid resistance in the malaria vector Anopheles funestus\u003c/em\u003e. 2012, The Author(s). p. 383.\u003c/li\u003e\n\u003cli\u003eIbrahim, S.S., et al., \u003cem\u003eThe P450 CYP6Z1 confers carbamate/pyrethroid cross-resistance in a major African malaria vector beside a novel carbamate-insensitive N485I acetylcholinesterase-1 mutation\u003c/em\u003e. 2016, John Wiley \u0026amp; Sons, Ltd (10.1111). p. 3436-3452.\u003c/li\u003e\n\u003cli\u003eHearn, J., et al., \u003cem\u003eMulti‐omics analysis identifies a CYP9K1 haplotype conferring pyrethroid resistance in the malaria vector Anopheles funestus in East Africa.\u003c/em\u003e Molecular ecology, 2022. \u003cstrong\u003e31\u003c/strong\u003e(13): p. 3642-3657.\u003c/li\u003e\n\u003cli\u003eYunta, C., et al., \u003cem\u003eCross-resistance profiles of malaria mosquito P450s associated with pyrethroid resistance against WHO insecticides\u003c/em\u003e. 2019.\u003c/li\u003e\n\u003cli\u003eVontas, J., et al., \u003cem\u003eRapid selection of a pyrethroid metabolic enzyme CYP9K1 by operational malaria control activities\u003c/em\u003e. 2018. p. 4619 LP - 4624.\u003c/li\u003e\n\u003cli\u003eNjoroge, H., et al., \u003cem\u003eIdentification of a rapidly‐spreading triple mutant for high‐level metabolic insecticide resistance in Anopheles gambiae provides a real‐time molecular diagnostic for antimalarial intervention deployment.\u003c/em\u003e Molecular ecology, 2022. \u003cstrong\u003e31\u003c/strong\u003e(16): p. 4307-4318.\u003c/li\u003e\n\u003cli\u003eIbrahim, S.S., et al., \u003cem\u003eThe cytochrome P450 CYP6P4 is responsible for the high pyrethroid resistance in knockdown resistance-free Anopheles arabiensis.\u003c/em\u003e 2015.\u003c/li\u003e\n\u003cli\u003eBalabanidou, V., L. Grigoraki, and J. Vontas, \u003cem\u003eInsect cuticle: a critical determinant of insecticide resistance\u003c/em\u003e. 2018. p. 68-74.\u003c/li\u003e\n\u003cli\u003eBalabanidou, V., et al., \u003cem\u003eCytochrome P450 associated with insecticide resistance catalyzes cuticular hydrocarbon production in Anopheles gambiae.\u003c/em\u003e Proceedings of the National Academy of Sciences, 2016. \u003cstrong\u003e113\u003c/strong\u003e(33): p. 9268-9273.\u003c/li\u003e\n\u003cli\u003eWood, O.R., et al., \u003cem\u003eCuticle thickening associated with pyrethroid resistance in the major malaria vector Anopheles funestus.\u003c/em\u003e Parasites \u0026amp; Vectors, 2010. \u003cstrong\u003e3\u003c/strong\u003e: p. 67.\u003c/li\u003e\n\u003cli\u003eMartinez-Torres, D., et al., \u003cem\u003eMolecular characterization of pyrethroid knockdown resistance (kdr) in the major malaria vector Anopheles gambiae s.s.\u003c/em\u003e 1998. p. 179-184.\u003c/li\u003e\n\u003cli\u003eWeill, M., et al., \u003cem\u003eThe unique mutation in ace-1 giving high insecticide resistance is easily detectable in mosquito vectors\u003c/em\u003e. 2004, Blackwell Publishing Ltd. p. 1-7.\u003c/li\u003e\n\u003cli\u003eIngham, V.A., et al., \u003cem\u003eA sensory appendage protein protects malaria vectors from pyrethroids.\u003c/em\u003e Nature, 2019. \u003cstrong\u003e577\u003c/strong\u003e: p. 376-380.\u003c/li\u003e\n\u003cli\u003eIngham, V.A., S. Wagstaff, and H. Ranson, \u003cem\u003eTranscriptomic meta-signatures identified in Anopheles gambiae populations reveal previously undetected insecticide resistance mechanisms.\u003c/em\u003e Nature Communications, 2018. \u003cstrong\u003e9\u003c/strong\u003e: p. 5282.\u003c/li\u003e\n\u003cli\u003eConsortium, A.g.G., \u003cem\u003eGenetic diversity of the African malaria vector Anopheles gambiae\u003c/em\u003e. 2017, Nature Publishing Group. p. 96.\u003c/li\u003e\n\u003cli\u003eWilliams, J., et al., \u003cem\u003eSympatric Populations of the Anopheles gambiae Complex in Southwest Burkina Faso Evolve Multiple Diverse Resistance Mechanisms in Response to Intense Selection Pressure with Pyrethroids\u003c/em\u003e. 2022.\u003c/li\u003e\n\u003cli\u003eSimma, E.A., et al., \u003cem\u003eGenome-wide gene expression profiling reveals that cuticle alterations and P450 detoxification are associated with deltamethrin and DDT resistance in Anopheles arabiensis populations from Ethiopia.\u003c/em\u003e 2019.\u003c/li\u003e\n\u003cli\u003eMessenger, L.A., et al., \u003cem\u003eA whole transcriptomic approach provides novel insights into the molecular basis of organophosphate and pyrethroid resistance in Anopheles arabiensis from Ethiopia\u003c/em\u003e. 2021, Elsevier. p. 103655.\u003c/li\u003e\n\u003cli\u003eIngham, V.A., et al., \u003cem\u003eIntegration of whole genome sequencing and transcriptomics reveals a complex picture of the reestablishment of insecticide resistance in the major malaria vector Anopheles coluzzii.\u003c/em\u003e PLOS Genetics, 2021. \u003cstrong\u003e17\u003c/strong\u003e: p. e1009970.\u003c/li\u003e\n\u003cli\u003eIbrahim, S.S., et al., \u003cem\u003eMolecular drivers of insecticide resistance in the Sahelo-Sudanian populations of a major malaria vector Anopheles coluzzii.\u003c/em\u003e BMC Biology, 2023. \u003cstrong\u003e21\u003c/strong\u003e(1): p. 125.\u003c/li\u003e\n\u003cli\u003eWipf, N.C., et al., \u003cem\u003eMulti-insecticide resistant malaria vectors in the field remain susceptible to malathion, despite the presence of Ace1 point mutations\u003c/em\u003e. 2022, Public Library of Science. p. e1009963.\u003c/li\u003e\n\u003cli\u003eKouamo, M.F.M., et al., \u003cem\u003eGenome-Wide Transcriptional Analysis and Functional Validation Linked a Cluster of Epsilon Glutathione S-Transferases with Insecticide Resistance in the Major Malaria Vector Anopheles funestus across Africa\u003c/em\u003e. 2021.\u003c/li\u003e\n\u003cli\u003eLeffler, E.M., et al., \u003cem\u003eRevisiting an Old Riddle: What Determines Genetic Diversity Levels within Species?\u003c/em\u003e PLOS Biology, 2012. \u003cstrong\u003e10\u003c/strong\u003e(9): p. e1001388.\u003c/li\u003e\n\u003cli\u003eGarud, N.R., et al., \u003cem\u003eRecent Selective Sweeps in North American Drosophila melanogaster Show Signatures of Soft Sweeps.\u003c/em\u003e PLOS Genetics, 2015. \u003cstrong\u003e11\u003c/strong\u003e(2): p. e1005004.\u003c/li\u003e\n\u003cli\u003eIsaacs, A.T., et al., \u003cem\u003eGenome-wide transcriptional analyses in Anopheles mosquitoes reveal an unexpected association between salivary gland gene expression and insecticide resistance\u003c/em\u003e. 2018. p. 225.\u003c/li\u003e\n\u003cli\u003eNamiki, T., et al., \u003cem\u003eCytochrome P450 CYP307A1/Spook: a regulator for ecdysone synthesis in insects\u003c/em\u003e. 2005, Elsevier. p. 367-374.\u003c/li\u003e\n\u003cli\u003eOrtelli, F., et al., \u003cem\u003eHeterologous expression of four glutathione transferase genes genetically linked to a major insecticide-resistance locus from the malaria vector Anopheles gambiae\u003c/em\u003e. 2003. p. 957-963.\u003c/li\u003e\n\u003cli\u003eIngham, V.A., et al., \u003cem\u003eThe transcription factor Maf-S regulates metabolic resistance to insecticides in the malaria vector Anopheles gambiae.\u003c/em\u003e BMC Genomics, 2017. \u003cstrong\u003e18\u003c/strong\u003e: p. 669.\u003c/li\u003e\n\u003cli\u003eIngham, V.A., et al., \u003cem\u003eCapturing the transcription factor interactome in response to sub-lethal insecticide exposure.\u003c/em\u003e Current Research in Insect Science, 2021: p. 100018.\u003c/li\u003e\n\u003cli\u003eDuret, L. and D. Mouchiroud, \u003cem\u003eDeterminants of Substitution Rates in Mammalian Genes: Expression Pattern Affects Selection Intensity but Not Mutation Rate.\u003c/em\u003e Molecular Biology and Evolution, 2000. \u003cstrong\u003e17\u003c/strong\u003e(1): p. 68-070.\u003c/li\u003e\n\u003cli\u003eNagi, S.C., et al., \u003cem\u003eRNA-Seq-Pop: Exploiting the sequence in RNA sequencing\u0026mdash;A Snakemake workflow reveals patterns of insecticide resistance in the malaria vector Anopheles gambiae.\u003c/em\u003e Molecular Ecology Resources, 2023. \u003cstrong\u003e23\u003c/strong\u003e(4): p. 946-961.\u003c/li\u003e\n\u003cli\u003eLeon, M., et al., \u003cem\u003eA rapidly selected 4.3kb transposon-containing structural variation is driving a P450-based resistance to pyrethroids in the African malaria vector Anopheles funestus\u003c/em\u003e. 2023, Authorea Preprints.\u003c/li\u003e\n\u003cli\u003eMoyes, C.L., et al., \u003cem\u003eAssessing cross-resistance within the pyrethroids in terms of their interactions with key cytochrome P450 enzymes and resistance in vector populations\u003c/em\u003e. 2021. p. 115.\u003c/li\u003e\n\u003cli\u003eLucas, E.R., et al., \u003cem\u003eWhole-genome sequencing reveals high complexity of copy number variation at insecticide resistance loci in malaria mosquitoes\u003c/em\u003e. 2019, Cold Spring Harbor Laboratory Press. p. 1250-1261.\u003c/li\u003e\n\u003cli\u003eLavara-Culebras, E. and N. Paricio, \u003cem\u003eDrosophila DJ-1 mutants are sensitive to oxidative stress and show reduced lifespan and motor deficits.\u003c/em\u003e Gene, 2007. \u003cstrong\u003e400\u003c/strong\u003e(1): p. 158-165.\u003c/li\u003e\n\u003cli\u003eChampion, C.J. and J. Xu, \u003cem\u003eRedox state affects fecundity and insecticide susceptibility in Anopheles gambiae\u003c/em\u003e. 2018. p. 13054.\u003c/li\u003e\n\u003cli\u003eOliver, S.V. and B.D. Brooke, \u003cem\u003eThe Role of Oxidative Stress in the Longevity and Insecticide Resistance Phenotype of the Major Malaria Vectors Anopheles arabiensis and Anopheles funestus\u003c/em\u003e. 2016, Public Library of Science. p. e0151049.\u003c/li\u003e\n\u003cli\u003eOtali, D., et al., \u003cem\u003eIncreased production of mitochondrial reactive oxygen species and reduced adult life span in an insecticide-resistant strain of Anopheles gambiae\u003c/em\u003e. 2014. p. 323-333.\u003c/li\u003e\n\u003cli\u003eAdams, K.L., et al., \u003cem\u003eCuticular hydrocarbons are associated with mating success and insecticide resistance in malaria vectors.\u003c/em\u003e Communications Biology, 2021. \u003cstrong\u003e4\u003c/strong\u003e: p. 911.\u003c/li\u003e\n\u003cli\u003eAdams, K.L., et al., \u003cem\u003eSelection for insecticide resistance can promote Plasmodium falciparum infection in Anopheles.\u003c/em\u003e PLOS Pathogens, 2023. \u003cstrong\u003e19\u003c/strong\u003e(6): p. e1011448.\u003c/li\u003e\n\u003cli\u003eGovella, N.J., et al., \u003cem\u003eHeritability of biting time behaviours in the major African malaria vector Anopheles arabiensis.\u003c/em\u003e Malaria Journal, 2023. \u003cstrong\u003e22\u003c/strong\u003e(1): p. 238.\u003c/li\u003e\n\u003cli\u003eRuiz, M., et al., \u003cem\u003eGrasshopper Lazarillo, a GPI-anchored Lipocalin, increases Drosophila longevity and stress resistance, and functionally replaces its secreted homolog NLaz.\u003c/em\u003e Insect Biochemistry and Molecular Biology, 2012. \u003cstrong\u003e42\u003c/strong\u003e(10): p. 776-789.\u003c/li\u003e\n\u003cli\u003eLv, Y., et al., \u003cem\u003eVenom allergen 5 is Associated With Deltamethrin Resistance in Culex pipiens pallens (Diptera: Culicidae).\u003c/em\u003e Journal of Medical Entomology, 2015. \u003cstrong\u003e52\u003c/strong\u003e(4): p. 672-682.\u003c/li\u003e\n\u003cli\u003eArc\u0026agrave;, B., et al., \u003cem\u003eAnopheline salivary protein genes and gene families: an evolutionary overview after the whole genome sequence of sixteen Anopheles species.\u003c/em\u003e BMC Genomics, 2017. \u003cstrong\u003e18\u003c/strong\u003e(1): p. 153.\u003c/li\u003e\n\u003cli\u003eGao, L., et al., \u003cem\u003eXenobiotic responses in insects\u003c/em\u003e. 2022, John Wiley \u0026amp; Sons, Ltd. p. e21869.\u003c/li\u003e\n\u003cli\u003eWondji, C.S., et al., \u003cem\u003eMapping a Quantitative Trait Locus (QTL) conferring pyrethroid resistance in the African malaria vector Anopheles funestus.\u003c/em\u003e BMC Genomics, 2007. \u003cstrong\u003e8\u003c/strong\u003e(1): p. 34.\u003c/li\u003e\n\u003cli\u003eMugenzi, L.M.J., et al., \u003cem\u003eCis-regulatory CYP6P9b P450 variants associated with loss of insecticide-treated bed net efficacy against Anopheles funestus\u003c/em\u003e. 2019, Nature Publishing Group. p. 1-11.\u003c/li\u003e\n\u003cli\u003eStyczynska-Soczka, K. and A.P. Jarman, \u003cem\u003eThe Drosophila homologue of Rootletin is required for mechanosensory function and ciliary rootlet formation in chordotonal sensory neurons.\u003c/em\u003e Cilia, 2015. \u003cstrong\u003e4\u003c/strong\u003e(1): p. 9.\u003c/li\u003e\n\u003cli\u003eCampbell, C.L., et al., \u003cem\u003eVgsc-interacting proteins are genetically associated with pyrethroid resistance in Aedes aegypti.\u003c/em\u003e PLOS ONE, 2019. \u003cstrong\u003e14\u003c/strong\u003e(1): p. e0211497.\u003c/li\u003e\n\u003cli\u003eMukherjee, S., N. Paricio, and N.S. Sokol, \u003cem\u003eA stress-responsive miRNA regulates BMP signaling to maintain tissue homeostasis.\u003c/em\u003e Proceedings of the National Academy of Sciences, 2021. \u003cstrong\u003e118\u003c/strong\u003e(21): p. e2022583118.\u003c/li\u003e\n\u003cli\u003eSingh, A., et al., \u003cem\u003eThe transcriptional response to oxidative stress is independent of stress-granule formation.\u003c/em\u003e Molecular Biology of the Cell, 2022. \u003cstrong\u003e33\u003c/strong\u003e(3): p. ar25.\u003c/li\u003e\n\u003cli\u003eMace, K.A., J.C. Pearson, and W. McGinnis, \u003cem\u003eAn Epidermal Barrier Wound Repair Pathway in Drosophila Is Mediated by grainy head.\u003c/em\u003e Science, 2005. \u003cstrong\u003e308\u003c/strong\u003e(5720): p. 381-385.\u003c/li\u003e\n\u003cli\u003eJoanne, P.O., H. Scott, and Q.D. Chris, \u003cem\u003eDrosophilaHB9 Is Expressed in a Subset of Motoneurons and Interneurons, Where It Regulates Gene Expression and Axon Pathfinding.\u003c/em\u003e The Journal of Neuroscience, 2002. \u003cstrong\u003e22\u003c/strong\u003e(21): p. 9143.\u003c/li\u003e\n\u003cli\u003eNguyen, D.N.T., M. Rohrbaugh, and Z.-C. Lai, \u003cem\u003eThe Drosophila homolog of Onecut homeodomain proteins is a neural-specific transcriptional activator with a potential role in regulating neural differentiation.\u003c/em\u003e Mechanisms of Development, 2000. \u003cstrong\u003e97\u003c/strong\u003e(1): p. 57-72.\u003c/li\u003e\n\u003cli\u003eMatsuda, T., et al., \u003cem\u003eEffects of overexpression of mitochondrial transcription factor A on lifespan and oxidative stress response in Drosophila melanogaster\u003c/em\u003e. 2013, Elsevier. p. 717-721.\u003c/li\u003e\n\u003cli\u003eEmiliano, T., et al., \u003cem\u003eGene expression is the main driver of purifying selection in large penguin populations.\u003c/em\u003e bioRxiv, 2023: p. 2023.08.08.552445.\u003c/li\u003e\n\u003cli\u003eLove, M.I., W. Huber, and S. Anders, \u003cem\u003eModerated estimation of fold change and dispersion for RNA-seq data with DESeq2\u003c/em\u003e. 2014, Springer. p. 550.\u003c/li\u003e\n\u003cli\u003eGrau-Bov\u0026eacute;, X., et al., \u003cem\u003eEvolution of the Insecticide Target Rdl in African Anopheles Is Driven by Interspecific and Interkaryotypic Introgression.\u003c/em\u003e Molecular Biology and Evolution, 2020. \u003cstrong\u003e37\u003c/strong\u003e(10): p. 2900-2917.\u003c/li\u003e\n\u003cli\u003eMiles, A. and N.J. Harding, \u003cem\u003escikit-allel: A Python package for exploring and analysing genetic variation data\u003c/em\u003e. 2016.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Anopheles, malaria, transcriptomics, insecticide resistance, pyrethroids, selective sweep, purifying selection, regulatory network","lastPublishedDoi":"10.21203/rs.3.rs-3910702/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3910702/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eMalaria control faces challenges from widespread insecticide resistance in major \u003cem\u003eAnopheles\u003c/em\u003e species. This study, employing a cross-species approach, integrates RNA-Sequencing, whole-genome sequencing, and microarray data to elucidate drivers of insecticide resistance in \u003cem\u003eAnopheles gambiae\u003c/em\u003e complex and \u003cem\u003eAn. funestus\u003c/em\u003e. Findings show an inverse relationship between genetic diversity and gene expression, with highly expressed genes experiencing stronger purifying selection. These genes cluster physically in the genome, revealing potential coordinated regulation. We identified known and novel candidate insecticide resistance genes, enriched in metabolic, cuticular, and behavioural functions. We also present AnoExpress, a Python package, and an online interface for user-friendly exploration of resistance candidate expression. Despite millions of years of speciation, convergent gene expression responses to insecticidal selection pressures are observed across \u003cem\u003eAnopheles\u003c/em\u003e species, providing crucial insights for malaria vector control. This study culminates in a rich dataset that allows us to understand molecular mechanisms, better enabling us to combat insecticide resistance effectively.\u003c/p\u003e","manuscriptTitle":"Genomic Profiling of Insecticide Resistance in Malaria Vectors: Insights into Molecular Mechanisms.","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-02-14 06:02:42","doi":"10.21203/rs.3.rs-3910702/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"08073093-85e1-4856-95b7-1c44668e813c","owner":[],"postedDate":"February 14th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":28741810,"name":"Biological sciences/Genetics/Gene expression"},{"id":28741811,"name":"Biological sciences/Genetics/Genetic association study"}],"tags":[],"updatedAt":"2024-02-14T06:02:50+00:00","versionOfRecord":[],"versionCreatedAt":"2024-02-14 06:02:42","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3910702","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3910702","identity":"rs-3910702","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-19T01:45:01.086888+00:00