Weak genetic divergence and signals of adaptation obscured by high gene flow in an economically important aquaculture species

preprint OA: closed
Full text JSON View at publisher
Full text 184,323 characters · extracted from preprint-html · click to expand
Weak genetic divergence and signals of adaptation obscured by high gene flow in an economically important aquaculture species | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Weak genetic divergence and signals of adaptation obscured by high gene flow in an economically important aquaculture species Bernarda Calla, Jingwei Song, Neil Thompson This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5418899/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 05 Feb, 2025 Read the published version in BMC Genomics → Version 1 posted 10 You are reading this latest preprint version Abstract Background: The genetic diversity of a population defines its ability to adapt to episodic and fluctuating environmental changes. For species of agricultural value, available genetic diversity also determines their breeding potential and remains fundamental to the development of practices that maintain health and productivity. In this study, we used whole-genome resequencing to investigate genetic diversity within and between naturalized and captively reared populations of Pacific oysters from the US Pacific coast. The analyses included individuals from preserved samples dating to 1998 and 2004, two contemporary naturalized populations, and one domesticated population. Results: Despite high overall heterozygosity, there was extremely low but significant genetic divergence between populations, indicative of high gene flow. The captive population, which was reared for over 25 years was the most genetically distinct population and exhibited reduced nucleotide diversity, attributable to inbreeding. Individuals from populations that were separated both geographically and temporally did not show detectable genetic differences, illustrating the consequences of human intervention in the form translocation of animals between farms, hatcheries and natural settings. Fifty-nine significant F ST outlier sites were identified, the majority of which were present in high proportions of the captive population individuals, and which are possibly associated with domestication. Conclusion: Pacific oysters in the US Pacific coast harbor high genetic heterozygosity which obscures weak population structure. Differences between these Pacific oyster populations could be leveraged for breeding and might be a source of adaptation to new environments. gene flow divergence Pacific oysters adaptation captivity inbreeding diversity Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Background The genetic diversity of a population determines its ability to adapt to episodic and fluctuating environmental changes. For domesticated species, including those used for agriculture, genetic diversity is a key component of their potential for genetic selection and for adaptation to non-natural environments. Under this view, comprehensive knowledge of the gene pool of a species remains fundamental for developing practices that maintain its health and productivity. This knowledge may also inform establishment of founder populations for broodstock and enable the measuring and utilization of the organism inherent resiliency. In addition, since the genetic diversity of a species is the result of an interplay between gene flow, genetic drift, and selection, it offers a window into its evolutionary and demographic history. The Pacific oyster Magallana gigas (previously Crassostrea gigas , Thunberg, 1973), a species native to the Pacific Coast of Asia, was introduced to the Pacific Coast of the United States through multiple independent transplants from Northeast Japan occurring mainly during the period between 1902 and 1960 [ 1 ]. These introductions followed the depletion of native oyster ( Ostrea lurida ) populations due to overharvesting and pollution [ 1 , 2 ]. Because of the inherent fast growth and high reproductive potential of M. gigas , this species quickly became established in commercial aquaculture, being now the most cultivated and consumed shellfish species in the U.S. and around the world. Annual production of Pacific oysters in the U.S. reached 2,433 metric tons in 2021 ( https://www.fisheries.noaa.gov/ ). In the 1980s, efforts focused on developing local Pacific oyster hatcheries that could satisfy the demand for seed, eliminating the dependency on imports from Japan and on seed collections from natural settlements occurring in limited areas of the North American Pacific Coast [ 3 ]. The Oregon State University Molluscan Broodstock Program (OSU-MBP) was initiated in 1996 using multiple collections of naturalized Pacific oysters from Willapa and Dabob bays (Washington, US) and from Pipestem Inlet, Vancouver Island (British Columbia, Canada). These founders were used to establish a hatchery-domesticated population maintained through artificial spawning and selected for yield traits and, more recently, for survival to OsHV-1 traits [ 4 – 7 ] The unique history of the origin, introduction, naturalization, and domestication of M. gigas in the United States Pacific Northwest, together with the availability of genetic material from both “museum” specimens (i.e., samples collected and preserved since the 1990s) and contemporary naturalized and captive populations offers an opportunity to evaluate the imprint of environmental challenges and of hatchery and selection practices on the genetic pattern within this species. In-depth knowledge of the available genetic diversity and population structure of the Pacific oyster is also paramount for the understanding of its potential to adapt to existing and future environmental challenges. Because of the repeated introductions that might have resulted in genetic remixing and/or bottlenecks, and due to hatchery and out-planting practices, M. gigas genomic variability and population structure on the U.S. Pacific West Coast could be complex [ 8 ]. In addition, typical population genetic assumptions may not hold for this species due to its high fecundity, high larval dispersion, and variable reproductive success [ 9 , 10 ]. M . gigas is known to harbor high levels of DNA polymorphism and heterozygosity [ 11 ], and exhibit “reproductive sweepstakes” which limits effective population size in specific cohorts [ 10 ], affects Mendelian segregation ratios via null alleles, and increases genetic load [ 12 , 13 ]. Several studies have documented genetic variability in M. gigas from different regions of the native and introduced species range. G Zhang, X Fang, X Guo, L Li, R Luo, F Xu, P Yang, L Zhang, X Wang, H Qi, et al. [ 14 ] assembled a reference genome from a 4th generation inbred M. gigas individual and identified over 3 million SNPs by comparing reads from a wild oyster against this newly assembled genome. This study had the obvious limitation that a single re-sequenced individual cannot represent variability between or within populations. More recent studies have used larger numbers of individuals spanning geographical areas, for example, H Qi, K Song, C Li, W Wang, B Li, L Li and G Zhang [ 15 ] documented 2.7 million SNPs by resequencing 472 Pacific oyster individuals from China, Japan, Korea, and Canada. A subset of the identified SNPs was used to construct a high-density genotyping array [ 15 ]. Variation in M. gigas European populations was investigated by sequencing 203 individuals from eight different geographical sources to build a medium-density array that also contained SNPs from Ostrea edulis (the European flat oyster); the study identified 12.4 million SNPs [ 16 ]. DLJ Vendrami, RD Houston, K Gharbi, L Telesca, AP Gutierrez, H Gurney-Smith, N Hasegawa, P Boudry and JI Hoffman [ 17 ] reported high genetic differentiation between southern and northern Pacific oyster populations from Western Europe using a SNP array. Low effects of translocation (movement of juveniles between natural settings and farms) and farm and hatchery practices on genetic connectivity between populations were identified for Canadian Pacific Northwest populations using roughly 17,000 markers identified via RAD-seq [ 18 ]. Within the native range, low coverage whole genome resequencing identified 12.2 million SNPs among two wild populations from Northeast Japan and Northeast China and two breeding populations derived from each wild collection [ 19 ]. Yet other research has found low genetic diversity and low population differentiation based on over 2 million SNPs in seven wild populations of M. gigas in Dalian, China, the major producer of Pacific oysters in the region [ 20 ]. To date, no studies have documented genetic variation among populations from the U.S. Pacific Coast using whole genome resequencing. Available studies for this geographical range have been restricted to those utilizing a limited number of markers (e.g., allozymes and low-coverage SNPs markers) [ 9 ] or reduced-representation methods (e.g., RAD-seq) [ 18 ]. The main factor precluding whole genome scans that involve a high number of samples is cost, but low coverage sequencing when paired with the appropriate analysis tools, can provide a cost-effective way of measuring diversity with larger sample sizes. In the present study, we obtained a catalog of current genetic diversity in M. gigas populations along the U.S. Pacific Coast, including naturalized and aquaculture populations, as well as founders and contemporary populations. Our aim was to evaluate available genetic diversity and to identify potential signatures of genetic selection resulting from either naturalization to the U.S. Pacific Coast or artificial selection and hatchery practices. For this purpose, we collected 100 individuals from 5 temporally and geographically segregated populations of M. gigas , including individuals that were part of the OSU-MBP collections, as well as naturalized individuals from Willapa Bay, Washington, and from a presumably self-recruiting population in San Diego Bay, California. Methods Sample collection: Pacific oyster samples from the U.S. Pacific Coast were obtained from five populations (Fig. 1 ). Adductor or mantle tissue was taken from each animal and preserved in liquid nitrogen or 95% ethanol. U.S. naturalized Pacific oysters were sampled from two sites: Willapa Bay, Washington (WB), and San Diego Bay, California (SD). Animals from the WB population were obtained from an oyster bed (“Parcel A”) near Nahcotta, Washington, in 2022. SD animals were collected from a pier located approximately 125 m south of Tuna Harbor in San Diego Bay; this location experienced oyster mass mortalities due to Ostreid Herpesvirus microvariant (OsHV-1 µvar) outbreaks in 2018 and 2020 [ 21 ]. Individuals for this study were collected in July 2020 while the mortality event was occurring, but sampled animals were not exhibiting symptoms of infection or stress. Two other populations were obtained from OSU-MBP. MBP cohort 6 (MBP6) samples were obtained from a collection of preserved individuals dating to 1998; these oysters were naturalized animals collected from Dabob Bay in 1996 and used as broodstock to initiate, in part, the OSU-MBP selective breeding program. The MBP cohort 30 (MBP30) samples were juveniles produced in 2021, these animals represent the seventh generation of the MBP and have a mixed lineage (founders were from different locations, and descendant families were widely intercrossed) [ 4 , 6 ]. Both MBP cohorts belong to the “Miyagi” population, which was repeatedly introduced to the U.S. from Northeast Japan starting in the 1920s [ 1 ]. A fifth population (MID) was obtained from preserved individuals that were collected in 2004 from the Kumamoto Prefecture in Southern Japan by the OSU-MBP. The samples were taken from originally collected individuals (G0), which were later spawned to create the “Midori” population [ 22 ]. This last population (MID) has undergone little to no selection and is currently cultured widely along the U.S. Pacific coast. Whole genome resequencing DNA extraction, library construction, and sequencing were carried out by the OSU Center for Quantitative Life Sciences. Libraries were constructed using the PrepX DNA library prep kit (Takara) and sequenced in an Illumina NexSeq 2000 with a P3 flow cell to obtain 150 bp paired reads. The target read coverage was approximately 5.6X per sample. Variant calling The resulting sequencing reads were checked for quality with FASTQC v.0.11.9 [ 23 ]; reads with trimmed adapters were then mapped to the reference Magallana gigas genome (NCBI GenBank # GCA_902806645.1) [ 24 ] using BWA v. 0.7.17-r1188 with the mem algorithm. For compatibility with the downstream pipeline, the BAM had read groups (RG) added with the ‘AddOrReplaceReadGroups’ from Picard v. 2.27.1 ( https://broadinstitute.github.io/picard/ ). Mapping quality and coverage depth were evaluated with Qualimap v. 2.3 [ 25 ]. The following steps were completed using the tools and Best Practices workflow from GATK v.4.1.4.1 [ 26 , 27 ]. Duplicates were marked with MarkDuplicatesSpark from, the GATK Spark implementation of Picard’s MarkDuplicates; this process also sorted the records by coordinate. Next, SNPs and indels were called with HaplotypeCaller, generating GCVF files for each sample. All GVCFs were then merged into a database with GenomicsDBImport first without and then with the “all-sites” option to maintain both variants and invariant sites. Joint genotyping was done with GenotypeGVCFs using the database generated in the previous step as input. For memory efficiency when using the “all-sites” flag, both GenomicsDBImport and joint genotyping were run on intervals created by splitting the reference genome into 18 segments. All files with called genotypes were then merged into ten separate chromosome files plus one file containing the 236 unmapped scaffolds for remaining steps. Best filtering strategies were explored in a subsample of the called genotypes using vcftools [ 28 ] and ggplot package in R v.4.3.1. Filtering was done in two separate steps following GATK best practices recommendations. First, a hard filtering step was applied using the following cutoffs: Fisher Strand > 60.0; Strand Odds Ratio (SOR) > 3; RMS Mapping Quality (MQ) < 40; Mapping Quality Rank Sum Test (MQRankSum) 12; Quality by Depth (QD) < 2.0; Read Position Rank Sum (ReadPosRankSum) 500. A second filtering was applied to remove individual genotypes with poor values (based on by-sample fields in the vcf file) with Read Depth (FMT/DP) < 3 and Genotype Quality likelihood (FMT/GQ) < 20 and genotypes which, after filtering off sites in the previous step, had all remaining sites missing. In addition, SNPs within ten bp of indels and with more than two alleles were removed (--SnpGap 10; -m2 -M2). The file containing invariants was filtered separately to preserve invariant and variant sites as input to Pixy software. To test deviation from HWE, we used the exact test as defined in JE Wigginton, DJ Cutler and GR Abecasis [ 29 ] and implemented in vcftools v0.1.17 “--hardy”. The max-missing flag filter was set to no more than 10% missing data. P-values were adjusted based on false discovery rate (FDR) [ 30 ]. SNP loci with an FDR < 0.1 were considered significant. Genetic diversity and population structure To evaluate genetic structure and relatedness between the populations, linked SNPs were first removed using PLINK2 [ 31 ] using -indep-pairswise 50 5 0.5. The resulting pruned set was used in a Principal Component Analysis (PCA). To assess the proportion of ancestry and relatedness from each population, we used ADMIXTURE v. 1.3.0 [ 32 , 33 ] with the PLINK pruned output; the optimal K-value was evaluated with the ADMIXTURE cross-validation feature with K ranging from 1 to 7. Average nucleotide diversity within populations (π) and average divergence between populations ( dxy ) was estimated with Pixy v.2.1.7 beta [ 28 ] using 10 kb non-overlapping sliding windows on the vcf file that had both variant and invariant sites. Per-population π and pairwise dxy averages were calculated by adding the raw by-window counts of pairwise differences between genotypes and then dividing by the sum of total raw by-window non-missing sites. Unbiased Weir and Cockerham F ST [ 34 ] and Nei’s genetic distances [ 35 ] between populations were estimated with StAMMP v. 1.6.3 [ 36 ]. The significance of the F ST values was calculated with 95% confidence intervals over 100 bootstrap replications. An Analysis of Molecular Variance (AMOVA) was run with 1000 permutations to estimate variance partitioning with StAMMP. Outlier detection F ST outliers were analyzed with Outflank v 0.2 [ 37 ] using SNPs that were present in at least 80% of the samples. To infer the χ 2 square distribution against which the outliers were tested, the tails of the F ST distribution were trimmed with the Outflank default trimming values but also removing low heterozygosity loci (He values > 0.1). The detected outliers were mapped back to the chromosomes in the reference genome assembly and were functionally annotated with SNPEff v. 5.1d [ 38 ]. Annotations were manually verified for accuracy by querying the NCBI databases. RESULTS One hundred individuals from five populations (WB, SD, MID, MBP6, and MB30) were sequenced (n = 20 per population) (Fig. 1 ). The 2.5 million mapped reads resulted in a mean sample coverage of 5X. The GC content of mapped reads was 34.93%, and the mean mapping quality per sample (BAM QC) was 36.76. Variants call The initial SNP calling by GATK contained over 100 million variants, including indels and SNPs, after site-level filtering, sample-level filtering, removing indels, and removing of repetitive regions, there were a total of 57.9 million biallelic SNPs, of which 20.3 million had a MAF > 1% and were not singletons (SNP density 31.3 SNPs per kb of the genome). Only 230 loci deviated significantly from HWE in all five sampling locations. No loci were removed based on the HWE test. Principal component analysis using the filtered SNPs set after pruning linked sites identified three principal components explaining 33.5% of the total variation (PC1, PC2, and PC3). PC1 explained 13.4% of the variation, whereas PC2 and PC3 explained an additional 10.4% and 9.75% of the variation, respectively. PC1 and PC2 both captured the variation between MBP30 and all other populations, these two PC also showed high overlap between MID and SD populations and showed that MBP6 and WB populations were indistinguishable in the PC space. PC3 separated MID from SD more distinctly (Fig. 2 ). An ADMIXTURE analysis showed no population substructure: the lowest cross-validation error was found at K = 1 (Fig. 3 A). However, a result more congruent with the PCA was seen when using a subsample of 10% of the raw (unfiltered) SNPs, the Q matrix from ADMIXTURE did show that MBP30 could be distinguishable from the other populations, while MD and the SD populations may share a genetic history that separate them slightly from the rest of the populations (with k = 3) (Fig. 3 B), indicating that weak population structure signals might effectively get lost when removing out rare variants through MAF filtering [ 39 ]. Genetic variation and differentiation To further characterize genetic diversity in the samples, we calculated weighted, unbiased π values across the genome in 10 kb windows using the dataset containing both variant and invariant sites (--allsites output from GATK). MBP30 had the lowest within-population nucleotide diversity (π) equal to 0.0043 compared to MBP6 = 0.0079; MID = 0.00786; SD = 0.00738; and WB = 0.00779. This difference was consistent across the whole genome (Fig. 4 ). Population differentiation, as measured using overall pairwise Weir & Cockerham's F ST , showed low divergence between populations. MBP30 has the highest divergence with F ST values between 0.0103 (with MBP6) and 0.0145 (with MID) (Table 1 , lower triangle). The lowest F ST values were found between MBP6 and WB (0.004). All F ST values in all the pairwise comparisons were highly significant, with p-values < 0.001 obtained by sampling with replacement (100x) and correcting for multiple testing with the Bonferroni method. In general, divergence (F ST ) closely resembled the PCA results. Dxy was also used to compare population differentiation, the results showed, again, that MBP30 has the highest divergence with all the other populations. However, pairwise dxy values across all the pairwise comparisons were higher and less variable than F ST ( dxy values ranging between 0.07717 and 0.09627; Table 1 upper triangle), indicating that the low within-population nucleotide diversity in MBP30 relative to that of other populations might be a major driver for the divergence results. An AMOVA, to test for partitioning of the genetic variation, found significant differentiation between populations but not within any of the populations (Table 2 ). Table 1 Measures of genetic divergence. Lower triangle Weir & Cockerham's F ST , upper triangle (bold) dxy . SD WB MBP30 MBP6 MID SD 0.080359 0.096269 0.079779 0.080080 WB 0.003257 0.091172 0.078181 0.077920 MBP30 0.011596 0.009732 0.090906 0.089730 MBP6 0.003245 0.000396 0.010290 0.077172 MID 0.002795 0.004311 0.014500 0.005008 Table 2 AMOVA SSD MSD df sigma2 P.value Between populations 0.00360666 0.0009 4 1.84E-05 0 Within populations 0.05077457 0.0005 95 5.34E-04 Total 0.05438122 0.0005 99 Outlier analyses: The Outflank software, which conservatively fits the distribution of F ST from neutral loci to a χ 2 square distribution, identified 59 top outlier candidate SNPs. These outliers were mapped in the reference genome and found to be distributed across nine of the ten contiguous chromosomes and in seven out of the 236 non-assembled scaffolds (Fig. 5 ). MB30 had the most individuals with outlier SNPs as heterozygous or homozygous alternative allele (51 out of 59), followed by SD (41 out of 59), whereas MBP6 and WB had the fewest SNPs as heterozygous or homozygous ALT alleles (Fig. 6 ). A functional annotation analysis of the genomic regions where the top outliers were found shows that 39% of these sites are in transcribed regions (protein and non-protein coding), and the remaining (61%) are in non-coding regions. SNPs that translate into non-synonymous variants within protein-coding sequences, which could be considered as the most consequential, include two missense variants, one in a transcript coding for “tripartite motif-containing protein 2” (XM_034452193.1) in Chromosome 10 (only present in individuals from the MBP30 population), and another in a transcript coding for “histone-lysine N-methyltransferase SETMAR-like” (XM_034451927.1) in Chromosome 8 (present in more than half of MBP30 individuals and only in up to 3 individuals from each of the other populations). Additionally, eleven outliers (18% of the total) mapped to five long-non-coding RNA (lncRNA) loci (Table 3 ). Annotation of outlier also included genomic regions that are within 5Kb of the outlier if they did not fall directly in a coding region. Those were classified as “upstream” or “downstream” gene variants following SnpEff annotation scheme. Half of the outliers that did not map into a coding region were within those categories (Table 3 ). Table 3 F ST outliers with position and annotations Outlier Number Chr. Number POS variant type Nearest coding sequence NCBI refseq ID sequence type* functional annotation* 1 Chr: 1 7680836 synonymous_variant XM_034445312.1 transcript homeobox protein 6 2 Chr: 1 33625290 downstream_gene_variant XM_011422575.2 transcript tubulin-specific chaperone C-like 3 Chr: 1 37904677 intragenic_variant XR_004595737.1 non-protein coding lncRNA 4 Chr: 1 37904740 intragenic_variant XR_004595737.1 non-protein coding lncRNA 5 Chr: 1 37904809 intragenic_variant XR_004595737.1 non-protein coding lncRNA 6 Chr: 1 55548026 intergenic_variant LOC105335559 transcript HCLS1-binding protein 3 7 Chr: 2 33394675 upstream_gene_variant XR_004601088.1 transcript uncharacterized LOC117688662 8 Chr: 2 39522736 downstream_gene_variant XM_034466364.1 transcript epidermal growth factor-like domains 9 Chr: 2 51858408 intergenic_variant 10 Chr: 2 59631568 intron_variant XM_034465519.1 transcript uncharacterized protein 11 Chr: 3 27743234 intron_variant LOC105329645 transcript sentrin-specific protease 1 12 Chr: 3 30213493 intergenic_variant 13 Chr: 3 47602303 downstream_gene_variant XR_004601319.1 transcript 14 Chr: 4 12185248 intergenic_variant 15 Chr: 4 22791392 intron_variant LOC15343840 transcript growth hormone-regulated TBC protein 1-A 16 Chr: 5 263654 intron_variant XM_034479567.1 transcript platelet glycoprotein Ib alpha chain 17 Chr: 5 39465652 downstream_gene_variant XM_034475582.1 transcript uncharacterized protein 18 Chr: 5 41465770 upstream_gene_variant XM_011438014.3 transcript cell division cycle 5-like protein 19 Chr: 5 41912112 intron_variant XR_010714176.1 non-protein coding lncRNA 20 Chr: 6 49280926 upstream_gene_variant XM_011427417.3 transcript E3 ubiquitin-protein ligase CHFR 21 Chr: 7 30647 intergenic_variant 22 Chr: 7 6307936 intron_variant LOC105329705 transcript uncharacterized protein 23 Chr: 8 120446 intergenic_variant 24 Chr: 8 3525666 intron_variant XM_034452193.1 transcript tripartite motif-containing protein 3-like 25 Chr: 8 38861094 downstream_gene_variant XM_034448703.1 transcript uncharacterized protein 26 Chr: 8 40231691 missense_variant XM_034451927.1 transcript histone-lysine N-methyltransferase SETMAR-like 27 Chr: 8 46762427 downstream_gene_variant XR_002201338.2 transcript 28 Chr: 8 52522525 upstream_gene_variant XM_034449879 transcript uncharacterized protein 29 Chr: 10 20830318 upstream_gene_variant XM_034455499.1-1 transcript protein FAM133A-like 30 Chr: 10 33311896 missense_variant XM_011451359.3 transcript tripartite motif-containing protein 2 31 Chr: 10 41514923 upstream_gene_variant XM_034455021.1 transcript opine dehydrogenase 32 NW_022994786.1 28721 intergenic_variant 33 NW_022994786.1 35777 intergenic_variant 34 NW_022994786.1 62891 downstream_gene_variant XM_011420943.3 transcript uncharacterized LOC105322303 35 NW_022994801.1 28046 intergenic_variant 36 NW_022994829.1 71361 downstream_gene_variant XR_004599051.1 transcript lncRNA 37 NW_022994829.1 80271 intron_variant XR_004599053.1 transcript lncRNA 38 NW_022994829.1 124001 intron_variant XM_034459954.1 transcript uncharacterized protein LOC105341878 39 NW_022994829.1 124008 intron_variant XM_034459954.1 transcript uncharacterized protein LOC105341878 40 NW_022994852.1 173056 downstream_gene_variant XM_034460498.1 transcript uncharacterized protein LOC117685913 41 NW_022994852.1 173152 downstream_gene_variant XM_034460498.1 transcript uncharacterized protein LOC117685913 42 NW_022994852.1 303488 intragenic_variant XR_004599282.1 non-protein coding lncRNA 43 NW_022994852.1 303501 intragenic_variant XR_004599282.1 non-protein coding lncRNA 44 NW_022994852.1 304348 intragenic_variant XR_004599282.1 non-protein coding lncRNA 45 NW_022994852.1 313977 intergenic_variant 46 NW_022994852.1 314413 intergenic_variant 47 NW_022994852.1 319933 intergenic_variant 48 NW_022994852.1 328132 intergenic_variant 49 NW_022994864.1 196193 intergenic_variant 50 NW_022994865.1 3222987 intergenic_variant 51 NW_022994890.1 32136 upstream_gene_variant XM_034461254.1 transcript zinc finger protein 862-like isoform X1 52 NW_022994890.1 37030 intergenic_variant zinc finger protein 862-like isoform X1 53 NW_022994890.1 42354 intergenic_variant zinc finger protein 862-like isoform X1 54 NW_022994890.1 42663 intergenic_variant zinc finger protein 862-like isoform X1 55 NW_022994890.1 73226 upstream_gene_variant LOC105348138 transcript uncharacterized protein LOC105348138 56 NW_022994890.1 122131 intron_variant LOC117686395 transcript 57 NW_022994890.1 127673 intragenic_variant XR_004599579.1 non-protein coding lncRNA 58 NW_022994890.1 127710 intragenic_variant XR_004599579.1 non-protein coding lncRNA 59 NW_022994940.1 16981 intron_variant LOC105324010 transcript uncharacterized protein LOC105324010 Discussion Magallana gigas , like other bivalves, is known to have extremely high levels of genetic polymorphisms. While this variability could be a source of adaptability and a source for genetic selection, it cannot be easily characterized since it is strongly affected by a combination of life history traits (e.g., high fecundity, high larval dispersal, variable sex ratios, variable reproductive success) and the unique demographic history of this species (i.e., M. gigas has been artificially transported, distributed, domesticated, and naturalized across the world). In the present study, we set to investigate the genetic diversity that exists in M. gigas populations on the US Pacific Coast to understand the demographic history, the potential for genetic selection, and the effects of naturalization and domestication on this economically and ecologically important species. We detected high heterozygosity through the identification of roughly 20 M SNPs by resequencing one hundred individuals from five different populations from the U.S. Pacific Coast. This is the highest number of variant sites in this species recorded to date, with the second highest from a study that identified 12.2 M SNPs by evaluating populations from a selective breeding program in China with two wild endemic populations from Japan and China [ 19 ]. The study by Hu et al., [ 19 ] is among the few that also used whole genomic resequencing, while previous studies might have been limited by either the method, the number of sites tested, or both. Although we found a high number of single-site variations, the divergence between populations was low. The 2021 cohort of selectively bred animals (MBP30) was the most divergent, whereas the other populations, except for WB and MBP6, could also be distinguished based on their genetic makeup but with overall signals of weak population divisions. These results were supported across multiple tests: PCA showed population subdivisions while ADMIXTURE showed weak structure that appears to get lost when stringently filtering out rare alleles. Independent estimators of divergence (F ST and dxy ) also indicated low but significant differentiation between populations. By-population F ST showed weak but very significant differentiation (0.00973–0.01450, P < 0.0001) between MBP30 and the other four populations. These F ST values are comparable to those calculated by Sun et al. [ 9 ] between a population of M. gigas from Hiroshima (Japan) and six from North America (F ST = 0.0151–0.0212), and slightly lower than values from a study using 33 loci where multiple MBP cohorts clustered distinctly away from naturalized Pacific coast populations (F ST = 0.0218) [ 40 ]. Likewise, a study involving a hatchery population of mixed lineages from US, British Columbia, and South America, showed high divergence from local naturalized and other hatchery populations [ 41 ]. However, divergence estimates in the latter study were much larger than those obtained in the present work (average F ST = 0.06). In another study [ 20 ], the authors obtained F ST estimates comparable to those in our study when comparing populations in the Dalian Sea, a prominent aquaculture region in China. MBP30 has been reared in captivity and with controlled reproduction for over 25 years, and while there is no record of intentional new introductions of animals to minimize inbreeding during those years, the founding broodstock used to initiate the MBP has been mixed with broodstock from other localities and undergone multiple mating schemes [ 6 ]. MBP30 also has the lowest nucleotide diversity (average and by window or chromosome), possibly as a result of inbreeding and/or a reduction in effective population size since the number of crosses per cohort ranged from 24–85 [ 6 ]. This result is not unexpected given the domestication history of MBP30 and the challenges in maintaining high genetic diversity and low rates of inbreeding accumulation in aquatic animal breeding programs. In this study, we used the Cockerham and Weir F ST , which (as all other F ST statistics) is a relative measure of population divergence largely depending on current within-population diversity (π). Dxy , conversely, is independent of extant population diversity and better reflects the relationships between populations that are shared by ancestry [ 42 ]. Dxy generally supported the F ST results, but the values were more similar across pairs of populations involving MBP30, indicating that the lower nucleotide diversity within MBP30 is what causes, at least in part, the elevated F ST values in pairs including MBP30. An AMOVA, based on Nei’s genetic distances, also supported significant genetic variability between populations, although AMOVA does not allow for dissecting the specific pairs that diverge, based on all other results, MBP30 is likely the major driver of such variability. None of the tests we performed detected differences between MBP6 (the broodstock collected from Dabob Bay in 1998) and WB, the contemporary, naturally recruiting population from Willapa Bay collected in 2022. This pair of populations had the lowest F ST (0.000396, p-value < 0.0001) and overlapped significantly in the PCA. This is an unexpected result since these two populations are separated geographically and temporally, underscoring the challenge in characterizing population genetics in this species. If we assume two generations per year and that the standardized temporal variation (Fs) [ 43 ] is expected to be double the F ST between two populations [ 9 ], temporal change between these two populations is still very low (Fs 0.0096). Although Dabob Bay and Willapa Bay are geographically segregated, there was and continues to be extensive human-led movement of animals between those two locations. These results contrast with the study by X Sun and D Hedgecock [ 9 ] in which large genetic divergence in time for populations collected in Dabob Bay in 1985, 1996, and 2006 was detected. A more direct comparison using contemporary samples from Dabob Bay would be useful to clarify discrepancies. The SD population, assumed to be a naturalized, self-recruiting population sampled from San Diego Bay, CA, tightly clustered with the MID population, which, in turn, has been in captivity as part of the OSU-MBP in Newport, Oregon, since its collection in Japan in 2004. ADMIXTURE analysis using all SNPs (before filtering for INDELS and rare alleles) supports that these two populations share some weak genetic ancestry. This finding was unexpected, and one possible explanation is that animals from the same region in Japan from where MID came from were somehow moved to San Diego Bay. Unintentional transport via ship ballast and fouling is plausible since San Diego is one of the most active ports on the U.S. West Coast. The data suggests that the progenitors of SD were more likely from Southern Japan near Kumamoto rather than from the Miyagi prefecture in Northeastern Japan, where the bulk of seed imports originated from in the previous century [ 1 ]. In addition, after collecting samples for this study, the fact that the SD population is self-recruiting has been questioned. There is at least one report of Pacific oysters established in southern California, although it was unclear, at the time of that report, if these clusters of established oysters would persist since many intentional attempts to introduce Pacific oysters in that region had failed in the past [ 44 ]. Recently, an in-depth report of feral M. gigas presence on the U.S. Pacific Coast showed that this species has increased in abundance in Southern California and pointed to possible permanent establishment and dispersal [ 45 ]. An investigation into the self-recruiting ability of animals from San Diego Bay is warranted but outside of the scope of this study. Importantly, given the population genetic characteristics of SD found here (i.e., low but marked genetic divergence from other populations), this location could be a source of genetic diversity exploitable for breeding purposes. To evaluate F ST outliers, we used the Outflank algorithm developed by MC Whitlock and KE Lotterhos [ 37 ] based on the first formal method for detecting F ST outliers [ 46 ]. Importantly, Outflank does not assume that populations are independent, allowing for population pairs to have exchange of migrants and shared evolutionary histories, which we deemed appropriate in this study. MBP30 had the most individuals with identified outlier sites, while some outliers were present only in some populations and not in others. Several outliers fell into coding regions and regulatory regions. Importantly, many of the outliers were found in genes encoding lncRNAs, important regulators of transcription through a variety of mechanisms and believed to be involved in immunity and stress response in bivalves [ 47 , 48 ]. LncRNAs were associated with growth regulation in Pacific oysters [ 49 ]. These associations are plausible given MBP’s history of selection for yield (a composite of growth and survival) and more recently for field survival in OsHV-1 positive bay. Although we were able to functionally annotate the loci where the F ST outliers were identified, both the genome assembly reference used in this study and its associated annotations are still relatively unrefined. Many of the identified outlier sites are in or nearby genes coding for uncharacterized proteins. Moreover, during the writing of the present manuscript, at least three new Magallana gigas genome assemblies were made public in the NCBI repository alone, and one of them (GCF_963853765.1, Wellcome Sanger Institute) has replaced the “reference” status of the genome assembly in this study. None of the currently available genome assemblies derives from a U.S. sourced Pacific oyster. High genetic and molecular diversity in bivalves is thought to be amply rooted in presence/absence variation found through pan-genome studies involving individuals from geographically distant populations [ 50 ]. In Pacific oysters, this variation has been found across European populations [ 51 , 52 ]. Conclusions In this study, we confirmed that North American Pacific oysters harbor very high genetic heterozygosity. The divergence between Pacific oyster populations along the U.S. Pacific coast is very low but detectable and significant. The captive population in our set, which has been used for breeding for over 25 years, is the most genetically differentiated and shows low nucleotide diversity attributable to the effects of domestication and inbreeding. In addition, this captive population seems to harbor loci with a high probability of having been selected as a result of domestication and artificial selection. Overall, the results presented here are indicative of high-gene flow and weak but detectable population structure among the contemporary populations in this set. The genetic variability detected could be exploitable for breeding purposes and probably confers Pacific oysters with an increased ability to adapt to changing environments. Declarations Ethics approval and consent to participate: Not applicable Availability of data and materials: Sequencing data is available in the NCBI SRA under accession BioProject #PRJNA1165834 Competing interests: The authors declare that they have no competing interests Funding This research was supported by the U.S. Department of Agriculture, Agricultural Research Service (Project number 2076-63000-005-000-D). This research used resources provided by the SCINet project and/or the AI Center of Excellence of the USDA Agricultural Research Service, ARS project numbers 0201-88888-003-000D and 0201-88888-002-000D. USDA is an equal opportunity provider and employer. Authors' contributions: BC and NT conceived and designed the experiment; BC and NT prepared samples and obtained data; BC, JS, and NT analyzed and interpreted the data; BC drafted the manuscript; BC, JS, and NT revised the manuscript. All authors read and approved the final manuscript. Acknowledgements We thank Dr. Brett Dumbauld for critical review of the manuscript. Brooke McIntyre, Dr. Brett Dumbauld, and Zach Forster for collecting and providing oyster samples from Willapa Bay. Thanks to Dr. Colleen Burge (California Department of Fish and Wildlife) for providing oyster samples from San Diego Bay. We thank Dr. Chris Langdon of the Oregon State University MBP for granting access to breeding populations and samples. References Steele EN: The immigrant oyster ( Ostrea gigas ) now known as the Pacific oyster . Olympia, Washington: Quick Print; 1964. Lavoie RE: Oyster Culture in North America History, Present and Future . In: The 1st International Oyster Symposium Proceedings: 2005; Tokio, Japan : Oyster Research Institute News 17: 14-21 2005. Chew KK: Recent advances in the cultivation of molluscs in the Pacific United States and Canada . Aquaculture 1984, 39 (1):69-81. Langdon C, Evans F, Jacobson D, Blouin M: Yields of cultured Pacific oysters Crassostrea gigas Thunberg improved after one generation of selection . Aquaculture 2003, 220 (1):227-244. Evans S, Langdon C: Effects of genotype×environment interactions on the selection of broadly adapted Pacific oysters (Crassostrea gigas) . Aquaculture 2006, 261 (2):522-534. de Melo CMR, Durland E, Langdon C: Improvements in desirable traits of the Pacific oyster, Crassostrea gigas, as a result of five generations of selection on the West Coast, USA . Aquaculture 2016, 460 :105-115. Divilov K, Schoolfield B, Mancilla Cortez D, Wang X, Fleener GB, Jin L, Dumbauld BR, Langdon C: Genetic improvement of survival in Pacific oysters to the Tomales Bay strain of OsHV-1 over two cycles of selection . Aquaculture 2021, 543 :737020. Camara MD: Changes in molecular genetic variation at AFLP loci associated with naturalization and domestication of the Pacific oyster ( Crassostrea gigas ) . Aquat Living Resour 2011, 24 (1):35-43. Sun X, Hedgecock D: Temporal genetic change in North American Pacific oyster populations suggests caution in seascape genetics analyses of high gene-flow species . Mar Ecol Prog Ser 2017, 565 :79-93. Hedgecock D: Does variance in reproductive success limit effective population sizes of marine organisms? In A . Genetics and Evolution of Aquatic Organisms 1994. Sauvage C, Bierne N, Lapègue S, Boudry P: Single Nucleotide polymorphisms and their relationship to codon usage bias in the Pacific oyster Crassostrea gigas . Gene 2007, 406 (1):13-22. Launey S, Hedgecock D: High genetic load in the pacific oyster Crassostrea gigas . Genetics 2001, 159 (1):255-265. Plough LV, Hedgecock D: Quantitative trait locus analysis of stage-specific inbreeding depression in the Pacific oyster Crassostrea gigas . Genetics 2011, 189 (4):1473-1486. Zhang G, Fang X, Guo X, Li L, Luo R, Xu F, Yang P, Zhang L, Wang X, Qi H et al : The oyster genome reveals stress adaptation and complexity of shell formation . Nature 2012, 490 (7418):49-54. Qi H, Song K, Li C, Wang W, Li B, Li L, Zhang G: Construction and evaluation of a high-density SNP array for the Pacific oyster (Crassostrea gigas) . PLOS ONE 2017, 12 (3):e0174007. Gutierrez AP, Turner F, Gharbi K, Talbot R, Lowe NR, Peñaloza C, McCullough M, Prodöhl PA, Bean TP, Houston RD: Development of a medium density combined-species snp array for pacific and european oysters ( Crassostrea gigas and Ostrea edulis ) . G3 Genes|Genomes|Genetics 2017, 7 (7):2209-2218. Vendrami DLJ, Houston RD, Gharbi K, Telesca L, Gutierrez AP, Gurney-Smith H, Hasegawa N, Boudry P, Hoffman JI: Detailed insights into pan-European population structure and inbreeding in wild and hatchery Pacific oysters ( Crassostrea gigas ) revealed by genome-wide SNP data . Evolutionary Applications 2019, 12 (3):519-534. Sutherland BJG, Rycroft C, Ferchaud A-L, Saunders R, Li L, Liu S, Chan AM, Otto SP, Suttle CA, Miller KM: Relative genomic impacts of translocation history, hatchery practices, and farm selection in Pacific oyster Crassostrea gigas throughout the Northern Hemisphere . Evolutionary Applications 2020, 13 (6):1380-1399. Hu B, Tian Y, Li Q, Liu S: Genomic signatures of artificial selection in the Pacific oyster, Crassostrea gigas . Evolutionary Applications 2022, 15 (4):618-630. Mao J, Tian Y, Liu Q, Li D, Ge X, Wang X, Hao Z: Revealing genetic diversity, population structure, and selection signatures of the Pacific oyster in Dalian by whole-genome resequencing . Frontiers in Ecology and Evolution 2024, 11 . Burge CA, Friedman CS, Kachmar ML, Humphrey KL, Moore JD, Elston RA: The first detection of a novel OsHV-1 microvariant in San Diego, California, USA . Journal of Invertebrate Pathology 2021, 184 :107636. de Melo CMR, Divilov K, Durland E, Schoolfield B, Davis J, Carnegie RB, Reece KS, Evans F, Langdon C: Introduction and evaluation on the US West Coast of a new strain (Midori) of Pacific oyster (Crassostrea gigas) collected from the Ariake Sea, southern Japan . Aquaculture 2021, 531 :735970. Simon A: FastQC: A Quality Control Tool for High Throughput Sequence Data [Online] . In . ; 2010. Peñaloza C, Gutierrez AP, Eöry L, Wang S, Guo X, Archibald AL, Bean TP, Houston RD: A chromosome-level genome assembly for the Pacific oyster Crassostrea gigas . GigaScience 2021, 10 (3):giab020. Okonechnikov K, Conesa A, García-Alcalde F: Qualimap 2: Advanced multi-sample quality control for high-throughput sequencing data . Bioinformatics 2016, 32 (2):292-294. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M et al : The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data . Genome Research 2010, 20 (9):1297-1303. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J et al : From FastQ Data to High-Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline . Current Protocols in Bioinformatics 2013, 43 (1):11.10.11-11.10.33. Korunes KL, Samuk K: pixy: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data . Molecular Ecology Resources 2021, 21 (4):1359-1368. Wigginton JE, Cutler DJ, Abecasis GR: A note on exact tests of Hardy-Weinberg equilibrium . The American Journal of Human Genetics 2005, 76 (5):887-893. Benjamini Y, Hochberg Y: Controlling the False Discovery Rate: A practical and powerful approach to multiple testing . Journal of the Royal Statistical Society: Series B (Methodological) 1995, 57 (1):289-300. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ et al : PLINK: A tool set for whole-genome association and population-based linkage analyses . The American Journal of Human Genetics 2007, 81 (3):559-575. Alexander DH, Novembre J, Lange K: Fast model-based estimation of ancestry in unrelated individuals . Genome Research 2009, 19 (9):1655-1664. Alexander DH, Lange K: Enhancements to the ADMIXTURE algorithm for individual ancestry estimation . BMC Bioinformatics 2011, 12 (1):246. Weir BS, Cockerham CC: Estimating F-statistics for the analysis of population structure . Evolution 1984, 38 (6):1358-1370. Nei M: Genetic distance between populations . The American Naturalist 1972, 106 (949):283-292. Pembleton LW, Cogan NOI, Forster JW: StAMPP: an R package for calculation of genetic differentiation and structure of mixed-ploidy level populations . Molecular Ecology Resources 2013, 13 (5):946-952. Whitlock MC, Lotterhos KE: Reliable detection of loci responsible for local adaptation: Inference of a null model through trimming the distribution of FST . The American Naturalist 2015, 186 (S1):S24-S36. Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM: A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff . Fly 2012, 6 (2):80-92. Linck E, Battey CJ: Minor allele frequency thresholds strongly affect population structure inference with genomic data sets . Molecular Ecology Resources 2019, 19 (3):639-647. Hedgecock D, Pan FTC: Genetic divergence of selected and wild populations of Pacific oysters ( Crassostrea gigas ) on the West Coast of North America . Aquaculture 2021, 530 :735737. Sutherland BJG, Thompson NF, Surry LB, Gujjula KR, Carrasco CD, Chadaram S, Lunda SL, Langdon CJ, Chan AM, Suttle CA et al : An amplicon panel for high-throughput and low-cost genotyping of Pacific oyster . G3 Genes|Genomes|Genetics 2024:jkae125. Cruickshank TE, Hahn MW: Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow . Molecular Ecology 2014, 23 (13):3133-3157. Jorde PE, Ryman N: Unbiased estimator for genetic drift and effective population size . Genetics 2007, 177 (2):927-935. Crooks J, Crooks KR, Crooks AJ: Observations of the non-native Pacific oyster ( Crassostrea gigas ) in San Diego County, California . California Fish and Game 2015, 101 :101-107. Kornbluth A, Perog BD, Crippen S, Zacherl D, Quintana B, Grosholz ED, Wasson K: Mapping oysters on the Pacific coast of North America: A coast-wide collaboration to inform enhanced conservation . PLoS One 2022, 17 (3):e0263998. Lewontin RC, Krakauer J: Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms . Genetics 1973, 74 (1):175-195. Pereiro P, Moreira R, Novoa B, Figueras A: Differential expression of long non-coding rna (lncRNA) in mediterranean mussel ( Mytilus galloprovincialis ) hemocytes under immune stimuli . In: Genes. vol. 12; 2021. Cai C, He Q, Xie B, Xu Z, Wang C, Yang C, Liao Y, Zheng Z: Long non-coding RNA LncMPEG1 responds to multiple environmental stressors by affecting biomineralization in pearl oyster Pinctada fucata martensii . Frontiers in Marine Science 2022, 9 . Li Y, Yang B, Shi C, Tan Y, Ren L, Mokrani A, Li Q, Liu S: Integrated analysis of mRNAs and lncRNAs reveals candidate marker genes and potential hub lncRNAs associated with growth regulation of the Pacific Oyster, Crassostrea gigas . BMC Genomics 2023, 24 (1):453. Gerdol M, Moreira R, Cruz F, Gómez-Garrido J, Vlasova A, Rosani U, Venier P, Naranjo-Ortiz MA, Murgarella M, Greco S et al : Massive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel . Genome Biology 2020, 21 (1):275. Sollitto M, Kenny NJ, Greco S, Tucci CF, Calcino AD, Gerdol M: Detecting Structural VariantsStructural variants and Associated Gene Presence–Absence Variation Phenomena in the Genomes of Marine Organisms . In: Marine Genomics: Methods and Protocols. Edited by Verde C, Giordano D. New York, NY: Springer US; 2022: 53-76. Tucci CF: Unveiling and open pan-genomic structure in the Pacific oyster Crassostrea gigas through the identification of PAV phenomena . In: Mollusc genomics EMBO. University of Namur, Namur Belgium; 2024. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 05 Feb, 2025 Read the published version in BMC Genomics → Version 1 posted Editorial decision: Revision requested 16 Dec, 2024 Reviews received at journal 13 Dec, 2024 Reviews received at journal 25 Nov, 2024 Reviewers agreed at journal 15 Nov, 2024 Reviewers agreed at journal 15 Nov, 2024 Reviewers invited by journal 14 Nov, 2024 Editor invited by journal 14 Nov, 2024 Editor assigned by journal 11 Nov, 2024 Submission checks completed at journal 11 Nov, 2024 First submitted to journal 08 Nov, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5418899","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":382464029,"identity":"c7277b72-e08e-4702-b04b-0f6c4484388b","order_by":0,"name":"Bernarda Calla","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAklEQVRIiWNgGAWjYBACAwjFDCIYDzxgsAHSCcRrYTiQwJBGupbDhLWYs/c+e/Bzj7U8A/vhBwcSas4nzm9PfsD4peIwTi2WPcfNDXuepRs28KQZHEg4djtxw5lnBswyZ3BrMbiRxibBc+AwY4MEA1ALG1CLRIIBs2RbGm4t95+xSf45cNi+QYL9w4GEf+cS589I/4Bfyw02NmmgLYkNEjwGBxLbDiQ23MgxYPzYZoPHL2ls0jIH0pPbeHIKDiT2JRtvOPOm4DDDGdxazNmPsUm+OWBt289+fOODD9/sZOe3p298+KNCAqcWOGBD5hzmIawBDTD+IFnLKBgFo2AUDGMAALWjW7FlPMHXAAAAAElFTkSuQmCC","orcid":"","institution":"United States Department of Agriculture","correspondingAuthor":true,"prefix":"","firstName":"Bernarda","middleName":"","lastName":"Calla","suffix":""},{"id":382464030,"identity":"710713af-c584-4de2-a78f-ce96824d0b51","order_by":1,"name":"Jingwei Song","email":"","orcid":"","institution":"Oregon State University","correspondingAuthor":false,"prefix":"","firstName":"Jingwei","middleName":"","lastName":"Song","suffix":""},{"id":382464031,"identity":"2005ee65-d0dc-417e-b86d-3bd8910a7e07","order_by":2,"name":"Neil Thompson","email":"","orcid":"","institution":"United States Department of Agriculture","correspondingAuthor":false,"prefix":"","firstName":"Neil","middleName":"","lastName":"Thompson","suffix":""}],"badges":[],"createdAt":"2024-11-08 22:08:17","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5418899/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5418899/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12864-025-11259-9","type":"published","date":"2025-02-05T15:57:51+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":70206196,"identity":"bbd13532-c9a6-4a71-a0b4-7da2f17ca1a8","added_by":"auto","created_at":"2024-11-29 13:46:04","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":2844621,"visible":true,"origin":"","legend":"\u003cp\u003eSampling locations of Pacific oysters (\u003cem\u003eMagallana gigas\u003c/em\u003e) along the US Pacific Coast. Five populations were sampled, locations are marked with a star. BLUE= Dabob Bay (MBP6 from OSU-MBP); BLACK = Willapa Bay (WB); GREEN = Midori (MID from OSU-MBP); RED = MBP30 (from OSU-MBP); San Diego Bay (SD). OSU-MBP= Oregon State University Molluskan Broodstock Program.\u003c/p\u003e","description":"","filename":"Binder71.png","url":"https://assets-eu.researchsquare.com/files/rs-5418899/v1/ba95724f4573c105c98d6af8.png"},{"id":70205413,"identity":"327c8302-4394-484a-a753-3dc68622a8d6","added_by":"auto","created_at":"2024-11-29 13:38:04","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":104382,"visible":true,"origin":"","legend":"\u003cp\u003ePrincipal Component Analysis (PCA) based on 20.3 million SNPs across five Pacific oyster (\u003cem\u003eMagallana gigas\u003c/em\u003e) populations from the U.S. Pacific coast. MID = Midori, SD= San Diego, WB = Willpa Bay; MBP6=Molluskan Broodstock Program founder from Dabob Bay; MBP30=Molluskan Broodstock Program Cohort 30.\u003c/p\u003e","description":"","filename":"Binder72.png","url":"https://assets-eu.researchsquare.com/files/rs-5418899/v1/1a644b96726aa1be0f1a099d.png"},{"id":70205411,"identity":"0f1da840-c6a1-4098-9373-e011e1073491","added_by":"auto","created_at":"2024-11-29 13:38:04","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":49773,"visible":true,"origin":"","legend":"\u003cp\u003eAdmixture analysis in one hundred individuals from five populations of Pacific oysters (\u003cem\u003eMagallana gigas\u003c/em\u003e). \u003cstrong\u003ePanel A.\u003c/strong\u003e Population structure detected using a subsample of SNPs before filtering for rare variants; \u003cstrong\u003ePanel B.\u003c/strong\u003e After MAF filtering showing no population structure. SNPs were called for five populations: MID = Midori, SD= San Diego, WB = Willpa Bay; MBP6=Molluskan Broodstock Program founder from Dabob Bay; MBP30=Molluskan Broodstock Program Cohort 30.\u003c/p\u003e","description":"","filename":"Binder73.png","url":"https://assets-eu.researchsquare.com/files/rs-5418899/v1/d7a6fb6e7c31bd9036094ce6.png"},{"id":70205410,"identity":"c78e2b11-9665-4d0d-880e-380a43a7277c","added_by":"auto","created_at":"2024-11-29 13:38:04","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":69316,"visible":true,"origin":"","legend":"\u003cp\u003eNucleotide diversity (\u003cu\u003eπ) \u003c/u\u003ein five populations of Pacific oyster (\u003cem\u003eMagallana gigas\u003c/em\u003e) from the U.S. Pacific coast. MID = Midori, SD= San Diego, WB = Willpa Bay; MBP6=Molluskan Broodstock Program founder from Dabob Bay; MBP30=Molluskan Broodstock Program Cohort 30.\u003c/p\u003e","description":"","filename":"Binder74.png","url":"https://assets-eu.researchsquare.com/files/rs-5418899/v1/c78e82983dbe648206ce2aa2.png"},{"id":70205412,"identity":"ec38a2d6-2e0b-4be9-b207-465029609beb","added_by":"auto","created_at":"2024-11-29 13:38:04","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":25706,"visible":true,"origin":"","legend":"\u003cp\u003eFifty-nine F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outliers, indicative of local adaptation, from five populations of Pacific oysters (\u003cem\u003eMagallana gigas\u003c/em\u003e) across the genome. Outliers are marked with red circles.\u003c/p\u003e","description":"","filename":"Binder75.png","url":"https://assets-eu.researchsquare.com/files/rs-5418899/v1/05bd46dca6aca94bcdf8a9b1.png"},{"id":70207807,"identity":"bd539a48-10de-4842-b37d-9e22b4f33bf9","added_by":"auto","created_at":"2024-11-29 13:54:04","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":87672,"visible":true,"origin":"","legend":"\u003cp\u003ePresence of F\u003csub\u003e\u003cem\u003eST \u003c/em\u003e\u003c/sub\u003eoutliers across populations. Twenty individuals were sequenced per population. Vertical bars are the 59 F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outliers in order of genome position. Bubble size indicates the number of individuals that had the outlier SNP either as homozygous alternative allele or as heterozygous.\u003c/p\u003e","description":"","filename":"Binder76.png","url":"https://assets-eu.researchsquare.com/files/rs-5418899/v1/e5c719b6e912bb99e90665f1.png"},{"id":75931196,"identity":"ae3480c6-1f49-41cd-90b3-1f07d876c7d9","added_by":"auto","created_at":"2025-02-10 16:14:01","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":6600015,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5418899/v1/63781939-fc7b-4875-9980-7edbcdb369cd.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"\u003cp\u003eWeak genetic divergence and signals of adaptation obscured by high gene flow in an economically important aquaculture species\u003c/p\u003e","fulltext":[{"header":"Background","content":"\u003cp\u003eThe genetic diversity of a population determines its ability to adapt to episodic and fluctuating environmental changes. For domesticated species, including those used for agriculture, genetic diversity is a key component of their potential for genetic selection and for adaptation to non-natural environments. Under this view, comprehensive knowledge of the gene pool of a species remains fundamental for developing practices that maintain its health and productivity. This knowledge may also inform establishment of founder populations for broodstock and enable the measuring and utilization of the organism inherent resiliency. In addition, since the genetic diversity of a species is the result of an interplay between gene flow, genetic drift, and selection, it offers a window into its evolutionary and demographic history.\u003c/p\u003e \u003cp\u003eThe Pacific oyster \u003cem\u003eMagallana gigas\u003c/em\u003e (previously \u003cem\u003eCrassostrea gigas\u003c/em\u003e, Thunberg, 1973), a species native to the Pacific Coast of Asia, was introduced to the Pacific Coast of the United States through multiple independent transplants from Northeast Japan occurring mainly during the period between 1902 and 1960 [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. These introductions followed the depletion of native oyster (\u003cem\u003eOstrea lurida\u003c/em\u003e) populations due to overharvesting and pollution [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Because of the inherent fast growth and high reproductive potential of \u003cem\u003eM. gigas\u003c/em\u003e, this species quickly became established in commercial aquaculture, being now the most cultivated and consumed shellfish species in the U.S. and around the world. Annual production of Pacific oysters in the U.S. reached 2,433 metric tons in 2021 (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.fisheries.noaa.gov/\u003c/span\u003e\u003cspan address=\"https://www.fisheries.noaa.gov/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). In the 1980s, efforts focused on developing local Pacific oyster hatcheries that could satisfy the demand for seed, eliminating the dependency on imports from Japan and on seed collections from natural settlements occurring in limited areas of the North American Pacific Coast [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. The Oregon State University Molluscan Broodstock Program (OSU-MBP) was initiated in 1996 using multiple collections of naturalized Pacific oysters from Willapa and Dabob bays (Washington, US) and from Pipestem Inlet, Vancouver Island (British Columbia, Canada). These founders were used to establish a hatchery-domesticated population maintained through artificial spawning and selected for yield traits and, more recently, for survival to OsHV-1 traits [\u003cspan additionalcitationids=\"CR5 CR6\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eThe unique history of the origin, introduction, naturalization, and domestication of \u003cem\u003eM. gigas\u003c/em\u003e in the United States Pacific Northwest, together with the availability of genetic material from both \u0026ldquo;museum\u0026rdquo; specimens (i.e., samples collected and preserved since the 1990s) and contemporary naturalized and captive populations offers an opportunity to evaluate the imprint of environmental challenges and of hatchery and selection practices on the genetic pattern within this species. In-depth knowledge of the available genetic diversity and population structure of the Pacific oyster is also paramount for the understanding of its potential to adapt to existing and future environmental challenges. Because of the repeated introductions that might have resulted in genetic remixing and/or bottlenecks, and due to hatchery and out-planting practices, \u003cem\u003eM. gigas\u003c/em\u003e genomic variability and population structure on the U.S. Pacific West Coast could be complex [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. In addition, typical population genetic assumptions may not hold for this species due to its high fecundity, high larval dispersion, and variable reproductive success [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. \u003cem\u003eM\u003c/em\u003e. \u003cem\u003egigas\u003c/em\u003e is known to harbor high levels of DNA polymorphism and heterozygosity [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e], and exhibit \u0026ldquo;reproductive sweepstakes\u0026rdquo; which limits effective population size in specific cohorts [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e], affects Mendelian segregation ratios via null alleles, and increases genetic load [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eSeveral studies have documented genetic variability in \u003cem\u003eM. gigas\u003c/em\u003e from different regions of the native and introduced species range. G Zhang, X Fang, X Guo, L Li, R Luo, F Xu, P Yang, L Zhang, X Wang, H Qi, et al. [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] assembled a reference genome from a 4th generation inbred \u003cem\u003eM. gigas\u003c/em\u003e individual and identified over 3\u0026nbsp;million SNPs by comparing reads from a wild oyster against this newly assembled genome. This study had the obvious limitation that a single re-sequenced individual cannot represent variability between or within populations. More recent studies have used larger numbers of individuals spanning geographical areas, for example, H Qi, K Song, C Li, W Wang, B Li, L Li and G Zhang [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e] documented 2.7\u0026nbsp;million SNPs by resequencing 472 Pacific oyster individuals from China, Japan, Korea, and Canada. A subset of the identified SNPs was used to construct a high-density genotyping array [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. Variation in \u003cem\u003eM. gigas\u003c/em\u003e European populations was investigated by sequencing 203 individuals from eight different geographical sources to build a medium-density array that also contained SNPs from \u003cem\u003eOstrea edulis\u003c/em\u003e (the European flat oyster); the study identified 12.4\u0026nbsp;million SNPs [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. DLJ Vendrami, RD Houston, K Gharbi, L Telesca, AP Gutierrez, H Gurney-Smith, N Hasegawa, P Boudry and JI Hoffman [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e] reported high genetic differentiation between southern and northern Pacific oyster populations from Western Europe using a SNP array. Low effects of translocation (movement of juveniles between natural settings and farms) and farm and hatchery practices on genetic connectivity between populations were identified for Canadian Pacific Northwest populations using roughly 17,000 markers identified via RAD-seq [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Within the native range, low coverage whole genome resequencing identified 12.2\u0026nbsp;million SNPs among two wild populations from Northeast Japan and Northeast China and two breeding populations derived from each wild collection [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. Yet other research has found low genetic diversity and low population differentiation based on over 2\u0026nbsp;million SNPs in seven wild populations of \u003cem\u003eM. gigas\u003c/em\u003e in Dalian, China, the major producer of Pacific oysters in the region [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eTo date, no studies have documented genetic variation among populations from the U.S. Pacific Coast using whole genome resequencing. Available studies for this geographical range have been restricted to those utilizing a limited number of markers (e.g., allozymes and low-coverage SNPs markers) [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] or reduced-representation methods (e.g., RAD-seq) [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. The main factor precluding whole genome scans that involve a high number of samples is cost, but low coverage sequencing when paired with the appropriate analysis tools, can provide a cost-effective way of measuring diversity with larger sample sizes. In the present study, we obtained a catalog of current genetic diversity in \u003cem\u003eM. gigas\u003c/em\u003e populations along the U.S. Pacific Coast, including naturalized and aquaculture populations, as well as founders and contemporary populations. Our aim was to evaluate available genetic diversity and to identify potential signatures of genetic selection resulting from either naturalization to the U.S. Pacific Coast or artificial selection and hatchery practices. For this purpose, we collected 100 individuals from 5 temporally and geographically segregated populations of \u003cem\u003eM. gigas\u003c/em\u003e, including individuals that were part of the OSU-MBP collections, as well as naturalized individuals from Willapa Bay, Washington, and from a presumably self-recruiting population in San Diego Bay, California.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eSample collection:\u003c/h2\u003e \u003cp\u003ePacific oyster samples from the U.S. Pacific Coast were obtained from five populations (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Adductor or mantle tissue was taken from each animal and preserved in liquid nitrogen or 95% ethanol. U.S. naturalized Pacific oysters were sampled from two sites: Willapa Bay, Washington (WB), and San Diego Bay, California (SD). Animals from the WB population were obtained from an oyster bed (\u0026ldquo;Parcel A\u0026rdquo;) near Nahcotta, Washington, in 2022. SD animals were collected from a pier located approximately 125 m south of Tuna Harbor in San Diego Bay; this location experienced oyster mass mortalities due to Ostreid Herpesvirus microvariant (OsHV-1 \u0026micro;var) outbreaks in 2018 and 2020 [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Individuals for this study were collected in July 2020 while the mortality event was occurring, but sampled animals were not exhibiting symptoms of infection or stress.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTwo other populations were obtained from OSU-MBP. MBP cohort 6 (MBP6) samples were obtained from a collection of preserved individuals dating to 1998; these oysters were naturalized animals collected from Dabob Bay in 1996 and used as broodstock to initiate, in part, the OSU-MBP selective breeding program. The MBP cohort 30 (MBP30) samples were juveniles produced in 2021, these animals represent the seventh generation of the MBP and have a mixed lineage (founders were from different locations, and descendant families were widely intercrossed) [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Both MBP cohorts belong to the \u0026ldquo;Miyagi\u0026rdquo; population, which was repeatedly introduced to the U.S. from Northeast Japan starting in the 1920s [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. A fifth population (MID) was obtained from preserved individuals that were collected in 2004 from the Kumamoto Prefecture in Southern Japan by the OSU-MBP. The samples were taken from originally collected individuals (G0), which were later spawned to create the \u0026ldquo;Midori\u0026rdquo; population [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. This last population (MID) has undergone little to no selection and is currently cultured widely along the U.S. Pacific coast.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eWhole genome resequencing\u003c/h2\u003e \u003cp\u003eDNA extraction, library construction, and sequencing were carried out by the OSU Center for Quantitative Life Sciences. Libraries were constructed using the PrepX DNA library prep kit (Takara) and sequenced in an Illumina NexSeq 2000 with a P3 flow cell to obtain 150 bp paired reads. The target read coverage was approximately 5.6X per sample.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eVariant calling\u003c/h2\u003e \u003cp\u003eThe resulting sequencing reads were checked for quality with FASTQC v.0.11.9 [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]; reads with trimmed adapters were then mapped to the reference \u003cem\u003eMagallana gigas\u003c/em\u003e genome (NCBI GenBank # GCA_902806645.1) [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e] using BWA v. 0.7.17-r1188 with the \u003cem\u003emem\u003c/em\u003e algorithm. For compatibility with the downstream pipeline, the BAM had read groups (RG) added with the \u0026lsquo;AddOrReplaceReadGroups\u0026rsquo; from Picard v. 2.27.1 (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://broadinstitute.github.io/picard/\u003c/span\u003e\u003cspan address=\"https://broadinstitute.github.io/picard/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Mapping quality and coverage depth were evaluated with Qualimap v. 2.3 [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe following steps were completed using the tools and Best Practices workflow from GATK v.4.1.4.1 [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. Duplicates were marked with MarkDuplicatesSpark from, the GATK Spark implementation of Picard\u0026rsquo;s MarkDuplicates; this process also sorted the records by coordinate. Next, SNPs and indels were called with HaplotypeCaller, generating GCVF files for each sample. All GVCFs were then merged into a database with GenomicsDBImport first without and then with the \u0026ldquo;all-sites\u0026rdquo; option to maintain both variants and invariant sites. Joint genotyping was done with GenotypeGVCFs using the database generated in the previous step as input. For memory efficiency when using the \u0026ldquo;all-sites\u0026rdquo; flag, both GenomicsDBImport and joint genotyping were run on intervals created by splitting the reference genome into 18 segments. All files with called genotypes were then merged into ten separate chromosome files plus one file containing the 236 unmapped scaffolds for remaining steps. Best filtering strategies were explored in a subsample of the called genotypes using vcftools [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] and ggplot package in R v.4.3.1. Filtering was done in two separate steps following GATK best practices recommendations. First, a hard filtering step was applied using the following cutoffs: Fisher Strand\u0026thinsp;\u0026gt;\u0026thinsp;60.0; Strand Odds Ratio (SOR)\u0026thinsp;\u0026gt;\u0026thinsp;3; RMS Mapping Quality (MQ)\u0026thinsp;\u0026lt;\u0026thinsp;40; Mapping Quality Rank Sum Test (MQRankSum)\u0026lt;-4.0; Mapping Quality Rank Sum (MQRankSum)\u0026thinsp;\u0026gt;\u0026thinsp;12; Quality by Depth (QD)\u0026thinsp;\u0026lt;\u0026thinsp;2.0; Read Position Rank Sum (ReadPosRankSum)\u0026lt;-3.0; and combined depth across all samples (INFO/DP)\u0026thinsp;\u0026gt;\u0026thinsp;500. A second filtering was applied to remove individual genotypes with poor values (based on by-sample fields in the vcf file) with Read Depth (FMT/DP)\u0026thinsp;\u0026lt;\u0026thinsp;3 and Genotype Quality likelihood (FMT/GQ)\u0026thinsp;\u0026lt;\u0026thinsp;20 and genotypes which, after filtering off sites in the previous step, had all remaining sites missing. In addition, SNPs within ten bp of indels and with more than two alleles were removed (--SnpGap 10; -m2 -M2). The file containing invariants was filtered separately to preserve invariant and variant sites as input to Pixy software.\u003c/p\u003e \u003cp\u003eTo test deviation from HWE, we used the exact test as defined in JE Wigginton, DJ Cutler and GR Abecasis [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e] and implemented in vcftools v0.1.17 \u0026ldquo;--hardy\u0026rdquo;. The max-missing flag filter was set to no more than 10% missing data. P-values were adjusted based on false discovery rate (FDR) [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. SNP loci with an FDR\u0026thinsp;\u0026lt;\u0026thinsp;0.1 were considered significant.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eGenetic diversity and population structure\u003c/h2\u003e \u003cp\u003eTo evaluate genetic structure and relatedness between the populations, linked SNPs were first removed using PLINK2 [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e] using -indep-pairswise 50 5 0.5. The resulting pruned set was used in a Principal Component Analysis (PCA). To assess the proportion of ancestry and relatedness from each population, we used ADMIXTURE v. 1.3.0 [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e] with the PLINK pruned output; the optimal K-value was evaluated with the ADMIXTURE cross-validation feature with K ranging from 1 to 7. Average nucleotide diversity within populations (π) and average divergence between populations (\u003cem\u003edxy\u003c/em\u003e) was estimated with Pixy v.2.1.7 beta [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] using 10 kb non-overlapping sliding windows on the vcf file that had both variant and invariant sites. Per-population π and pairwise \u003cem\u003edxy\u003c/em\u003e averages were calculated by adding the raw by-window counts of pairwise differences between genotypes and then dividing by the sum of total raw by-window non-missing sites.\u003c/p\u003e \u003cp\u003eUnbiased Weir and Cockerham F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e] and Nei\u0026rsquo;s genetic distances [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e] between populations were estimated with StAMMP v. 1.6.3 [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. The significance of the F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e values was calculated with 95% confidence intervals over 100 bootstrap replications. An Analysis of Molecular Variance (AMOVA) was run with 1000 permutations to estimate variance partitioning with StAMMP.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eOutlier detection\u003c/h2\u003e \u003cp\u003eF\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outliers were analyzed with Outflank v 0.2 [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] using SNPs that were present in at least 80% of the samples. To infer the χ\u003csup\u003e2\u003c/sup\u003e square distribution against which the outliers were tested, the tails of the F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e distribution were trimmed with the Outflank default trimming values but also removing low heterozygosity loci (He values\u0026thinsp;\u0026gt;\u0026thinsp;0.1). The detected outliers were mapped back to the chromosomes in the reference genome assembly and were functionally annotated with SNPEff v. 5.1d [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Annotations were manually verified for accuracy by querying the NCBI databases.\u003c/p\u003e \u003c/div\u003e"},{"header":"RESULTS","content":"\u003cp\u003eOne hundred individuals from five populations (WB, SD, MID, MBP6, and MB30) were sequenced (n\u0026thinsp;=\u0026thinsp;20 per population) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The 2.5\u0026nbsp;million mapped reads resulted in a mean sample coverage of 5X. The GC content of mapped reads was 34.93%, and the mean mapping quality per sample (BAM QC) was 36.76.\u003c/p\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eVariants call\u003c/h2\u003e \u003cp\u003eThe initial SNP calling by GATK contained over 100\u0026nbsp;million variants, including indels and SNPs, after site-level filtering, sample-level filtering, removing indels, and removing of repetitive regions, there were a total of 57.9\u0026nbsp;million biallelic SNPs, of which 20.3\u0026nbsp;million had a MAF\u0026thinsp;\u0026gt;\u0026thinsp;1% and were not singletons (SNP density 31.3 SNPs per kb of the genome). Only 230 loci deviated significantly from HWE in all five sampling locations. No loci were removed based on the HWE test.\u003c/p\u003e \u003cp\u003ePrincipal component analysis using the filtered SNPs set after pruning linked sites identified three principal components explaining 33.5% of the total variation (PC1, PC2, and PC3). PC1 explained 13.4% of the variation, whereas PC2 and PC3 explained an additional 10.4% and 9.75% of the variation, respectively. PC1 and PC2 both captured the variation between MBP30 and all other populations, these two PC also showed high overlap between MID and SD populations and showed that MBP6 and WB populations were indistinguishable in the PC space. PC3 separated MID from SD more distinctly (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAn ADMIXTURE analysis showed no population substructure: the lowest cross-validation error was found at K\u0026thinsp;=\u0026thinsp;1 (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). However, a result more congruent with the PCA was seen when using a subsample of 10% of the raw (unfiltered) SNPs, the Q matrix from ADMIXTURE did show that MBP30 could be distinguishable from the other populations, while MD and the SD populations may share a genetic history that separate them slightly from the rest of the populations (with k\u0026thinsp;=\u0026thinsp;3) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB), indicating that weak population structure signals might effectively get lost when removing out rare variants through MAF filtering [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e].\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eGenetic variation and differentiation\u003c/h2\u003e \u003cp\u003eTo further characterize genetic diversity in the samples, we calculated weighted, unbiased π values across the genome in 10 kb windows using the dataset containing both variant and invariant sites (--allsites output from GATK). MBP30 had the lowest within-population nucleotide diversity (π) equal to 0.0043 compared to MBP6\u0026thinsp;=\u0026thinsp;0.0079; MID\u0026thinsp;=\u0026thinsp;0.00786; SD\u0026thinsp;=\u0026thinsp;0.00738; and WB\u0026thinsp;=\u0026thinsp;0.00779. This difference was consistent across the whole genome (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003ePopulation differentiation, as measured using overall pairwise Weir \u0026amp; Cockerham's \u003cem\u003eF\u003c/em\u003e\u003csub\u003e\u003cem\u003eST\u003c/em\u003e,\u003c/sub\u003e showed low divergence between populations. MBP30 has the highest divergence with \u003cem\u003eF\u003c/em\u003e\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e values between 0.0103 (with MBP6) and 0.0145 (with MID) (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, lower triangle). The lowest F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e values were found between MBP6 and WB (0.004). All \u003cem\u003eF\u003c/em\u003e\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e values in all the pairwise comparisons were highly significant, with p-values\u0026thinsp;\u0026lt;\u0026thinsp;0.001 obtained by sampling with replacement (100x) and correcting for multiple testing with the Bonferroni method. In general, divergence (F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e) closely resembled the PCA results. \u003cem\u003eDxy\u003c/em\u003e was also used to compare population differentiation, the results showed, again, that MBP30 has the highest divergence with all the other populations. However, pairwise \u003cem\u003edxy\u003c/em\u003e values across all the pairwise comparisons were higher and less variable than F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e (\u003cem\u003edxy\u003c/em\u003e values ranging between 0.07717 and 0.09627; Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e upper triangle), indicating that the low within-population nucleotide diversity in MBP30 relative to that of other populations might be a major driver for the divergence results. An AMOVA, to test for partitioning of the genetic variation, found significant differentiation between populations but not within any of the populations (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eMeasures of genetic divergence. Lower triangle Weir \u0026amp; Cockerham's F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e, upper triangle (bold) \u003cem\u003edxy\u003c/em\u003e.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSD\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eWB\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eMBP30\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eMBP6\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eMID\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSD\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.080359\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.096269\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.079779\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e0.080080\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eWB\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.003257\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.091172\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.078181\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e0.077920\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMBP30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.011596\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.009732\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.090906\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e0.089730\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMBP6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.003245\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.000396\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.010290\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e0.077172\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMID\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.002795\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.004311\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.014500\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.005008\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eAMOVA\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSSD\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMSD\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003edf\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003esigma2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eP.value\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBetween populations\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.00360666\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.0009\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.84E-05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eWithin populations\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.05077457\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.0005\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e95\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e5.34E-04\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTotal\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.05438122\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.0005\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e99\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eOutlier analyses:\u003c/h2\u003e \u003cp\u003eThe Outflank software, which conservatively fits the distribution of F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e from neutral loci to a χ\u003csup\u003e2\u003c/sup\u003e square distribution, identified 59 top outlier candidate SNPs. These outliers were mapped in the reference genome and found to be distributed across nine of the ten contiguous chromosomes and in seven out of the 236 non-assembled scaffolds (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). MB30 had the most individuals with outlier SNPs as heterozygous or homozygous alternative allele (51 out of 59), followed by SD (41 out of 59), whereas MBP6 and WB had the fewest SNPs as heterozygous or homozygous ALT alleles (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eA functional annotation analysis of the genomic regions where the top outliers were found shows that 39% of these sites are in transcribed regions (protein and non-protein coding), and the remaining (61%) are in non-coding regions. SNPs that translate into non-synonymous variants within protein-coding sequences, which could be considered as the most consequential, include two missense variants, one in a transcript coding for \u0026ldquo;tripartite motif-containing protein 2\u0026rdquo; (XM_034452193.1) in Chromosome 10 (only present in individuals from the MBP30 population), and another in a transcript coding for \u0026ldquo;histone-lysine N-methyltransferase SETMAR-like\u0026rdquo; (XM_034451927.1) in Chromosome 8 (present in more than half of MBP30 individuals and only in up to 3 individuals from each of the other populations). Additionally, eleven outliers (18% of the total) mapped to five long-non-coding RNA (lncRNA) loci (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). Annotation of outlier also included genomic regions that are within 5Kb of the outlier if they did not fall directly in a coding region. Those were classified as \u0026ldquo;upstream\u0026rdquo; or \u0026ldquo;downstream\u0026rdquo; gene variants following SnpEff annotation scheme. Half of the outliers that did not map into a coding region were within those categories (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eF\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outliers with position and annotations\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOutlier Number\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr.\u003c/p\u003e \u003cp\u003eNumber\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePOS\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003evariant type\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eNearest coding sequence NCBI refseq ID\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003esequence type*\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003efunctional annotation*\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7680836\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003esynonymous_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034445312.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003ehomeobox protein 6\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e33625290\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_011422575.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003etubulin-specific chaperone C-like\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e37904677\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004595737.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e37904740\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004595737.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e37904809\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004595737.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e55548026\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLOC105335559\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eHCLS1-binding protein 3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e33394675\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004601088.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized LOC117688662\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e39522736\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034466364.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eepidermal growth factor-like domains\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e51858408\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e59631568\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034465519.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e27743234\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLOC105329645\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003esentrin-specific protease 1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e30213493\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e47602303\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004601319.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e12185248\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e22791392\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLOC15343840\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003egrowth hormone-regulated TBC protein 1-A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e263654\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034479567.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eplatelet glycoprotein Ib alpha chain\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e39465652\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034475582.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e18\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e41465770\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_011438014.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003ecell division cycle 5-like protein\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e19\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e41912112\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_010714176.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e49280926\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_011427417.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eE3 ubiquitin-protein ligase CHFR\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e21\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e30647\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e22\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6307936\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLOC105329705\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e23\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e120446\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3525666\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034452193.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003etripartite motif-containing protein 3-like\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e25\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e38861094\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034448703.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e26\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e40231691\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003emissense_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034451927.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003ehistone-lysine N-methyltransferase SETMAR-like\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e27\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e46762427\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_002201338.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e52522525\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034449879\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e29\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e20830318\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034455499.1-1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eprotein FAM133A-like\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e33311896\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003emissense_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_011451359.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003etripartite motif-containing protein 2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e31\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr: 10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e41514923\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034455021.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eopine dehydrogenase\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e32\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994786.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e28721\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e33\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994786.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e35777\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e34\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994786.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e62891\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_011420943.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized LOC105322303\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e35\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994801.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e28046\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e36\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994829.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e71361\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004599051.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e37\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994829.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e80271\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004599053.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e38\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994829.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e124001\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034459954.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein LOC105341878\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e39\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994829.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e124008\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034459954.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein LOC105341878\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e40\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e173056\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034460498.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein LOC117685913\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e41\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e173152\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003edownstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034460498.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein LOC117685913\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e42\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e303488\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004599282.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e43\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e303501\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004599282.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e44\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e304348\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004599282.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e45\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e313977\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e46\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e314413\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e47\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e319933\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e48\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994852.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e328132\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994864.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e196193\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994865.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3222987\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e51\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e32136\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXM_034461254.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003ezinc finger protein 862-like isoform X1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e52\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e37030\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003ezinc finger protein 862-like isoform X1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e53\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e42354\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003ezinc finger protein 862-like isoform X1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e54\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e42663\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintergenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003ezinc finger protein 862-like isoform X1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e55\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e73226\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eupstream_gene_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLOC105348138\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein LOC105348138\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e56\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e122131\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLOC117686395\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e57\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e127673\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004599579.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e58\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994890.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e127710\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintragenic_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eXR_004599579.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003enon-protein coding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003elncRNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e59\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNW_022994940.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e16981\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eintron_variant\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLOC105324010\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003etranscript\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003euncharacterized protein LOC105324010\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003e \u003cem\u003eMagallana gigas\u003c/em\u003e, like other bivalves, is known to have extremely high levels of genetic polymorphisms. While this variability could be a source of adaptability and a source for genetic selection, it cannot be easily characterized since it is strongly affected by a combination of life history traits (e.g., high fecundity, high larval dispersal, variable sex ratios, variable reproductive success) and the unique demographic history of this species (i.e., \u003cem\u003eM. gigas\u003c/em\u003e has been artificially transported, distributed, domesticated, and naturalized across the world).\u003c/p\u003e \u003cp\u003eIn the present study, we set to investigate the genetic diversity that exists in \u003cem\u003eM. gigas\u003c/em\u003e populations on the US Pacific Coast to understand the demographic history, the potential for genetic selection, and the effects of naturalization and domestication on this economically and ecologically important species. We detected high heterozygosity through the identification of roughly 20 M SNPs by resequencing one hundred individuals from five different populations from the U.S. Pacific Coast. This is the highest number of variant sites in this species recorded to date, with the second highest from a study that identified 12.2 M SNPs by evaluating populations from a selective breeding program in China with two wild endemic populations from Japan and China [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. The study by Hu et al., [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e] is among the few that also used whole genomic resequencing, while previous studies might have been limited by either the method, the number of sites tested, or both.\u003c/p\u003e \u003cp\u003eAlthough we found a high number of single-site variations, the divergence between populations was low. The 2021 cohort of selectively bred animals (MBP30) was the most divergent, whereas the other populations, except for WB and MBP6, could also be distinguished based on their genetic makeup but with overall signals of weak population divisions. These results were supported across multiple tests: PCA showed population subdivisions while ADMIXTURE showed weak structure that appears to get lost when stringently filtering out rare alleles. Independent estimators of divergence (F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e and \u003cem\u003edxy\u003c/em\u003e) also indicated low but significant differentiation between populations.\u003c/p\u003e \u003cp\u003eBy-population F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e showed weak but very significant differentiation (0.00973\u0026ndash;0.01450, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.0001) between MBP30 and the other four populations. These F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e values are comparable to those calculated by Sun et al. [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] between a population of \u003cem\u003eM. gigas\u003c/em\u003e from Hiroshima (Japan) and six from North America (F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e = 0.0151\u0026ndash;0.0212), and slightly lower than values from a study using 33 loci where multiple MBP cohorts clustered distinctly away from naturalized Pacific coast populations (F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e = 0.0218) [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. Likewise, a study involving a hatchery population of mixed lineages from US, British Columbia, and South America, showed high divergence from local naturalized and other hatchery populations [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. However, divergence estimates in the latter study were much larger than those obtained in the present work (average F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e = 0.06). In another study [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], the authors obtained F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e estimates comparable to those in our study when comparing populations in the Dalian Sea, a prominent aquaculture region in China. MBP30 has been reared in captivity and with controlled reproduction for over 25 years, and while there is no record of intentional new introductions of animals to minimize inbreeding during those years, the founding broodstock used to initiate the MBP has been mixed with broodstock from other localities and undergone multiple mating schemes [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. MBP30 also has the lowest nucleotide diversity (average and by window or chromosome), possibly as a result of inbreeding and/or a reduction in effective population size since the number of crosses per cohort ranged from 24\u0026ndash;85 [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. This result is not unexpected given the domestication history of MBP30 and the challenges in maintaining high genetic diversity and low rates of inbreeding accumulation in aquatic animal breeding programs.\u003c/p\u003e \u003cp\u003eIn this study, we used the Cockerham and Weir F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e,\u003c/sub\u003e which (as all other F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e statistics) is a relative measure of population divergence largely depending on current within-population diversity (π). \u003cem\u003eDxy\u003c/em\u003e, conversely, is independent of extant population diversity and better reflects the relationships between populations that are shared by ancestry [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. \u003cem\u003eDxy\u003c/em\u003e generally supported the F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e results, but the values were more similar across pairs of populations involving MBP30, indicating that the lower nucleotide diversity within MBP30 is what causes, at least in part, the elevated F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e values in pairs including MBP30. An AMOVA, based on Nei\u0026rsquo;s genetic distances, also supported significant genetic variability between populations, although AMOVA does not allow for dissecting the specific pairs that diverge, based on all other results, MBP30 is likely the major driver of such variability.\u003c/p\u003e \u003cp\u003eNone of the tests we performed detected differences between MBP6 (the broodstock collected from Dabob Bay in 1998) and WB, the contemporary, naturally recruiting population from Willapa Bay collected in 2022. This pair of populations had the lowest F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e (0.000396, p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.0001) and overlapped significantly in the PCA. This is an unexpected result since these two populations are separated geographically and temporally, underscoring the challenge in characterizing population genetics in this species. If we assume two generations per year and that the standardized temporal variation (Fs) [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e] is expected to be double the F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e between two populations [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], temporal change between these two populations is still very low (Fs 0.0096). Although Dabob Bay and Willapa Bay are geographically segregated, there was and continues to be extensive human-led movement of animals between those two locations. These results contrast with the study by X Sun and D Hedgecock [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] in which large genetic divergence in time for populations collected in Dabob Bay in 1985, 1996, and 2006 was detected. A more direct comparison using contemporary samples from Dabob Bay would be useful to clarify discrepancies.\u003c/p\u003e \u003cp\u003eThe SD population, assumed to be a naturalized, self-recruiting population sampled from San Diego Bay, CA, tightly clustered with the MID population, which, in turn, has been in captivity as part of the OSU-MBP in Newport, Oregon, since its collection in Japan in 2004. ADMIXTURE analysis using all SNPs (before filtering for INDELS and rare alleles) supports that these two populations share some weak genetic ancestry. This finding was unexpected, and one possible explanation is that animals from the same region in Japan from where MID came from were somehow moved to San Diego Bay. Unintentional transport via ship ballast and fouling is plausible since San Diego is one of the most active ports on the U.S. West Coast. The data suggests that the progenitors of SD were more likely from Southern Japan near Kumamoto rather than from the Miyagi prefecture in Northeastern Japan, where the bulk of seed imports originated from in the previous century [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. In addition, after collecting samples for this study, the fact that the SD population is self-recruiting has been questioned. There is at least one report of Pacific oysters established in southern California, although it was unclear, at the time of that report, if these clusters of established oysters would persist since many intentional attempts to introduce Pacific oysters in that region had failed in the past [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. Recently, an in-depth report of feral \u003cem\u003eM. gigas\u003c/em\u003e presence on the U.S. Pacific Coast showed that this species has increased in abundance in Southern California and pointed to possible permanent establishment and dispersal [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]. An investigation into the self-recruiting ability of animals from San Diego Bay is warranted but outside of the scope of this study. Importantly, given the population genetic characteristics of SD found here (i.e., low but marked genetic divergence from other populations), this location could be a source of genetic diversity exploitable for breeding purposes.\u003c/p\u003e \u003cp\u003eTo evaluate F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outliers, we used the Outflank algorithm developed by MC Whitlock and KE Lotterhos [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] based on the first formal method for detecting F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outliers [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]. Importantly, Outflank does not assume that populations are independent, allowing for population pairs to have exchange of migrants and shared evolutionary histories, which we deemed appropriate in this study. MBP30 had the most individuals with identified outlier sites, while some outliers were present only in some populations and not in others. Several outliers fell into coding regions and regulatory regions. Importantly, many of the outliers were found in genes encoding lncRNAs, important regulators of transcription through a variety of mechanisms and believed to be involved in immunity and stress response in bivalves [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e, \u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e]. LncRNAs were associated with growth regulation in Pacific oysters [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. These associations are plausible given MBP\u0026rsquo;s history of selection for yield (a composite of growth and survival) and more recently for field survival in OsHV-1 positive bay.\u003c/p\u003e \u003cp\u003eAlthough we were able to functionally annotate the loci where the F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outliers were identified, both the genome assembly reference used in this study and its associated annotations are still relatively unrefined. Many of the identified outlier sites are in or nearby genes coding for uncharacterized proteins. Moreover, during the writing of the present manuscript, at least three new \u003cem\u003eMagallana gigas\u003c/em\u003e genome assemblies were made public in the NCBI repository alone, and one of them (GCF_963853765.1, Wellcome Sanger Institute) has replaced the \u0026ldquo;reference\u0026rdquo; status of the genome assembly in this study. None of the currently available genome assemblies derives from a U.S. sourced Pacific oyster. High genetic and molecular diversity in bivalves is thought to be amply rooted in presence/absence variation found through pan-genome studies involving individuals from geographically distant populations [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. In Pacific oysters, this variation has been found across European populations [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e, \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e].\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eIn this study, we confirmed that North American Pacific oysters harbor very high genetic heterozygosity. The divergence between Pacific oyster populations along the U.S. Pacific coast is very low but detectable and significant. The captive population in our set, which has been used for breeding for over 25 years, is the most genetically differentiated and shows low nucleotide diversity attributable to the effects of domestication and inbreeding. In addition, this captive population seems to harbor loci with a high probability of having been selected as a result of domestication and artificial selection. Overall, the results presented here are indicative of high-gene flow and weak but detectable population structure among the contemporary populations in this set. The genetic variability detected could be exploitable for breeding purposes and probably confers Pacific oysters with an increased ability to adapt to changing environments.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate:\u0026nbsp;\u003c/strong\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSequencing data is available in the NCBI SRA under accession BioProject #PRJNA1165834\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research was supported by the U.S. Department of Agriculture, Agricultural Research Service (Project number\u0026nbsp;2076-63000-005-000-D).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThis research used resources provided by the SCINet project and/or the AI Center of Excellence of the USDA Agricultural Research Service, ARS project numbers 0201-88888-003-000D and 0201-88888-002-000D.\u003c/p\u003e\n\u003cp\u003eUSDA is an equal opportunity provider and employer.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors' contributions:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eBC and NT conceived and designed the experiment; BC and NT prepared samples and obtained data; BC, JS, and NT analyzed and interpreted the data; BC drafted the manuscript; BC, JS, and NT revised the manuscript. All authors read and approved the final manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank Dr. Brett Dumbauld for critical review of the manuscript. Brooke McIntyre, Dr. Brett Dumbauld, and Zach Forster for collecting and providing oyster samples from Willapa Bay. Thanks to Dr. Colleen Burge (California Department of Fish and Wildlife) for providing oyster samples from San Diego Bay. We thank Dr. Chris Langdon of the Oregon State University MBP for granting access to breeding populations and samples.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSteele EN: \u003cstrong\u003eThe immigrant oyster (\u003cem\u003eOstrea gigas\u003c/em\u003e) now known as the Pacific oyster\u003c/strong\u003e. Olympia, Washington: Quick Print; 1964.\u003c/li\u003e\n\u003cli\u003eLavoie RE: \u003cstrong\u003eOyster Culture in North America History, Present and Future\u003c/strong\u003e. In: \u003cem\u003eThe 1st International Oyster Symposium Proceedings: 2005; Tokio, Japan\u003c/em\u003e: Oyster Research Institute News 17: 14-21 2005.\u003c/li\u003e\n\u003cli\u003eChew KK: \u003cstrong\u003eRecent advances in the cultivation of molluscs in the Pacific United States and Canada\u003c/strong\u003e. \u003cem\u003eAquaculture \u003c/em\u003e1984, \u003cstrong\u003e39\u003c/strong\u003e(1):69-81.\u003c/li\u003e\n\u003cli\u003eLangdon C, Evans F, Jacobson D, Blouin M: \u003cstrong\u003eYields of cultured Pacific oysters \u003cem\u003eCrassostrea gigas\u003c/em\u003e Thunberg improved after one generation of selection\u003c/strong\u003e. \u003cem\u003eAquaculture \u003c/em\u003e2003, \u003cstrong\u003e220\u003c/strong\u003e(1):227-244.\u003c/li\u003e\n\u003cli\u003eEvans S, Langdon C: \u003cstrong\u003eEffects of genotype\u0026times;environment interactions on the selection of broadly adapted Pacific oysters (Crassostrea gigas)\u003c/strong\u003e. \u003cem\u003eAquaculture \u003c/em\u003e2006, \u003cstrong\u003e261\u003c/strong\u003e(2):522-534.\u003c/li\u003e\n\u003cli\u003ede Melo CMR, Durland E, Langdon C: \u003cstrong\u003eImprovements in desirable traits of the Pacific oyster, Crassostrea gigas, as a result of five generations of selection on the West Coast, USA\u003c/strong\u003e. \u003cem\u003eAquaculture \u003c/em\u003e2016, \u003cstrong\u003e460\u003c/strong\u003e:105-115.\u003c/li\u003e\n\u003cli\u003eDivilov K, Schoolfield B, Mancilla Cortez D, Wang X, Fleener GB, Jin L, Dumbauld BR, Langdon C: \u003cstrong\u003eGenetic improvement of survival in Pacific oysters to the Tomales Bay strain of OsHV-1 over two cycles of selection\u003c/strong\u003e. \u003cem\u003eAquaculture \u003c/em\u003e2021, \u003cstrong\u003e543\u003c/strong\u003e:737020.\u003c/li\u003e\n\u003cli\u003eCamara MD: \u003cstrong\u003eChanges in molecular genetic variation at AFLP loci associated with naturalization and domestication of the Pacific oyster (\u003cem\u003eCrassostrea gigas\u003c/em\u003e)\u003c/strong\u003e. \u003cem\u003eAquat Living Resour \u003c/em\u003e2011, \u003cstrong\u003e24\u003c/strong\u003e(1):35-43.\u003c/li\u003e\n\u003cli\u003eSun X, Hedgecock D: \u003cstrong\u003eTemporal genetic change in North American Pacific oyster populations suggests caution in seascape genetics analyses of high gene-flow species\u003c/strong\u003e. \u003cem\u003eMar Ecol Prog Ser \u003c/em\u003e2017, \u003cstrong\u003e565\u003c/strong\u003e:79-93.\u003c/li\u003e\n\u003cli\u003eHedgecock D: \u003cstrong\u003eDoes variance in reproductive success limit effective population sizes of marine organisms? In A\u003c/strong\u003e. \u003cem\u003eGenetics and Evolution of Aquatic Organisms \u003c/em\u003e1994.\u003c/li\u003e\n\u003cli\u003eSauvage C, Bierne N, Lap\u0026egrave;gue S, Boudry P: \u003cstrong\u003eSingle Nucleotide polymorphisms and their relationship to codon usage bias in the Pacific oyster Crassostrea gigas\u003c/strong\u003e. \u003cem\u003eGene \u003c/em\u003e2007, \u003cstrong\u003e406\u003c/strong\u003e(1):13-22.\u003c/li\u003e\n\u003cli\u003eLauney S, Hedgecock D: \u003cstrong\u003eHigh genetic load in the pacific oyster \u003cem\u003eCrassostrea gigas\u003c/em\u003e\u003c/strong\u003e. \u003cem\u003eGenetics \u003c/em\u003e2001, \u003cstrong\u003e159\u003c/strong\u003e(1):255-265.\u003c/li\u003e\n\u003cli\u003ePlough LV, Hedgecock D: \u003cstrong\u003eQuantitative trait locus analysis of stage-specific inbreeding depression in the Pacific oyster \u003cem\u003eCrassostrea gigas\u003c/em\u003e\u003c/strong\u003e. \u003cem\u003eGenetics \u003c/em\u003e2011, \u003cstrong\u003e189\u003c/strong\u003e(4):1473-1486.\u003c/li\u003e\n\u003cli\u003eZhang G, Fang X, Guo X, Li L, Luo R, Xu F, Yang P, Zhang L, Wang X, Qi H\u003cem\u003e et al\u003c/em\u003e: \u003cstrong\u003eThe oyster genome reveals stress adaptation and complexity of shell formation\u003c/strong\u003e. \u003cem\u003eNature \u003c/em\u003e2012, \u003cstrong\u003e490\u003c/strong\u003e(7418):49-54.\u003c/li\u003e\n\u003cli\u003eQi H, Song K, Li C, Wang W, Li B, Li L, Zhang G: \u003cstrong\u003eConstruction and evaluation of a high-density SNP array for the Pacific oyster (Crassostrea gigas)\u003c/strong\u003e. \u003cem\u003ePLOS ONE \u003c/em\u003e2017, \u003cstrong\u003e12\u003c/strong\u003e(3):e0174007.\u003c/li\u003e\n\u003cli\u003eGutierrez AP, Turner F, Gharbi K, Talbot R, Lowe NR, Pe\u0026ntilde;aloza C, McCullough M, Prod\u0026ouml;hl PA, Bean TP, Houston RD: \u003cstrong\u003eDevelopment of a medium density combined-species snp array for pacific and european oysters (\u003cem\u003eCrassostrea gigas\u003c/em\u003e and \u003cem\u003eOstrea edulis\u003c/em\u003e)\u003c/strong\u003e. \u003cem\u003eG3 Genes|Genomes|Genetics \u003c/em\u003e2017, \u003cstrong\u003e7\u003c/strong\u003e(7):2209-2218.\u003c/li\u003e\n\u003cli\u003eVendrami DLJ, Houston RD, Gharbi K, Telesca L, Gutierrez AP, Gurney-Smith H, Hasegawa N, Boudry P, Hoffman JI: \u003cstrong\u003eDetailed insights into pan-European population structure and inbreeding in wild and hatchery Pacific oysters (\u003cem\u003eCrassostrea gigas\u003c/em\u003e) revealed by genome-wide SNP data\u003c/strong\u003e. \u003cem\u003eEvolutionary Applications \u003c/em\u003e2019, \u003cstrong\u003e12\u003c/strong\u003e(3):519-534.\u003c/li\u003e\n\u003cli\u003eSutherland BJG, Rycroft C, Ferchaud A-L, Saunders R, Li L, Liu S, Chan AM, Otto SP, Suttle CA, Miller KM: \u003cstrong\u003eRelative genomic impacts of translocation history, hatchery practices, and farm selection in Pacific oyster \u003cem\u003eCrassostrea gigas\u003c/em\u003e throughout the Northern Hemisphere\u003c/strong\u003e. \u003cem\u003eEvolutionary Applications \u003c/em\u003e2020, \u003cstrong\u003e13\u003c/strong\u003e(6):1380-1399.\u003c/li\u003e\n\u003cli\u003eHu B, Tian Y, Li Q, Liu S: \u003cstrong\u003eGenomic signatures of artificial selection in the Pacific oyster, \u003cem\u003eCrassostrea gigas\u003c/em\u003e\u003c/strong\u003e. \u003cem\u003eEvolutionary Applications \u003c/em\u003e2022, \u003cstrong\u003e15\u003c/strong\u003e(4):618-630.\u003c/li\u003e\n\u003cli\u003eMao J, Tian Y, Liu Q, Li D, Ge X, Wang X, Hao Z: \u003cstrong\u003eRevealing genetic diversity, population structure, and selection signatures of the Pacific oyster in Dalian by whole-genome resequencing\u003c/strong\u003e. \u003cem\u003eFrontiers in Ecology and Evolution \u003c/em\u003e2024, \u003cstrong\u003e11\u003c/strong\u003e.\u003c/li\u003e\n\u003cli\u003eBurge CA, Friedman CS, Kachmar ML, Humphrey KL, Moore JD, Elston RA: \u003cstrong\u003eThe first detection of a novel OsHV-1 microvariant in San Diego, California, USA\u003c/strong\u003e. \u003cem\u003eJournal of Invertebrate Pathology \u003c/em\u003e2021, \u003cstrong\u003e184\u003c/strong\u003e:107636.\u003c/li\u003e\n\u003cli\u003ede Melo CMR, Divilov K, Durland E, Schoolfield B, Davis J, Carnegie RB, Reece KS, Evans F, Langdon C: \u003cstrong\u003eIntroduction and evaluation on the US West Coast of a new strain (Midori) of Pacific oyster (Crassostrea gigas) collected from the Ariake Sea, southern Japan\u003c/strong\u003e. \u003cem\u003eAquaculture \u003c/em\u003e2021, \u003cstrong\u003e531\u003c/strong\u003e:735970.\u003c/li\u003e\n\u003cli\u003eSimon A: \u003cstrong\u003eFastQC: A Quality Control Tool for High Throughput Sequence Data [Online]\u003c/strong\u003e. In\u003cem\u003e.\u003c/em\u003e; 2010.\u003c/li\u003e\n\u003cli\u003ePe\u0026ntilde;aloza C, Gutierrez AP, E\u0026ouml;ry L, Wang S, Guo X, Archibald AL, Bean TP, Houston RD: \u003cstrong\u003eA chromosome-level genome assembly for the Pacific oyster Crassostrea gigas\u003c/strong\u003e. \u003cem\u003eGigaScience \u003c/em\u003e2021, \u003cstrong\u003e10\u003c/strong\u003e(3):giab020.\u003c/li\u003e\n\u003cli\u003eOkonechnikov K, Conesa A, Garc\u0026iacute;a-Alcalde F: \u003cstrong\u003eQualimap 2: Advanced multi-sample quality control for high-throughput sequencing data\u003c/strong\u003e. \u003cem\u003eBioinformatics \u003c/em\u003e2016, \u003cstrong\u003e32\u003c/strong\u003e(2):292-294.\u003c/li\u003e\n\u003cli\u003eMcKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M\u003cem\u003e et al\u003c/em\u003e: \u003cstrong\u003eThe Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data\u003c/strong\u003e. \u003cem\u003eGenome Research \u003c/em\u003e2010, \u003cstrong\u003e20\u003c/strong\u003e(9):1297-1303.\u003c/li\u003e\n\u003cli\u003eVan der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J\u003cem\u003e et al\u003c/em\u003e: \u003cstrong\u003eFrom FastQ Data to High-Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline\u003c/strong\u003e. \u003cem\u003eCurrent Protocols in Bioinformatics \u003c/em\u003e2013, \u003cstrong\u003e43\u003c/strong\u003e(1):11.10.11-11.10.33.\u003c/li\u003e\n\u003cli\u003eKorunes KL, Samuk K: \u003cstrong\u003epixy: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data\u003c/strong\u003e. \u003cem\u003eMolecular Ecology Resources \u003c/em\u003e2021, \u003cstrong\u003e21\u003c/strong\u003e(4):1359-1368.\u003c/li\u003e\n\u003cli\u003eWigginton JE, Cutler DJ, Abecasis GR: \u003cstrong\u003eA note on exact tests of Hardy-Weinberg equilibrium\u003c/strong\u003e. \u003cem\u003eThe American Journal of Human Genetics \u003c/em\u003e2005, \u003cstrong\u003e76\u003c/strong\u003e(5):887-893.\u003c/li\u003e\n\u003cli\u003eBenjamini Y, Hochberg Y: \u003cstrong\u003eControlling the False Discovery Rate: A practical and powerful approach to multiple testing\u003c/strong\u003e. \u003cem\u003eJournal of the Royal Statistical Society: Series B (Methodological) \u003c/em\u003e1995, \u003cstrong\u003e57\u003c/strong\u003e(1):289-300.\u003c/li\u003e\n\u003cli\u003ePurcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ\u003cem\u003e et al\u003c/em\u003e: \u003cstrong\u003ePLINK: A tool set for whole-genome association and population-based linkage analyses\u003c/strong\u003e. \u003cem\u003eThe American Journal of Human Genetics \u003c/em\u003e2007, \u003cstrong\u003e81\u003c/strong\u003e(3):559-575.\u003c/li\u003e\n\u003cli\u003eAlexander DH, Novembre J, Lange K: \u003cstrong\u003eFast model-based estimation of ancestry in unrelated individuals\u003c/strong\u003e. \u003cem\u003eGenome Research \u003c/em\u003e2009, \u003cstrong\u003e19\u003c/strong\u003e(9):1655-1664.\u003c/li\u003e\n\u003cli\u003eAlexander DH, Lange K: \u003cstrong\u003eEnhancements to the ADMIXTURE algorithm for individual ancestry estimation\u003c/strong\u003e. \u003cem\u003eBMC Bioinformatics \u003c/em\u003e2011, \u003cstrong\u003e12\u003c/strong\u003e(1):246.\u003c/li\u003e\n\u003cli\u003eWeir BS, Cockerham CC: \u003cstrong\u003eEstimating F-statistics for the analysis of population structure\u003c/strong\u003e. \u003cem\u003eEvolution \u003c/em\u003e1984, \u003cstrong\u003e38\u003c/strong\u003e(6):1358-1370.\u003c/li\u003e\n\u003cli\u003eNei M: \u003cstrong\u003eGenetic distance between populations\u003c/strong\u003e. \u003cem\u003eThe American Naturalist \u003c/em\u003e1972, \u003cstrong\u003e106\u003c/strong\u003e(949):283-292.\u003c/li\u003e\n\u003cli\u003ePembleton LW, Cogan NOI, Forster JW: \u003cstrong\u003eStAMPP: an R package for calculation of genetic differentiation and structure of mixed-ploidy level populations\u003c/strong\u003e. \u003cem\u003eMolecular Ecology Resources \u003c/em\u003e2013, \u003cstrong\u003e13\u003c/strong\u003e(5):946-952.\u003c/li\u003e\n\u003cli\u003eWhitlock MC, Lotterhos KE: \u003cstrong\u003eReliable detection of loci responsible for local adaptation: Inference of a null model through trimming the distribution of FST\u003c/strong\u003e. \u003cem\u003eThe American Naturalist \u003c/em\u003e2015, \u003cstrong\u003e186\u003c/strong\u003e(S1):S24-S36.\u003c/li\u003e\n\u003cli\u003eCingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM: \u003cstrong\u003eA program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff\u003c/strong\u003e. \u003cem\u003eFly \u003c/em\u003e2012, \u003cstrong\u003e6\u003c/strong\u003e(2):80-92.\u003c/li\u003e\n\u003cli\u003eLinck E, Battey CJ: \u003cstrong\u003eMinor allele frequency thresholds strongly affect population structure inference with genomic data sets\u003c/strong\u003e. \u003cem\u003eMolecular Ecology Resources \u003c/em\u003e2019, \u003cstrong\u003e19\u003c/strong\u003e(3):639-647.\u003c/li\u003e\n\u003cli\u003eHedgecock D, Pan FTC: \u003cstrong\u003eGenetic divergence of selected and wild populations of Pacific oysters (\u003cem\u003eCrassostrea gigas\u003c/em\u003e) on the West Coast of North America\u003c/strong\u003e. \u003cem\u003eAquaculture \u003c/em\u003e2021, \u003cstrong\u003e530\u003c/strong\u003e:735737.\u003c/li\u003e\n\u003cli\u003eSutherland BJG, Thompson NF, Surry LB, Gujjula KR, Carrasco CD, Chadaram S, Lunda SL, Langdon CJ, Chan AM, Suttle CA\u003cem\u003e et al\u003c/em\u003e: \u003cstrong\u003eAn amplicon panel for high-throughput and low-cost genotyping of Pacific oyster\u003c/strong\u003e. \u003cem\u003eG3 Genes|Genomes|Genetics \u003c/em\u003e2024:jkae125.\u003c/li\u003e\n\u003cli\u003eCruickshank TE, Hahn MW: \u003cstrong\u003eReanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow\u003c/strong\u003e. \u003cem\u003eMolecular Ecology \u003c/em\u003e2014, \u003cstrong\u003e23\u003c/strong\u003e(13):3133-3157.\u003c/li\u003e\n\u003cli\u003eJorde PE, Ryman N: \u003cstrong\u003eUnbiased estimator for genetic drift and effective population size\u003c/strong\u003e. \u003cem\u003eGenetics \u003c/em\u003e2007, \u003cstrong\u003e177\u003c/strong\u003e(2):927-935.\u003c/li\u003e\n\u003cli\u003eCrooks J, Crooks KR, Crooks AJ: \u003cstrong\u003eObservations of the non-native Pacific oyster (\u003cem\u003eCrassostrea gigas\u003c/em\u003e) in San Diego County, California\u003c/strong\u003e. \u003cem\u003eCalifornia Fish and Game \u003c/em\u003e2015, \u003cstrong\u003e101\u003c/strong\u003e:101-107.\u003c/li\u003e\n\u003cli\u003eKornbluth A, Perog BD, Crippen S, Zacherl D, Quintana B, Grosholz ED, Wasson K: \u003cstrong\u003eMapping oysters on the Pacific coast of North America: A coast-wide collaboration to inform enhanced conservation\u003c/strong\u003e. \u003cem\u003ePLoS One \u003c/em\u003e2022, \u003cstrong\u003e17\u003c/strong\u003e(3):e0263998.\u003c/li\u003e\n\u003cli\u003eLewontin RC, Krakauer J: \u003cstrong\u003eDistribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms\u003c/strong\u003e. \u003cem\u003eGenetics \u003c/em\u003e1973, \u003cstrong\u003e74\u003c/strong\u003e(1):175-195.\u003c/li\u003e\n\u003cli\u003ePereiro P, Moreira R, Novoa B, Figueras A: \u003cstrong\u003eDifferential expression of long non-coding rna (lncRNA) in mediterranean mussel (\u003cem\u003eMytilus galloprovincialis\u003c/em\u003e) hemocytes under immune stimuli\u003c/strong\u003e. In: \u003cem\u003eGenes.\u003c/em\u003e vol. 12; 2021.\u003c/li\u003e\n\u003cli\u003eCai C, He Q, Xie B, Xu Z, Wang C, Yang C, Liao Y, Zheng Z: \u003cstrong\u003eLong non-coding RNA LncMPEG1 responds to multiple environmental stressors by affecting biomineralization in pearl oyster \u003cem\u003ePinctada fucata martensii\u003c/em\u003e\u003c/strong\u003e. \u003cem\u003eFrontiers in Marine Science \u003c/em\u003e2022, \u003cstrong\u003e9\u003c/strong\u003e.\u003c/li\u003e\n\u003cli\u003eLi Y, Yang B, Shi C, Tan Y, Ren L, Mokrani A, Li Q, Liu S: \u003cstrong\u003eIntegrated analysis of mRNAs and lncRNAs reveals candidate marker genes and potential hub lncRNAs associated with growth regulation of the Pacific Oyster, \u003cem\u003eCrassostrea gigas\u003c/em\u003e\u003c/strong\u003e. \u003cem\u003eBMC Genomics \u003c/em\u003e2023, \u003cstrong\u003e24\u003c/strong\u003e(1):453.\u003c/li\u003e\n\u003cli\u003eGerdol M, Moreira R, Cruz F, G\u0026oacute;mez-Garrido J, Vlasova A, Rosani U, Venier P, Naranjo-Ortiz MA, Murgarella M, Greco S\u003cem\u003e et al\u003c/em\u003e: \u003cstrong\u003eMassive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel\u003c/strong\u003e. \u003cem\u003eGenome Biology \u003c/em\u003e2020, \u003cstrong\u003e21\u003c/strong\u003e(1):275.\u003c/li\u003e\n\u003cli\u003eSollitto M, Kenny NJ, Greco S, Tucci CF, Calcino AD, Gerdol M: \u003cstrong\u003eDetecting Structural VariantsStructural variants and Associated Gene Presence\u0026ndash;Absence Variation Phenomena in the Genomes of Marine Organisms\u003c/strong\u003e. In: \u003cem\u003eMarine Genomics: Methods and Protocols.\u003c/em\u003e Edited by Verde C, Giordano D. New York, NY: Springer US; 2022: 53-76.\u003c/li\u003e\n\u003cli\u003eTucci CF: \u003cstrong\u003eUnveiling and open pan-genomic structure in the Pacific oyster \u003cem\u003eCrassostrea gigas\u003c/em\u003e through the identification of PAV phenomena\u003c/strong\u003e. In: \u003cem\u003eMollusc genomics EMBO.\u003c/em\u003e University of Namur, Namur Belgium; 2024.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"gene flow, divergence, Pacific oysters, adaptation, captivity, inbreeding, diversity","lastPublishedDoi":"10.21203/rs.3.rs-5418899/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5418899/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground: \u003c/strong\u003eThe genetic diversity of a population defines its ability to adapt to episodic and fluctuating environmental changes. For species of agricultural value, available genetic diversity also determines their breeding potential and remains fundamental to the development of practices that maintain health and productivity. In this study, we used whole-genome resequencing to investigate genetic diversity within and between naturalized and captively reared populations of Pacific oysters from the US Pacific coast. The analyses included individuals from preserved samples dating to 1998 and 2004, two contemporary naturalized populations, and one domesticated population.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults:\u003c/strong\u003e Despite high overall heterozygosity, there was extremely low but significant genetic divergence between populations, indicative of high gene flow. The captive population, which was reared for over 25 years was the most genetically distinct population and exhibited reduced nucleotide diversity, attributable to inbreeding. Individuals from populations that were separated both geographically and temporally did not show detectable genetic differences, illustrating the consequences of human intervention in the form translocation of animals between farms, hatcheries and natural settings. Fifty-nine significant F\u003csub\u003e\u003cem\u003eST\u003c/em\u003e\u003c/sub\u003e outlier sites were identified, the majority of which were present in high proportions of the captive population individuals, and which are possibly associated with domestication.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion:\u003c/strong\u003e Pacific oysters in the US Pacific coast harbor high genetic heterozygosity which obscures weak population structure. Differences between these Pacific oyster populations could be leveraged for breeding and might be a source of adaptation to new environments.\u003c/p\u003e","manuscriptTitle":"Weak genetic divergence and signals of adaptation obscured by high gene flow in an economically important aquaculture species","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-11-29 13:37:59","doi":"10.21203/rs.3.rs-5418899/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-12-16T06:53:34+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-12-13T07:21:05+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-11-25T16:36:28+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"254284369536221029510298501041444628356","date":"2024-11-15T06:41:00+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"196087947486785009572927584241558880545","date":"2024-11-15T05:53:03+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-11-14T21:30:52+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2024-11-14T20:52:04+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-11-12T02:59:31+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-11-12T02:59:22+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Genomics","date":"2024-11-08T22:06:57+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"be215767-60b8-4c77-ab75-dcaf8eb3fde4","owner":[],"postedDate":"November 29th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-02-10T16:08:39+00:00","versionOfRecord":{"articleIdentity":"rs-5418899","link":"https://doi.org/10.1186/s12864-025-11259-9","journal":{"identity":"bmc-genomics","isVorOnly":false,"title":"BMC Genomics"},"publishedOn":"2025-02-05 15:57:51","publishedOnDateReadable":"February 5th, 2025"},"versionCreatedAt":"2024-11-29 13:37:59","video":"","vorDoi":"10.1186/s12864-025-11259-9","vorDoiUrl":"https://doi.org/10.1186/s12864-025-11259-9","workflowStages":[]},"version":"v1","identity":"rs-5418899","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5418899","identity":"rs-5418899","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00