A latitudinal gradient of reference genomes

preprint OA: closed CC-BY-NC-4.0
📄 Open PDF Full text JSON View at publisher
Full text 46,022 characters · extracted from oa-pdf · 9 sections · click to expand

Abstract

Global inequality rooted in legacies of colonialism and uneven development 8 can lead to systematic biases in scientific knowledge. In ecology and evolutionary biology, 9 findings, funding and research effort are disproportionately concentrated at high latitudes while 10 biological diversity is concentrated at low latitudes. This discrepancy may have a particular 11 influence in fields like phylogeography, molecular ecology and conservation genetics, where the 12 rise of genomics has increased the cost and technical expertise required to apply state-of-the-art 13 methods. Here we ask whether a fundamental biogeographic pattern—the latitudinal gradient of 14 species richness in tetrapods—is reflected in available reference genomes, an important data 15 resource for various applications of molecular tools for biodiversity research and conservation. 16 We also ask whether sequencing approaches differ between Global South and Global North, 17 reviewing the last five years of conservation genetics research in four leading journals. We find 18 that extant reference genomes are scarce relative to species richness at low latitudes, and that 19 reduced-representation and whole-genome sequencing are disproportionately applied to taxa in 20 the Global North. We conclude with recommendations to close this gap and improve 21 international collaborations in biodiversity genomics. 22

Keywords

biodiversity genomics, conservation genetics, tetrapods, macroecology 23 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint

Introduction

Scientific knowledge reflects both reality and the economic and social 24 conditions that shape the practice of research (Hull 1990; Logino 2002; Stephan 2012). The 25 coincidence of the Scientific Revolution and imperial expansion in Western Europe in the 16th 26 through 18th centuries led to the accumulation and concentration of global capital in institutions 27 of higher learning in the United Kingdom, France, Spain, the Netherlands, and Germany, funding 28 foundational investigations in the natural sciences. Industrialization and the rise of capitalism in 29 the 19th century stoked the boiler of the research endeavor, permitting the continued expansion 30 of universities and birthing a learned aristocratic class with disposable time and income to 31 commit to natural history (Opitz 2004; Opitz 2006). This legacy persists to this day: Research 32 expenditures concentrated in a handful of majority Anglophone countries (May 1997), which in 33 turn produce a disproportionate amount of scholarship (May 1997; King 2004) and often drive 34 citation networks (Pasterkamp et al. 2007; Meneghini et al. 2008). 35 Ecology and evolutionary biology (EEB) faces a unique challenge in light of global 36 scientific inequality, as while funding and research effort are disproportionately concentrated at 37 high latitudes (Melles et al. 2019), biological diversity is concentrated at low latitudes (Willig et 38 al. 2003). The consequences of this discrepancy are many, including systematic biases in 39 research effort (Titley et al. 2017) and taxonomy (Freeman & Pennell 2021), major gaps in our 40 understanding of natural history and species distributions (Collen et al. 2008; Feely & Sillman 41 2011), and a blinkered view of patterns of diversity and diversification across the tree of life 42 (Reddy 2014; Cornwell et al. 2019). Increased research attention on tropical ecosystems has 43 therefore been a stated priority for decades (Collen et al. 2008; Nori et al. 2020), but despite 44 some promising strides towards international collaborations (Perez et al. 2018), partners in the 45 Global North frequently set research agendas (Bradeley 2008; Asase et al. 2022), leading to 46 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint accusations of parachute science and persistent gaps in data quantity, quality, and type (Stocks et 47 al. 2008; Soares et al. 2023). 48 EEB disciplines where data collection and analysis are equipment- and cost-intensive are 49 especially likely to show increased discrepancies between the Global North and Global South. In 50 the 1960s and 70s, assays of molecular genetic markers in nonmodel organisms laid the 51 groundwork for phylogenetic systematics to shift away from a traditional focus on morphology, 52 increasing statistical power and requiring increasingly expensive tools and specialized skills 53 (Hillis et al. 1996). Concurrently, the rise of phylogeography brought an evolutionary 54 perspective to population biology, altering views on the fundamental units of management and 55 conservation (Avise 2000). In the late 2000s and early 2010s, reduced representation high-56 throughput sequencing using library preparation methods such as RADseq and target-capture 57 began to proliferate as an alternative to traditional Sanger sequencing (Eklbom et al. 2011; 58 Lemmon & Lemmon 2013), boasting the advantages of rapidly generating much larger datasets 59 and requiring little to no prior information about a focal taxon’s genome. 60 A third revolution is currently underway, as low-coverage whole genome resequencing 61 (hereafter WGS) begins to supplant reduced representation methods in some fields and taxa 62 (Ellegren et al. 2014; Toews et al. 2016). Unlike RADseq, target capture, and related approaches, 63 WGS typically requires a high-coverage reference genome with which to align samples. Because 64 sequencing an individual at sufficient depth remains costly—and because assembling such large 65 data is computationally intensive—reference genomes remain out of reach for many labs and, by 66 extension, species. 67 The problem may be particularly acute in conservation biology where decades of concern 68 about the so-called “conservation genetics gap” have highlighted a persistent disconnect between 69 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint the importance managers place on genetic information and its actual use in decision making 70 (Taylor et al. 2017). Hypotheses to explain the conservation genetics gap include skepticism 71 about its importance, the specialized knowledge required for analysis and interpretation, and cost 72 (Hoban et al. 2013). At low-resource institutions in the tropics a broader shift towards WGS 73 approaches may further discourage the development of the field in the very regions where the 74 greatest number of critically endangered species are found (Vamosi & Vamosi 2008; Bertola et 75 al. 2024), especially if technological advances become publication requirements at high impact 76 and broadly read journals. 77 Here we ask whether a fundamental biogeographic pattern—the latitudinal gradient of 78 species richness in tetrapods—is reflected in available reference genomes, an important data 79 resource for various applications of molecular tools for biodiversity research and conservation. 80 We hypothesized that inequities in economic development and access to scientific resources 81 would lead the number of species with assembled genomes to be greatest in the temperate zone, 82 not the tropics. We also ask whether sequencing approaches differ between the Global South and 83 Global North, reviewing the last five years of conservation genetics research in four leading 84 journals. We conclude by discussing strategies to improve international collaborations in 85 biodiversity genomics and boost the representation of species from the Global South in sequence 86 databases. 87 88 89

Methods

90

Reference

genomes, species richness, and latitude. We used the NCBI Datasets 91 command-line tools v.16.19.0 to download taxonomy metadata for the subset of species with an 92 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint assembled reference genome in the following taxa: birds (Class: Aves), mammals (Class: 93 Mammalia), squamates (Order: Squamata), amphibians (Class: Amphibia), turtles (Order: 94 Testudines), crocodilians (Order: Crocodilia) and tuataras (Order: Rhynchocephalia). We 95 selected these groups—together comprising extant tetrapods—to provide a snapshot of animal 96 diversity in relatively well-studied clades with different ecologies and evolutionary histories, 97 while restricting the total dataset to a computationally manageable size. From this initial list we 98 retained species with an exact match to GBIF’s Backbone Taxonomy using rgbif v.3.8.0, and 99 downloaded all observations of each backed by georeferenced voucher specimens in natural 100 history museum collections (NHCs), excluding those without coordinates and those flagged for 101 geospatial issues. We repeated this process for all species in each higher-level taxon represented 102 in our list of reference genomes (i.e., downloaded metadata for all georeferenced tetrapod 103 specimens on GBIF). 104 Filtering these aggregated datasets to contain only species with 10 or more records, we 105 generated convex hull polygons for each as an approximation of their geographic distribution. 106 Overlaying these on a shapefile of Earth’s landmasses from rnaturalearth v.1.0.1, we calculated 107 species richness as the number of overlapping convex hulls in 2-degree x 2-degree grid cells, 108 standardizing this value by subtracting mean global species richness and dividing by its standard 109 deviation. We subtracted the number of species with reference genomes from total species 110 richness to determine the regions with the largest representation gap in genomic resources, again 111 standardizing the difference. To evaluate the significance and slope of a correlation between 112 species richness and the absolute value of latitude, we performed simple linear regressions in R 113 v.4.4.0, analyzing species with reference genomes and our full dataset separately. 114 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint 115 Literature review. To evaluate how the geography of authorship might impact sequencing 116 strategy of studies in conservation biology, we performed a restricted Web of Science literature 117 search on 29 June 2024 for conservation genetics papers published in the last five years in the 118 journals Conservation Genetics, Molecular Ecology, Journal of Heredity, and Conservation 119 Biology. We used the queries ‘SO=”Conservation Genetics”’ and ‘SO=("Molecular Ecology" 120 OR "Journal of Heredity" OR "Conservation Biology") AND (TS="Conservation Genet*" OR 121 KP="Conservation Genet*" OR TI="Conservation Genet*"’), excluding reviews, genome 122 announcements, meta-analyses, preprints, and studies that were purely simulations. 123 We then manually reviewed each study, first assigning the home institution of the first 124 and last author to the Global North or Global South using the 2024 UN Trade and Development 125 Classifications. We categorized its sequencing approach as reduced representation, WGS, Sanger 126 sequencing, microsatellites, or other, and described its overall focus using tiered categories based 127 on discussion in Bertola et al. 2024. These tiers were: 1) Taxonomy / systematics, identification, 128 or sexing; 2) Phylogeography / population genetic structure, estimating genetic diversity, and 129 inferring demographic history; and 3) Detecting outlier loci, quantifying runs of homozygosity, 130 and evaluating adaptive potential. When studies employed more than one sequencing approach 131 or addressed goals belonging to multiple tiers, we assigned them to a single category on the basis 132 of their most data-intensive method or question. 133 134

Results

135 Sampled Gradients of Species Richness. Our list of tetrapod reference genomes from 136 NCBI included 1159 bird species, 795 mammal species, 123 amphibian species, 39 turtle 137 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint species, 6 crocodile species, and the tuatara (Sphenodon punctuatus). This total represented 6.8% 138 of the 30,832 tetrapod species with a georeferenced preserved specimen record on GBIF. Species 139 with a reference genome were associated with 3,048,136 specimens, or 40.7% of the 7,478,867 140 tetrapod specimens with available metadata. Following filtering out species with 10 or fewer 141 observations, we retained 1,859 species with a reference genome, or 8.6% of the 21,583 tetrapod 142 species meeting the same criteria. 143 Minimum tetrapod species richness in both datasets was 1. Maximum species richness 144 calculated from the reference genome dataset was 705, occurring in a grid cell centered on south 145 Florida, USA (Figure 1C). Maximum species richness in the full dataset was 2698, occurring in 146 a grid cell centered on the western Amazon basin and east slope of the Andes in Ecuador near the 147 Peruvian border (Figure 1C); this was also where the greatest gap between sequenced species 148 and total species richness was observed (Figure 1A). 149 Species richness was negatively related to the absolute value latitude in both regressions, 150 albeit with a much steeper slope when data from all tetrapod species were included (reference 151 genomes only, β = -4.892, adjusted R2=0.6439; p<0.001; full data, β = -18.65, adjusted 152 R2=0.7856 p<0.001) (Figure 1B). Because data visualization indicated there might be a distinct 153 breakpoint in the relationship at mid latitudes—potentially reflecting a transition from the 154 influence of sampling effort effects to true biogeographic signal—we fit an additional piecewise 155 linear regression model with the R package segmented v.2.1-0. This model identified a 156 breakpoint at 39.819, fitting a segment with a slope of β = -0.312 before it, and a segment with a 157 slope of β = -7.3634 after it (p=0.0408; adjusted R2=0.7184). 158 159 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Literature Review. After excluding papers that did not meet our stated criteria, we 160 reviewed 394 empirical conservation genetics articles published between 1 January 2020 and 29 161 June 2024. This list included 248 papers from the journal Conservation Genetics, 11 papers from 162 Journal of Heredity, and 35 articles from Molecular Ecology; no appropriate studies from 163 Conservation Biology were identified. Of these, 62 included a first or senior author from the 164 Global South, while 342 included a first or senior author from the Global North. Ninety-eight 165 included sampling from a focal taxon or focal taxa in the Global South, with 277 sampling a 166 taxon or taxa from the Global North. Microsatellites were the most used sequencing strategy 167 among Global South authors and in studies of Global South taxa, while reduced representation 168 genome sequencing approaches were most common in the Global North. For authors and taxa in 169 both the Global South and the Global North, Tier 2 studies (phylogeography / population genetic 170 structure, estimating genetic diversity, or inferring demographic history) were most common. 171 Further details are provided in Table 1 and Table 2. 172 173

Discussion

Extant reference genomes fail to reflect the overwhelming concentration of 174 tetrapod species richness in the tropics and are strongly biased towards species at mid-latitudes in 175 the Northern Hemisphere (Figure 1). This pattern is almost certainly a result of global 176 inequalities in economic development and its resulting effects on research productivity (May 177 1997; King 2004). Its consequences will likely include increasing an already profound 178 methodological gap in sequencing approaches between molecular ecologists in the Global North 179 and the Global South (Table 1). 180 Our analysis contrasts patterns inferred from both a traditional source of biodiversity 181 data—vouchered specimens in NHCs—and a contemporary genomic resource archive, the NIH 182 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint NCBI. Our description of the latitudinal gradient of total tetrapod species richness is broadly 183 similar to other recent macroecological studies of the phenomenon (Roll et al. 2017; Quintero et 184 al. 2023), with global hotspots concentrated in northwest South America and Central Africa 185 (Figure 1C) and an approximately monotonic decline in latitudinal species richness maxima 186 from the equator to the poles. In contrast, the latitudinal gradient of richness of tetrapod species 187 with reference genomes is flattened, showing only moderate declines at high latitudes and a mid-188 latitude peak in species richness in the Northern Hemisphere (Figure 1B). 189 This difference is especially notable because we made no effort to correct for disparities 190 in historical specimen collection across latitude, with the consequence that our ‘true’ species 191 richness gradient significantly underestimates biodiversity in the tropics. Across longitude, our 192 analysis appears to underestimate diversity in East Asia, Indonesia, and Oceania (Quintero et al. 193 2023), likely due to both the coarse grain of our study and the region’s greater distance from the 194 large NHCs in Europe and North America that are the backbone of curated GBIF data. 195 Regardless, species with publicly available reference genomes as of July 2024 are more 196 reflective of socioeconomic conditions than biogeographic reality. 197 If both natural history collections and contemporary bioinformatics resources reflect 198 historical inequalities in development and scientific capacity, why do data from the former better 199 approximate the latitudinal gradient of species richness? Part of the answer lies in their different 200 goals: while the mission of many NHCs is to explicitly catalog and archive regional or global 201 biodiversity, the NCBI Genome Browser is often used as a repository for open data publication 202 requirements and is less often an end unto itself. Another part of the answer lies in shifting global 203 politics: as the birth and golden age of NHCs coincided with the heyday of Western colonialism 204 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint (De Vos 2007; Quintero Toro 2012), access to tropical habitat by collectors was less restricted by 205 concerns of either sovereignty or Indigenous land tenure. 206 We are not naive enough to believe that scientific colonialism is no longer a problem 207 (Asase et al. 2022; Soares et al. 2023). However, we suggest that a combination of evolving 208 norms and persistent obstacles of cost have led many relatively well-resourced scientists in the 209 Global North to prioritize generating reference genomes for local taxa. In spite of rapid declines 210 in the per base pair cost of whole genome sequencing (Lou et al. 2021), high-coverage 211 sequencing remains a significant expense: while averages are hard to come by in this 212 increasingly privatized sector, one of us (E.L.) recently paid ~$13,000 2024 USD for long-read 213 and HiC sequencing of a North American passerine bird. Even in North America, this figure 214 likely pushes small, single PI labs to prioritize investing in generating resources for species at the 215 center of their research program, or otherwise likely to provide long-term utility—which are 216 often those in their own backyards. 217 In the spirit of the traditional mission of NHCs, the past decade has seen several 218 interrelated, international initiatives to increase taxonomic diversity in high-quality reference 219 genomes (e.g. O’Brien et al. 2014; Koepfli et al. 2015; Cheng et al. 2018; Rhie et al. 2021). 220 These collaborations have had a profound impact on biodiversity genomics, helping to close 221 what was surely an even larger gap in empirical patterns of species richness and species richness 222 represented by NCBI. In some cases, they have provided researchers with early access to draft 223 assemblies of nonmodel organisms (e.g., Linck et al. 2020) or with support and resources to 224 produce new assemblies (e.g., Cadena et al. 2024), a scenario that suggests the resource gap may 225 be slightly less dire in practice than reported here. Yet in a world of limited time, finite 226 resources, and incompletely described biodiversity, sequencing at scale is not immune to its own 227 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint biases. For example, a recent publication introducing an attempt to generate reference genomes 228 for all vertebrates (Rhie et al. 2021) included 127 authors affiliated with 102 institutions. Of 229 these, only 13 are in the Global South: 4 in China, 3 in Korea, 2 in Malaysia, 1 in Singapore, 1 in 230 Qatar, and 1 in Colombia. No authors or institutional affiliations from Africa, Indonesia, or 231 Oceania outside of Australia and New Zealand were included. Though understandable 232 considering the current distribution of scientists and resources—though not of species—the 233 imbalance seems likely to perpetuate representational biases in the near-term. 234 Current efforts by international collaborations like the Amphibian Genomics Consortium 235 to increase geographic representation and to offer support and opportunities to researchers from 236 developing countries and underrepresented groups (Kosch et al. 2024) are steps in the right 237 direction in the path to make the field of biodiversity genomics more equitable. In line with the 238 Convention on Biological Diversity’s Nagoya Protocol (Secretariat of the Convention on 239 Biodiversity 2011), another critical dimension of the conversation about equitable generation of 240 encyclopedias of reference genomes is the need for researchers to build strong partnerships with 241 Indigenous peoples and other local communities, allowing them to participate in and benefit 242 from the different phases and products of sequencing projects (Ambler et al. 2020; Colella et al. 243 2023; Mc Cartney et al. 2023). 244 If tropical species are underrepresented on NCBI, we would expect that they are only 245 rarely studied using whole-genome resequencing (and other sequencing strategies dependent on 246 a reference genome). Our review of conservation genetics papers published in Molecular 247 Ecology, Journal of Heredity, Conservation Genetics, and Conservation Biology over the last 248 five years suggests this is indeed the case. While the Global South / Global North binary and 249 measures of human development more generally are only imperfectly correlated with latitude, 250 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint their association nonetheless indicates that WGS is only rarely applied in conservation genetics 251 studies in the tropics (n=6), and almost never by a leading researcher with a primary affiliation to 252 a research institution in the region (n=1) (Table 1). 253 Similarly, for Global South scientists and for Global South focal taxa, microsatellites 254 remain the most common molecular approach; for scientists and focal taxa in the Global North, 255 reduced representation approaches dominate. We believe this is reflective of the expense and 256 limited availability of high-throughput sequencing in the Global South, regardless of whether 257 reads are assembled de novo or aligned to a reference. Lastly, we point out that across all 258 categories, the vast majority of scientists (87%) and species (77%) in our sample originate in the 259 Global North. Though our choice of North American or European-based, English language 260 journals precludes generalization, it nevertheless seems safe to interpret this as indicating the 261 field of conservation genetics—let alone conservation genomics—is in its infancy in the tropics. 262 Setting appropriate goals, targets, and indicators to effectively conserve and monitor global 263 genetic diversity will require this situation to be remedied (Hoban et al. 2021). 264 We highlight discrepancies between available reference genomes and global 265 biogeographic patterns to encourage increased, equitable collaboration between scientists in the 266 Global North and Global South. In light of this, we make three simple recommendations (see 267 also Bertola et al. 2023). First, we encourage scientists from resource-rich institutions to consider 268 allocating effort and funds towards generating reference genomes that serve the needs of 269 managers and researchers in the Global South. Second, we support the continued development of 270 multinational sequencing projects, but ask funders and senior personnel to increasingly consider 271 prioritization, inclusion, and capacity building in areas of the world with rich biodiversity and 272 limited resources to study it using genomic tools. Third, we ask journals to consider issues of 273 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint access and cost in editorial guidelines and decisions: While high-throughput sequencing is 274 increasingly expected by editorial boards and reviewers at high-impact journals, it is not essential 275 to address a variety of research questions (Bertola et al. 2023) and remains out of reach for most 276 scientists residing where most of the world’s species occur. If ecology, evolution, and 277 conservation aim to accurately catalog and effectively protect life on Earth, remedying 278 inequalities in genomic resources should be a major priority. 279 280 Data Accessibility: DOIs for GBIF downloads, processed datasets, and a digital 281 notebook containing code to perform these analyses and generate Figure 1 are available at 282 https://github.com/elinck/lat_grad_genome and from Data Dryad (pending). 283 284 Acknowledgments: We thank Marty Kardos for the invitation to participate in this 285 special issue. 286 287

References

288 Ambler, J., Dearden, P. K., Wilcox, P., Hudson, M., & Tiffin, N. (2021). Including digital 289 sequence data in the Nagoya Protocol can promote data sharing. Trends in biotechnology, 39(2), 290 116-125. 291 Asase, A., Mzumara-Gawa, T. I., Owino, J. O., Peterson, A. T., & Saupe, E. (2022). 292 Replacing “parachute science” with “global science” in ecology and conservation biology. 293 Conservation Science and Practice, 4(5), e517. 294 Avise, J. C. (2000). Phylogeography: the history and formation of species. Harvard 295 university press. 296 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Bertola, L. D., Brüniche-Olsen, A., Kershaw, F., Russo, I. R. M., MacDonald, A. J., 297 Sunnucks, P., ... & Segelbacher, G. (2024). A pragmatic approach for integrating molecular tools 298 into biodiversity conservation. Conservation Science and Practice, 6(1), e13053. 299 Bradley, M. (2008). On the agenda: North–South research partnerships and agenda-300 setting processes. Development in Practice, 18(6), 673-685. 301 Cadena, C. D., Pabón, L., DoNascimiento, C., Abueg, L., Tilley, T., O-Toole, B., ... & 302 Torres, M. (2024). A reference genome for the Andean cavefish Trichomycterus rosablanca 303 (Siluriformes, Trichomycteridae): Building genomic resources to study evolution in cave 304 environments. Journal of Heredity, 115(3), 311-316. 305 Colella, J. P., Silvestri, L., Súzan, G., Weksler, M., Cook, J. A., & Lessa, E. P. (2023). 306 Engaging with the Nagoya Protocol on Access and Benefit-Sharing: recommendations for 307 noncommercial biodiversity researchers. Journal of Mammalogy, 104(3), 430-443. 308 Collen, B., Ram, M., Zamin, T., & McRae, L. (2008). The tropical biodiversity data gap: 309 addressing disparity in global monitoring. Tropical Conservation Science, 1(2), 75-88. 310 Cornwell, W. K., Pearse, W. D., Dalrymple, R. L., & Zanne, A. E. (2019). What we 311 (don't) know about global plant diversity. Ecography, 42(11), 1819-1831. 312 Cheng, S., Melkonian, M., Smith, S. A., Brockington, S., Archibald, J. M., Delaux, P. M., 313 ... & Wong, G. K. S. (2018). 10KP: A phylodiverse genome sequencing plan. Gigascience, 7(3), 314 giy013. 315 De Vos, P. S. (2007). Natural history and the pursuit of empire in eighteenth-century 316 Spain. Eighteenth-Century Studies, 40(2), 209-239. 317 Ekblom, R., & Galindo, J. (2011). Applications of next generation sequencing in 318 molecular ecology of non-model organisms. Heredity, 107(1), 1-15. 319 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Ellegren, H. (2014). Genome sequencing and population genomics in non-model 320 organisms. Trends in Ecology & Evolution, 29(1), 51-63. 321 Feeley, K. J., & Silman, M. R. (2011). The data void in modeling current and future 322 distributions of tropical species. Global Change Biology, 17(1), 626-630. 323 Freeman, B. G., & Pennell, M. W. (2021). The latitudinal taxonomy gradient. Trends in 324 Ecology & Evolution, 36(9), 778-786. 325 Hillis, D. M., Moritz, C., & Mable, B. K. (1996). Molecular systematics (Vol. 23). 326 Sinauer. 327 Hoban, S. M., Hauffe, H. C., Pérez-Espona, S., Arntzen, J. W., Bertorelle, G., Bryja, J., ... 328 & Bruford, M. W. (2013). Bringing genetic diversity to the forefront of conservation policy and 329 management. Conservation Genetics Resources, 5, 593-598. 330 Hoban, S., Bruford, M. W., Funk, W. C., Galbusera, P., Griffith, M. P., Grueber, C. E., ... 331 & Vernesi, C. (2021). Global commitments to conserving and monitoring genetic diversity are 332 now necessary and feasible. Bioscience, 71(9), 964-976. 333 Hull, D. L. (1990). Science as a process: an evolutionary account of the social and 334 conceptual development of science. University of Chicago Press. 335 King, D. A. (2004). The scientific impact of nations. Nature, 430(6997), 311-316. 336 Koepfli, K. P., Paten, B., Genome 10K Community of Scientists, & O’Brien, S. J. (2015). 337 The Genome 10K Project: a way forward. Annu. Rev. Anim. Biosci., 3(1), 57-111. 338 Kosch, T. A., Torres-Sanchez, M., Liedtke, H. C., Summers, K., Yun, M. H., Crawford, 339 A. J., ... & Amphibian Genomics Consortium (AGC). (2024). The Amphibian Genomics 340 Consortium: advancing genomic and genetic resources for amphibian research and conservation. 341 bioRxiv, 2024-06. 342 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Lemmon, E. M., & Lemmon, A. R. (2013). High-throughput genomic data in systematics 343 and phylogenetics. Annual Review of Ecology, Evolution, and Systematics, 44, 99-121. 344 Linck, E., Freeman, B. G., & Dumbacher, J. P. (2020). Speciation and gene flow across 345 an elevational gradient in New Guinea kingfishers. Journal of Evolutionary Biology, 33(11), 346 1643-1652. 347 Logino, H. (2019). The Social Dimensions of Scientific Knowledge. In E. N. Zalta (Ed.), 348 The Stanford Encylopedia of Philosophy (Summer 2019 Edition). Metaphysics Research Lab, 349 Stanford University. https://plato.stanford.edu/archives/sum2019/entries/scientific-knowledge-350 social/ 351 Lou, R. N., Jacobs, A., Wilder, A. P., & Therkildsen, N. O. (2021). A beginner's guide to 352 low-coverage whole genome sequencing for population genomics. Molecular Ecology, 30(23), 353 5966-5993. 354 May, R. M. (1997). The scientific wealth of nations. Science, 275(5301), 793-796. 355 Mc Cartney, A. M., Head, M. A., Tsosie, K. S., Sterner, B., Glass, J. R., Paez, S., ... & 356 Hudson, M. (2023). Indigenous peoples and local communities as partners in the sequencing of 357 global eukaryotic biodiversity. npj Biodiversity, 2(1), 8. 358 Melles, S. J., Scarpone, C., Julien, A., Robertson, J., Levieva, J. B., Carrier, C., ... & 359 Morales, K. (2019). Diversity of practitioners publishing in five leading international journals of 360 applied ecology and conservation biology, 1987–2015 relative to global biodiversity hotspots. 361 Ecoscience, 26(4), 323-340. 362 Meneghini, R., Packer, A. L., & Nassi-Calo, L. (2008). Articles by Latin American 363 authors in prestigious journals have fewer citations. PloS ONE, 3(11), e3804. 364 365 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Nori, J., Loyola, R., & Villalobos, F. (2020). Priority areas for conservation of and 366 research focused on terrestrial vertebrates. Conservation Biology, 34(5), 1281-1291. 367 OBrien, S. J., Haussler, D., & Ryder, O. (2014). The birds of Genome10K. GigaScience, 368 3(1), 2047-217X. 369 Opitz, D. L. (2004). Aristocrats and professionals: Country-house science in late-370 Victorian Britain. University of Minnesota. 371 Opitz, D. L. (2006). " This House is a Temple of Research": Country-House Centres for 372 Late-Victorian Science. 373 Pasterkamp, G., Rotmans, J., de Kleijn, D., & Borst, C. (2007). Citation frequency: A 374 biased measure of research impact significantly influenced by the geographical origin of research 375 articles. Scientometrics, 70(1), 153-165. 376 Perez, T. M., & Hogan, J. A. (2018). The changing nature of collaboration in tropical 377 ecology and conservation. Biotropica, 50(4), 563-567. 378 Quintero, I., Landis, M. J., Jetz, W., & Morlon, H. (2023). The build-up of the present-379 day tropical diversity of tetrapods. Proceedings of the National Academy of Sciences, 120(20), 380 e2220672120. 381 Quintero Toro, C. (2012). Birds of empire, birds of nation: A history of science, 382 economy, and conservation in United States-Colombia relations. Ediciones Uniandes-383 Universidad de los Andes. 384 Reddy, S. (2014). What’s missing from avian global diversification analyses?. Molecular 385 Phylogenetics and Evolution, 77, 159-165. 386 Rhie, A., McCarthy, S. A., Fedrigo, O., Damas, J., Formenti, G., Koren, S., ... & Jarvis, 387 E. D. (2021). Towards complete and error-free genome assemblies of all vertebrate species. 388 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Nature, 592(7856), 737-746. 389 Roll, U., Feldman, A., Novosolov, M., Allison, A., Bauer, A. M., Bernard, R., ... & Meiri, 390 S. (2017). The global distribution of tetrapods reveals a need for targeted reptile conservation. 391 Nature Ecology & Evolution, 1(11), 1677-1682. 392 Secretariat of the Convention on Biodiversity (2011). Nagoya Protocol on Access to 393 Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization 394 to the Convention on Biological Diversity : text and annex. Montreal, Canada, Secretariat of the 395 Convention on Biodiversity, 15pp. DOI: http://dx.doi.org/10.25607/OBP-789 396 Seth, S. (2009). Putting knowledge in its place: science, colonialism, and the 397 postcolonial. Postcolonial studies, 12(4), 373-388. 398 Soares, L., Cockle, K. L., Ruelas Inzunza, E., Ibarra, J. T., Miño, C. I., Zuluaga, S., ... & 399 Martins, P. V. R. (2023). Neotropical ornithology: Reckoning with historical assumptions, 400 removing systemic barriers, and reimagining the future. Ornithological Applications, 125(1), 401 duac046. 402 Stocks, G., Seales, L., Paniagua, F., Maehr, E., & Bruna, E. M. (2008). The geographical 403 and institutional distribution of ecological research in the tropics. Biotropica, 40(4), 397-404. 404 Stephan, P. (2012). How economics shapes science. Harvard University Press. 405 Taylor, H. R., Dussex, N., & van Heezik, Y. (2017). Bridging the conservation genetics 406 gap by identifying barriers to implementation for conservation practitioners. Global Ecology and 407 Conservation, 10, 231-242. 408 Titley, M. A., Snaddon, J. L., & Turner, E. C. (2017). Scientific research on animal 409 biodiversity is systematically biased towards vertebrates and temperate regions. PloS one, 410 12(12), e0189577. 411 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Toews, D. P., Campagna, L., Taylor, S. A., Balakrishnan, C. N., Baldassarre, D. T., 412 Deane-Coe, P. E., ... & Winger, B. M. (2016). Genomic approaches to understanding population 413 divergence and speciation in birds. The Auk: Ornithological Advances, 133(1), 13-30. 414 Willig, M. R., Kaufman, D. M., & Stevens, R. D. (2003). Latitudinal gradients of 415 biodiversity: pattern, process, scale, and synthesis. Annual review of ecology, evolution, and 416 systematics, 34(1), 273-309. 417 Vamosi, J. C., & Vamosi, S. M. (2008). Extinction risk escalates in the tropics. PLoS 418 One, 3(12), e3886. 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Figures & Tables: 435 Table 1. Summary of regional authorship affiliation, sequencing strategy, and sampled 436 focal species range for empirical conservation genetics papers from 2019-2024 in four leading 437 journals. Integers indicate the total number of studies in each category, while numbers in 438 parentheses refer to its proportion out of all reviewed articles (n=394). Papers were assigned to a 439 sequencing strategy based on the most data-intensive approach they employed (i.e., a study 440 applying both Sanger sequencing and microsatellites would be assigned to the ‘Microsatellites’ 441 category.) 442 443 Sequencing Strategy Global South Author Global North Author Global South Taxon Global North Taxon Sanger 13 (0.0329) 36 (0.0913) 21 (0.0532) 21 (0.0532) Microsatellites 28 (0.0710) 123 (0.3121) 40 (0.1015) 103 (0.2614) Reduced Representation 15 (0.0381) 146 (0.3706) 27 (0.0685) 127 (0.3223) WGS 2 (0.0050) 27 (0.0685) 6 (0.0152) 18 (0.0456) Other 4 (0.0101) 10 (0.0254) 4 (0.0101) 8 (0.0203) 444 445 446 447 448 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint Table 2. Summary of regional authorship affiliation, study goals, and sampled focal 449 species range among reviewed papers. Study goals refer to broad tiers of research questions with 450 increasing data requirements. Papers were assigned to each on the basis of the most data-451 intensive analysis they employed (i.e., a paper inferring population genetic structure and 452 identifying loci under selection would be assigned to the tier 3). 453 454 Study Goals Global South Author Global North Author Global South Taxon Global North Taxon 1. Taxonomy / systematics, identification, or sexing 6 (0.0152) 25 (0.0635) 11 (0.0279) 17 (0.04315) 2. Phylogeography / population genetic structure, estimating genetic diversity, and inferring demographic history 61 (0.1548) 285 (0.7234) 102 (0.2588) 248 (0.6294) 3. Detecting outlier loci, quantifying runs of homozygosity, and evaluating adaptive potential 1 (0.0025) 35 (0.0888) 4 (0.0102) 31 (0.0787) 455 456 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint 457 Figure 1. A) Tropical species are underrepresented among available reference genomes. Colors 458 reflect the standardized difference between total species richness and the number of species with 459 an assembled reference genome on the NCBI genome browser. B) Richness among species with 460

Reference

genomes does not reflect global patterns of biodiversity. Blue circles represent total 461 local species richness in 2-degree by 2-degree grid cells, while gold circles represent richness of 462 species with assembled reference genomes. C) Global patterns of species richness calculated 463 from species with reference genomes and all species with NHC specimen records on GBIF, 464 respectively. 465 .CC-BY-NC 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 13, 2024. ; https://doi.org/10.1101/2024.07.09.602657doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-NC-4.0