Optimizing Cost-Effective Gene Expression Phenotyping Approaches in Cattle Using 3′ mRNA Sequencing

doi:10.1101/2024.06.18.599599

Optimizing Cost-Effective Gene Expression Phenotyping Approaches in Cattle Using 3′ mRNA Sequencing

2024 · doi:10.1101/2024.06.18.599599

preprint OA: closed

📄 Open PDF Full text JSON View at publisher

Full text 2,564 characters · extracted from oa-doi-fallback · 3 sections · click to expand

Abstract

Background Genetic and genomic selection programs require large numbers of phenotypes observed for animals in shared environments. Direct measurements of phenotypes like meat quality, methane emission, and disease susceptibility are difficult and expensive to measure at scale but are critically important to livestock production. Our work leans on our understanding of the “Central Dogma” of molecular genetics to leverage molecular intermediates as cheaply-measured proxies of organism-level phenotypes. The rapidly declining cost of next-generation sequencing presents opportunities for population-level molecular phenotyping. While the cost of whole transcriptome sequencing has declined recently, its required sequencing depth still makes it an expensive choice for wide-scale molecular phenotyping. We aim to optimize 3′ mRNA sequencing (3′ mRNA-Seq) approaches for collecting cost-effective proxy molecular phenotypes for cattle from easy-to-collect tissue samples (i.e., whole blood). We used matched 3′ mRNA-Seq samples for 15 Holstein male calves in a heat stress trail to identify the 1) best library preparation kit (Takara SMART-Seq v4 3′ DE and Lexogen QuantSeq) and 2) optimal sequencing depth (0.5 to 20 million reads/sample) to capture gene expression phenotypes most cost-effectively.

Results

Takara SMART-Seq v4 3′ DE outperformed Lexogen QuantSeq libraries across all metrics: number of quality reads, expressed genes, informative genes, differentially expressed genes, and 3′ biased intragenic variants. Serial downsampling analyses identified that as few as 8.0 million reads per sample could effectively capture most of the between-sample variation in gene expression. However, progressively more reads did provide marginal increases in recall across metrics. These 3′ mRNA-Seq reads can also capture animal genotypes that could be used as the basis for downstream imputation. The 10 million read downsampled groups called an average of 104,386 SNPs and 20,131 INDELs, many of which segregate at moderate minor allele frequencies in the population.

Conclusion

This work demonstrates that 3′ mRNA-Seq with Takara SMART-Seq v4 3′ DE can provide an incredibly cost-effective (<$25/sample) approach to quantifying molecular phenotypes (gene expression) while discovering sufficient variation for use in genotype imputation. Ongoing work is evaluating the accuracy of imputation and the ability of much larger datasets to predict individual animal phenotypes. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00