Full text
2,360 characters
· extracted from
oa-doi-fallback
· click to expand
Abstract
Tandem repeats (TRs) are among the most mutable loci in the human genome, but the genomic determinants of TR mutagenesis remain mysterious. We used PacBio HiFi long-read sequencing to profile nearly eight million TR loci in 28 members of a large, four-generation CEPH/Utah family designated K1463. We identified 1,270 de novo TR expansions and contractions across 20 children in the pedigree. De novo mutations (DNMs) were more likely to occur at loci that were longer, composed of uninterrupted motif sequences, and heterozygous in the parental germline. Children born to older fathers also exhibited more de novo mutations at short tandem repeats (STRs). A total of 43 TR loci were hyper-mutable in K1463, expanding or contracting up to twelve times across the pedigree. Among hyper-mutable loci that comprised multiple motifs (i.e., “complex” loci), specific motifs expanded and contracted more often than others; for example, all ten DNMs at a complex, hyper-mutable locus near the non-coding RNA LINC03021 involved the same 19bp motif. The mutability of particular motifs may be attributable to allele length, as 95% of DNMs at complex loci were expansions and contractions of the most abundant motif on a parental haplotype. However, future work will be required to disentangle the effects of nucleotide content and allele length on motif-specific mutability, especially at hyper-mutable TRs. Overall, this study combines long-read sequencing technologies with new software tools to comprehensively investigate the factors that influence TR mutagenesis.
Competing Interest Statement
E.E.E. is a scientific advisory board member of Variant Bio. E.D., T.M., Z.K., G.S.B., W.J.R., and M.A.E. are employees and/or shareholders of PacBio. M.A.E is an employee of GeneDx. The other authors declare no competing interests.
Footnotes
We have added a brief new analysis of hyper-mutable and complex TR loci. We show that nearly all motifs that expand or contract at complex TR loci are the most abundant motifs on parental alleles (see new Supplementary Fig. 6). We also add a number of co-authors who contributed to the generation, processing, and management of sequencing data, as well as software methods, used this manuscript. Regrettably, we did not include these co-authors in our original submission. We gratefully acknowledge their contributions.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.