Draft genome of Angelica Biserrata, a traditional Chinese medicinal herb of the Angelica genus (Apiaceae)

preprint OA: closed
Full text JSON View at publisher
Full text 48,858 characters · extracted from preprint-html · click to expand
Draft genome of Angelica Biserrata, a traditional Chinese medicinal herb of the Angelica genus (Apiaceae) | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Data Note Draft genome of Angelica Biserrata, a traditional Chinese medicinal herb of the Angelica genus (Apiaceae) Yuan-jiang Xu, Li Li, Xue Liu, Fang-yu Zhao, Yi-Quan Zhou, Xian-you Qu This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6430141/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 29 Oct, 2025 Read the published version in BMC Genomic Data → Version 1 posted 10 You are reading this latest preprint version Abstract Objectives Angelica biserrata (commonly known as “Duhuo”), a traditional Chinese medicinal herb of the genus Angelica within the Apiaceae family, is clinically valued for its therapeutic effects in dispelling wind-dampness and alleviating arthralgia. Its pharmacological properties are primarily attributed to coumarins, to elucidate the molecular mechanisms underlying coumarin biosynthesis and facilitate the breeding of high-coumarin cultivars, we present the first draft genome assembly and annotation of A. biserrata . The first genome assembly of A. biserrata will provide novel insights into elucidating coumarin biosynthesis and advancing evolutionary biological studies. Data description The genome of A. biserrata was sequenced using PacBio HiFi technology, generating 8.83 million high-fidelity reads with an average length of 14.2 kb (125.34 Gb, sequencing coverage 41 ×). The reads were assembled to give a draft genome of 4.52 Gb with an N50 contig length of 35.72 Mb. Chromosome-scale scaffolding was then performed using 300.87 Gb Hi-C data, resulting in a final genome assembly of 3.89 Gb with improved continuity (contig N50 = 34.42 Mb, scaffold N50 = 325.77 Mb). The genomic integrity was 96.59% (based on the embryophyta database of OrthoDB 10) through the evaluation of universal single copy direct homologous gene (BUSCO). At the same time, 3811.62 Mb long sequences were attached to 11 chromosomes, accounting for 97.86%. Angelica Biserrata Hi-Fi Genome Hi-C Genome annotation Objective Angelica biserrata , with the original name Angelica pubescens f. biserrata, belongs to the Apiaceae family. Its dry root is a traditional Chinese medicine [1]. Flora of China (FOC) is distributed mainly in Chongqing, Anhui, Hubei, Jiangxi, and Sichuan Provinces, etc., at altitudes of 1000–1700 m. FOC is known as “Duhuo” or “Rouduhuo” in China, “Dokwhal” in South Korea, and “Dokkatsu” in Japan; A. biserrata has different common names across East Asia [2]. The medicinal properties of A. biserrata were first recorded in “Shen Nong Ben Cao Jing” (Eastern Han Dynasty). It is used mainly to treat rheumatism, headache and other diseases [3]. Coumarins are considered the active ingredients of A. biserrata and feature a wide range of pharmacological activities, including anti-inflammatory, antitumor and antimicrobial properties, and take part in plant growth and development, such as seed germination, defense against plant pathogens and stress reactions [4-5]. However, their complex synthetic pathways are not fully known. The chromosome-level genome assembly of Angelica sinensis (2.3 Gb) has been successfully deciphered, providing novel genomic insights into the biosynthetic machinery of simple coumarins [6]. The coumarins in A. biserrata are more complex and diverse than those of A . sinensis . High-quality reference genomes are important for research on the resource protection of A. biserrata and the synthetic pathways of coumarin and other active ingredients. This study completed the first assembly and annotation of the A. biserrata genome. This work provides critical data for further analysis of the biosynthetic pathways of the active ingredients of A. biserrata , the development of molecular marker-assisted breeding systems, and the formulation of scientific conservation strategies. Moreover, sequencing the genome of A. biserrata will be helpful for analyzing the evolutionary history of Umbelliferae medicinal plants. Data Several leaves were collected from a plant of A . biserrata cultivated in Lanying Township, Wuxi County, Chongqing City, China (N 29.531106°, E 106.598273°), snap-frozen in liquid nitrogen, and stored in a -80 °C freezer in the laboratory. Prior to sequencing, high-quality DNA and total RNA were extracted from the samples. First, a DNBSEQ-T7 genomic survey and sequencing were performed on the Illumina sequencing platform, and 161.67 Gb of data (data file 1) were obtained after quality control. Subsequently, approximately 125.34 Gb of data were generated through Hi-Fi sequencing via the PacBio Revio sequencer (data file 2). A Hi-C library was constructed via methods such as cell cross-linking and Hind III restriction enzyme digestion and then sequenced via the DNBSEQ-T7 sequencer in 150-bp paired-end mode. Hi-C sequencing yielded a total of approximately 300.87 Gb of data (data file 3). Finally, RNA-seq sequencing was performed via the Illumina X-plus sequencer, generating approximately 6 Gb of data (data file 4), which was used to assist in annotation. The findings suggested that the genome size of A. biserrata is approximately 3.03 Gb, with a heterozygosity rate of approximately 0.27% and a repetitive sequence content of approximately 83.59%. It is a complex and large genome. HiFi sequencing yielded 8.83 million high-fidelity reads with an average length of 14.2 kb. Hifiasm (V 0.19) [7] was used to splice the genome, and the total length of the contig sequence was 4.55 Gb (N50 = 35.4 Mb). After nucleoside alignment and elimination of the Hi-C signal redundancy, the total length of the genome contig sequence was 3.89 Gb, and the contig N50 was 34.42 Mb. In accordance with OrthoDB 10 (http://cegg.unige.ch/orthodb) and BUSCO (V 5.2.1) [8], the integrity of the A. biserrata genome was determined to be 96.59%. HiC-Pro [9] (v2.10.0) was adopted to filter and evaluate the Hi-C data to obtain high-quality clean data. Through BWA [10] (version: 0.7.17-r1188; comparison mode: aln; other parameters are default), the double-end sequence data were aligned with the sequence of the assembled genome. The genome sequence was grouped, sequenced and oriented with LACHESIS [11] software, and then the final version of the genome was obtained after Hi-C error correction, auxiliary chromosome mounting and redundancy removal. The assembly result was 3894.84 Mb, the contig N50 was 34.42 Mb, and the scaffold N50 was 325.77 Mb (data file 5). Among them, the 3811.62 Mb genome sequence was located on 11 chromosomes, accounting for 97.86% of the genome, indicating high integrity of the assembly. Then, repeatModeler2 [12] (v2.0.1), RepeatMasker [13] (V4.1.2) and other software were used for ab initio prediction of repeat sequences; Augustus [14] (v3.1.0) and SNAP [15] (2006-07-28) were utilized for ab initio prediction of structural genes; GeMoMa [16] (V1.7) was used to predict homologous species; and PASA [17] (v2.4.1) was used for transcriptome-assisted gene prediction. Finally, EVM [18] (v1.1.1) was adopted to combine the prediction results from the above three methods, and 98.66% of the genes (a total of 44603 genes) were annotated (data file 6), indicating high annotation quality. Table 1 : Overview of data files Label Name of data file File types (file extension) Data repository and identifier (DOI or accession numbe r) Data file 1 Raw short whole genome Illumina sequencing reads Fasta file (.fastq) CNCB Big sub-Genome Sequence Archive (GSA) Accession number CRA024320. https://ngdc.cncb.ac.cn/gsa/browse/CRA024320 [19] Data file 2 Raw long whole genome Hi-Fi sequencing reads Fasta file (.fastq) CNCB Big sub-Genome Sequence Archive (GSA) Accession number CRA024335. (Data submission, in verification) https://ngdc.cncb.ac.cn/gsa/browse/CRA024335 [20] Data file 3 Raw Hi-C reads Fasta file (.fastq) CNCB Big sub-Genome Sequence Archive (GSA) Accession number (Data submission to CNCB subCRA039514, in verification). [21] Data file 4 Raw RNA-seq reads Fasta file (.fastq) CNCB Big sub-Genome Sequence Archive (GSA) Accession number CRA024296. https://ngdc.cncb.ac.cn/gsa/browse/CRA024296 [22] Data file 5 Genome annotation Gff3 file (.gff3) Figshare, 10.6084/m9.figshare.28740092 [23] Data file 6 Assembled genome Fasta file (.fa) CNCB Big sub-Genome Warehouse (GWH) Accession number GWHFQYN00000000.1 https://ngdc.cncb.ac.cn/gwh/Assembly/reviewersPage/NYBbdmgyzOuHzkThMbjkEnifHkIOIJwyuCYcLbNdtKbdBIlAwoQYdynfzdbKqPdT [24] Limitations Although we have successfully assembled a chromosome-level genome of A. biserrata (Duhuo), the biosynthetic pathways of its active compounds (e.g., coumarins) and the genetic mechanisms governing its medicinal properties require further investigation. Future studies will employ integrated multiomics approaches involving comparative genomics, metabolomics, transcriptomics, and in vitro enzymatic functional validation to systematically elucidate the complex coumarin biosynthesis network and its regulation in this medicinal species. Abbreviations BUSCO : Benchmarking Universal Single-Copy Orthologs CNCB: China National Center for Bioinformation PacBio: Pacifc Biosciences Hic: High-throughput chromosome conformation capture Declarations Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Availability of data and materials The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (GSA) of the National Genomics Data Center. The accession numbers for the Illumina, Hi-Fi, Hi-C, and RNA-seq data are CRA024320, CRA024335, CRA024295 and CRA024296, respectively. The assembled genome reported in this paper has been deposited in the Genome Warehouse (GWH) at the National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformatics, under accession number GWHFQYN00000000.1. The results of the genome annotations have been uploaded to Figshare. Please see Table 1 for details and links to the data. Competing interests The authors declare that they have no competing interests. Funding The study was funded by the Analysis of Synthetic Pathway Study of Key Enzyme Genes of Chongqing Post-Bo Foundation (grant number cstc2021jcyj-bshX0015). Authors’ contributions Y. J X and L. L. are jointly responsible for sample organization and sequencing data analysis. Y. J X ultimately wrote the manuscript, and Y. Q Z and X. Y Q revised the manuscript. L. L. and X. L. assisted in genome assembly and annotation. Y.Q Z and X. Y Q conceived and designed this project. All the authors have read and approved the final version of this manuscript. Acknowledgements The authors appreciate the use of biomarkers for providing high-quality sequencing. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. References Chinese Pharmacopoeia Commission. 2020. Pharmacopoeia of the People's Republic of China. Vol. Beijing: China Medical Science Publisher. 274. Ma J, Huang J, Hua S, Zhang Y, Zhang Y, Li T, Dong L, Gao Q, Fu X. 2019. The ethnopharmacology, phytochemistry and pharmacology of Angelica biserrata - A review. J Ethnopharmacol. 231:152-169. doi: 10.1016/j.jep.2018.10.040. Lu Y, Wu H, Yu X, Zhang Xiao, Luo Hanyan, Tang Liying, Wang Zhuju. Traditional Chinese Medicine of Angelicae Pubescentis Radix: A Review of Phytochemistry, Pharmacology and Pharmacokinetics[J]. Frontiers in Pharmacology, 2020, 11. DOI: 10.3389/fphar.2020.00335. K.N. Venugopala, V. Rashmi, B. Odhav, Review on natural coumarin lead compounds for their pharmacological activity, Biomed Res Int. (2013) 963248. https://doi.org/10.1155/2013/963248. Zhao Y, He Y, Han L, Zhang L, Xia Y, Yin F, Wang X, Zhao D, Xu S, Qiao F, Xiao Y, Kong L. Two types of coumarins-specific enzymes complete the last missing steps in pyran- and furanocoumarins biosynthesis. Acta Pharm Sin B. 2024 Feb;14(2):869-880. doi: 10.1016/j.apsb.2023.10.016. Han X, Li C, Sun S, Ji J, Nie B, Maker G, Ren Y, Wang L. The chromosome-level genome of female ginseng (Angelica sinensis) provides insights into molecular mechanisms and evolution of coumarin biosynthesis. Plant J. 2022 Dec;112(5):1224-1237. doi: 10.1111/tpj.16007. Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5. Simão Felipe A, Waterhouse R M, Panagiotis I, et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs[J]. Bioinformatics. (2015):19. Servant, Nicolas, et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biology, 2015.16(1):1-11. Li H, Durbin R. Fast and accurate short read alignment with Burrows‒Wheeler Transform.Bioinformatics, 2009 25:1754-60. Burton, J.N., et al., Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol, 2013. 31(12): p. 1119-25. Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457. Tarailo‐Graovac M, Chen N: Using RepeatMasker to identify repetitive elements in genomic sequences.Current Protocols in Bioinformatics 2009:4.10. 11-14.10. 14. Mario Stanke, Mark Diekhans, Robert Baertsch, David Haussler (2008)Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, doi: 10.1093/bioinformatics/btn013 Korf I. Gene finding in novel genomes. BMC bioinformatics 2004, 5:59. Keilwagen J, Wenk M, Erickson J L, Schattat, M. H., Jan, G., Frank, H: Using intron position conservation for homology-based gene prediction. Nucleic acids research 2016, 44: e89-e89. Haas, B.J., Delcher, A.L., Mount, S.M., Wortman, J.R., Smith Jr, R.K., Jr., Hannick, L.I., Maiti, R., Ronning, C.M., Rusch, D.B., Town, C.D. et al. (2003) Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res, 31, 5654-5666. Haas B J, Salzberg S L, Zhu W, Pertea M, Allen J E, Orvis J, White O, Buell C R, Wortman JR: Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 2008, 9:R7. Xu Y J. Illumina sequencing reads of Angelica Biserrata . China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA024320 Xu Y J. Hi-Fi sequencing reads of Angelica Biserrata . China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA024335 Xu Y J. Hi-C reads of Angelica Biserrata . China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. Xu Y J. RNA-seq of Angelica Biserrata . China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA024296 Xu Y J. Genome annotation of Angelica Biserrata . Figshare. 2025. Figshare, 10.6084/m9.figshare.28740092 Xu Y J. Draft Genome of Angelica Biserrata. China National Center for Bioinformation (CNCB)- Genome Warehouse (GWH). 2024. Accession number GWHFQYN00000000.1 https://ngdc.cncb.ac.cn/gwh/Assembly/reviewersPage/NYBbdmgyzOuHzkThMbjkEnifHkIOIJwyuCYcLbNdtKbdBIlAwoQYdynfzdbKqPdT Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 29 Oct, 2025 Read the published version in BMC Genomic Data → Version 1 posted Editorial decision: Revision requested 19 Jun, 2025 Reviews received at journal 21 May, 2025 Reviewers agreed at journal 21 May, 2025 Reviewers agreed at journal 19 May, 2025 Reviews received at journal 16 May, 2025 Reviewers agreed at journal 06 May, 2025 Reviewers invited by journal 30 Apr, 2025 Editor assigned by journal 15 Apr, 2025 Submission checks completed at journal 15 Apr, 2025 First submitted to journal 11 Apr, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6430141","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Data Note","associatedPublications":[],"authors":[{"id":451191789,"identity":"f4abe41a-8821-4835-80b6-fa48c4c71c52","order_by":0,"name":"Yuan-jiang Xu","email":"","orcid":"","institution":"Chongqing Academy of Chinese Materia Medical","correspondingAuthor":false,"prefix":"","firstName":"Yuan-jiang","middleName":"","lastName":"Xu","suffix":""},{"id":451191790,"identity":"465a6105-2a81-4636-9bbd-88945297779e","order_by":1,"name":"Li Li","email":"","orcid":"","institution":"Chongqing Academy of Chinese Materia Medical","correspondingAuthor":false,"prefix":"","firstName":"Li","middleName":"","lastName":"Li","suffix":""},{"id":451191791,"identity":"e63a403f-c09c-49c0-98ba-c3cf1f2c2115","order_by":2,"name":"Xue Liu","email":"","orcid":"","institution":"Chongqing Academy of Chinese Materia Medical","correspondingAuthor":false,"prefix":"","firstName":"Xue","middleName":"","lastName":"Liu","suffix":""},{"id":451191792,"identity":"500d2adb-33e6-4d48-85f4-613c191e191b","order_by":3,"name":"Fang-yu Zhao","email":"","orcid":"","institution":"Xizang Agricultural and Animal Husbandry University","correspondingAuthor":false,"prefix":"","firstName":"Fang-yu","middleName":"","lastName":"Zhao","suffix":""},{"id":451191793,"identity":"1109d34e-19f4-465e-b160-1f705e8cd65a","order_by":4,"name":"Yi-Quan Zhou","email":"","orcid":"","institution":"Chongqing Academy of Chinese Materia Medical","correspondingAuthor":false,"prefix":"","firstName":"Yi-Quan","middleName":"","lastName":"Zhou","suffix":""},{"id":451191794,"identity":"3ce9368c-7917-4805-a93a-4f8481481cc2","order_by":5,"name":"Xian-you Qu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAwElEQVRIiWNgGAWjYLCChIoaOTb25gMkaPlw5pgxH8+xBOJ1MM5sY06cJ5GjQJxyvhs5ZtI8bGzpbQw5DAw/KrYR1iJ5Iy1NmodHJreN4ewBxp4ztwlrMbiRfEyaR4Itt42xL4GZsY0oLYlt0jwGzOlszECSSC3JxyRnJDAnsLERq0XyzLNkiw8Hjhm28bAlHCTKL3zHcwxvJP6rkZef//jggx8VRGhhOMDAIoFgEwUOMDB/IE7lKBgFo2AUjFgAAOcvOvoxfR6AAAAAAElFTkSuQmCC","orcid":"","institution":"Chongqing Academy of Chinese Materia Medical","correspondingAuthor":true,"prefix":"","firstName":"Xian-you","middleName":"","lastName":"Qu","suffix":""}],"badges":[],"createdAt":"2025-04-11 16:53:17","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6430141/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6430141/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12863-025-01371-w","type":"published","date":"2025-10-29T15:56:54+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":95039762,"identity":"b588b853-57d7-4ccf-9212-7466726e3a47","added_by":"auto","created_at":"2025-11-03 16:00:12","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":434735,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6430141/v1/cc9104c3-2c09-4ba3-ae50-16cf5dbf185f.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Draft genome of Angelica Biserrata, a traditional Chinese medicinal herb of the Angelica genus (Apiaceae)","fulltext":[{"header":"Objective ","content":"\u003cp\u003e\u003cem\u003eAngelica biserrata\u003c/em\u003e, with the original name \u003cem\u003eAngelica pubescens\u003c/em\u003e f. biserrata, belongs to the Apiaceae family. Its dry root is a traditional Chinese medicine [1]. Flora of China (FOC) is distributed mainly in Chongqing, Anhui, Hubei, Jiangxi, and Sichuan Provinces, etc.,\u0026nbsp;at altitudes of 1000\u0026ndash;1700 m. FOC is known as \u0026ldquo;Duhuo\u0026rdquo; or \u0026ldquo;Rouduhuo\u0026rdquo; in China, \u0026ldquo;Dokwhal\u0026rdquo; in South Korea, and \u0026ldquo;Dokkatsu\u0026rdquo; in Japan; \u003cem\u003eA. biserrata\u003c/em\u003e has different common names across East Asia [2].\u0026nbsp;The medicinal properties of\u0026nbsp;\u003cem\u003eA. biserrata\u003c/em\u003e were first recorded in\u0026nbsp;\u0026ldquo;Shen Nong Ben Cao Jing\u0026rdquo; (Eastern Han Dynasty). It is used mainly to treat rheumatism, headache and other diseases [3]. Coumarins are considered the active ingredients of\u0026nbsp;\u003cem\u003eA. biserrata\u003c/em\u003e and feature\u0026nbsp;a wide range of pharmacological activities, including anti-inflammatory, antitumor and antimicrobial properties, and\u0026nbsp;take part in\u0026nbsp;plant growth and development, such as seed germination, defense against plant pathogens and stress reactions [4-5].\u0026nbsp;However,\u0026nbsp;their\u0026nbsp;complex synthetic\u0026nbsp;pathways\u0026nbsp;are\u0026nbsp;not fully\u0026nbsp;known. The chromosome-level genome assembly of \u003cem\u003eAngelica sinensis\u003c/em\u003e (2.3 Gb) has been successfully deciphered, providing novel genomic insights into the biosynthetic machinery of simple coumarins\u0026nbsp;[6]. The coumarins in\u0026nbsp;\u003cem\u003eA. biserrata\u003c/em\u003e are more complex and diverse than those of \u003cem\u003eA\u003c/em\u003e\u003cem\u003e. sinensis\u003c/em\u003e.\u0026nbsp;High-quality reference genomes are\u0026nbsp;important for research on the resource protection of\u0026nbsp;\u003cem\u003eA. biserrata\u003c/em\u003e and the synthetic\u0026nbsp;pathways\u0026nbsp;of coumarin and other active ingredients. This study completed the first assembly and annotation of the\u0026nbsp;\u003cem\u003eA. biserrata\u003c/em\u003e genome. This work provides\u0026nbsp;critical\u0026nbsp;data for further analysis of the biosynthetic\u0026nbsp;pathways\u0026nbsp;of\u0026nbsp;the active\u0026nbsp;ingredients of\u0026nbsp;\u003cem\u003eA. biserrata\u003c/em\u003e, the development of molecular marker-assisted breeding systems, and the formulation of scientific conservation strategies.\u0026nbsp;Moreover, sequencing the genome of\u003cem\u003e\u0026nbsp;A. biserrata\u003c/em\u003e will be helpful for analyzing the evolutionary history of Umbelliferae medicinal plants.\u003c/p\u003e"},{"header":"Data ","content":"\u003cp\u003eSeveral\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003eleaves were collected from a plant of\u0026nbsp;\u003cem\u003eA\u003c/em\u003e\u003cem\u003e.\u0026nbsp;\u003c/em\u003e\u003cem\u003ebiserrata\u003c/em\u003e cultivated in\u0026nbsp;Lanying Township, Wuxi County, Chongqing City,\u0026nbsp;China (N 29.531106\u0026deg;, E 106.598273\u0026deg;), snap-frozen in liquid nitrogen, and stored in a -80\u0026nbsp;\u0026deg;C freezer in the laboratory. Prior to sequencing, high-quality\u0026nbsp;DNA\u0026nbsp;and total\u0026nbsp;RNA\u0026nbsp;were extracted from the samples.\u0026nbsp;First, a\u0026nbsp;DNBSEQ-T7\u0026nbsp;genomic\u0026nbsp;survey\u0026nbsp;and\u0026nbsp;sequencing were performed on the Illumina sequencing platform, and 161.67 Gb of data (data file 1) were\u0026nbsp;obtained after quality control.\u0026nbsp;Subsequently, approximately 125.34 Gb of data were generated through Hi-Fi sequencing via the PacBio Revio sequencer (data file 2). A Hi-C library was constructed via methods such as cell cross-linking and Hind III restriction enzyme digestion and then sequenced via the DNBSEQ-T7 sequencer in 150-bp paired-end mode. Hi-C sequencing yielded a total of approximately 300.87 Gb of data (data file 3). Finally, RNA-seq sequencing was performed via the Illumina X-plus sequencer, generating approximately 6 Gb of data (data file 4), which was used to assist in annotation.\u003c/p\u003e\n\u003cp\u003eThe findings suggested that the genome size of \u003cem\u003eA. biserrata\u003c/em\u003e is approximately 3.03 Gb, with a heterozygosity rate of approximately 0.27% and a repetitive sequence content of approximately 83.59%. It is a complex and large genome. HiFi sequencing yielded 8.83 million high-fidelity reads with an average length of 14.2 kb. Hifiasm (V 0.19) [7] was used to splice the genome, and the total length of the contig sequence was 4.55 Gb (N50 = 35.4 Mb). After nucleoside alignment and elimination of the Hi-C signal redundancy, the total length of the genome contig sequence was 3.89 Gb, and the contig N50 was 34.42 Mb. In accordance with OrthoDB 10 (http://cegg.unige.ch/orthodb) and BUSCO (V 5.2.1) [8], the integrity of the \u003cem\u003eA. biserrata\u003c/em\u003e genome was determined\u0026nbsp;to be\u0026nbsp;96.59%.\u0026nbsp;HiC-Pro\u0026nbsp;[9] (v2.10.0)\u0026nbsp;was adopted to\u0026nbsp;filter and evaluate the Hi-C data to obtain high-quality clean data.\u0026nbsp;Through\u0026nbsp;BWA\u0026nbsp;[10] (version: 0.7.17-r1188;\u0026nbsp;comparison mode: aln;\u0026nbsp;other parameters are default),\u0026nbsp;the double-end\u0026nbsp;sequence data were aligned with the sequence of the assembled genome. The genome sequence was grouped, sequenced and oriented\u0026nbsp;with\u0026nbsp;LACHESIS\u0026nbsp;[11] software, and then the final version of the genome was obtained after Hi-C error correction, auxiliary chromosome mounting and redundancy removal. The assembly result was 3894.84 Mb, the\u0026nbsp;contig N50\u0026nbsp;was 34.42 Mb,\u0026nbsp;and the scaffold N50\u0026nbsp;was 325.77 Mb (data file 5).\u0026nbsp;Among them, the 3811.62 Mb\u0026nbsp;genome sequence was located on 11 chromosomes, accounting for 97.86% of the genome, indicating high integrity of the assembly. Then,\u0026nbsp;repeatModeler2\u0026nbsp;[12] (v2.0.1),\u0026nbsp;RepeatMasker\u0026nbsp;[13] (V4.1.2) and other software were used for ab initio prediction of repeat sequences;\u0026nbsp;Augustus [14] (v3.1.0) and SNAP [15] (2006-07-28) were utilized\u0026nbsp;for ab initio prediction of structural genes;\u0026nbsp;GeMoMa\u0026nbsp;[16] (V1.7) was used to predict homologous species; and\u0026nbsp;PASA\u0026nbsp;[17] (v2.4.1) was used for\u0026nbsp;transcriptome-assisted gene prediction. Finally,\u0026nbsp;EVM\u0026nbsp;[18] (v1.1.1)\u0026nbsp;was adopted\u0026nbsp;to\u0026nbsp;combine\u0026nbsp;the prediction results\u0026nbsp;from\u0026nbsp;the above three methods,\u0026nbsp;and\u0026nbsp;98.66% of the genes\u0026nbsp;(a total of 44603 genes)\u0026nbsp;were annotated\u0026nbsp;(data file 6), indicating\u0026nbsp;high\u0026nbsp;annotation quality.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 1\u003c/strong\u003e: Overview of data files\u003c/p\u003e\n\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"606\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eLabel\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 133px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eName\u0026nbsp;of\u0026nbsp;data\u0026nbsp;file\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 110px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFile\u0026nbsp;types\u0026nbsp;(file extension)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 298px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eData repository and identifier (DOI or accession numbe\u003c/strong\u003e\u003cstrong\u003er)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003eData file\u0026nbsp;1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 133px;\"\u003e\n \u003cp\u003eRaw short whole genome\u0026nbsp;Illumina\u0026nbsp;sequencing reads\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 110px;\"\u003e\n \u003cp\u003eFasta file\u0026nbsp;(.fastq)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 298px;\"\u003e\n \u003cp\u003eCNCB Big sub-Genome Sequence Archive (GSA) Accession number CRA024320. https://ngdc.cncb.ac.cn/gsa/browse/CRA024320 [19]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003eData file\u0026nbsp;2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 133px;\"\u003e\n \u003cp\u003eRaw\u0026nbsp;long whole genome\u0026nbsp;Hi-Fi\u0026nbsp;sequencing reads\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 110px;\"\u003e\n \u003cp\u003eFasta file\u0026nbsp;(.fastq)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 298px;\"\u003e\n \u003cp\u003eCNCB Big sub-Genome Sequence Archive (GSA) Accession number CRA024335. (Data submission, in verification) https://ngdc.cncb.ac.cn/gsa/browse/CRA024335 [20]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003eData file\u0026nbsp;3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 133px;\"\u003e\n \u003cp\u003eRaw\u0026nbsp;Hi-C\u0026nbsp;reads\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 110px;\"\u003e\n \u003cp\u003eFasta file\u0026nbsp;(.fastq)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 298px;\"\u003e\n \u003cp\u003eCNCB\u0026nbsp;Big sub-Genome Sequence Archive\u0026nbsp;(GSA) Accession\u0026nbsp;number\u0026nbsp;(Data submission to CNCB subCRA039514, in verification). [21]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003eData file\u0026nbsp;4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 133px;\"\u003e\n \u003cp\u003eRaw\u0026nbsp;RNA-seq\u0026nbsp;reads\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 110px;\"\u003e\n \u003cp\u003eFasta file\u0026nbsp;(.fastq)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 298px;\"\u003e\n \u003cp\u003eCNCB Big sub-Genome Sequence Archive (GSA) Accession number CRA024296. https://ngdc.cncb.ac.cn/gsa/browse/CRA024296 [22]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003eData file\u0026nbsp;5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 133px;\"\u003e\n \u003cp\u003eGenome annotation\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 110px;\"\u003e\n \u003cp\u003eGff3 file (.gff3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 298px;\"\u003e\n \u003cp\u003eFigshare,\u0026nbsp;10.6084/m9.figshare.28740092\u0026nbsp;[23]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003eData file\u0026nbsp;6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 133px;\"\u003e\n \u003cp\u003eAssembled genome\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 110px;\"\u003e\n \u003cp\u003eFasta file (.fa)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 298px;\"\u003e\n \u003cp\u003eCNCB Big sub-Genome Warehouse (GWH) Accession number GWHFQYN00000000.1 https://ngdc.cncb.ac.cn/gwh/Assembly/reviewersPage/NYBbdmgyzOuHzkThMbjkEnifHkIOIJwyuCYcLbNdtKbdBIlAwoQYdynfzdbKqPdT [24]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e"},{"header":"Limitations","content":"\u003cp\u003eAlthough we have successfully assembled a chromosome-level genome of \u003cem\u003eA. biserrata\u003c/em\u003e (Duhuo), the biosynthetic pathways of its active compounds (e.g., coumarins) and the genetic mechanisms governing its medicinal properties require further investigation. Future studies will employ integrated multiomics approaches involving comparative genomics, metabolomics, transcriptomics, and in vitro enzymatic functional validation to systematically elucidate the complex coumarin biosynthesis network and its regulation in this medicinal species.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003e\u003cstrong\u003eBUSCO\u003c/strong\u003e\u003cstrong\u003e:\u003c/strong\u003e Benchmarking Universal Single-Copy Orthologs\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCNCB: China National Center for Bioinformation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003ePacBio: Pacifc Biosciences\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eHic:\u003c/strong\u003eHigh-throughput chromosome conformation capture\u0026nbsp;\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003cbr\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003cbr\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (GSA) of the National Genomics Data Center. The accession numbers for the Illumina, Hi-Fi, Hi-C, and RNA-seq data are CRA024320, CRA024335, CRA024295 and CRA024296, respectively. The assembled genome reported in this paper has been deposited in the Genome Warehouse (GWH) at the National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformatics, under accession number GWHFQYN00000000.1. The results of the genome annotations have been uploaded to Figshare. Please see Table 1 for details and links to the data.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003cbr\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare\u0026nbsp;that they have\u0026nbsp;no competing\u0026nbsp;interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003cbr\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe study was funded by the\u0026nbsp;Analysis of Synthetic Pathway Study of Key Enzyme Genes of Chongqing Post-Bo Foundation (grant number cstc2021jcyj-bshX0015).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026rsquo; contributions\u003cbr\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eY.\u0026nbsp;J\u0026nbsp;X and L.\u0026nbsp;L. are jointly responsible for sample organization and sequencing data analysis. Y. J X ultimately wrote the manuscript, and Y.\u0026nbsp;Q\u0026nbsp;Z\u0026nbsp;and X.\u0026nbsp;Y Q revised the manuscript. L.\u0026nbsp;L. and X.\u0026nbsp;L. assisted in genome assembly and annotation. Y.Q Z and X.\u0026nbsp;Y Q conceived and designed this project. All the authors have read and approved the final version of this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors appreciate the use of biomarkers for providing high-quality sequencing.\u003c/p\u003e\u003cp\u003e\u003cstrong\u003ePublisher\u0026rsquo;s note\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSpringer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eChinese Pharmacopoeia Commission. 2020. Pharmacopoeia of the People\u0026apos;s Republic of China. Vol. Beijing: China Medical Science Publisher. 274.\u003c/li\u003e\n\u003cli\u003eMa J, Huang J, Hua S, Zhang Y, Zhang Y, Li T, Dong L, Gao Q, Fu X. 2019. The ethnopharmacology, phytochemistry and pharmacology of Angelica biserrata - A review. J Ethnopharmacol. 231:152-169. doi: 10.1016/j.jep.2018.10.040.\u003c/li\u003e\n\u003cli\u003eLu Y, Wu H, Yu X, Zhang Xiao, Luo Hanyan, Tang Liying, Wang Zhuju. Traditional Chinese Medicine of Angelicae Pubescentis Radix: A Review of Phytochemistry, Pharmacology and Pharmacokinetics[J]. Frontiers in Pharmacology, 2020, 11. DOI: 10.3389/fphar.2020.00335.\u003c/li\u003e\n\u003cli\u003eK.N. Venugopala, V. Rashmi, B. Odhav, Review on natural coumarin lead compounds for their pharmacological activity, Biomed Res Int. (2013) 963248. https://doi.org/10.1155/2013/963248.\u003c/li\u003e\n\u003cli\u003eZhao Y, He Y, Han L, Zhang L, Xia Y, Yin F, Wang X, Zhao D, Xu S, Qiao F, Xiao Y, Kong L. Two types of coumarins-specific enzymes complete the last missing steps in pyran- and furanocoumarins biosynthesis. Acta Pharm Sin B. 2024 Feb;14(2):869-880. doi: 10.1016/j.apsb.2023.10.016.\u003c/li\u003e\n\u003cli\u003eHan X, Li C, Sun S, Ji J, Nie B, Maker G, Ren Y, Wang L. The chromosome-level genome of female ginseng (Angelica sinensis) provides insights into molecular mechanisms and evolution of coumarin biosynthesis. Plant J. 2022 Dec;112(5):1224-1237. doi: 10.1111/tpj.16007.\u003c/li\u003e\n\u003cli\u003eCheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5.\u003c/li\u003e\n\u003cli\u003eSim\u0026atilde;o Felipe A, Waterhouse R M, Panagiotis I, et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs[J]. Bioinformatics. (2015):19.\u003c/li\u003e\n\u003cli\u003eServant, Nicolas, et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biology, 2015.16(1):1-11.\u003c/li\u003e\n\u003cli\u003eLi H, Durbin R. Fast and accurate short read alignment with Burrows‒Wheeler Transform.Bioinformatics, 2009 25:1754-60.\u003c/li\u003e\n\u003cli\u003eBurton, J.N., et al., Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol, 2013. 31(12): p. 1119-25.\u003c/li\u003e\n\u003cli\u003eFlynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457.\u003c/li\u003e\n\u003cli\u003eTarailo‐Graovac M, Chen N: Using RepeatMasker to identify repetitive elements in genomic sequences.Current Protocols in Bioinformatics 2009:4.10. 11-14.10. 14.\u003c/li\u003e\n\u003cli\u003eMario Stanke, Mark Diekhans, Robert Baertsch, David Haussler (2008)Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, doi: 10.1093/bioinformatics/btn013\u003c/li\u003e\n\u003cli\u003eKorf I. Gene finding in novel genomes. BMC bioinformatics 2004, 5:59.\u003c/li\u003e\n\u003cli\u003eKeilwagen J, Wenk M, Erickson J L, Schattat, M. H., Jan, G., Frank, H: Using intron position conservation for homology-based gene prediction. Nucleic acids research 2016, 44: e89-e89.\u003c/li\u003e\n\u003cli\u003eHaas, B.J., Delcher, A.L., Mount, S.M., Wortman, J.R., Smith Jr, R.K., Jr., Hannick, L.I., Maiti, R., Ronning, C.M., Rusch, D.B., Town, C.D. et al. (2003) Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res, 31, 5654-5666.\u003c/li\u003e\n\u003cli\u003eHaas B J, Salzberg S L, Zhu W, Pertea M, Allen J E, Orvis J, White O, Buell C R, Wortman JR: Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 2008, 9:R7. \u003c/li\u003e\n\u003cli\u003eXu Y J. Illumina sequencing reads of \u003cem\u003eAngelica Biserrata\u003c/em\u003e. China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA024320 \u003c/li\u003e\n\u003cli\u003eXu Y J. Hi-Fi sequencing reads of \u003cem\u003eAngelica Biserrata\u003c/em\u003e. China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA024335 \u003c/li\u003e\n\u003cli\u003eXu Y J. Hi-C reads of \u003cem\u003eAngelica Biserrata\u003c/em\u003e. China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. \u003c/li\u003e\n\u003cli\u003eXu Y J. RNA-seq of \u003cem\u003eAngelica Biserrata\u003c/em\u003e. China National Center for Bioinformation (CNCB)-Genome Sequence Archive (GSA). 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA024296 \u003c/li\u003e\n\u003cli\u003eXu Y J. Genome annotation of \u003cem\u003eAngelica Biserrata\u003c/em\u003e. Figshare. 2025. Figshare, 10.6084/m9.figshare.28740092\u003c/li\u003e\n\u003cli\u003eXu Y J. Draft Genome of Angelica Biserrata. China National Center for Bioinformation (CNCB)- Genome Warehouse (GWH). 2024. Accession number GWHFQYN00000000.1 https://ngdc.cncb.ac.cn/gwh/Assembly/reviewersPage/NYBbdmgyzOuHzkThMbjkEnifHkIOIJwyuCYcLbNdtKbdBIlAwoQYdynfzdbKqPdT\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-genomic-data","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gtic","sideBox":"Learn more about [BMC Genomic Data](http://bmcgenet.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gtic/default.aspx","title":"BMC Genomic Data","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Angelica Biserrata, Hi-Fi Genome, Hi-C, Genome annotation","lastPublishedDoi":"10.21203/rs.3.rs-6430141/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6430141/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eObjectives \u003c/strong\u003e\u003cem\u003eAngelica biserrata \u003c/em\u003e(commonly known as “Duhuo”), a traditional Chinese medicinal herb of the genus Angelica within the Apiaceae family, is clinically valued for its therapeutic effects in dispelling wind-dampness and alleviating arthralgia. Its pharmacological properties are primarily attributed to coumarins, to elucidate the molecular mechanisms underlying coumarin biosynthesis and facilitate the breeding of high-coumarin cultivars, we present the first draft genome assembly and annotation of \u003cem\u003eA. biserrata\u003c/em\u003e. The first genome assembly of \u003cem\u003eA. biserrata\u003c/em\u003e will provide novel insights into elucidating coumarin biosynthesis and advancing evolutionary biological studies.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData description \u003c/strong\u003eThe genome of \u003cem\u003eA. biserrata\u003c/em\u003e was sequenced using PacBio HiFi technology, generating 8.83 million high-fidelity reads with an average length of 14.2 kb (125.34 Gb, sequencing coverage 41 ×). The reads were assembled to give a draft genome of 4.52 Gb with an N50 contig length of 35.72 Mb. Chromosome-scale scaffolding was then performed using 300.87 Gb Hi-C data, resulting in a final genome assembly of 3.89 Gb with improved continuity (contig N50 = 34.42 Mb, scaffold N50 = 325.77 Mb). The genomic integrity was 96.59% (based on the embryophyta database of OrthoDB 10) through the evaluation of universal single copy direct homologous gene (BUSCO). At the same time, 3811.62 Mb long sequences were attached to 11 chromosomes, accounting for 97.86%.\u003c/p\u003e","manuscriptTitle":"Draft genome of Angelica Biserrata, a traditional Chinese medicinal herb of the Angelica genus (Apiaceae)","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-05-06 09:40:39","doi":"10.21203/rs.3.rs-6430141/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-06-19T10:30:47+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-05-21T16:22:54+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"97002441752671116083080265689927846655","date":"2025-05-21T15:52:07+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"279700654761097980687525339338546218464","date":"2025-05-19T11:30:35+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-05-16T10:24:14+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"36398391589869338990497233400431490784","date":"2025-05-06T13:33:16+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-04-30T15:33:06+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-04-15T11:13:00+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-04-15T11:10:29+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Genomic Data","date":"2025-04-11T16:45:10+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-genomic-data","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gtic","sideBox":"Learn more about [BMC Genomic Data](http://bmcgenet.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gtic/default.aspx","title":"BMC Genomic Data","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"db745271-10a3-4bbc-b931-a6fdbc902cdb","owner":[],"postedDate":"May 6th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-11-03T15:59:07+00:00","versionOfRecord":{"articleIdentity":"rs-6430141","link":"https://doi.org/10.1186/s12863-025-01371-w","journal":{"identity":"bmc-genomic-data","isVorOnly":false,"title":"BMC Genomic Data"},"publishedOn":"2025-10-29 15:56:54","publishedOnDateReadable":"October 29th, 2025"},"versionCreatedAt":"2025-05-06 09:40:39","video":"","vorDoi":"10.1186/s12863-025-01371-w","vorDoiUrl":"https://doi.org/10.1186/s12863-025-01371-w","workflowStages":[]},"version":"v1","identity":"rs-6430141","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6430141","identity":"rs-6430141","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00