Organ-Specific Long-Read Transcriptome Assembly of Stenochlaena palustris and Annotation of Anthocyanin Biosynthesis Genes

doi:10.21203/rs.3.rs-7105205/v1

Organ-Specific Long-Read Transcriptome Assembly of Stenochlaena palustris and Annotation of Anthocyanin Biosynthesis Genes

2025 · doi:10.21203/rs.3.rs-7105205/v1

preprint OA: closed

Full text JSON View at publisher

Full text 119,697 characters · extracted from preprint-html · click to expand

Organ-Specific Long-Read Transcriptome Assembly of Stenochlaena palustris and Annotation of Anthocyanin Biosynthesis Genes | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Organ-Specific Long-Read Transcriptome Assembly of Stenochlaena palustris and Annotation of Anthocyanin Biosynthesis Genes Maria Stefanie Dwiyanti, Della Rahmawati, Maria Dewi Puspitasari Tirtaningtyas Gunawan-Puteri, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7105205/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Stenochlaena palustris is valued as a vegetable and medicinal fern native to Southeast Asia; however, it remains largely underrepresented in genomic studies. People in Kalimantan (Indonesia) collect young leaves and fronds from wild populations for use as vegetables or medicines to treat conditions, such as ulcers, stomachaches, fever, diarrhoea, and skin infections. The young leaves and fronds of S. palustris contain flavonoids, polyphenols, and anthocyanins. Here, we present a high-quality organ-specific transcriptome assembly of S. palustris based on long-read RNA sequencing of young leaves. Results The de novo assembly yielded 47,759 transcripts, with an N50 of 1,524 bp and a BUSCO completeness of 66.6%, consistent with organ-specific transcriptomes. Functional annotation identified key structural and regulatory genes involved in anthocyanin biosynthesis, including genes for chalcone synthase (CHS) and dihydroflavonol 4-reductase (DFR). We further analysed the expression of the selected CHS and DFR genes via qRT-PCR of three phenotypically contrasting young leaf samples. Although no strong correlation was observed between gene expression levels and anthocyanin pigmentation, the results suggest that complex regulation involves post-transcriptional control or developmental timing. Conclusions This study provides the first long-read transcriptomic resource for S. palustris and valuable data for future investigations of secondary metabolism and gene regulation in ferns. Our findings complement broader fern transcriptome studies by offering tissue-specific resolution and a focused view of pigment biosynthesis. Stenochlaena palustris transcriptome Nanopore sequencing long-read sequencing chalcone synthase dihydroflavonol 4-reductase Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 BACKGROUND Stenochlaena palustris , a climbing fern of the Blechnaceae family, is a naturally distributed plant in South Asia, Southeast Asia, Australia, and Polynesia [ 1 , 2 ]. It is known by several local names, such as kelakai/kalakai in Kalimantan (Indonesia); lemidin, midin, or paku midin in Malaysia; and diliman or hagnaya in the Philippines [ 3 , 4 , 5 , 6 ]. In these regions, young leaves and fronds of S. palustris are collected from wild habitats and consumed as a vegetable [ 3 , 4 , 5 , 6 ]. In addition to its culinary uses, S. palustris has been traditionally used for its medicinal properties for treating conditions, such as ulcers, stomachaches, fever, diarrhoea, and skin infection [ 4 , 5 , 6 ]. It also treats anaemia, promotes breast milk production, aids recovery after childbirth, prevents diabetes, and reduces antimicrobial activity [ 4 , 5 , 6 ]. Additionally, S. palustris is a hardy plant capable of growing in challenging environments, such as acidic peatlands. In Kalimantan, it is commonly found in various habitats, including acidic swamp areas, riversides, natural and secondary forests, palm oil plantations, and residential zones [ 1 , 6 , 7 ]. It is also known as a pioneer plant, which can grow first after land disturbances, such as peatland forest fires, thus creating conditions conducive to the establishment of other plant species [ 8 ]. These properties render S. palustris a valuable source of vegetables and medicines, particularly for the people of Kalimantan. Studies have shown that fronds and young leaves of S. palustris possess antioxidant properties. Water extracts of fronds contain anthocyanins, polyphenols, and hydroxycinnamic acids, which contribute to their antioxidant activity [ 9 ]. Young leaves of S. palustris are either red- or green-coloured, indicating the differences in anthocyanin content. Young red leaves turn green when they mature [ 6 ]. Two independent studies identified the α-glucosidase inhibitor activity of S. palustris [ 9 , 10 ]. This inhibitory activity may be attributed to hydroxycinnamic acids or astragalin [ 9 , 10 ]. Astragalin is kaempferol 3-O-β- glucopyranoside, a glucosylated form of kaempferol. It is found in various plant species, including persimmon leaves, lotus leaves, green tea seeds, and roots of Astragalus membranaceus [ 11 ]. Understanding the regulation of flavonoid and anthocyanin biosynthesis is crucial for further phytochemical-related research on S. palustris. Flavonoid biosynthesis begins with the conversion of phenylalanine to the flavonoid precursor p -coumaroyl-CoA through a series of enzymatic steps [ 12 ]. The key enzymes involved in this process are phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), and 4-coumaroyl-CoA ligase (4CL) [ 12 ]. p -Coumaroyl-CoA is converted to naringenin chalcone by chalcone synthase (CHS). This intermediate is further converted into naringenin by chalcone isomerase (CHI). The addition of a hydroxyl group at the 3-position of naringenin results in the formation of dihydrokaempferol, a process catalysed by flavanone 3-hydroxylase (F3H). Dihydrokaempferol is subsequently converted into kaempferol by flavonol synthase (FLS). Kaempferol is glycosylated by UDP-dependent glycosyltransferases (UGT) to form astragalin. Additionally, dihydrokaempferol serves as a precursor for anthocyanins, with several enzymes involved in the conversion, including dihydroflavonol 4-reductase (DFR), flavonoid 3ʹ-hydroxylase (F3ʹH), flavonoid 3ʹ,5ʹ-hydroxylase (F3ʹ5ʹH), anthocyanidin synthase (ANS), flavonoid 3-O-glucosyltransferase (F3GT), and flavonoid 5-O-glucosyltransferase (F5GT). The growing interest in flavonoid and anthocyanin biosynthesis in ferns has led to the identification of related genes. The first step in flavonoid biosynthesis, which is the conversion of p -coumaryl CoA to naringenin chalcone by CHS, is often rate-limiting. CHS genes were isolated from the ferns Dryopteris fragrans (GenBank accession numbers KF530802.1, KP420005.1, and KP420004.1), D. erythrosora (GenBank accession number KJ135628.1), and Ceratopteris thalictroides (GenBank: JX027616.1). As CHS can exist as a single copy or form a multigene family, the actual number of CHS-expressing genes in ferns may be higher [ 13 ]. DFR is the key enzyme involved in anthocyanin biosynthesis. DFR genes have been identified and characterised in water ferns ( Azolla filiculoides ) and D. erythrosora. Chen et al. [ 14 ] isolated two DFR genes from D. erthyrosora , namely DeDFR1 (GenBank: MK920230) and DeDFR2 (GenBank: MK920231). Based on a GenBank search, Chen et al. [ 14 ] deposited three other DFR genes: MK920232, MK920233, and MK920234. In this study, these genes were designated as DFR3 , DFR4 , and DFR5 . Two DFR gene families were identified in A. filiculoides : DFR1 consisting of Azfi_s0035.g025620 and Azfi_s0245.g059984 and DFR2 with Azfi_s0008.g011655 and Azfi_s0008.g011657. [ 15 , 16 ]. Limited genomic information on S. palustris hinders the characterisation of flavonoid biosynthesis regulation. Currently, only three transcriptomic datasets are publicly available in the National Center for Biotechnology Information Sequence Read Archive (NCBI SRA) database ( https://www.ncbi.nlm.nih.gov/sra ; last accessed March 20th, 2025). The three datasets were generated from samples from China and Singapore [ 17 , 18 , 19 ]. Shen et al. [ 18 ] generated the transcriptome data for S. palustris sporophyll and trophophyll tissues. Sequencing was performed using the Illumina HiSeq 2500 system. De novo assembly produced 58,416 contigs with 945.83 bp average length and 48.15% GC content [ 18 ]. Ali et al. [ 19 ] determined the transcriptome profile of S. palustris from various organs, such as sterile and fertile leaves, petioles, rhizomes, and roots. Sequencing was performed using the Illumina NovaSeq 6000 system. The de novo assembly produced 54,843 contigs with 832.1 bp average length and 45.25% GC content [ 19 ]. The de novo assembly of the transcriptome data produced by Ali et al. [ 19 ] is publicly available ( https://conekt.plant.tools/ , last accessed March 21st, 2025). In this study, we used the Nanopore long-read sequencing platform to obtain transcriptome data from leaf samples collected from Palangka Raya, Indonesia. The data obtained in the present study can support genomic research on S. palustris by providing additional information that complements previously published data. In the present study, we identified genes potentially involved in flavonoid and anthocyanin biosynthesis, particularly those encoding CHS and DFR. METHODS Sample collection, RNA extraction, and sequencing Young green nonfibrous sterile leaves from S. palustris naturally grown in the University of Palangka Raya (UPR), Indonesia (2°12′57.2″S, 113°54′04.3″E) were collected and immediately stored in RNAlater solution (ThermoFisher Scientific, Japan). Plant material from a location in UPR (2°12′48.4″S, 113°54′02.8″E) had been previously identified at the Herbarium Bogoriense, Research Center for Biology, Cibinong, Indonesia (No. 252/IPH.1.01/If.07/II/2019). The samples were stored at 4°C until RNA extraction. RNA was extracted from leaves using 2% cetyltrimethylammonium bromide (CTAB) and 4% polyvinylpyrrolidone (PVP) extraction buffer, following the protocol of Kiss et al. [ 20 ], with several modifications. Leaf tissue (5 cm) was ground to a fine powder in liquid nitrogen using a mortar and a pestle. A preheated extraction buffer (6 mL) was added to the powder, and the mixture was homogenised to remove clumps. The mixture was transferred to six 2 mL tubes. The tubes were then incubated at 65°C for 10 min. Seven hundred microlitres of chloroform:isoamyl alcohol (24:1) was added to each tube. The tubes were vortexed and centrifuged at 15,000 rpm for 10 min at 4°C. The upper phase was transferred to a new 1.5 mL tube, and an equal volume of chloroform:isoamyl alcohol (24:1) was added. The addition of chloroform:isoamyl alcohol was repeated three times. The upper phase was then mixed with 500 µL 8M LiCl. The mixture was incubated at 4°C overnight. The following day, the solution was centrifuged at 15,000 rpm for 45 min at 4°C. The RNA pellet was washed with 50 µL ice-cold ethanol (80% v/v) and then centrifuged at 15,000 rpm for 5 min at 4°C. The supernatant was carefully removed using a pipette, and the tubes were centrifuged and then dried at 37°C to remove any residual ethanol. The RNA pellets were resuspended in 270 µL RNase-free water. Two microlitres of DNase I (Nippongene, Japan), 27 µL DNase I buffer, and 1 µL recombinant RNase inhibitor (Takara Shuzo, Kyoto, Japan) were added to the RNA solution and then incubated at 37°C for 30 min. DNase was removed using phenol:chloroform:isoamyl alcohol (25:24:1). RNA was precipitated in 99.5% ethanol overnight at -30°C then centrifuged at 15,000 rpm for 45 min at 4°C. The RNA pellet was washed using 100 µL cold 80% EtOH, followed by another centrifugation at 15,000 rpm for 5 min at 4°C. Residual ethanol was removed using a pipette, and the pellet was air-dried at room temperature. The pellet was eluted in 50 µL RNase/DNase-free water. RNA quantity was determined using NanoDrop spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA) and Qubit (Thermo Fisher Scientific, Japan). RNA quality was assessed via 1% gel electrophoresis, and Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Transcriptome sequencing was performed on the PromethION platform using a cDNA-PCR Barcoding Kit 24 V14 (SQK-PCR114.24) at GeneBay Inc., Japan. Base calling was performed using the Dorado base-call server (version 7.4). A high-accuracy model was used to generate sequence reads after filtering. To ensure high-quality data in the downstream analysis, low-quality and short reads were removed, and barcode sequences were trimmed using Porechop 0.2.4 [ 21 ], with the remove-middle option applied to eliminate barcodes embedded within the reads. De novo assembly, reduce redundancy, and assembly quality assessment De novo sequence assembly of reads that passed filtering was performed using RNA-Bloom2 version 2.0.1, with the default parameters [ 22 ]. To reduce redundancy, the contigs were clustered based on sequence similarity using CD-HIT-EST in the CD-HIT package version 4.8.1 [ 23 ], with the following parameters: similarity threshold (-c 0.90) and word size threshold (-n 8). The resulting contigs were discarded if 1) read coverage < 1 based on read mapped to each contig by minimap [ 24 ], 2) sequences are predicted from non-fern organisms by the evaluation conducted using NCBI Foreign Contamination Screen (NCBI FCS) provided in Galaxy ( https://usegalaxy.org , last accessed March 27th, 2025), 3) barcode adapters remain in the sequences by filtering manually before uploading to NCBI TSA database, and 4) GC% of the contig is more than 60% or less than 40%. The assembly quality was evaluated using Benchmarking Universal Single-Copy Orthologs v5 (BUSCO v5) [ 25 , 26 ] to assess the completeness of the assembly. Online BUSCO analysis used Embryophyta from the OrthoDB v10 ortholog sets ( https://gvolante.riken.jp [ 27 ]). The final dataset was examined for the contig length distribution, including the minimum, maximum, and average values, and the %GC content was assessed using QUAST [ 28 ]. Identification of open reading frame Open reading frames (ORFs) for each contig were predicted using ORFipy version 0.0.4 [ 29 ]. ORFipy was run using the following parameters: start ATG to search for ORFs beginning at the common start codon ATG and Table 1 standard genetic code, which is applicable for most organisms. Option –longest was also added to filter and output the largest ORFs. The largest ORFs predicted for each contig were assembled into a dataset to predict gene function. Table 1 Primers used for quantitative real-time PCR. Gene Primer name Sequence (5′-3′) CHS CHS1-F CAGAAGGTGTTTGGTGAAGATGCTC CHS1-R TCGACATGTTCCCGAATTCTGAGAG CHS2-F GGCTTACATTCCACCTCATGAAG CHS2-R GACATGTTTCCATAGTCCGACAG DFR DFR1-F GAGAAAGCGGCGGTGGAGTT DFR1-R GCTATTGGGAATCTTGGAAAG DFR2-F TCTCCCTTGTCACAGGTGAT DFR2-R GCGGACCCAATATAGCGACC DFR3-F CTGTGGAGTACTTAAGGCACAAGA DFR3-R TGGAATAGAAGGTGTGAGAAAGGG Actin Actin-F GAGACCACTTACAACTCCATCATG Actin-R TCCAGACACTGTATTTTCTCAGGAGG Gene functional annotation using eggNOG-mapper v2 Functional annotation was performed using the eggNOG-mapper v2 web version ( http://eggnog-mapper.embl.de [ 30 , 31 ])by uploading the longest contigs obtained from the CD-HIT-EST analysis. The input was nucleotide sequences, whereas the parameters used were genomic data, gene prediction method: BLASTX-like, allow frameshifts (Diamond –frameshift option), database eggnog 5, minimum hit e-value 0.001, minimum hit bit-score 60, percentage identity 40%, minimum% of query coverage 20%, and minimum% of subject coverage 20%. For further analysis, another filtering was performed to retain only the contigs that matched “max_annotation_level” to “Streptophyta” or “Viridiplantae”. KEGG orthology (KO) and Clusters of Orthologous Groups (COG) distributions were used for the final output contigs. Mining flavonoid biosynthesis genes Based on the output from the eggNOG-mapper v2, genes potentially involved in flavonoid biosynthesis were identified and compiled (Fig. 3 , Additional File 1). Multiple sequence alignment and phylogenetic tree analysis were performed to define gene clustering. For CHS, only 16 genes with lengths of > 800 bp were included in the analysis, whereas all 11 DFR genes were included. In addition to S. palustris genes, known CHS and DFR genes from other ferns were included for comparison. The CHS genes included were D. fragrans (GenBank: KF530802.1, KP420005.1, KP420004.1), D. erythrosora (GenBank: KJ135628.1), C. thalictroides (GenBank: JX027616.1), whereas the DFR genes included were D. erythrosora (MK920230.1, MK920231.1, MK920232.1, MK920233.1, MK920234.1) and A. filiculoides (Azfi_s0035.g025620, Azfi_s0245.g059984, Azfi_s0008.g011655, Azfi_s0008.g011657). Multiple sequence alignment was performed using MAFFT version 7 with the L-INS-I algorithm ( https://mafft.cbrc.jp/alignment/server/index.html [ 32 ]). A phylogenetic tree was constructed using the neighbour-joining method with the Jukes-Cantor substitution model and 1000-times bootstrap replicates provided on the MAFFT web server. The phylogenetic tree was edited using Archaeopteryx.js, which is provided on the same server. Gene expression analysis via quantitative real-time PCR To confirm the expression of genes involved in flavonoid biosynthesis, quantitative real-time PCR (qRT-PCR) was conducted using RNA extracted from young leaves collected from natural populations at three locations in Central Kalimantan: the UPR, Kalampangan, and Tjilik Riwut (Fig. 5 A). Sampling locations were the UPR (2°12′57.2″S, 113°54′04.3″E), Tjilik Riwut (2°14′02.8″S, 113°57′09.1″E), and Kalampangan (2°16′44.8″S, 114°00′30.3″E). The leaves from the UPR site were green, whereas those from Kalampangan and Tjilik Riwut were red. The supernatant phase of the water-methanol-chloroform extract of the UPR leaves was clear, whereas that of Kalampangan and Tjilik Riwut leaves was pink, probably due to presence of anthocyanins [ 33 ]. The samples were kept fresh in an icebox and were lyophilised for 24 hours before storage at -80°C. RNA extraction and DNase I treatment were performed as previously described. RNA quantity and quality were measured using the NanoDrop spectrophotometer. cDNA synthesis was performed as follows: a mixture consisting of 200 ng total RNA, 1 µL 50 µM oligo (dT) 20 primer (Toyobo, Osaka, Japan), 1 mol dNTP mixture (Takara Shuzo, Kyoto, Japan) was denatured at 65°C for 5 min and then immediately put on ice. 5× RT buffer (4 µL), RNase inhibitor (1 µL), and M-MLV RT (1 µL, Invitrogen, Tokyo, Japan) were added to the mixture. The cDNA synthesis reaction was then carried out at 42°C for 60 min followed by a denaturation at 70°C for 15 minutes and a cooling down step at 10°C, infinite time. A 10 µL reaction mixture was prepared for qRT-PCR: 5 µL SSoAdvanced Universal SYBR Green Supermix (Bio-Rad, Hercules, CA, USA), 2 µL cDNA, and 250 nM forward and reverse primers. The qRT-PCR conditions were as follows: initial denaturation at 95°C for 30 s, followed by 40 cycles each at 95°C for 30 s, 58°C for 15 s, and 72°C for 20 s. Melting curve analysis ranged from 56 to 95°C. To examine the specificity of the qRT-PCR amplification, temperature was increased by 0.5°C every 10 s after completion of the amplification cycles. The gene expression level of each sample was calculated and compared with the expression level of the internal control gene actin. Primers designed for the selected genes are listed in Table 1 . For gene expression analysis, primers were designed targeting the conserved regions. qRT-PCR was performed using the Bio-Rad CFX Connect machine at the Faculty of Agriculture, Hokkaido University. Three biological replicates were prepared. RESULTS Long-read nanopore sequencing and de novo assembly results Long-read nanopore sequencing generated 15,632,749 raw reads with a total nucleotide length of 13,033,432,831 bp. After filtering, 14,913,665 reads, with an average length of 682.5 bp, were retained for de novo assembly. The de novo assembly produced 112,160 contigs. The reduced-redundancy step using CD-HIT-EST organised the 112,160 contigs into 74,467 clusters, from which the longest contig in each cluster was selected as the representative and used for further analysis. After further filtering based on read coverage, absence of adapter sequences, and GC%, a final set of 47,759 contigs was obtained. The minimum length was 500 bp, the maximum length was 11,260 bp, and the average length was 1328.8 bp (Fig. 1 C). The GC content was 45.15%. Completeness scores based on BUSCO assessment were 66.6% complete (44.9% single-copy and 21.7% duplicated), 6.2% fragmented, and 27.2% missing (Fig. 1 C). Gene annotation based on eggNOG-mapper The longest ORFs from 47,759 contigs were predicted using ORFipy. The resulting dataset was functionally annotated using eggNOG-mapper. A total of 30,010 contigs were successfully annotated, with the highest annotation level assigned to either Viridiplantae or Streptophyta . KO terms were assigned to each contig; some contigs had a single KO term, whereas others had multiple terms. Based on the KO classifications, 3,745 contigs were associated with genetic information processing, 1,206 with environmental information processing, 82 with organismal systems, and 5,960 with metabolism (Fig. 2 A). Among those classified under “Metabolism”, 301 contigs were linked to “biosynthesis of other secondary metabolites”, including genes involved in the biosynthesis of phenylpropanoid, flavone and flavonol, isoflavonoid, flavonoid, and anthocyanin. Based on COG annotations, 5,531 contigs were assigned to information storage and processing, 8,238 to cellular processes and signaling, and 8,858 to metabolism (Fig. 2 B). Within the metabolism category, the contigs related to energy production and conversion (C), carbohydrate transport and metabolism (G), and amino acid transport and metabolism (E) were the most abundant. Additionally, 967 contigs were classified as associated with secondary metabolite biosynthesis, transport, and catabolism (Q) (Fig. 2 B). Mining flavonoid biosynthesis-related genes Genes or contigs encoding enzymes for phenylpropanoid and flavonoid biosynthesis were identified based on eggNOG-mapper annotation. There were 16 genes for PAL, 10 for C4H, 18 for 4CL, 51 for CHS, 7 for CHI, and 11 for DFR (Fig. 3 ). No genes were identified for F3′H, F3′5′H, FLS, ANS, F3GT, and F5GT. The selected CHS genes identified in this study were aligned with CHS genes from the ferns D. fragrans , D. erythrosora , and C. thalictroides . Phylogenetic analysis using the neighbour-joining method clustered CHS genes into three major groups, designated CHS1, CHS2, and CHS3 (Fig. 4 ; Additional File 2–4). The CHS1 cluster included three genes: rb_8683, rb_49489, and rb_54131 (Additional File 2). Compared with rb_49489, rb_54131 had a 264-bp deletion and a 1-bp insertion that shifted the predicted stop codon downstream. rb_8683 showed a 99-bp deletion but retained the original stop codon position (Additional File 2). CHS2 comprised rb_4659, rb_49361, and rb_103035 (Additional File 3). rb_103035 lacked the first 306 bp whereas rb_4659 and rb_49361 differed by four single nucleotide polymorphisms. The CHS3 group included four genes: rb_7618, rb_74283, rb_15005, and rb_52876 (Additional File 4). While their 5′ regions are highly conserved, variations in nucleotide composition and deletions in the 3′ regions resulted in differences in amino acid sequences and stop codon positions (Additional File 4). To investigate the relationships among DFR genes, 11 putative DFR sequences, along with DFR genes from D. erythrosora and A. filiculoides , were analysed using the neighbour-joining method (Fig. 5 ). rb_54792 and rb_103797 clustered with D. erythrosora DFR1 and were classified as part of the DFR1 group. Compared with rb_103797, rb_54792 contained a 117-bp deletion at the 5′ end and an additional 146-bp deletion that resulted in an upstream shift of the stop codon (Additional File 5). rb_13889 grouped with D. erythrosora DFR2 and was designated as part of the DFR2 group (Additional File 6). Three genes, rb_10724, rb_11399, and rb_57372 , clustered with D. erythrosora DFR3 and were assigned to the DFR3 group. Within this group, rb_10724 exhibited a 315-bp deletion at the 5′ end compared with rb_11399, whereas rb_57372 had a 90-bp insertion-deletion (indel) at the same region. Additionally, rb_57372 showed a 114-bp indel and several nucleotide variations in the 3′ region, leading to an earlier stop codon (Additional File 7). The remaining four genes formed a distinct clade, separate from the known DFR genes of A. filiculoides and D. erythrosora , suggesting the presence of novel DFR variants. To asses gene expression via qRT-PCR, gene-specific primers were designed based on the conserved regions of CHS1 , CHS2 , DFR1 , DFR2 , and DFR3 . Expression of these genes was confirmed in young leaf tissues collected from populations in the UPR, Kalampangan, and Tjilik Riwut (Fig. 6 A). qRT-PCR further validated the expression of both CHS and DFR across all populations (Fig. 6 B). However, the variations in leaf and leaf extract colouration observed among the UPR, Kalampangan, and Tjilik Riwut genotypes did not correspond to differences in CHS or DFR gene expression levels. DISCUSSION The transcriptome profile of S. palustris was generated via Nanopore long-read sequencing. De novo assembly yielded 47,759 contigs (Fig. 1 ). However, the completeness of the assembly, as evaluated by BUSCO using the Embryophyta dataset, was 66.6% (Fig. 1 ). This relatively low score may be attributed to the use of RNA extracted solely from young leaf tissues, which likely did not capture the full transcriptome complexity of S. palustris [ 26 ]. Putative genes encoding enzymes involved in flavonoid and anthocyanin biosynthesis were successfully identified. The long-read transcriptome enabled the differentiation of highly similar isoforms of CHS and DFR , highlighting the advantage of this technology for distinguishing closely related sequences in de novo assemblies. The expression of CHS and DFR genes was confirmed in young leaves through qRT-PCR. However, because the primers targeted conserved regions, the expression of individual gene isoforms could not be determined, warranting further investigation. Interestingly, the gene expression levels of CHS and DFR did not correlate with visible leaf pigmentation, suggesting that post-transcriptional mechanisms or metabolic regulation influenced pigment accumulation (Fig. 6 B). Samples were collected from natural populations at three sites with distinct soil types: UPR, Kalampangan, and Tjilik Riwut. The observed variation in leaf colouration may be due to underlying genetic differences or environmental factors [ 16 , 34 ]. Previous studies have shown that environmental conditions influence the accumulation of flavonoids and phenolic compounds in fern leaves. Particularly, salinity and full sunlight did not alter total polyphenol content in D. erythrosora or A. nipponicum var. "Red Beauty” but increased the total flavonoid content in D. erythrosora [ 16 , 34 ]. Similarly, high light intensity and low temperature enhanced anthocyanin accumulation in A. filiculoides fronds [ 16 ]. Future studies should assess the expression of flavonoid and anthocyanin biosynthesis genes across various tissues, developmental stages, and controlled environmental conditions to better understand the regulatory mechanisms underlying flavonoid and anthocyanin production in S. palustris . Notably, genes encoding key enzymes in the flavonoid biosynthetic pathway, such as FLS , ANS , F3GT , and F5GT , were undetected. This finding aligns with that of Ali et al. [ 19 ], who reported that several fern species, including S. palustris , lack these genes. Similarly, FLS and ANS were absent in C. richardii [ 35 ], although the upstream and downstream genes were present. Although FLS -like and ANS -like genes have been predicted in A. filiculoides [ 15 ], BLAST searches of the current transcriptome data did not yield any matches (data not shown). These results suggest that ferns utilise alternative biosynthetic routes or enzymes to fulfil the roles typically performed by the canonical FLS and ANS . This transcriptome profiling represents an important step in the characterisation of S. palustris . Future studies should focus on obtaining full genome sequences to confirm gene copy numbers, identify genetic markers, and support comparative genomics in ferns. Ultimately, integrated transcriptomic and metabolomic analyses will be essential to fully elucidate the flavonoid and anthocyanin biosynthetic pathways in S. palustris . CONCLUSIONS In the present study, we utilised long-read sequencing and de novo assembly of the transcriptome profiles of S. palustris leaves. The genes involved in flavonoid biosynthesis were also identified. The expression of genes encoding CHS and DFR was confirmed in young leaves of S. palustris . These data will facilitate advancements in the genetic and molecular studies of S. palustris . Declarations Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Availability of data and materials The datasets generated and/or analysed during the current study are available in the NCBI Short Read Archive repository under the BioProject ID PRJNA1237869, BioSample accession number SAMN47441734, and SRA accession number SRR32761149 [https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1237869/, https://www.ncbi.nlm.nih.gov/biosample/SAMN47441734/, https://www.ncbi.nlm.nih.gov/sra/SRR32761149]. Contigs resulting from the de novo assembly were deposited in the NCBI Transcriptome Shotgun Assembly Sequence Database under the accession number GLFY00000000. Competing interests The authors declare that they have no competing interests. Funding This study was supported by the Heiwa Nakajima Foundation for MSD. Authors’ contributions MSD conceived the study, performed experiments, and wrote the manuscript. DR collected the samples and edited the manuscript. DP performed the real-time PCR analysis. EW, FS, MDPTGP, and YAN selected the sampling locations, helped with sample collection and preparation before RNA extraction, and edited the manuscript. Acknowledgements We would like to thank Editage (www.editage.jp) for English language editing. References Giesen W, Wulffraat S, Zieren M, Scholten L. Mangrove Guidebook for Southeast Asia. Thailand: FAO and Wetlands International; 2006. p. 19. Irawan DC, Wijaya CH, Limin SH, Hashidoko Y, Osaki M, Kulu IP. Ethnobotanical study and nutrient potency of local traditional vegetables in Central Kalimantan. Tropics. 2006;15:441–8. Amoroso VB, Lagumbay AJD, Mendez RA, de la Cruz RY, Villalobos AP. Bioactives in three Philippine edible ferns. Asia Life Sci. 2014;23:445–54. Chai TT, Panirchellvum E, Ong HC, Won FC. Phenolic contents and antioxidant properties of Stenochlaena palustris , an edible medicinal fern. Bot Stud. 2012;53:439–46. Chai PPK. Midin ( Stenochlaena palustris ), the popular wild vegetable of Sarawak. Utar Agr Sci J. 2016;2:2. Pandiangan FI, Oslo EA, Destine F, Josephine J, Anwar RN. A review on the health benefits of kalakai ( Stenochlaena palustris ). JFFN. 2022;4. https://doi.org/10.33555/jffn.v4i1.98 . Rahmawati D, Wijaya CH, Hashidoko Y, Djajakirana G, Haraguchi A, Watanabe T, et al. Concentration of some trace elements in two wild edible ferns, Diplazium esculentum and Stenochlaena palustris , inhabiting tropical peatlands under different environments in Central Kalimantan. Eurasian J Res. 2017;20:11–20. Budiman I, Bastoni, Sari ENN, Hadi EE, Asmaliyah, Siahaan H, et al. Progress of paludiculture projects in supporting peatland ecosystem restoration in Indonesia. GECCO. 2020;23:e01084. https://doi.org/10.1016/j.gecco.2020.e01084 . Chai TT, Kwek MT, Ong HC, Wong FC. Water fraction of edible medicinal fern Stenochlaena palustris is a potent α-glucosidase inhibitor with concurrent antioxidant activity. Food Chem. 2015;186:26–31. Gunawan-Puteri MDPT, Kato E, Rahmawati D, Teji S, Santoso JA, Pandiangan FI, et al. Post-harvest and extraction conditions for the optimum alpha glucosidase inhibitory activity of Stenochlaena palustris . Int J Technol. 2021;12:649–60. Chen J, Zhong K, Qin S, Jing Y, Liu S, Li D, et al. Astragalin: a food-origin flavonoid with therapeutic effect for multiple diseases. Front Pharmacol. 2023;14:1265960. https://doi.org/10.3389/fphar.2023.1265960 . Davies KM, Landi M, van Klink JW, Schwinn KE, Brummell DA, Albert NW, et al. Evolution and function of red pigmentation in land plants. Ann Bot. 2022;130(5):613–36. https://doi.org/10.1093/aob/mcac109 . Dao TT, Linthorst HJ, Verpoorte R. Chalcone synthase and its functions in plant resistance. Phytochem Rev. 2011;10:397–412. 10.1007/s11101-011-9211-7 . Chen X, Liu W, Huang X, Fu H, Wang Q, Wang Y, et al. Arg-type dihydroflavonol 4-reductase genes from the fern Dryopteris erythrosora play important roles in the biosynthesis of anthocyanins. PLoS ONE. 2020;15:e0232090. 10.1371/journal.pone.0232090 . Güngör E, Brouwer P, Dijkhuizen LW, Shaffar DC, Nierop KGJ, de Vos RCH, et al. Azolla ferns testify: seed plants and ferns share a common ancestor for leucoanthocyanidin reductase enzymes. New Phytol. 2020;229:1118–32. https://doi.org/10.1111/nph.16896 . Costarelli A, Cannavò S, Cerri M, Pellegrino RM, Reale L, Paolocci F, et al. Light and temperature shape the phenylpropanoid profile of Azolla filiculoides fronds. Front Plant Sci. 2021;12:727667. 10.3389/fpls.2021.727667 . Qi X, Kuo LY, Guo C, Li H, Li Z, Qi J, Wang L, Hu Y, Xiang J, Zhang C, Guo J, Huang CH, Ma H. A well-resolved fern nuclear phylogeny reveals the evolution history of numerous transcription factor families. Mol Phylogenet Evol. 2018;127:961–77. 10.1016/j.ympev.2018.06.043 . Shen H, Jin D, Shu JP, Zhou XL, Lei M, Wei R, et al. Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns. GigaScience. 2018;7:1–11. https://doi.org/10.1093/gigascience/gix116 . Ali ZM, Tan QW, Lim PK, Chen H, Pfeifer L, Julca I, et al. Comparative transcriptomics in ferns reveals key innovations and divergent evolution of the secondary cell walls. Nat Plants. 2025;11:1028–48. https://doi.org/10.1038/s41477-025-01978-y . Kiss T, Karácsony Z, Gomba-Tóth A, Szabadi KL, Spitzmüller Z, Hegyi-Kaló J, et al. A modified CTAB method for the extraction of high-quality RNA from mono-and dicotyledonous plants rich in secondary metabolites. Plant Methods. 2024;20:62. 10.1186/s13007-024-01198-z . Wick RR, Judd LM, Gorrie CL, Holt KE. Completing bacterial genome assemblies with multiplex MinION sequencing. Microb Genom. 2017;3:e000132. 10.1099/mgen.0.000132 . Nip KM, Hafezqorani S, Gagalova KK, Chiu R, Yang C, Warren RL, et al. Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2. Nat Commun. 2023;14:2940. 10.1038/s41467-023-38553-y . Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–9. 10.1093/bioinformatics/btl158 . Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–100. 10.1093/bioinformatics/bty191 . Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO Update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021;38:4647–54. Waterhouse RM, Seppey M, Simão FA, Manni M, Ioannidis P, Klioutchnikov G, et al. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol. 2018;35:543–8. 10.1093/molbev/msx319 . Nishimura O, Hara Y, Kuraku S. Evaluating genome assemblies and gene models using gVolante. Methods Mol Biol. 2019;1962:247–56. 10.1007/978-1-4939-9173-0_15 . Mikheenko A, Saveliev V, Hirsch P, Gurevich A. WebQUAST: online evaluation of genome assemblies. Nucl Acids Res. 2023;51:W601–6. 10.1093/nar/gkad406 . Singh U, Wurtele ES. orfipy: a fast and flexible tool for extracting ORFs. Bioinformatics. 2021;37:3019–20. https://doi.org/10.1093/bioinformatics/btab090 . Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mering C, et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol Biol Evol. 2017;34:2115–22. https://doi.org/10.1093/molbev/msx148 . Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. biorxiv. Mol Biol Evol. 2021;38:5825–9. https://doi.org/10.1101/2021.06.03.446934 . Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20:1160–6. Klaric SV, Galvão Maciel A, Arend GD, Tres MV, de Lima M, Soares LS. Application of plant extracts rich in anthocyanins in the development of intelligent biodegradable packaging: an overview. Processes. 2025;13(1):191. https://doi.org/10.3390/pr13010191 . Pietrak A, Salachna P, Łopusiewicz Ł. Changes in growth, ionic status, metabolites content and antioxidant activity of two ferns exposed to shade, A full sunlight, and salinity. Int J Mol Sci. 2022;24:296. https://doi.org/10.3390/ijms24010296 . Marchant DB, Chen G, Cai S, Chen F, Schafran P, Jenkins J, et al. Dynamic genome evolution in a model fern. Nat Plants. 2022;8(9):1038–51. 10.1038/s41477-022-01226-7 . Additional Declarations No competing interests reported. Supplementary Files AdditionalFile1.xlsx Additional File 1. List of genes (contigs) annotated as flavonoid biosynthesis genes. AdditionalFile2.pdf Additional File 2. Alignment of genes in the CHS1 group. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows. AdditionalFile3.pdf Additional File 3. Alignment of genes in the CHS2 group. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows. AdditionalFile4.pdf Additional File 4. Alignment of genes in the CHS3 group. Only coding sequences were used for alignment. AdditionalFile5.pdf Additional File 5. Alignment of genes in the DFR1 group. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows. AdditionalFile6.pdf Additional File 6. Alignment of genes in the DFR2 group. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows. AdditionalFile7.pdf Additional File 7. Alignment of genes in the DFR3 group. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7105205","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":510034914,"identity":"5e328c0c-705c-414a-ac33-96cf987f9ca1","order_by":0,"name":"Maria Stefanie Dwiyanti","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABHElEQVRIie2RMUvEMBSAX6joEu/WBPX8C68EuljOv3KlcF06CK4OB0I279aC/guXGwMHdsksFQctglOFuhwIKiaeokJbHAXzDe+9hHy8lwTA4fiLKDKxifZtNDUCjEB9LLoUpHzye+U9IqD6pnTSK2LJnubP2yLPL2syDwXmcangaAjeaXMbXkSSn2ikgU4PM6LHAep70/EiBnKmGhW8jiTblEZRqTklFyEWI6OsKyBZ84RW4S9GEbPqU0lqBa/dypbtgixZWCXAIkVFZLvCb8rjvR0pKCsqDyI5FlxXByqaxrTtLj2dl1cPcrDfnyV38ChDf5on53W9HA78lhczrLFVpvj1I6agftZmgFev8sbtz/1d1qo4HA7H/+INaP1oVIYGbyoAAAAASUVORK5CYII=","orcid":"","institution":"Hokkaido University","correspondingAuthor":true,"prefix":"","firstName":"Maria","middleName":"Stefanie","lastName":"Dwiyanti","suffix":""},{"id":510034915,"identity":"3071edcb-0e17-4f88-8fbe-e88a0ca9134f","order_by":1,"name":"Della Rahmawati","email":"","orcid":"","institution":"Swiss German University","correspondingAuthor":false,"prefix":"","firstName":"Della","middleName":"","lastName":"Rahmawati","suffix":""},{"id":510034916,"identity":"bf592f00-ef91-4e0c-8ddc-bfabac99c3bc","order_by":2,"name":"Maria Dewi Puspitasari Tirtaningtyas Gunawan-Puteri","email":"","orcid":"","institution":"Swiss German University","correspondingAuthor":false,"prefix":"","firstName":"Maria","middleName":"Dewi Puspitasari Tirtaningtyas","lastName":"Gunawan-Puteri","suffix":""},{"id":510034917,"identity":"a102eeec-27ec-4519-b165-de3e0c5a5be3","order_by":3,"name":"Deshika Panapitiya","email":"","orcid":"","institution":"Hokkaido University","correspondingAuthor":false,"prefix":"","firstName":"Deshika","middleName":"","lastName":"Panapitiya","suffix":""},{"id":510034918,"identity":"d4e20a35-a159-40fd-880a-b1a29b98e7a7","order_by":4,"name":"Yanetri Asi Nion","email":"","orcid":"","institution":"University of Palangka Raya","correspondingAuthor":false,"prefix":"","firstName":"Yanetri","middleName":"Asi","lastName":"Nion","suffix":""},{"id":510034919,"identity":"2a5823ee-8dd1-4f2b-b0f6-005d4fa2586f","order_by":5,"name":"Elza Wijaya","email":"","orcid":"","institution":"Swiss German University","correspondingAuthor":false,"prefix":"","firstName":"Elza","middleName":"","lastName":"Wijaya","suffix":""},{"id":510034921,"identity":"b83648cc-74c7-4c12-9539-eadc2010e129","order_by":6,"name":"Filiana Santoso","email":"","orcid":"","institution":"Swiss German University","correspondingAuthor":false,"prefix":"","firstName":"Filiana","middleName":"","lastName":"Santoso","suffix":""}],"badges":[],"createdAt":"2025-07-12 02:23:19","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7105205/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7105205/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":90664366,"identity":"5ff337f3-16cf-41a0-8bf4-b5ef2aa59df6","added_by":"auto","created_at":"2025-09-05 12:17:46","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":483997,"visible":true,"origin":"","legend":"\u003cp\u003eA) Sampling location on Kalimantan Island. The map was sourced from Google Maps (https://maps.google.com/. Accessed April 23, 2025). B) Leaf samples used for Nanopore long-read sequencing. C) Assembly statistics.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/69f3eccfe79b34df1cc9df64.png"},{"id":90664369,"identity":"54c7c993-2733-43f7-bcae-8b9bb11d88c1","added_by":"auto","created_at":"2025-09-05 12:17:46","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":142651,"visible":true,"origin":"","legend":"\u003cp\u003eEggNOG-mapper analysis result. Genes were categorised based on A. KEGG orthology groups and B. Clusters of Orthologous Groups. The numbers show the count of genes belonging to each group.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/1eaf2c28f306a2287adc62a2.png"},{"id":90664367,"identity":"d371cc95-0888-4b02-826a-66018cb56f56","added_by":"auto","created_at":"2025-09-05 12:17:46","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":85608,"visible":true,"origin":"","legend":"\u003cp\u003ePhenylpropanoid, flavonoid, and anthocyanin pathway for \u003cem\u003eStenochlaena palustris.\u003c/em\u003e PAL, phenylalanine ammonia-lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate:CoA ligase; CHS, chalcone synthase; CHI, chalcone isomerase; F3H, flavanone 3-hydroxylase; DFR, dihydroflavonol 4-reductase; ANS, anthocyanidin synthase; F3GT, flavonoid 3-\u003cem\u003eO\u003c/em\u003e-glucosyltransferase; F5GT, flavonoid 5-\u003cem\u003eO\u003c/em\u003e-glucosyltransferase; FLS, flavonol synthase; F3ʹH, flavonoid 3ʹ-hydroxylase; F3ʹ5ʹH, flavonoid 3ʹ,5ʹ-hydroxylase; UGT, UDP-dependent glycosyltransferase. The numbers in brackets show the number of genes identified for each enzyme.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/c6af014d6ecfb7f61f5a8c3d.png"},{"id":90665154,"identity":"90b79f96-6225-4e91-9ba4-1b9d979f20fe","added_by":"auto","created_at":"2025-09-05 12:25:46","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":172497,"visible":true,"origin":"","legend":"\u003cp\u003eNeighbor-joining tree of chalcone synthase (\u003cem\u003eCHS\u003c/em\u003e)\u003cem\u003e \u003c/em\u003egenes. Fifty-one CHS-coding genes obtained in this study and \u003cem\u003eCHS\u003c/em\u003e genes from \u003cem\u003eDryopteris fragrans\u003c/em\u003e(KF530802.1), \u003cem\u003eD. erythrosora\u003c/em\u003e(KJ135628.1) and \u003cem\u003eCeratopteris thalictroides\u003c/em\u003e (JX027616.1) are included. The tree was built using 1,000 bootstrap replicates.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/61483a007707e9921026f568.png"},{"id":90665152,"identity":"2deacec0-d49f-44cc-8d72-29a62e9f859d","added_by":"auto","created_at":"2025-09-05 12:25:46","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":79848,"visible":true,"origin":"","legend":"\u003cp\u003eNeighbor-joining tree of dihydroflavonol 4-reductase (\u003cem\u003eDFR\u003c/em\u003e)\u003cem\u003e \u003c/em\u003egenes. DFR-coding genes obtained in this study and five \u003cem\u003eDFR\u003c/em\u003e genes from \u003cem\u003eDryopteris erythrosora, \u003c/em\u003eand four \u003cem\u003eDFR \u003c/em\u003egenes from \u003cem\u003eAzolla filiculoides \u003c/em\u003eare included. The tree was built using 1,000 bootstrap replicates.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/e3d2c34c8c58303db875395a.png"},{"id":90665381,"identity":"ff2637a5-4661-4000-a49c-4a4e447d01f6","added_by":"auto","created_at":"2025-09-05 12:33:46","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":192091,"visible":true,"origin":"","legend":"\u003cp\u003eA. Young leaf samples from the University of Palangka Raya (UPR), Tjilik Riwut, and Kalampangan sites, and water-methanol-chloroform extract from the corresponding samples. Sampling locations are the UPR (2°12′57.2″S, 113°54′04.3″E), Tjilik Riwut (2°14′02.8″S, 113°57′09.1″E), and Kalampangan (2°16′44.8″S, 114°00′30.3″E). The map was sourced from Google Maps, 2025. B. Expression levels of \u003cem\u003eCHS1, CHS2, DFR1\u003c/em\u003e, \u003cem\u003eDFR2\u003c/em\u003e, and \u003cem\u003eDFR3 \u003c/em\u003egenes in young leaves of the UPR, Kalampangan, and Tjilik Riwut samples. Expression levels were normalised to actin. Three replicates were prepared for each sample. Asterisks indicate a statistically significant difference at \u003cem\u003ep\u003c/em\u003e\u0026lt;0.05.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/236a06c983f97cd0e00477e7.png"},{"id":90798063,"identity":"a66765cc-3ac6-4b7b-8bb0-7bfffc6a232b","added_by":"auto","created_at":"2025-09-08 09:32:23","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1901386,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/34c459fe-69fd-4b03-bd07-2db13e2ec810.pdf"},{"id":90666279,"identity":"84e5ead4-dd3c-4d68-91fd-501c2deef7d4","added_by":"auto","created_at":"2025-09-05 12:41:46","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":16943,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional File 1. List of genes (contigs) annotated as flavonoid biosynthesis genes.\u003c/p\u003e","description":"","filename":"AdditionalFile1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/f2688aa7966dcc93a49fabf0.xlsx"},{"id":90664371,"identity":"5757c447-1a10-4611-a737-9099d0a3d19d","added_by":"auto","created_at":"2025-09-05 12:17:46","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":108624,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional File 2. Alignment of genes in the \u003cem\u003eCHS1\u003c/em\u003egroup. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows.\u003c/p\u003e","description":"","filename":"AdditionalFile2.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/e70ac82855061734bf8edc27.pdf"},{"id":90664376,"identity":"198168a4-d05c-4de8-aacf-cc3a935df08a","added_by":"auto","created_at":"2025-09-05 12:17:46","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":88064,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional File 3. Alignment of genes in the \u003cem\u003eCHS2\u003c/em\u003egroup. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows.\u003c/p\u003e","description":"","filename":"AdditionalFile3.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/ec1e4c6410268577602f4402.pdf"},{"id":90664378,"identity":"ce8df59d-0f86-4950-826f-f10a3ce722c2","added_by":"auto","created_at":"2025-09-05 12:17:46","extension":"pdf","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":1187414,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional File 4. Alignment of genes in the \u003cem\u003eCHS3\u003c/em\u003egroup. Only coding sequences were used for alignment.\u003c/p\u003e","description":"","filename":"AdditionalFile4.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/4af913fa2c13b937ae2c8c87.pdf"},{"id":90665383,"identity":"6f41bb40-5f78-489b-9244-c41bbe12ac1c","added_by":"auto","created_at":"2025-09-05 12:33:46","extension":"pdf","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":97950,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional File 5. Alignment of genes in the \u003cem\u003eDFR1\u003c/em\u003egroup. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows.\u003c/p\u003e","description":"","filename":"AdditionalFile5.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/372656a2325afad79ee5cb8c.pdf"},{"id":90665156,"identity":"b09c252c-988f-420b-bd77-d34a32a00265","added_by":"auto","created_at":"2025-09-05 12:25:46","extension":"pdf","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":62291,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional File 6. Alignment of genes in the \u003cem\u003eDFR2\u003c/em\u003egroup. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows.\u003c/p\u003e","description":"","filename":"AdditionalFile6.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/288209788b95f681360102d7.pdf"},{"id":90664386,"identity":"a4a2ef39-b256-4628-a8b2-d8e76d84bd7d","added_by":"auto","created_at":"2025-09-05 12:17:47","extension":"pdf","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":116438,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional File 7. Alignment of genes in the \u003cem\u003eDFR3\u003c/em\u003egroup. Only coding sequences were used for alignment. Primer binding sites are indicated by blue arrows.\u003c/p\u003e","description":"","filename":"AdditionalFile7.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7105205/v1/4f9bb2fc13eaecea91d7e005.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Organ-Specific Long-Read Transcriptome Assembly of Stenochlaena palustris and Annotation of Anthocyanin Biosynthesis Genes","fulltext":[{"header":"BACKGROUND","content":"\u003cp\u003e\u003cem\u003eStenochlaena palustris\u003c/em\u003e, a climbing fern of the Blechnaceae family, is a naturally distributed plant in South Asia, Southeast Asia, Australia, and Polynesia [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. It is known by several local names, such as kelakai/kalakai in Kalimantan (Indonesia); lemidin, midin, or paku midin in Malaysia; and diliman or hagnaya in the Philippines [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. In these regions, young leaves and fronds of \u003cem\u003eS. palustris\u003c/em\u003e are collected from wild habitats and consumed as a vegetable [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. In addition to its culinary uses, \u003cem\u003eS. palustris\u003c/em\u003e has been traditionally used for its medicinal properties for treating conditions, such as ulcers, stomachaches, fever, diarrhoea, and skin infection [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. It also treats anaemia, promotes breast milk production, aids recovery after childbirth, prevents diabetes, and reduces antimicrobial activity [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Additionally, \u003cem\u003eS. palustris\u003c/em\u003e is a hardy plant capable of growing in challenging environments, such as acidic peatlands. In Kalimantan, it is commonly found in various habitats, including acidic swamp areas, riversides, natural and secondary forests, palm oil plantations, and residential zones [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. It is also known as a pioneer plant, which can grow first after land disturbances, such as peatland forest fires, thus creating conditions conducive to the establishment of other plant species [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. These properties render \u003cem\u003eS. palustris\u003c/em\u003e a valuable source of vegetables and medicines, particularly for the people of Kalimantan.\u003c/p\u003e\u003cp\u003eStudies have shown that fronds and young leaves of \u003cem\u003eS. palustris\u003c/em\u003e possess antioxidant properties. Water extracts of fronds contain anthocyanins, polyphenols, and hydroxycinnamic acids, which contribute to their antioxidant activity [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. Young leaves of \u003cem\u003eS. palustris\u003c/em\u003e are either red- or green-coloured, indicating the differences in anthocyanin content. Young red leaves turn green when they mature [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Two independent studies identified the α-glucosidase inhibitor activity of \u003cem\u003eS. palustris\u003c/em\u003e [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. This inhibitory activity may be attributed to hydroxycinnamic acids or astragalin [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. Astragalin is kaempferol 3-O-β- glucopyranoside, a glucosylated form of kaempferol. It is found in various plant species, including persimmon leaves, lotus leaves, green tea seeds, and roots of \u003cem\u003eAstragalus membranaceus\u003c/em\u003e [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eUnderstanding the regulation of flavonoid and anthocyanin biosynthesis is crucial for further phytochemical-related research on \u003cem\u003eS. palustris.\u003c/em\u003e Flavonoid biosynthesis begins with the conversion of phenylalanine to the flavonoid precursor \u003cem\u003ep\u003c/em\u003e-coumaroyl-CoA through a series of enzymatic steps [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. The key enzymes involved in this process are phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), and 4-coumaroyl-CoA ligase (4CL) [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. \u003cem\u003ep\u003c/em\u003e-Coumaroyl-CoA is converted to naringenin chalcone by chalcone synthase (CHS). This intermediate is further converted into naringenin by chalcone isomerase (CHI). The addition of a hydroxyl group at the 3-position of naringenin results in the formation of dihydrokaempferol, a process catalysed by flavanone 3-hydroxylase (F3H). Dihydrokaempferol is subsequently converted into kaempferol by flavonol synthase (FLS). Kaempferol is glycosylated by UDP-dependent glycosyltransferases (UGT) to form astragalin. Additionally, dihydrokaempferol serves as a precursor for anthocyanins, with several enzymes involved in the conversion, including dihydroflavonol 4-reductase (DFR), flavonoid 3ʹ-hydroxylase (F3ʹH), flavonoid 3ʹ,5ʹ-hydroxylase (F3ʹ5ʹH), anthocyanidin synthase (ANS), flavonoid 3-O-glucosyltransferase (F3GT), and flavonoid 5-O-glucosyltransferase (F5GT).\u003c/p\u003e\u003cp\u003eThe growing interest in flavonoid and anthocyanin biosynthesis in ferns has led to the identification of related genes. The first step in flavonoid biosynthesis, which is the conversion of \u003cem\u003ep\u003c/em\u003e-coumaryl CoA to naringenin chalcone by CHS, is often rate-limiting. CHS genes were isolated from the ferns \u003cem\u003eDryopteris fragrans\u003c/em\u003e (GenBank accession numbers KF530802.1, KP420005.1, and KP420004.1), \u003cem\u003eD. erythrosora\u003c/em\u003e (GenBank accession number KJ135628.1), and \u003cem\u003eCeratopteris thalictroides\u003c/em\u003e (GenBank: JX027616.1). As CHS can exist as a single copy or form a multigene family, the actual number of CHS-expressing genes in ferns may be higher [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. DFR is the key enzyme involved in anthocyanin biosynthesis. DFR genes have been identified and characterised in water ferns (\u003cem\u003eAzolla filiculoides\u003c/em\u003e) and \u003cem\u003eD. erythrosora.\u003c/em\u003e Chen et al. [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] isolated two DFR genes from \u003cem\u003eD. erthyrosora\u003c/em\u003e, namely \u003cem\u003eDeDFR1\u003c/em\u003e (GenBank: MK920230) and \u003cem\u003eDeDFR2\u003c/em\u003e (GenBank: MK920231). Based on a GenBank search, Chen et al. [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] deposited three other DFR genes: MK920232, MK920233, and MK920234. In this study, these genes were designated as \u003cem\u003eDFR3\u003c/em\u003e, \u003cem\u003eDFR4\u003c/em\u003e, and \u003cem\u003eDFR5\u003c/em\u003e. Two DFR gene families were identified in \u003cem\u003eA. filiculoides\u003c/em\u003e: \u003cem\u003eDFR1\u003c/em\u003e consisting of Azfi_s0035.g025620 and Azfi_s0245.g059984 and \u003cem\u003eDFR2\u003c/em\u003e with Azfi_s0008.g011655 and Azfi_s0008.g011657. [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eLimited genomic information on \u003cem\u003eS. palustris\u003c/em\u003e hinders the characterisation of flavonoid biosynthesis regulation. Currently, only three transcriptomic datasets are publicly available in the National Center for Biotechnology Information Sequence Read Archive (NCBI SRA) database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/sra\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/sra\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e; last accessed March 20th, 2025). The three datasets were generated from samples from China and Singapore [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. Shen et al. [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e] generated the transcriptome data for \u003cem\u003eS. palustris\u003c/em\u003e sporophyll and trophophyll tissues. Sequencing was performed using the Illumina HiSeq 2500 system. \u003cem\u003eDe novo\u003c/em\u003e assembly produced 58,416 contigs with 945.83 bp average length and 48.15% GC content [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Ali et al. [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e] determined the transcriptome profile of \u003cem\u003eS. palustris\u003c/em\u003e from various organs, such as sterile and fertile leaves, petioles, rhizomes, and roots. Sequencing was performed using the Illumina NovaSeq 6000 system. The \u003cem\u003ede novo\u003c/em\u003e assembly produced 54,843 contigs with 832.1 bp average length and 45.25% GC content [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. The \u003cem\u003ede novo\u003c/em\u003e assembly of the transcriptome data produced by Ali et al. [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e] is publicly available (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://conekt.plant.tools/\u003c/span\u003e\u003cspan address=\"https://conekt.plant.tools/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e, last accessed March 21st, 2025).\u003c/p\u003e\u003cp\u003eIn this study, we used the Nanopore long-read sequencing platform to obtain transcriptome data from leaf samples collected from Palangka Raya, Indonesia. The data obtained in the present study can support genomic research on \u003cem\u003eS. palustris\u003c/em\u003e by providing additional information that complements previously published data. In the present study, we identified genes potentially involved in flavonoid and anthocyanin biosynthesis, particularly those encoding CHS and DFR.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eSample collection, RNA extraction, and sequencing\u003c/h2\u003e\u003cp\u003eYoung green nonfibrous sterile leaves from \u003cem\u003eS. palustris\u003c/em\u003e naturally grown in the University of Palangka Raya (UPR), Indonesia (2\u0026deg;12\u0026prime;57.2\u0026Prime;S, 113\u0026deg;54\u0026prime;04.3\u0026Prime;E) were collected and immediately stored in RNAlater solution (ThermoFisher Scientific, Japan). Plant material from a location in UPR (2\u0026deg;12\u0026prime;48.4\u0026Prime;S, 113\u0026deg;54\u0026prime;02.8\u0026Prime;E) had been previously identified at the Herbarium Bogoriense, Research Center for Biology, Cibinong, Indonesia (No. 252/IPH.1.01/If.07/II/2019).\u003c/p\u003e\u003cp\u003eThe samples were stored at 4\u0026deg;C until RNA extraction. RNA was extracted from leaves using 2% cetyltrimethylammonium bromide (CTAB) and 4% polyvinylpyrrolidone (PVP) extraction buffer, following the protocol of Kiss et al. [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], with several modifications. Leaf tissue (5 cm) was ground to a fine powder in liquid nitrogen using a mortar and a pestle. A preheated extraction buffer (6 mL) was added to the powder, and the mixture was homogenised to remove clumps. The mixture was transferred to six 2 mL tubes. The tubes were then incubated at 65\u0026deg;C for 10 min. Seven hundred microlitres of chloroform:isoamyl alcohol (24:1) was added to each tube. The tubes were vortexed and centrifuged at 15,000 rpm for 10 min at 4\u0026deg;C. The upper phase was transferred to a new 1.5 mL tube, and an equal volume of chloroform:isoamyl alcohol (24:1) was added. The addition of chloroform:isoamyl alcohol was repeated three times. The upper phase was then mixed with 500 \u0026micro;L 8M LiCl. The mixture was incubated at 4\u0026deg;C overnight. The following day, the solution was centrifuged at 15,000 rpm for 45 min at 4\u0026deg;C. The RNA pellet was washed with 50 \u0026micro;L ice-cold ethanol (80% v/v) and then centrifuged at 15,000 rpm for 5 min at 4\u0026deg;C. The supernatant was carefully removed using a pipette, and the tubes were centrifuged and then dried at 37\u0026deg;C to remove any residual ethanol. The RNA pellets were resuspended in 270 \u0026micro;L RNase-free water. Two microlitres of DNase I (Nippongene, Japan), 27 \u0026micro;L DNase I buffer, and 1 \u0026micro;L recombinant RNase inhibitor (Takara Shuzo, Kyoto, Japan) were added to the RNA solution and then incubated at 37\u0026deg;C for 30 min. DNase was removed using phenol:chloroform:isoamyl alcohol (25:24:1). RNA was precipitated in 99.5% ethanol overnight at -30\u0026deg;C then centrifuged at 15,000 rpm for 45 min at 4\u0026deg;C. The RNA pellet was washed using 100 \u0026micro;L cold 80% EtOH, followed by another centrifugation at 15,000 rpm for 5 min at 4\u0026deg;C. Residual ethanol was removed using a pipette, and the pellet was air-dried at room temperature. The pellet was eluted in 50 \u0026micro;L RNase/DNase-free water.\u003c/p\u003e\u003cp\u003eRNA quantity was determined using NanoDrop spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA) and Qubit (Thermo Fisher Scientific, Japan). RNA quality was assessed via 1% gel electrophoresis, and Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Transcriptome sequencing was performed on the PromethION platform using a cDNA-PCR Barcoding Kit 24 V14 (SQK-PCR114.24) at GeneBay Inc., Japan. Base calling was performed using the Dorado base-call server (version 7.4). A high-accuracy model was used to generate sequence reads after filtering. To ensure high-quality data in the downstream analysis, low-quality and short reads were removed, and barcode sequences were trimmed using Porechop 0.2.4 [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e], with the remove-middle option applied to eliminate barcodes embedded within the reads.\u003c/p\u003e\u003cp\u003e\u003cb\u003eDe novo\u003c/b\u003e \u003cb\u003eassembly, reduce redundancy, and assembly quality assessment\u003c/b\u003e\u003c/p\u003e\u003cp\u003e\u003cem\u003eDe novo\u003c/em\u003e sequence assembly of reads that passed filtering was performed using RNA-Bloom2 version 2.0.1, with the default parameters [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. To reduce redundancy, the contigs were clustered based on sequence similarity using CD-HIT-EST in the CD-HIT package version 4.8.1 [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e], with the following parameters: similarity threshold (-c 0.90) and word size threshold (-n 8). The resulting contigs were discarded if 1) read coverage\u0026thinsp;\u0026lt;\u0026thinsp;1 based on read mapped to each contig by minimap [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], 2) sequences are predicted from non-fern organisms by the evaluation conducted using NCBI Foreign Contamination Screen (NCBI FCS) provided in Galaxy (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://usegalaxy.org\u003c/span\u003e\u003cspan address=\"https://usegalaxy.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e, last accessed March 27th, 2025), 3) barcode adapters remain in the sequences by filtering manually before uploading to NCBI TSA database, and 4) GC% of the contig is more than 60% or less than 40%. The assembly quality was evaluated using Benchmarking Universal Single-Copy Orthologs v5 (BUSCO v5) [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] to assess the completeness of the assembly. Online BUSCO analysis used Embryophyta from the OrthoDB v10 ortholog sets (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://gvolante.riken.jp\u003c/span\u003e\u003cspan address=\"https://gvolante.riken.jp\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]). The final dataset was examined for the contig length distribution, including the minimum, maximum, and average values, and the %GC content was assessed using QUAST [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e].\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eIdentification of open reading frame\u003c/h3\u003e\n\u003cp\u003eOpen reading frames (ORFs) for each contig were predicted using ORFipy version 0.0.4 [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. ORFipy was run using the following parameters: start ATG to search for ORFs beginning at the common start codon ATG and Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e standard genetic code, which is applicable for most organisms. Option \u0026ndash;longest was also added to filter and output the largest ORFs. The largest ORFs predicted for each contig were assembled into a dataset to predict gene function.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003ePrimers used for quantitative real-time PCR.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGene\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003ePrimer name\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSequence (5\u0026prime;-3\u0026prime;)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eCHS\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCHS1-F\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCAGAAGGTGTTTGGTGAAGATGCTC\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCHS1-R\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTCGACATGTTCCCGAATTCTGAGAG\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCHS2-F\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGGCTTACATTCCACCTCATGAAG\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCHS2-R\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGACATGTTTCCATAGTCCGACAG\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eDFR\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDFR1-F\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGAGAAAGCGGCGGTGGAGTT\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDFR1-R\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGCTATTGGGAATCTTGGAAAG\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDFR2-F\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTCTCCCTTGTCACAGGTGAT\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDFR2-R\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGCGGACCCAATATAGCGACC\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDFR3-F\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCTGTGGAGTACTTAAGGCACAAGA\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDFR3-R\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTGGAATAGAAGGTGTGAGAAAGGG\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eActin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eActin-F\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGAGACCACTTACAACTCCATCATG\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eActin-R\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTCCAGACACTGTATTTTCTCAGGAGG\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\n\u003ch3\u003eGene functional annotation using eggNOG-mapper v2\u003c/h3\u003e\n\u003cp\u003eFunctional annotation was performed using the eggNOG-mapper v2 web version (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://eggnog-mapper.embl.de\u003c/span\u003e\u003cspan address=\"http://eggnog-mapper.embl.de\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e, \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e])by uploading the longest contigs obtained from the CD-HIT-EST analysis. The input was nucleotide sequences, whereas the parameters used were genomic data, gene prediction method: BLASTX-like, allow frameshifts (Diamond \u0026ndash;frameshift option), database eggnog 5, minimum hit e-value 0.001, minimum hit bit-score 60, percentage identity 40%, minimum% of query coverage 20%, and minimum% of subject coverage 20%. For further analysis, another filtering was performed to retain only the contigs that matched \u0026ldquo;max_annotation_level\u0026rdquo; to \u0026ldquo;Streptophyta\u0026rdquo; or \u0026ldquo;Viridiplantae\u0026rdquo;. KEGG orthology (KO) and Clusters of Orthologous Groups (COG) distributions were used for the final output contigs.\u003c/p\u003e\n\u003ch3\u003eMining flavonoid biosynthesis genes\u003c/h3\u003e\n\u003cp\u003eBased on the output from the eggNOG-mapper v2, genes potentially involved in flavonoid biosynthesis were identified and compiled (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Additional File 1). Multiple sequence alignment and phylogenetic tree analysis were performed to define gene clustering. For CHS, only 16 genes with lengths of \u0026gt;\u0026thinsp;800 bp were included in the analysis, whereas all 11 DFR genes were included. In addition to \u003cem\u003eS. palustris\u003c/em\u003e genes, known CHS and DFR genes from other ferns were included for comparison. The CHS genes included were \u003cem\u003eD. fragrans\u003c/em\u003e (GenBank: KF530802.1, KP420005.1, KP420004.1), \u003cem\u003eD. erythrosora\u003c/em\u003e (GenBank: KJ135628.1), \u003cem\u003eC. thalictroides\u003c/em\u003e (GenBank: JX027616.1), whereas the DFR genes included were \u003cem\u003eD. erythrosora\u003c/em\u003e (MK920230.1, MK920231.1, MK920232.1, MK920233.1, MK920234.1) and \u003cem\u003eA. filiculoides\u003c/em\u003e (Azfi_s0035.g025620, Azfi_s0245.g059984, Azfi_s0008.g011655, Azfi_s0008.g011657). Multiple sequence alignment was performed using MAFFT version 7 with the L-INS-I algorithm (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://mafft.cbrc.jp/alignment/server/index.html\u003c/span\u003e\u003cspan address=\"https://mafft.cbrc.jp/alignment/server/index.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]). A phylogenetic tree was constructed using the neighbour-joining method with the Jukes-Cantor substitution model and 1000-times bootstrap replicates provided on the MAFFT web server. The phylogenetic tree was edited using Archaeopteryx.js, which is provided on the same server.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eGene expression analysis via quantitative real-time PCR\u003c/h3\u003e\n\u003cp\u003eTo confirm the expression of genes involved in flavonoid biosynthesis, quantitative real-time PCR (qRT-PCR) was conducted using RNA extracted from young leaves collected from natural populations at three locations in Central Kalimantan: the UPR, Kalampangan, and Tjilik Riwut (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e5\u003c/span\u003eA). Sampling locations were the UPR (2\u0026deg;12\u0026prime;57.2\u0026Prime;S, 113\u0026deg;54\u0026prime;04.3\u0026Prime;E), Tjilik Riwut (2\u0026deg;14\u0026prime;02.8\u0026Prime;S, 113\u0026deg;57\u0026prime;09.1\u0026Prime;E), and Kalampangan (2\u0026deg;16\u0026prime;44.8\u0026Prime;S, 114\u0026deg;00\u0026prime;30.3\u0026Prime;E). The leaves from the UPR site were green, whereas those from Kalampangan and Tjilik Riwut were red. The supernatant phase of the water-methanol-chloroform extract of the UPR leaves was clear, whereas that of Kalampangan and Tjilik Riwut leaves was pink, probably due to presence of anthocyanins [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. The samples were kept fresh in an icebox and were lyophilised for 24 hours before storage at -80\u0026deg;C. RNA extraction and DNase I treatment were performed as previously described. RNA quantity and quality were measured using the NanoDrop spectrophotometer.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003ecDNA synthesis was performed as follows: a mixture consisting of 200 ng total RNA, 1 \u0026micro;L 50 \u0026micro;M oligo (dT)\u003csub\u003e20\u003c/sub\u003e primer (Toyobo, Osaka, Japan), 1 mol dNTP mixture (Takara Shuzo, Kyoto, Japan) was denatured at 65\u0026deg;C for 5 min and then immediately put on ice. 5\u0026times; RT buffer (4 \u0026micro;L), RNase inhibitor (1 \u0026micro;L), and M-MLV RT (1 \u0026micro;L, Invitrogen, Tokyo, Japan) were added to the mixture. The cDNA synthesis reaction was then carried out at 42\u0026deg;C for 60 min followed by a denaturation at 70\u0026deg;C for 15 minutes and a cooling down step at 10\u0026deg;C, infinite time.\u003c/p\u003e\u003cp\u003eA 10 \u0026micro;L reaction mixture was prepared for qRT-PCR: 5 \u0026micro;L SSoAdvanced Universal SYBR Green Supermix (Bio-Rad, Hercules, CA, USA), 2 \u0026micro;L cDNA, and 250 nM forward and reverse primers. The qRT-PCR conditions were as follows: initial denaturation at 95\u0026deg;C for 30 s, followed by 40 cycles each at 95\u0026deg;C for 30 s, 58\u0026deg;C for 15 s, and 72\u0026deg;C for 20 s. Melting curve analysis ranged from 56 to 95\u0026deg;C. To examine the specificity of the qRT-PCR amplification, temperature was increased by 0.5\u0026deg;C every 10 s after completion of the amplification cycles. The gene expression level of each sample was calculated and compared with the expression level of the internal control gene actin. Primers designed for the selected genes are listed in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. For gene expression analysis, primers were designed targeting the conserved regions. qRT-PCR was performed using the Bio-Rad CFX Connect machine at the Faculty of Agriculture, Hokkaido University. Three biological replicates were prepared.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cp\u003e\u003cb\u003eLong-read nanopore sequencing and\u003c/b\u003e \u003cb\u003ede novo\u003c/b\u003e \u003cb\u003eassembly results\u003c/b\u003e\u003c/p\u003e\u003cp\u003eLong-read nanopore sequencing generated 15,632,749 raw reads with a total nucleotide length of 13,033,432,831 bp. After filtering, 14,913,665 reads, with an average length of 682.5 bp, were retained for \u003cem\u003ede novo\u003c/em\u003e assembly. The \u003cem\u003ede novo\u003c/em\u003e assembly produced 112,160 contigs. The reduced-redundancy step using CD-HIT-EST organised the 112,160 contigs into 74,467 clusters, from which the longest contig in each cluster was selected as the representative and used for further analysis. After further filtering based on read coverage, absence of adapter sequences, and GC%, a final set of 47,759 contigs was obtained. The minimum length was 500 bp, the maximum length was 11,260 bp, and the average length was 1328.8 bp (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). The GC content was 45.15%. Completeness scores based on BUSCO assessment were 66.6% complete (44.9% single-copy and 21.7% duplicated), 6.2% fragmented, and 27.2% missing (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e1\u003c/span\u003eC).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eGene annotation based on eggNOG-mapper\u003c/h3\u003e\n\u003cp\u003eThe longest ORFs from 47,759 contigs were predicted using ORFipy. The resulting dataset was functionally annotated using eggNOG-mapper. A total of 30,010 contigs were successfully annotated, with the highest annotation level assigned to either \u003cem\u003eViridiplantae\u003c/em\u003e or \u003cem\u003eStreptophyta\u003c/em\u003e. KO terms were assigned to each contig; some contigs had a single KO term, whereas others had multiple terms. Based on the KO classifications, 3,745 contigs were associated with genetic information processing, 1,206 with environmental information processing, 82 with organismal systems, and 5,960 with metabolism (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). Among those classified under \u0026ldquo;Metabolism\u0026rdquo;, 301 contigs were linked to \u0026ldquo;biosynthesis of other secondary metabolites\u0026rdquo;, including genes involved in the biosynthesis of phenylpropanoid, flavone and flavonol, isoflavonoid, flavonoid, and anthocyanin.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eBased on COG annotations, 5,531 contigs were assigned to information storage and processing, 8,238 to cellular processes and signaling, and 8,858 to metabolism (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). Within the metabolism category, the contigs related to energy production and conversion (C), carbohydrate transport and metabolism (G), and amino acid transport and metabolism (E) were the most abundant. Additionally, 967 contigs were classified as associated with secondary metabolite biosynthesis, transport, and catabolism (Q) (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eB).\u003c/p\u003e\n\u003ch3\u003eMining flavonoid biosynthesis-related genes\u003c/h3\u003e\n\u003cp\u003eGenes or contigs encoding enzymes for phenylpropanoid and flavonoid biosynthesis were identified based on eggNOG-mapper annotation. There were 16 genes for PAL, 10 for C4H, 18 for 4CL, 51 for CHS, 7 for CHI, and 11 for DFR (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e3\u003c/span\u003e). No genes were identified for F3\u0026prime;H, F3\u0026prime;5\u0026prime;H, FLS, ANS, F3GT, and F5GT.\u003c/p\u003e\u003cp\u003eThe selected \u003cem\u003eCHS\u003c/em\u003e genes identified in this study were aligned with \u003cem\u003eCHS\u003c/em\u003e genes from the ferns \u003cem\u003eD. fragrans\u003c/em\u003e, \u003cem\u003eD. erythrosora\u003c/em\u003e, and \u003cem\u003eC. thalictroides\u003c/em\u003e. Phylogenetic analysis using the neighbour-joining method clustered \u003cem\u003eCHS\u003c/em\u003e genes into three major groups, designated CHS1, CHS2, and CHS3 (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e4\u003c/span\u003e; Additional File 2\u0026ndash;4). The CHS1 cluster included three genes: rb_8683, rb_49489, and rb_54131 (Additional File 2). Compared with rb_49489, rb_54131 had a 264-bp deletion and a 1-bp insertion that shifted the predicted stop codon downstream. rb_8683 showed a 99-bp deletion but retained the original stop codon position (Additional File 2). CHS2 comprised rb_4659, rb_49361, and rb_103035 (Additional File 3). rb_103035 lacked the first 306 bp whereas rb_4659 and rb_49361 differed by four single nucleotide polymorphisms. The CHS3 group included four genes: rb_7618, rb_74283, rb_15005, and rb_52876 (Additional File 4). While their 5\u0026prime; regions are highly conserved, variations in nucleotide composition and deletions in the 3\u0026prime; regions resulted in differences in amino acid sequences and stop codon positions (Additional File 4).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTo investigate the relationships among \u003cem\u003eDFR\u003c/em\u003e genes, 11 putative \u003cem\u003eDFR\u003c/em\u003e sequences, along with \u003cem\u003eDFR\u003c/em\u003e genes from \u003cem\u003eD. erythrosora\u003c/em\u003e and \u003cem\u003eA. filiculoides\u003c/em\u003e, were analysed using the neighbour-joining method (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e5\u003c/span\u003e). rb_54792 and rb_103797 clustered with \u003cem\u003eD. erythrosora DFR1\u003c/em\u003e and were classified as part of the DFR1 group. Compared with rb_103797, rb_54792 contained a 117-bp deletion at the 5\u0026prime; end and an additional 146-bp deletion that resulted in an upstream shift of the stop codon (Additional File 5). rb_13889 grouped with \u003cem\u003eD. erythrosora DFR2\u003c/em\u003e and was designated as part of the DFR2 group (Additional File 6). Three genes, rb_10724, rb_11399, and \u003cem\u003erb_57372\u003c/em\u003e, clustered with \u003cem\u003eD. erythrosora DFR3\u003c/em\u003e and were assigned to the DFR3 group. Within this group, rb_10724 exhibited a 315-bp deletion at the 5\u0026prime; end compared with rb_11399, whereas rb_57372 had a 90-bp insertion-deletion (indel) at the same region. Additionally, rb_57372 showed a 114-bp indel and several nucleotide variations in the 3\u0026prime; region, leading to an earlier stop codon (Additional File 7). The remaining four genes formed a distinct clade, separate from the known \u003cem\u003eDFR\u003c/em\u003e genes of \u003cem\u003eA. filiculoides\u003c/em\u003e and \u003cem\u003eD. erythrosora\u003c/em\u003e, suggesting the presence of novel \u003cem\u003eDFR\u003c/em\u003e variants.\u003c/p\u003e\u003cp\u003eTo asses gene expression via qRT-PCR, gene-specific primers were designed based on the conserved regions of \u003cem\u003eCHS1\u003c/em\u003e, \u003cem\u003eCHS2\u003c/em\u003e, \u003cem\u003eDFR1\u003c/em\u003e, \u003cem\u003eDFR2\u003c/em\u003e, and \u003cem\u003eDFR3\u003c/em\u003e. Expression of these genes was confirmed in young leaf tissues collected from populations in the UPR, Kalampangan, and Tjilik Riwut (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA). qRT-PCR further validated the expression of both \u003cem\u003eCHS\u003c/em\u003e and \u003cem\u003eDFR\u003c/em\u003e across all populations (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eB). However, the variations in leaf and leaf extract colouration observed among the UPR, Kalampangan, and Tjilik Riwut genotypes did not correspond to differences in \u003cem\u003eCHS\u003c/em\u003e or \u003cem\u003eDFR\u003c/em\u003e gene expression levels.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eThe transcriptome profile of \u003cem\u003eS. palustris\u003c/em\u003e was generated via Nanopore long-read sequencing. \u003cem\u003eDe novo\u003c/em\u003e assembly yielded 47,759 contigs (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e1\u003c/span\u003e). However, the completeness of the assembly, as evaluated by BUSCO using the Embryophyta dataset, was 66.6% (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e1\u003c/span\u003e). This relatively low score may be attributed to the use of RNA extracted solely from young leaf tissues, which likely did not capture the full transcriptome complexity of \u003cem\u003eS. palustris\u003c/em\u003e [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e].\u003c/p\u003e\u003cp\u003ePutative genes encoding enzymes involved in flavonoid and anthocyanin biosynthesis were successfully identified. The long-read transcriptome enabled the differentiation of highly similar isoforms of \u003cem\u003eCHS\u003c/em\u003e and \u003cem\u003eDFR\u003c/em\u003e, highlighting the advantage of this technology for distinguishing closely related sequences in \u003cem\u003ede novo\u003c/em\u003e assemblies. The expression of \u003cem\u003eCHS\u003c/em\u003e and \u003cem\u003eDFR\u003c/em\u003e genes was confirmed in young leaves through qRT-PCR. However, because the primers targeted conserved regions, the expression of individual gene isoforms could not be determined, warranting further investigation. Interestingly, the gene expression levels of \u003cem\u003eCHS\u003c/em\u003e and \u003cem\u003eDFR\u003c/em\u003e did not correlate with visible leaf pigmentation, suggesting that post-transcriptional mechanisms or metabolic regulation influenced pigment accumulation (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eB).\u003c/p\u003e\u003cp\u003eSamples were collected from natural populations at three sites with distinct soil types: UPR, Kalampangan, and Tjilik Riwut. The observed variation in leaf colouration may be due to underlying genetic differences or environmental factors [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. Previous studies have shown that environmental conditions influence the accumulation of flavonoids and phenolic compounds in fern leaves. Particularly, salinity and full sunlight did not alter total polyphenol content in \u003cem\u003eD. erythrosora\u003c/em\u003e or \u003cem\u003eA. nipponicum\u003c/em\u003e var. \"Red Beauty\u0026rdquo; but increased the total flavonoid content in \u003cem\u003eD. erythrosora\u003c/em\u003e [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. Similarly, high light intensity and low temperature enhanced anthocyanin accumulation in \u003cem\u003eA. filiculoides\u003c/em\u003e fronds [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. Future studies should assess the expression of flavonoid and anthocyanin biosynthesis genes across various tissues, developmental stages, and controlled environmental conditions to better understand the regulatory mechanisms underlying flavonoid and anthocyanin production in \u003cem\u003eS. palustris\u003c/em\u003e.\u003c/p\u003e\u003cp\u003eNotably, genes encoding key enzymes in the flavonoid biosynthetic pathway, such as \u003cem\u003eFLS\u003c/em\u003e, \u003cem\u003eANS\u003c/em\u003e, \u003cem\u003eF3GT\u003c/em\u003e, and \u003cem\u003eF5GT\u003c/em\u003e, were undetected. This finding aligns with that of Ali et al. [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e], who reported that several fern species, including \u003cem\u003eS. palustris\u003c/em\u003e, lack these genes. Similarly, \u003cem\u003eFLS\u003c/em\u003e and \u003cem\u003eANS\u003c/em\u003e were absent in \u003cem\u003eC. richardii\u003c/em\u003e [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e], although the upstream and downstream genes were present. Although \u003cem\u003eFLS\u003c/em\u003e-like and \u003cem\u003eANS\u003c/em\u003e-like genes have been predicted in \u003cem\u003eA. filiculoides\u003c/em\u003e [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e], BLAST searches of the current transcriptome data did not yield any matches (data not shown). These results suggest that ferns utilise alternative biosynthetic routes or enzymes to fulfil the roles typically performed by the canonical \u003cem\u003eFLS\u003c/em\u003e and \u003cem\u003eANS\u003c/em\u003e.\u003c/p\u003e\u003cp\u003eThis transcriptome profiling represents an important step in the characterisation of \u003cem\u003eS. palustris\u003c/em\u003e. Future studies should focus on obtaining full genome sequences to confirm gene copy numbers, identify genetic markers, and support comparative genomics in ferns. Ultimately, integrated transcriptomic and metabolomic analyses will be essential to fully elucidate the flavonoid and anthocyanin biosynthetic pathways in \u003cem\u003eS. palustris\u003c/em\u003e.\u003c/p\u003e"},{"header":"CONCLUSIONS","content":"\u003cp\u003eIn the present study, we utilised long-read sequencing and \u003cem\u003ede novo\u003c/em\u003e assembly of the transcriptome profiles of \u003cem\u003eS. palustris\u003c/em\u003e leaves. The genes involved in flavonoid biosynthesis were also identified. The expression of genes encoding CHS and DFR was confirmed in young leaves of \u003cem\u003eS. palustris\u003c/em\u003e. These data will facilitate advancements in the genetic and molecular studies of \u003cem\u003eS. palustris\u003c/em\u003e.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets generated and/or analysed during the current study are available in the NCBI Short Read Archive repository under the BioProject ID PRJNA1237869, BioSample accession number SAMN47441734, and SRA accession number SRR32761149 [https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1237869/, https://www.ncbi.nlm.nih.gov/biosample/SAMN47441734/, https://www.ncbi.nlm.nih.gov/sra/SRR32761149]. Contigs resulting from the de novo assembly were deposited in the NCBI Transcriptome Shotgun Assembly Sequence Database under the accession number GLFY00000000.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by the Heiwa Nakajima Foundation for MSD.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026rsquo; contributions\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMSD conceived the study, performed experiments, and wrote the manuscript. DR collected the samples and edited the manuscript. DP performed the real-time PCR analysis. EW, FS, MDPTGP, and YAN selected the sampling locations, helped with sample collection and preparation before RNA extraction, and edited the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe would like to thank Editage (www.editage.jp) for English language editing.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eGiesen W, Wulffraat S, Zieren M, Scholten L. Mangrove Guidebook for Southeast Asia. Thailand: FAO and Wetlands International; 2006. p. 19.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIrawan DC, Wijaya CH, Limin SH, Hashidoko Y, Osaki M, Kulu IP. Ethnobotanical study and nutrient potency of local traditional vegetables in Central Kalimantan. Tropics. 2006;15:441\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAmoroso VB, Lagumbay AJD, Mendez RA, de la Cruz RY, Villalobos AP. Bioactives in three Philippine edible ferns. Asia Life Sci. 2014;23:445\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChai TT, Panirchellvum E, Ong HC, Won FC. Phenolic contents and antioxidant properties of \u003cem\u003eStenochlaena palustris\u003c/em\u003e, an edible medicinal fern. Bot Stud. 2012;53:439\u0026ndash;46.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChai PPK. Midin (\u003cem\u003eStenochlaena palustris\u003c/em\u003e), the popular wild vegetable of Sarawak. Utar Agr Sci J. 2016;2:2.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePandiangan FI, Oslo EA, Destine F, Josephine J, Anwar RN. A review on the health benefits of kalakai (\u003cem\u003eStenochlaena palustris\u003c/em\u003e). JFFN. 2022;4. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.33555/jffn.v4i1.98\u003c/span\u003e\u003cspan address=\"10.33555/jffn.v4i1.98\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRahmawati D, Wijaya CH, Hashidoko Y, Djajakirana G, Haraguchi A, Watanabe T, et al. Concentration of some trace elements in two wild edible ferns, \u003cem\u003eDiplazium esculentum\u003c/em\u003e and \u003cem\u003eStenochlaena palustris\u003c/em\u003e, inhabiting tropical peatlands under different environments in Central Kalimantan. Eurasian J Res. 2017;20:11\u0026ndash;20.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBudiman I, Bastoni, Sari ENN, Hadi EE, Asmaliyah, Siahaan H, et al. Progress of paludiculture projects in supporting peatland ecosystem restoration in Indonesia. GECCO. 2020;23:e01084. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.gecco.2020.e01084\u003c/span\u003e\u003cspan address=\"10.1016/j.gecco.2020.e01084\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChai TT, Kwek MT, Ong HC, Wong FC. Water fraction of edible medicinal fern \u003cem\u003eStenochlaena palustris\u003c/em\u003e is a potent α-glucosidase inhibitor with concurrent antioxidant activity. Food Chem. 2015;186:26\u0026ndash;31.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGunawan-Puteri MDPT, Kato E, Rahmawati D, Teji S, Santoso JA, Pandiangan FI, et al. Post-harvest and extraction conditions for the optimum alpha glucosidase inhibitory activity of \u003cem\u003eStenochlaena palustris\u003c/em\u003e. Int J Technol. 2021;12:649\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen J, Zhong K, Qin S, Jing Y, Liu S, Li D, et al. Astragalin: a food-origin flavonoid with therapeutic effect for multiple diseases. Front Pharmacol. 2023;14:1265960. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fphar.2023.1265960\u003c/span\u003e\u003cspan address=\"10.3389/fphar.2023.1265960\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDavies KM, Landi M, van Klink JW, Schwinn KE, Brummell DA, Albert NW, et al. Evolution and function of red pigmentation in land plants. Ann Bot. 2022;130(5):613\u0026ndash;36. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/aob/mcac109\u003c/span\u003e\u003cspan address=\"10.1093/aob/mcac109\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDao TT, Linthorst HJ, Verpoorte R. Chalcone synthase and its functions in plant resistance. Phytochem Rev. 2011;10:397\u0026ndash;412. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s11101-011-9211-7\u003c/span\u003e\u003cspan address=\"10.1007/s11101-011-9211-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen X, Liu W, Huang X, Fu H, Wang Q, Wang Y, et al. Arg-type dihydroflavonol 4-reductase genes from the fern \u003cem\u003eDryopteris erythrosora\u003c/em\u003e play important roles in the biosynthesis of anthocyanins. PLoS ONE. 2020;15:e0232090. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0232090\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0232090\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eG\u0026uuml;ng\u0026ouml;r E, Brouwer P, Dijkhuizen LW, Shaffar DC, Nierop KGJ, de Vos RCH, et al. \u003cem\u003eAzolla\u003c/em\u003e ferns testify: seed plants and ferns share a common ancestor for leucoanthocyanidin reductase enzymes. New Phytol. 2020;229:1118\u0026ndash;32. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/nph.16896\u003c/span\u003e\u003cspan address=\"10.1111/nph.16896\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCostarelli A, Cannav\u0026ograve; S, Cerri M, Pellegrino RM, Reale L, Paolocci F, et al. Light and temperature shape the phenylpropanoid profile of \u003cem\u003eAzolla filiculoides\u003c/em\u003e fronds. Front Plant Sci. 2021;12:727667. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fpls.2021.727667\u003c/span\u003e\u003cspan address=\"10.3389/fpls.2021.727667\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eQi X, Kuo LY, Guo C, Li H, Li Z, Qi J, Wang L, Hu Y, Xiang J, Zhang C, Guo J, Huang CH, Ma H. A well-resolved fern nuclear phylogeny reveals the evolution history of numerous transcription factor families. Mol Phylogenet Evol. 2018;127:961\u0026ndash;77. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ympev.2018.06.043\u003c/span\u003e\u003cspan address=\"10.1016/j.ympev.2018.06.043\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShen H, Jin D, Shu JP, Zhou XL, Lei M, Wei R, et al. Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns. GigaScience. 2018;7:1\u0026ndash;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/gigascience/gix116\u003c/span\u003e\u003cspan address=\"10.1093/gigascience/gix116\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAli ZM, Tan QW, Lim PK, Chen H, Pfeifer L, Julca I, et al. Comparative transcriptomics in ferns reveals key innovations and divergent evolution of the secondary cell walls. Nat Plants. 2025;11:1028\u0026ndash;48. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41477-025-01978-y\u003c/span\u003e\u003cspan address=\"10.1038/s41477-025-01978-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKiss T, Kar\u0026aacute;csony Z, Gomba-T\u0026oacute;th A, Szabadi KL, Spitzm\u0026uuml;ller Z, Hegyi-Kal\u0026oacute; J, et al. A modified CTAB method for the extraction of high-quality RNA from mono-and dicotyledonous plants rich in secondary metabolites. Plant Methods. 2024;20:62. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s13007-024-01198-z\u003c/span\u003e\u003cspan address=\"10.1186/s13007-024-01198-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWick RR, Judd LM, Gorrie CL, Holt KE. Completing bacterial genome assemblies with multiplex MinION sequencing. Microb Genom. 2017;3:e000132. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1099/mgen.0.000132\u003c/span\u003e\u003cspan address=\"10.1099/mgen.0.000132\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNip KM, Hafezqorani S, Gagalova KK, Chiu R, Yang C, Warren RL, et al. Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2. Nat Commun. 2023;14:2940. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41467-023-38553-y\u003c/span\u003e\u003cspan address=\"10.1038/s41467-023-38553-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/bioinformatics/btl158\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btl158\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094\u0026ndash;100. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/bioinformatics/bty191\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/bty191\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eManni M, Berkeley MR, Seppey M, Sim\u0026atilde;o FA, Zdobnov EM. BUSCO Update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021;38:4647\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWaterhouse RM, Seppey M, Sim\u0026atilde;o FA, Manni M, Ioannidis P, Klioutchnikov G, et al. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol. 2018;35:543\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/molbev/msx319\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msx319\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNishimura O, Hara Y, Kuraku S. Evaluating genome assemblies and gene models using gVolante. Methods Mol Biol. 2019;1962:247\u0026ndash;56. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-1-4939-9173-0_15\u003c/span\u003e\u003cspan address=\"10.1007/978-1-4939-9173-0_15\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMikheenko A, Saveliev V, Hirsch P, Gurevich A. WebQUAST: online evaluation of genome assemblies. Nucl Acids Res. 2023;51:W601\u0026ndash;6. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/nar/gkad406\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkad406\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSingh U, Wurtele ES. orfipy: a fast and flexible tool for extracting ORFs. Bioinformatics. 2021;37:3019\u0026ndash;20. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/btab090\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btab090\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHuerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mering C, et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol Biol Evol. 2017;34:2115\u0026ndash;22. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/molbev/msx148\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msx148\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCantalapiedra CP, Hern\u0026aacute;ndez-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. biorxiv. Mol Biol Evol. 2021;38:5825\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1101/2021.06.03.446934\u003c/span\u003e\u003cspan address=\"10.1101/2021.06.03.446934\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKatoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20:1160\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKlaric SV, Galv\u0026atilde;o Maciel A, Arend GD, Tres MV, de Lima M, Soares LS. Application of plant extracts rich in anthocyanins in the development of intelligent biodegradable packaging: an overview. Processes. 2025;13(1):191. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/pr13010191\u003c/span\u003e\u003cspan address=\"10.3390/pr13010191\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePietrak A, Salachna P, Łopusiewicz Ł. Changes in growth, ionic status, metabolites content and antioxidant activity of two ferns exposed to shade, A full sunlight, and salinity. Int J Mol Sci. 2022;24:296. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/ijms24010296\u003c/span\u003e\u003cspan address=\"10.3390/ijms24010296\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMarchant DB, Chen G, Cai S, Chen F, Schafran P, Jenkins J, et al. Dynamic genome evolution in a model fern. Nat Plants. 2022;8(9):1038\u0026ndash;51. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41477-022-01226-7\u003c/span\u003e\u003cspan address=\"10.1038/s41477-022-01226-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Stenochlaena palustris, transcriptome, Nanopore sequencing, long-read sequencing, chalcone synthase, dihydroflavonol 4-reductase","lastPublishedDoi":"10.21203/rs.3.rs-7105205/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7105205/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e\u003cp\u003e\u003cem\u003eStenochlaena palustris\u003c/em\u003e is valued as a vegetable and medicinal fern native to Southeast Asia; however, it remains largely underrepresented in genomic studies. People in Kalimantan (Indonesia) collect young leaves and fronds from wild populations for use as vegetables or medicines to treat conditions, such as ulcers, stomachaches, fever, diarrhoea, and skin infections. The young leaves and fronds of \u003cem\u003eS. palustris\u003c/em\u003e contain flavonoids, polyphenols, and anthocyanins. Here, we present a high-quality organ-specific transcriptome assembly of \u003cem\u003eS. palustris\u003c/em\u003e based on long-read RNA sequencing of young leaves.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eThe de novo assembly yielded 47,759 transcripts, with an N50 of 1,524 bp and a BUSCO completeness of 66.6%, consistent with organ-specific transcriptomes. Functional annotation identified key structural and regulatory genes involved in anthocyanin biosynthesis, including genes for chalcone synthase (CHS) and dihydroflavonol 4-reductase (DFR). We further analysed the expression of the selected \u003cem\u003eCHS\u003c/em\u003e and \u003cem\u003eDFR\u003c/em\u003e genes via qRT-PCR of three phenotypically contrasting young leaf samples. Although no strong correlation was observed between gene expression levels and anthocyanin pigmentation, the results suggest that complex regulation involves post-transcriptional control or developmental timing.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e\u003cp\u003eThis study provides the first long-read transcriptomic resource for \u003cem\u003eS. palustris\u003c/em\u003e and valuable data for future investigations of secondary metabolism and gene regulation in ferns. Our findings complement broader fern transcriptome studies by offering tissue-specific resolution and a focused view of pigment biosynthesis.\u003c/p\u003e","manuscriptTitle":"Organ-Specific Long-Read Transcriptome Assembly of Stenochlaena palustris and Annotation of Anthocyanin Biosynthesis Genes","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-05 12:17:41","doi":"10.21203/rs.3.rs-7105205/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"b5c945e2-5ff0-4686-9aa6-cc4d92628352","owner":[],"postedDate":"September 5th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-09-08T09:24:10+00:00","versionOfRecord":[],"versionCreatedAt":"2025-09-05 12:17:41","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7105205","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7105205","identity":"rs-7105205","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00