Efficient training of neural networks using natural vectors with covariates for the plant microRNA precursor prediction

preprint OA: closed
Full text JSON View at publisher
Full text 1,961 characters · extracted from oa-doi-fallback · click to expand
Abstract The Fabaceae plants (Legumes) are important for the economy and food sovereignty of México. Traits development of agronomic interest, and other biological activities of the Fabaceae plants are tightly related to the gene regulation, like the post-transcriptional gene repression mediated by microRNAs. Several artificial intelligence models have been developed for the miRNA precursor sequence prediction. They were based mainly on Convolutional Neural Network and Multi-Layer Perceptron architectures. Although the numerical encoding of nucleotide sequence and its secondary structure of pre-miRNAs implemented in these neural networks showed good performance, there are other encoding methods that have not been explored. Recently, a geometric construction of viral genome space and the numerical encoding of the archaea, bacteria, fungi and viruses genomes were successfully achieved employing natural vectors with covariance component. Natural vectors have also been used as input data during neural networks training for the classification of viral genomes. In consequence, in this work we mainly assessed the performance of neural networks as regression or classifier models trained with nucleotide sequences and its secondary structure representation encoded by natural vectors with covariance component alone or nested within the three sequences method. Additionally, we tested other characteristics of neural networks, and the results of training neural networks with natural vectors with covariates showed a better performance in predicting intrinsic nucleotide features, such as percentage of guanine and cytosine, pairwise-aligned sequence identity. Also, it showed good accuracy in categorizing miRNA precursor sequences compared with the results obtained from other encoding methods, that are often used in the numerical representation of nucleotide sequences. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00