Kisan-BiQAS: A Bilingual Retrieval-Augmented Framework for Agricultural Question Answering System

preprint OA: closed
Full text JSON View at publisher
Full text 12,405 characters · extracted from preprint-html · click to expand
Kisan-BiQAS: A Bilingual Retrieval-Augmented Framework for Agricultural Question Answering System | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Kisan-BiQAS: A Bilingual Retrieval-Augmented Framework for Agricultural Question Answering System Ramesh Kumar, Lalit kumar Awasthi, Amrit Lal Sangal This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9610140/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Agricultural advisory systems require high factual accuracy, multilingual accessibility, and robustness to noisy real-world data. This paper presents Kisan-BiQAS, a bilingual (Hindi–English) question answering framework designed for farmercentric advisory support using Kisan Call Centre (KCC) data and curated agricultural knowledge sources. The proposed system integrates retrieval-based and retrieval-augmented generation (RAG) paradigms within a unified architecture, enabling a systematic comparison between extractive and generative approaches. The framework employs dual retrieval mechanisms for English and multilingual queries, semantic embedding-based indexing, and parallel large language model (LLM) inference using LLaMA and Qwen, followed by an attention-based fusion strategy. A comprehensive evaluation is conducted using Exact Match, F1-score, BLEU, ROUGE, and additional similarity metrics. Experimental results reveal that retrieval-only configurations achieve nearperfect performance (Accuracy and F1 ≈ 1.0) with zero hallucination in a closed-domain setting, establishing an upper-bound benchmark for agricultural QA. In contrast, generative models exhibit performance degradation due to paraphrasing and hallucination effects, with LLaMA achieving moderate performance (F1 ≈ 0.45) and Qwen showing recall-heavy but low-precision behavior (F1 ≈ 0.15, hallucination ≈ 58%). Further analysis demonstrates that errors primarily originate from the generation stage rather than retrieval, as the correct answers are consistently present in retrieved contexts. These findings highlight a fundamental trade-off between factual reliability and linguistic flexibility, emphasizing the importance of retrieval grounding in high-stakes domains such as agriculture. The study contributes a bilingual benchmark, a hybrid retrieval–generation framework, and a detailed hallucination analysis, providing insights for designing reliable multilingual advisory systems for real-world deployment. Decision Support Systems Agricultural Advisory Systems Bilingual Natural Language Processing Retrieval-Augmented Generation Hallucination Analysis Kisan Call Centre Data Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9610140","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":636275264,"identity":"05613761-c9d6-40fb-8607-0c2a2f5b80e2","order_by":0,"name":"Ramesh Kumar","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAklEQVRIiWNgGAWjYLACxoYEMPUgoYKBh4HhAIgjQZQWZoMHZxh4eEjRwib5sI0BZA1+YN5+OvHhzx1pDPz9BxgkEufdkbFnPMD44QeDRR4uLTJncjcb857JYZC4kcBgkLjtGchhzJI9DBLFuLRIMORuk2Zsq2AwADo+IXHbYbBfpIESiQ24tPC/3f7zJ0gL/wGGA4lzwFqYf+PVIpG7jYG3LYfBgCGBsSGxAayFDb8tEm83S/O2pfFI3EhsZkg4BtRy4GCbZY8BPoflbvz4sy1Zjr//8PGfP2oO27PPOHz4xo+KOpxaYIAHFDtQUw4CGQYE1KMCfkLGj4JRMApGwUgDADURVEXBrUPlAAAAAElFTkSuQmCC","orcid":"","institution":"Dr. B. R. Ambedkar National Institute Of Technology (Jalandhar, Punjab)","correspondingAuthor":true,"prefix":"","firstName":"Ramesh","middleName":"","lastName":"Kumar","suffix":""},{"id":636275266,"identity":"2e3adda6-62f6-425b-952a-329f0b37ed66","order_by":1,"name":"Lalit kumar Awasthi","email":"","orcid":"","institution":"Vice Chancellor Sardar Patel University","correspondingAuthor":false,"prefix":"","firstName":"Lalit","middleName":"kumar","lastName":"Awasthi","suffix":""},{"id":636275267,"identity":"3ac3528b-1e70-4949-8413-8319b5a7d6c4","order_by":2,"name":"Amrit Lal Sangal","email":"","orcid":"","institution":"Dr. B. R. Ambedkar National Institute Of Technology (Jalandhar, Punjab)","correspondingAuthor":false,"prefix":"","firstName":"Amrit","middleName":"Lal","lastName":"Sangal","suffix":""}],"badges":[],"createdAt":"2026-05-04 15:53:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9610140/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9610140/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":108998379,"identity":"f810b644-354b-4211-b5ce-02d9ec45c540","added_by":"auto","created_at":"2026-05-11 14:31:14","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1080634,"visible":true,"origin":"","legend":"","description":"","filename":"main.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9610140/v1_covered_55635422-b3a2-4ef2-aea7-3600488e6c4c.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Kisan-BiQAS: A Bilingual Retrieval-Augmented Framework for Agricultural Question Answering System","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Decision Support Systems, Agricultural Advisory Systems, Bilingual Natural Language Processing, Retrieval-Augmented Generation, Hallucination Analysis, Kisan Call Centre Data","lastPublishedDoi":"10.21203/rs.3.rs-9610140/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9610140/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAgricultural advisory systems require high factual accuracy, multilingual accessibility, and robustness to noisy real-world data. This paper presents Kisan-BiQAS, a bilingual (Hindi–English) question answering framework designed for farmercentric advisory support using Kisan Call Centre (KCC) data and curated agricultural knowledge sources. The proposed system integrates retrieval-based and retrieval-augmented generation (RAG) paradigms within a unified architecture, enabling a systematic comparison between extractive and generative approaches. The framework employs dual retrieval mechanisms for English and multilingual queries, semantic embedding-based indexing, and parallel large language model (LLM) inference using LLaMA and Qwen, followed by an attention-based fusion strategy. A comprehensive evaluation is conducted using Exact Match, F1-score, BLEU, ROUGE, and additional similarity metrics. Experimental results reveal that retrieval-only configurations achieve nearperfect performance (Accuracy and F1 ≈ 1.0) with zero hallucination in a closed-domain setting, establishing an upper-bound benchmark for agricultural QA. In contrast, generative models exhibit performance degradation due to paraphrasing and hallucination effects, with LLaMA achieving moderate performance (F1 ≈ 0.45) and Qwen showing recall-heavy but low-precision behavior (F1 ≈ 0.15, hallucination ≈ 58%). Further analysis demonstrates that errors primarily originate from the generation stage rather than retrieval, as the correct answers are consistently present in retrieved contexts. These findings highlight a fundamental trade-off between factual reliability and linguistic flexibility, emphasizing the importance of retrieval grounding in high-stakes domains such as agriculture. The study contributes a bilingual benchmark, a hybrid retrieval–generation framework, and a detailed hallucination analysis, providing insights for designing reliable multilingual advisory systems for real-world deployment.\u003c/p\u003e","manuscriptTitle":"Kisan-BiQAS: A Bilingual Retrieval-Augmented Framework for Agricultural Question Answering System","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-11 14:30:13","doi":"10.21203/rs.3.rs-9610140/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"87c3fe9f-b77a-4b43-bcb1-498e1e8279e3","owner":[],"postedDate":"May 11th, 2026","published":true,"recentEditorialEvents":[{"type":"decision","content":"Revision requested","date":"2026-05-07T14:17:18+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-05-07T10:36:41+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-05-07T10:36:33+00:00","index":"","fulltext":""},{"type":"submitted","content":"The Journal of Supercomputing","date":"2026-05-04T15:36:40+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-05-11T14:30:13+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-11 14:30:13","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9610140","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9610140","identity":"rs-9610140","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00