Issue Detection and Future Proofing Dutch Government Apps Using Language Technologies

preprint OA: closed
Full text JSON View at publisher
Full text 12,490 characters · extracted from preprint-html · click to expand
Issue Detection and Future Proofing Dutch Government Apps Using Language Technologies | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Issue Detection and Future Proofing Dutch Government Apps Using Language Technologies Anca-Mihaela Matei, Flor Miriam Plaza-del-Arco, Natalia Amat-Lefort This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7410714/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract As public services increasingly shift to digital platforms due to e-Government initiatives, understanding and incorporating user feedback has become critical for improving the quality and usability of government applications. The field of Natural Language Processing (NLP) techniques have emerged as a crucial response to the need to process and analyzeand diverse user feedback. Among these techniques, Large Language Models (LLMs) have become key scalable and versatile tools. This research explores the application of LLMs to extract, classify, and forecast issues reported in user reviews of four Dutch government applications, namely KopieID, Reisapp, MijnOverheid, and DigiD. This research is structured around four core tasks: (1) issue extraction, (2) multi-label review classification, (3) assessment of how different issues impact star ratings, including a temporal analysis, and (4) forecasting of future issues and actionable recommendations. A comparative analysis between LLMs and Latent Dirichlet Allocation (LDA) is performed to evaluate coherence and classification confidence (via Shannon Entropy). The results show that LLMs outperform LDA in coherence, flexibility, and interpretability, though challenges such as hallucination and classification ambiguity were observed. The star rating assessment highlights that technical reliability remains a key driver of user dissatisfaction, while usability-related concerns exhibit more variable effects across applications. Forecasting analysis reveals that LLMs can partially identify emerging issues and generate precise, app-specific recommendations, though the prediction of issue frequency remains limited. This research offers a replicable, unsupervised pipeline for multilingual user feedback analysis and provides practical insights for enhancing citizen-centric digital services in the public sector. Government institutions could use and build on this pipeline to identify critical pain points in their applications, create an evidence-based prioritization framework based on the evolution of discovered issues, and employ focused recommendation strategies. Other Business Natural Language Processing (NLP) Large Language Models (LLMs) Latent Dirichlet Allocation (LDA) citizen feedback digital public services Full Text Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7410714","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":502697760,"identity":"9bbcfc52-b411-4031-9938-e50ed1c4c56e","order_by":0,"name":"Anca-Mihaela Matei","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA60lEQVRIiWNgGAWjYLCCBBDB3nzwAZDi4SOsnhmqhedYsgGIYiNKCxhI5JhJgGiCWnTbzx988KDmjjx/Q45Z5dccOxk2BuaHj27g0WJ2JpnZIOHYM8MZB46V3Zbdlgx0GJuxcQ4+LQeS2SQS2A4nMBxs3nZbchszUAsPmzReLecfs/9I+Hc4Qf4wg1mx5LZ6IrTcSGZjSGw7nGBwjMWM8eO2w8RoeWwskdh32HDjGbZkacZtx3nYmAn55Xziw48/vh2Wl7v/+ODHn9uq7fnZmx8+xqcFBTDzgElilYMA4w9SVI+CUTAKRsGIAQCjs0qqqJgGrAAAAABJRU5ErkJggg==","orcid":"","institution":"Leiden University","correspondingAuthor":true,"prefix":"","firstName":"Anca-Mihaela","middleName":"","lastName":"Matei","suffix":""},{"id":502697761,"identity":"4279ea27-c96e-4657-a7e5-c9bbdb45ec45","order_by":1,"name":"Flor Miriam Plaza-del-Arco","email":"","orcid":"https://orcid.org/0000-0002-3020-5512","institution":"Leiden University","correspondingAuthor":false,"prefix":"","firstName":"Flor","middleName":"Miriam","lastName":"Plaza-del-Arco","suffix":""},{"id":502697762,"identity":"ee4d070f-324d-41d2-a02a-35be2eb735fa","order_by":2,"name":"Natalia Amat-Lefort","email":"","orcid":"https://orcid.org/0000-0001-7602-8614","institution":"Leiden University","correspondingAuthor":false,"prefix":"","firstName":"Natalia","middleName":"","lastName":"Amat-Lefort","suffix":""}],"badges":[],"createdAt":"2025-08-19 16:58:24","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-7410714/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7410714/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":89529947,"identity":"2d0cf5ed-1007-42b6-b97c-fc104a9e0d3c","added_by":"auto","created_at":"2025-08-21 03:38:37","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4349744,"visible":true,"origin":"","legend":"","description":"","filename":"THESISAncaPreprintformat.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7410714/v1_covered_e240ebdf-02ef-46e2-a1a2-32085c367796.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eIssue Detection and Future Proofing Dutch Government Apps Using Language Technologies\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Leiden University","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Natural Language Processing (NLP), Large Language Models (LLMs), Latent Dirichlet Allocation (LDA), citizen feedback, digital public services","lastPublishedDoi":"10.21203/rs.3.rs-7410714/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7410714/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAs public services increasingly shift to digital platforms due to e-Government initiatives, understanding and incorporating user feedback has become critical for improving the quality and usability of government applications. The field of Natural Language Processing (NLP) techniques have emerged as a crucial response to the need to process and analyzeand diverse user feedback. Among these techniques, Large Language Models (LLMs) have become key scalable and versatile tools. This research explores the application of LLMs to extract, classify, and forecast issues reported in user reviews of four Dutch government applications, namely KopieID, Reisapp, MijnOverheid, and DigiD. This research is structured around four core tasks: (1) issue extraction, (2) multi-label review classification, (3) assessment of how different issues impact star ratings, including a temporal analysis, and (4) forecasting of future issues and actionable recommendations. A comparative analysis between LLMs and Latent Dirichlet Allocation (LDA) is performed to evaluate coherence and classification confidence (via Shannon Entropy). The results show that LLMs outperform LDA in coherence, flexibility, and interpretability, though challenges such as hallucination and classification ambiguity were observed. The star rating assessment highlights that technical reliability remains a key driver of user dissatisfaction, while usability-related concerns exhibit more variable effects across applications. Forecasting analysis reveals that LLMs can partially identify emerging issues and generate precise, app-specific recommendations, though the prediction of issue frequency remains limited. This research offers a replicable, unsupervised pipeline for multilingual user feedback analysis and provides practical insights for enhancing citizen-centric digital services in the public sector. Government institutions could use and build on this pipeline to identify critical pain points in their applications, create an evidence-based prioritization framework based on the evolution of discovered issues, and employ focused recommendation strategies.\u003c/p\u003e","manuscriptTitle":"Issue Detection and Future Proofing Dutch Government Apps Using Language Technologies","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-21 03:30:25","doi":"10.21203/rs.3.rs-7410714/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"842d2553-fe73-4ba2-adb5-8fab4812d85b","owner":[],"postedDate":"August 21st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":53400599,"name":"Other Business"}],"tags":[],"updatedAt":"2025-08-21T03:30:25+00:00","versionOfRecord":[],"versionCreatedAt":"2025-08-21 03:30:25","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7410714","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7410714","identity":"rs-7410714","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00