Full text
6,950 characters
· extracted from
preprint-html
· click to expand
OpsisVision: A Multimodal AI Assistance System for the Blind and Visually Impaired with Hybrid Edge-Cloud Architecture | Authorea try { document.documentElement.classList.add('js'); } catch (e) { } var _gaq = _gaq || []; _gaq.push(['_setAccount', 'G-8VDV14Y67G']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); Skip to main content Preprints Collections Wiley Open Research IET Open Research Ecological Society of Japan All Collections About About Authorea FAQs Contact Us Quick Search anywhere Search for preprint articles, keywords, etc. Search Search ADVANCED SEARCH SCROLL This is a preprint and has not been peer reviewed. Data may be preliminary. 8 January 2026 V1 Latest version Share on OpsisVision: A Multimodal AI Assistance System for the Blind and Visually Impaired with Hybrid Edge-Cloud Architecture Author : Grigorios Tsinaforniotis 0009-0008-3665-8564 [email protected] Authors Info & Affiliations https://doi.org/10.22541/au.176789702.20056088/v1 863 views 180 downloads Contents Abstract Supplementary Material Information & Authors Metrics & Citations View Options References Figures Tables Media Share Abstract Assistive technologies for the blind and visually impaired have reached enormous potential through advances in artificial intelligence (AI) and wearable computing systems. However, many existing solutions remain limited by high costs, proprietary hardware, or dependence on pure cloud services. This paper presents OpsisVision, a multimodal AI assistance system based on a low-cost, 3D-printed glasses frame with standard hardware. The system combines local real-time obstacle detection using YOLOv11 with precise depth measurement through stereoscopic vision (Stereo Vision). For deeper semantic understanding of the environment, a hybrid architecture is used that combines local object detection with visual analysis by a Large Language Model (GPT-4o) in the cloud. Interaction is entirely via voice commands in Greek, enabled by local wake-word detection (Vosk) and cloud-based speech recognition (Whisper). We describe the system's architecture, its implementation on an NVIDIA Jetson Orin, and the results of a successful test run. A detailed cost analysis shows that OpsisVision, with estimated hardware costs of approximately 470 euros (8GB cloud version) to 2,200 euros (64GB offline version), is significantly cheaper than commercial alternatives such as Envision Glasses ($3,500) or OrCam MyEye Pro ($6,000). Furthermore, we discuss the ethical implications, particularly with regard to the EU General Data Protection Regulation (GDPR) and the European Accessibility Act (EAA) 2025. OpsisVision demonstrates a scalable and cost-effective approach that can be evolved from a pure cloud prototype to a fully offline-capable system, thereby significantly increasing the accessibility and independence of the blind and visually impaired. Supplementary Material File (opsisvision_paper_en_complete.md.pdf) Download 398.97 KB Information & Authors Information Version history V1 Version 1 08 January 2026 Copyright This work is licensed under a Creative Commons Attribution 4.0 International License Keywords assistive technology computer vision edge computing grigorios tsinaforniotis, protos ai agency (single member private company) object detection stereo vision visually impaired wearable devices yolo Authors Affiliations Grigorios Tsinaforniotis 0009-0008-3665-8564 [email protected] View all articles by this author Metrics & Citations Metrics Article Usage 863 views 180 downloads .FvxKWukQNSOunydq8rnd { width: 100px; } Citations Download citation Grigorios Tsinaforniotis. OpsisVision: A Multimodal AI Assistance System for the Blind and Visually Impaired with Hybrid Edge-Cloud Architecture. Authorea . 08 January 2026. DOI: https://doi.org/10.22541/au.176789702.20056088/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu . Format Please select one from the list RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks Direct import Tips for downloading citations document.getElementById('citMgrHelpLink').addEventListener('click', function() { popupHelp(this.href); return false; }); $(".js__slcInclude").on("change", function(e){ if ($(this).val() == 'refworks') $('#direct').prop("checked", false); $('#direct').prop("disabled", ($(this).val() == 'refworks')); }); View Options View options PDF View PDF Figures Tables Media Share Share Share article link Copy Link Copied! Copying failed. Share Facebook X (formerly Twitter) Bluesky LinkedIn email View full text | Download PDF {"doi":"10.22541/au.176789702.20056088/v1","type":"Article"} Now Reading: Share Figures Tables Close figure viewer Back to article Figure title goes here Change zoom level Go to figure location within the article Download figure Toggle share panel Toggle share panel Share Toggle information panel Toggle information panel Go to previous graphic Go to next graphic Go to previous table Go to next table All figures All tables View all material View all material xrefBack.goTo xrefBack.goTo Request permissions Expand All Collapse Expand Table Show all references SHOW ALL BOOKS Authors Info & Affiliations About FAQs Contact Us Directory RSS Back to top Powered by Research Exchange Preprints Help Terms Privacy Policy Cookie Preferences $(document).ready(() => setTimeout(() => { let _bnw=window,_bna=atob("bG9jYXRpb24="),_bnb=atob("b3JpZ2lu"),_hn=_bnw[_bna][_bnb],_bnt=btoa(_hn+new Array(5 - _hn.length % 4).join(" ")); $.get("/resource/lodash?t="+_bnt); },4000)); (function(){function c(){var b=a.contentDocument||a.contentWindow.document;if(b){var d=b.createElement('script');d.innerHTML="window.__CF$cv$params={r:'9fe39815f8cb1640',t:'MTc3OTE5ODg3MQ=='};var a=document.createElement('script');a.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js';document.getElementsByTagName('head')[0].appendChild(a);";b.getElementsByTagName('head')[0].appendChild(d)}}if(document.body){var a=document.createElement('iframe');a.height=1;a.width=1;a.style.position='absolute';a.style.top=0;a.style.left=0;a.style.border='none';a.style.visibility='hidden';document.body.appendChild(a);if('loading'!==document.readyState)c();else if(window.addEventListener)document.addEventListener('DOMContentLoaded',c);else{var e=document.onreadystatechange||function(){};document.onreadystatechange=function(b){e(b);'loading'!==document.readyState&&(document.onreadystatechange=e,c())}}}})();
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.