Using Image-based positioning for seamless localization in cultural heritage setting | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Using Image-based positioning for seamless localization in cultural heritage setting Bashar Egbariya, Rotem Dror, Tsvi Kuflik, Ilan Shimshoni This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6142584/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This study presents the development and evaluation of an image-based positioning system for a mobile museum visitors’ guide system comprising two components: An Android application and a backend service. The application identifies visitors’ location by capturing and sending images to the server, which then determines their location within the museum. The server maintains a comprehensive dataset of the museum's points of interest (POIs), and content about them. The content was created automatically, using large language model (LLM) and corrected by museum staff, who can also upload videos and descriptive information for each POI via the application. The image-based indoor positioning solution uses a deep learning-based model for representing an image as a vector of features. This approach enables the system to simply calculate distances between vectors and ultimately determine the similarity between them, allowing for accurate POI identification. A user study aimed at evaluating users' perception of the systems' accuracy and ease of use was conducted at the Hecht Museum, where participants used the developed application and subsequently completed a System Usability Scale (SUS) questionnaire, along with other open-ended questions. The high scores and the highly positive feedback obtained indicate an overall excellent usability experience, especially with respect to the accuracy and speed of POI identification. The feedback also provided insights into areas where our solution can be enhanced and further developed. Physical sciences/Mathematics and computing/Information technology Physical sciences/Mathematics and computing/Computer science Image-based positioning Mobile visitors' guide Indoor positioning Cultural Heritage Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 1 INTRODUCTION Indoor localization refers to determining the precise location of objects or people within an indoor environment. Unlike outdoor localization, which primarily relies on GPS technology that provides global coverage with an error rate of up to 3.5 meters under optimal conditions (Misra et al. 2006, Morley et al. 2017), GPS does not work inside. Indoor positioning technologies cannot use GPS due to signal reflection and diffraction caused by indoor obstacles, necessitating the use of alternative methods (X. Wang et al. 2016, Chintalapudi et al. 2010). The distinction between indoor and outdoor localization lies in the environmental factors affecting signal propagation and the technologies employed. While GPS works well in open spaces, its accuracy diminishes significantly indoors due to signal blockage and interference from walls and other structures (Jo HJ et al.2018, Tan et al.2020, del Horno et al.2019, Yao et al.2020). Indoor localization, therefore, often requires alternative technologies (K.Chintalapudi et al.2010), each offering different levels of accuracy and requiring varying infrastructure investments. As a result, indoor localization is a field that needs to be investigated, and it is important to do so in order to develop effective solutions that can overcome these challenges. Researchers have developed various technologies to achieve precise indoor localization and tracking. These include Radio-frequency Identification (RFID) (Poulose et al.2019a, Ashraf et al.2019a), Wi-Fi (Y. Gu et al.2016), Bluetooth (Guo et al.2019), acoustic methods (Liu et al.2017, Gupta et al.2024), and inertial sensors (Aeshah et al.2023). RFID requires the presence of active or passive tags within the environment or on the user, along with a scanner to read these tags (Poulose et al.2019a). Wi-Fi and Bluetooth methods necessitate additional infrastructure to be effective. Acoustic localization offers centimeter-level accuracy but is susceptible to interference from reflections and obstructions like walls (Liu et al.2020). Pedestrian Dead Reckoning (PDR) is a notable approach for estimating a pedestrian's location using data from inertial measurement units (IMUs) (Zhang et al. 2022). Research on PDR has investigated aspects such as step detection, step length estimation, heading determination, and position estimation based on this information (Poulose et al. 2019a, Klette et al. 2014). PDR is widely employed in smart devices for localization purposes (Guo et al. 2019). However, PDR deteriorates over time, causing the error to increase progressively. Another method received signal strength (RSS) or received signal strength indicator (RSSI), involves measuring the strength of a radio signal, which theoretically diminishes as the distance between the transmitter and receiver grows (Naser et al.2023). Each localization method faces unique challenges, including issues with accuracy, cost, coverage, complexity, and applicability. With the growth in smartphone computing power and distribution, there has been an increasing trend toward utilizing smartphone sensors for position detection (Naser et al.2023, Stockman et al.2001). The significance of accurate indoor localization cannot be overstated in pervasive computing environments, as it is crucial in various applications across different domains as emergency security, crowd monitoring, intelligent warehousing, precision marketing, mobile health, augmented reality, and other significant fields (Hsu et al.2018, Ashraf et al.2019b). In healthcare, for example, it is used to track the location of patients and equipment within hospitals (Shenoy et al.2022). In retail, it aids in understanding customer behavior by monitoring their movement within stores (Lin et al.2020). The advent of smart buildings and internet of things (IoT) has further increased the demand for precise indoor localization, facilitating automation and enhancing user experiences. However, indoor localization faces numerous challenges, including signal multipath effects and signal attenuation (X.Wang et al.2016, Chintalapudi et al.2010), as well as the requirement for extensive infrastructure, which can be costly and demand regular maintenance (Z. Yang et al.2012, He et al.2015, N. Ravi et al.2005). Indoor positioning systems (IPS) are highly valuable in cultural heritage settings like museums and historical sites, where they address unique challenges. The preservation of artifacts and structures often limits the installation of intrusive infrastructure, making non-invasive localization solutions essential. The complexity and scale of museums, along with diverse visitor profiles, require accurate and user-friendly navigation aids. IPS enhances visitor experiences by providing context-aware information, guiding visitors through exhibits, and offering detailed descriptions of artifacts. Additionally, IPS plays a crucial role in crowd management and security, ensuring the safety and smooth operation of these venues while adhering to aesthetic preservation and conservation standards (Y. Yin et al.2017). Image-based indoor positioning offers a promising solution for the unique challenges posed by cultural heritage environments. This approach leverages the ubiquity of smartphones equipped with cameras to determine the user's location by matching captured images with a pre-existing database of images tagged with location data. Unlike other methods, image-based positioning does not require the installation of hardware like beacons or sensors, preserving the integrity of cultural sites (Conte et al.2008, Kuo et al.2014, Li et al.2017, Dong et al.2019). The primary advantage of image-based positioning lies in its accuracy and non-intrusiveness. By analyzing visual features in the captured images, such systems can pinpoint locations with high precision. This method also enables a rich, interactive visitor experience, as images can be linked to multimedia content, providing immersive storytelling opportunities. The approach's scalability and low cost further enhance its appeal, as it primarily requires the maintenance of a digital image database rather than physical infrastructure (Conte et al.2008, Kuo et al.2014, Li et al.2017, Dong et al.2019). It is worth noting that images are used for identifying users' positions outdoors as well, as demonstrated for instance by (Brusch 2022), but our focus is on indoors positioning hence we do not cover outdoors image-based positioning. In our study, we explored the viability of applying computer vision techniques combined with a smartphone to better derive the location of a person. Our research question was “ How can image-based positioning be integrated efficiently into a mobile, location aware museum visitors guide? ” We developed an Android-based mobile application that leverages image-based positioning to guide visitors within the (removed for annonymization). A significant challenge we faced was determining how to compare images representing different locations and accurately assess their similarity. To solve this, we used the CLIP (Radford et al.2021) model for image representation, which allowed us to effectively match images and pinpoint the visitor's location. The application captures photos of exhibits and compares them to a pre-curated database to determine the visitor’s position within the museum. Integrated with a backend service, the system provides visitors with real-time information about nearby points of interest (POIs), enhancing their educational experience with timely and relevant insights. Since experimentation in a realistic setting requires also high-quality content, we followed (AAA) and used a large language model for content creation that was corrected manually, for creating commentaries about the POIs. We conducted a user study to evaluate the system's usability and accuracy, and to gather feedback on its effectiveness and user interface design. The results indicate that image-based indoor positioning is a viable and efficient solution for enhancing visitor experiences in cultural heritage settings. Additionally, the system's performance and accuracy were continuously monitored through comprehensive log data, providing valuable insights. Analysis of these logs revealed consistent high performance and accuracy, further demonstrating the system’s reliability and ease of use. By addressing the challenges and opportunities of indoor positioning in cultural heritage, this work contributes to the broader field of location-based services and specifically to their application in cultural heritage. 2 BACKGROUND AND RELATED WORK This section reviews related work on applications in indoor localization, with a particular focus on their use in museums. 2.1 Indoor localization Indoor localization is a domain concerned with facilitating precise and efficient wayfinding within enclosed structures, encompassing facilities such as commercial buildings, shopping malls, healthcare facilities, and airports. Unlike outdoor localization, which relies on satellite-based systems like GPS, indoor localization systems focus on addressing the unique challenges posed by the absence of reliable GPS signals within indoor environments (Jo HJ et al.2018, Tan Esther et al.2020, del Horno et al.2019, Yao et al.2020). Central components of indoor localization encompass: Location Determination: Indoor localization systems employ a variety of technologies, including Wi-Fi signals, Bluetooth beacons, Radio-Frequency Identification (RFID), computer vision, and smartphone sensor data, to pinpoint a user's exact position within a building (Huang et al.2009, El-Sheimy et al.2021). Cartography: The creation of detailed indoor maps is a fundamental aspect of indoor localization. These maps encompass floor plans, room layouts, key points of interest, and optimal routing information (Huang et al.2009). Routing and Guidance: Once a user's location is ascertained, the system provides them with step-by-step directions, comprising turn-by-turn instructions, visual cues, and audible guidance, facilitating their journey to a specified destination (Huang et al.2009). Points of Interest (POIs): Indoor localization systems typically highlight POIs, including restrooms, exits, retail establishments, offices, or specific locations within a building, assisting users in locating desired destinations (El-Sheimy et al.2021). User Interface: Users engage with the localization system through interfaces such as mobile applications, kiosks, or wearable devices. These interfaces furnish users with pertinent information, including maps, directions, and related details (El-Sheimy et al.2021). Accessibility: Indoor localization systems are invaluable for individuals with disabilities, particularly those with visual impairments, empowering them to navigate autonomously and securely within indoor spaces (El-Sheimy et al.2021). Indoor localization systems fulfill a diverse set of objectives, including enhancing user experiences, bolstering safety and security measures, optimizing logistical operations within expansive facilities and offering location-based services to visitors and patrons. Their application extends across manifold settings, ranging from commercial complexes and transportation hubs to healthcare institutions, cultural establishments, and intelligent building infrastructures (Barbieri et al.2021, Butun et al.2019, Pundir et al.2019, Zou et al.2014, Moreno et al.2012a, Moreno et al.2012b, Yin et al.2017, Basiri et al.2017, Kim Geok et al.2020, Xingli et al2018). Ongoing innovations in indoor localization technology, underscored by advancements in fields such as computer vision, machine learning, and sensor technology, have significantly enhanced the accuracy and utility of these systems, fostering their continual development and integration into various real-world scenarios (Basiri et al.2017, Kim Geok et al.2020, lu et al2018, Gu et al2009, Dabove et al.2018, Lymberopoulos et al.2017, Yuan et al.2018, Podevijn et al.2018, Xiong et al.2011, Correa et al.2017, Jackermeier et al.2018, Kárník et al.2016, Davidson et al.2017, Benini et al.2006). Early technical approaches for indoor localization systems relied on techniques like Wi-Fi triangulation (Youssef et al.2005), Bluetooth signal strength (Feldmann et al.2003), and Radio Frequency Identification (RFID) tags (Renaudin et al.2007) to estimate the user's position. More robust approaches, like inertial sensors—such as accelerometers and gyroscopes—have become integral components in indoor localization systems. These sensors measure changes in velocity and orientation to estimate the user’s position. Inertial localization systems were initially used in military localization (Barbour et al. 2001). However, a significant drawback of this approach is that it loses accuracy over time due to the accumulation of small errors in the sensor data. Other approaches involve Beacons such as Bluetooth Low Energy (BLE) Technology: Beacons are small wireless devices that transmit signals, for indoor positioning. They can be placed strategically throughout an indoor environment to provide location information to user devices. They became popular due to their low power consumption and ease of deployment. Examples of uses include a localization system for car searching in indoor parking (Wang et al.2018). Estimote beacons (https://estimote.com/), to detect the user’s presence (Sawaby et al.2019). Finally, the Sensor Fusion approach, which is a modern indoor localization approach integrate and analyzes data from various sources to provide more reliable positioning information (Elmenreich et al.2002). Examples for such implementation include the use of LiDAR and inertial measurement units (IMU), for reliably estimating the pose with high precision (Ye et al.2019) thus providing more effective tracking. Smartphones have emerged as the predominant tool for indoor localization due to their versatility and multifunctionality. Various indoor localization systems harness the capabilities of smartphones, utilizing features such as Wi-Fi connectivity, built-in sensors, and cameras (M. Piras et al.2014). 2.2 Indoor localization in museums Indoor localization technologies have gained prominence in recent years, particularly within the context of museums and cultural institutions. Traditional localization methods, such as paper maps and static signage, often prove inadequate in large and intricate museum spaces. Visitors may experience difficulties in finding their way, leading to suboptimal experiences (Jamshidi et al.2020). A large variety of technologies were experimented in museums, including IR beacons (Stock et al.2007), RF Zigbee beacons (XXX), WiFi (Jiang et al.2023), Landmark based navigation (CCC). Museums have increasingly turned to computer vision-based indoor localization solutions to enhance visitor experiences. This approach offers several advantages, including cost-effectiveness, real-time tracking, and the ability to recognize visual cues and landmarks within the museum environment (Conte et al.2008, Kuo et al.2014, Li et al.2017, Dong et al.2019). Some museums have integrated augmented reality (AR) into their indoor localization systems (Kolivand et al.2019). AR overlays digital information onto the visitor's view through a smartphone or smart glasses. This provides dynamic, real-time guidance, allowing visitors to interact with exhibits more effectively. For instance, visitors can point their devices at artworks or artifacts, receiving detailed information about them. Ghouaie et al.2017, introduces an innovative handheld Augmented Reality (AR) system termed the "Mobile Augmented Reality Touring System" (M.A.R.T.S). A key feature of this system is its proposal to replicate the role of a human guide using a virtual human counterpart within the M.A.R.T.S framework. The overarching objective of these interaction schemes is to facilitate the real-time linkage of digital information with exhibits, enhancing the visitor experience and understanding. Indoor localization solutions in museums not only guide visitors but also offer interactive and immersive experiences. Users can access multimedia content related to exhibits, including audio guides, videos, and additional contextual information (Stock et al.2007), (XXX). These features transform the localization experience into an educational and engaging journey through the museum's collections. Example: In the cited research paper (Villaespesa et al.2021), computer vision technology was employed to establish a subject tagging system within a web-based platform. This entailed the utilization of computer vision algorithms capable of expeditiously generating subject tags from digital images of objects within a curated collection. The methodology involved the acquisition of an extensive dataset comprising images captured within a museum setting. Subsequently, these images underwent processing through computer vision algorithms, resulting in the extraction of descriptive tags, which were subsequently cataloged in a database. This system thereby facilitated user-accessible search functionalities on the website based on the assigned tags, eventually engage users with museum’s collections. Computer vision-based indoor localization systems can be tailored to enhance accessibility and inclusivity within museums. They can provide specialized guidance for visitors with disabilities, such as audio descriptions for visually impaired individuals. Additionally, multilingual support ensures that diverse audiences can navigate and engage with museum exhibits comfortably. Meliones et al.2018, introduces an interactive autonomous localization system designed for indoor use, specifically targeting individuals and groups with visual impairments, referred to as the "Blind Museum Tourer." The core functionality of the Blind Museum Tourer system hinges upon the incorporation of a robust indoor localization module, serving as a guide for individuals who are blind or visually impaired, facilitating self-guided tours within museum premises. In real-time, the system possesses the capability to pinpoint the user's location within the indoor environment and subsequently provide guidance towards the next exhibit according to the predefined tour route. Upon reaching each exhibit, the system delivers auditory presentations to the user for an informative and engaging museum experience. In conclusion, indoor localization technologies, particularly those based on computer vision and augmented reality, are redefining the way visitors explore and interact with museums. These innovative solutions not only streamline localization but also contribute to richer, more immersive, and inclusive museum experiences. As technology continues to advance, museums can look forward to further enhancing visitor engagement and education through indoor localization systems. 2.3 Computer vision in indoor localization Computer vision is a field focused on techniques for capturing, processing, and analyzing images, allowing systems to interpret and understand visual information. Computer vision algorithms typically rely on both low-level and high-level visual features to extract meaningful data from images. Low-level features include color, texture, edges, and corners, while high-level features involve semantic information and object relationships. Techniques like image filtering, feature detection, and feature extraction play a key role in extracting these features (Klette et al.2014, Stockman et al.2001, Morris et al.2004, Jähne et al.2000). Recent advancements in hardware, software, and machine learning have significantly boosted the capabilities of computer vision. Notably, deep learning methods have revolutionized the field, with convolutional neural networks (CNNs) (O'Shea et al.2015) and recurrent neural networks (RNNs) (Wang et al.2016) becoming central to modern computer vision tasks. CNNs, in particular, excel in autonomously learning layered representations from large image datasets, enabling accurate image recognition and classification tasks (Li et al.2021). One crucial application of computer vision is scene recognition , where algorithms identify and categorize environments or contexts within images. Scene recognition relies on analyzing visual cues such as objects, textures, and spatial arrangements to classify images into various scene types (Lin Xie et al.2020). This capability is vital in fields such as smart city infrastructure , where it helps recognize urban scenes, monitor traffic, and optimize city planning (Syahidi et al.2023). It is also crucial in augmented reality (AR) , where devices must recognize scenes and objects in real time to interact with the user's surroundings effectively (Ghasemi et al.2022). In recent years, there has been a growing interest in applying computer vision to indoor localization due to its ability to provide accurate and robust localization in environments where GPS signals are unavailable or unreliable. Traditional positioning signals such as GPS fail indoors, and even wireless signals like Wi-Fi and Bluetooth can suffer from interference caused by building materials or obstructions (Morar et al.2020). By contrast, computer vision provides a robust alternative by leveraging images or video streams captured by cameras to estimate a user’s location accurately (Kim et al.2018). According to (Chen et al.2021), the most significant advantage of cellular positioning technology is to achieve seamless indoor positioning. Despite its potential, computer vision for indoor localization faces certain challenges, including accuracy (affected by lighting conditions, camera quality, and moving objects) and computational complexity (which can limit real-time application) (Zhang et al.2019). Nevertheless, computer vision is becoming a promising tool for indoor localization, and ongoing improvements in algorithms and hardware continue to address these challenges. Yang et al. (2020) proposes an improved vision-based positioning method that uses a pixel threshold-based eight-point method to enhance the quality of feature points, thereby eliminating mismatching caused by pixel drift. The method also improves the epipolar constraint and introduces a new cost function for better accuracy in fundamental matrix calculation, achieving superior results compared to traditional methods. An innovative approach in this field is Visual SLAM (Simultaneous Localization and Mapping) , which uses computer vision to simultaneously map an environment and estimate the user's position within it. By analyzing visual features extracted from camera images, Visual SLAM algorithms track user movement and generate a real-time map of the surrounding environment (Raul et al.2015). This technology has been employed in various indoor localization systems to provide precise localization. For instance, a study by Poulose et al. (2019) demonstrated that combining Visual SLAM with hybrid sensors on a smartphone camera reduced the localization error from 0.1398 meters to 0.0690 meters, but the major drawback is that this approach is very computationally costly. In addition to Visual SLAM, augmented reality (AR) is often integrated into computer vision-based indoor localization. AR systems overlay digital information onto the camera view, offering real-time localization assistance (Cavallari et al.2019). Such systems typically analyze images captured by the user's camera to recognize specific objects or landmarks. By utilizing deep learning models for object recognition, these systems enhance the real-time interactivity between the user and their environment (Zhou et al.2015). Varalatchoumy et al. (2023) proposed an AR-based indoor localization solution that integrates smartphone sensors and cameras. Experimental results from this study showed that localization errors ranged from 0.1 to 0.25 meters for short distances and up to 1.2 meters over longer distances of 200 meters. Another widely-used technique in indoor localization is visual landmark recognition , where computer vision algorithms identify distinctive landmarks within an environment. These systems estimate the user's position based on their relative proximity and orientation to these landmarks. By integrating landmark recognition with sensor fusion techniques , such systems can enhance accuracy and provide more reliable localization (CCC). Table 1 provides a summary of advantages and disadvantages of the different methods. In our specific application, we employ a camera-based solution for museum localization. Users wear a smartphone around their neck while exploring the museum, and the device continuously captures images of their surroundings. These images are analyzed to detect when users pause in front of exhibits, signaling engagement with specific points of interest (POIs). The system’s reliance on camera-based assessments for location identification negates the need for external tools like Wi-Fi or additional sensors. This vision-based positioning system is an efficient solution for indoor localization, providing visitors with real-time, accurate guidance without manual intervention. As research in computer vision continues to progress, the application of this technology in indoor localization is expected to become even more accurate, robust, and efficient. These advancements will further enhance the user experience, particularly in cultural heritage environments, such as museums, where accurate and non-intrusive localization is crucial. Table 1: An overview of each method's strengths and weaknesses, offering insights into their applications in various indoor environments, including museums. Method Advantages Disadvantages Wi-Fi Triangulation - Widely available in most indoor environments. - Cost-effective as it uses existing Wi-Fi networks. - Susceptible to interference from walls and objects. - Limited accuracy in complex or crowded environments. Bluetooth Beacons (BLE) - Low power consumption. - Easy to deploy. - Provides good accuracy within short ranges. - Requires maintenance (e.g., battery replacement). - Signal may weaken in large spaces or through obstacles. RFID - High precision for short-range localization. - No reliance on batteries for tags. - Limited range. - Expensive to deploy over large areas. - Requires specialized readers. Inertial Sensors - Can work without external infrastructure. - Suitable for real-time tracking. - Accumulates errors over time (drift). - Limited standalone accuracy. Computer Vision - High accuracy in recognizing locations and landmarks. - Cost-effective using existing cameras. - Dependent on lighting conditions. - Computationally intensive. - Accuracy can be impacted by moving objects. Visual SLAM - Provides simultaneous localization and mapping. - Works in real time for dynamic environments. - High computational cost. - Requires advanced hardware for real-time processing. Augmented Reality (AR) - Enhances user engagement. - Real-time overlay of information onto the environment. - Requires good camera quality. - Limited accuracy for large or cluttered spaces. - High battery usage on devices. Landmark Recognition - High reliability using unique landmarks. - Improves accuracy with sensor fusion. - Requires pre-mapped landmarks. - Less effective in environments lacking distinctive features. Sensor Fusion - Combines data from multiple sources for improved accuracy. - Works in diverse conditions. - High computational complexity. - May require multiple sensors, increasing system cost. Smartphones (General) - Widely available and versatile. - Integrates multiple features (Wi-Fi, sensors, cameras). - Dependent on smartphone hardware capabilities. - Battery drain can be significant during continuous use. 2.4 Using Large Language Models in Cultural Heritage Since large language models (LLMs) appeared, they quickly found their way into a large and diverse domain of application, as they speed up the process of content creation. Like many other domains, LLMs were adopted also in cultural heritage. Trichopoulos et al. (2023) presented MAGICAL: Museum AI Guide for Augmenting Cultural Heritage with Intelligent Language Model – a system that demonstrated the capability of CHTGPT4 (cite GPT4) to be used as a tour guide that responds to visitors' questions and provides answers about objects (using also speech to text and text to speech technologies). There are not many additional studies that explored the potential of LLMs to be used as a smart and personalized museum guide, still, it is beyond the scope of this paper to review them. However, one issue needs to be noted and it is the validity of the content created by the LLMs. (AAA) demonstrated the potential of automatic content generation for descriptions of artifacts in a museum, where an image, a title, and sometimes a Wikipedia article were uses to guide the creation of a textual description of the object of interest, but not as a replacement of a tour guide, but as an assistant to the content curator of the museum. The authors suggested that the created content will be verified manually and only then used by a visitors' guide system, thus becoming a "curator's helper" We adopted this approach in our study, as the quality of the content is important as part of the overall experience. 3 LOCALIZATION ALGORITHM The problem we are addressing is determining the location of a museum visitor. As visitors explore the museum, they may come across POIs they want to learn more about. Our goal is to provide a solution that enables them to easily access detailed information about the POIs they encounter. The proposed solution is an application that helps visitors identify and learn about any POI they are standing in front of. To build this solution, we need to overcome several key challenges. First, the application must develop an efficient way to represent the input images so that the matching process is both accurate and optimized. Proper image representation is crucial to ensure that the system can quickly and reliably match the input image to the correct POI. Second, it is essential to construct a well-organized data set for each POI, which will allow for smoother identification and comparison processes. The system must maintain a robust database to support effective POI recognition. Finally, the solution needs to provide a fast and reliable method to recognize whether a POI exists in the database or not. If the POI does exist, the solution should correctly recognize it, ensuring accurate identification. This includes handling cases where a POI is not yet present in the data set and making that decision swiftly to maintain a smooth user experience. 3.1 Image Representation and Matching Many research efforts have focused on using feature extraction techniques to solve the challenge of image representation and matching. For example, (Yang et al. 2020) employed the SIFT algorithm for feature extraction and matching. SIFT (G. Lowe et al.2004) works by detecting key points in an image and describing them using distinctive feature vectors, which are then used to match images based on similarity. At the beginning of our research, we explored using SIFT and other tools like OpenCV’s ORB (Rublee et al.2011) model. However, these models (models based detecting key points in the image) proved to be too slow and relatively inaccurate for our purposes. The complex matching process they rely on significantly impacted performance, making them unsuitable for online applications where real-time processing is critical. These solutions employed computationally expensive approaches that were not practical from a processing time perspective. We then turned to deep learning-based representations, which are relatively new compared to traditional methods like SIFT, SURF, and ORB. We evaluated several candidates, including ResNet (He et al.2016) and CLIP (Radford et al.2021), to assess their feasibility. One advantage of these models is that they allow us to compute distances between feature vectors, which is much more computationally efficient than the matching process used by SIFT. Among the models we tested, CLIP yielded the best performance and accuracy. As a result, we decided to further investigate and ultimately adopt CLIP as the feature extraction model behind our solution. While CLIP was initially designed to represent images and text together, we chose to utilize its dense layer output, which produces a feature vector representation of the image. During our investigation of the CLIP model, we discovered several versions available. After testing and research, we selected the CLIP-ViT-Large-Patch14 (https://huggingface.co/openai/clip-vit-large-patch14) model, which best suited our needs. We also experimented with the CLIP-ViT-Base-Patch32 version (https://huggingface.co/openai/clip-vit-base-patch32), but it did not perform as well as CLIP-ViT-Large-Patch14 in terms of accuracy and performance. 3.2 POI Data Set Construction In order to build a database, we captured videos for each POI, ensuring that the video frames cover every possible snapshot a visitor might take. For each video, we extracted all the frames and then employed CLIP on the entire set of frames, resulting in a set of embeddings representing those frames. After this, the ARIDF (BBB) algorithm is applied to these embeddings. The ARIDF (Automatic Representative Image Dataset Finder for Image Based Localization) algorithm processes a set of embeddings to identify a minimal subset that best represents the entire set. The process begins by initializing a distance matrix to store the pairwise Euclidean distances between the embeddings. To optimize performance, only half of the matrix is computed since the distance between any two embeddings is symmetric. Once the distance matrix is computed, it is converted into a binary matrix using a threshold. If a distance is greater than 0.38 (a value determined through a grid search on this parameter), the corresponding cell is set to 0, indicating dissimilarity; otherwise, it is set to 1, indicating similarity. The goal of the algorithm is to identify the most representative embeddings. It does this by iteratively selecting the column with the most 1s, which indicates the embedding closest to the majority of other embeddings. The corresponding embedding index is added to a subset that will represent the entire set. The matrix is then updated to mark all similar embeddings as covered, reducing redundancy in subsequent iterations. This process repeats until no 1s remain in the matrix, meaning all similar embeddings have been accounted for. The final output is a subset of embeddings that effectively represents the diversity within the entire set. This subset is returned as the result, providing a reduced but representative view of the data. The ARIDF algorithm is efficient in identifying significant relationships within the data and minimizing redundancy, making it ideal for reducing dimensionality while preserving diversity. The ARIDF reduction yielded different subset sizes for different videos, as it is influenced by several factors, such as the diversity between frames and the video's length. Nevertheless, the model still has the ability to eliminate more than 70% of the total frames for each video, effectively reducing redundancy while preserving important visual content. 3.3 POI Recognition The algorithm aims to find a matching Point of Interest (POI) in a database based on an input image by using image embeddings. The process starts when an image is received as input. The first step is to generate an embedding (a vector of features) for this input image using the CLIP model. This embedding is a numerical representation of the image's features. Next, the algorithm initializes two variables: “minDistance” to keep track of the smallest Euclidean distance found and “matchedPOI” to store the data associated with the embedding that is closest to the input image. Initially, “minDistance” is set to infinity, indicating that no distance has been calculated yet. The algorithm then loops over all subsets of embeddings stored in the database. For each subset, it calculates the Euclidean distance between the input embedding and each embedding within the subset. The Euclidean distance measures the similarity between two vectors; a smaller distance indicates higher similarity. Within each subset, the algorithm maintains a local minimum distance (“subsetMinDistance”). As it iterates through each embedding in the subset, it updates “subsetMinDistance” whenever a smaller distance is found. Once all embeddings in a subset have been processed, the algorithm compares “subsetMinDistance” with the overall “minDistance”. If “subsetMinDistance” is smaller, “minDistance” is updated, and the corresponding data for that subset is stored in “matchedPOI”. After processing all subsets of embeddings, the algorithm checks whether the smallest distance found (“minDistance”) is below a predefined threshold. This threshold is determined based on the desired balance between false positives and true positives. If “minDistance” is below the threshold, it indicates that a sufficiently similar POI was found, and the corresponding data for that POI is returned. If “minDistance” is greater than the threshold, the algorithm returns a general response indicating that no matching location was found. Runtime complexity, to analyze the runtime complexity of the described algorithm, let's break down the various components: Key variables: n : Number of image embeddings in the database. d : Dimensionality of each embedding (in this case, 768 ). 1. CLIP Extraction Process: The extraction process that generates an embedding from the input image is stated to be computationally expensive. However, this process only happens once, so we can denote its time complexity as O(E) , where E is the cost of computing the embedding for the input image using the CLIP model. 2. Euclidean Distance Calculation: The algorithm calculates the Euclidean distance between the input embedding and each embedding in the database. Since each embedding is a vector of length d=768, calculating the Euclidean distance between two embeddings takes O(d) time. So for each embedding, the distance computation is O(768)=O(d) (constant time). 3. Looping Over Embeddings: The algorithm loops over all embeddings in each subset. The loop operates at: Loop over all embeddings : O(n) . Euclidean distance : O(d) for each comparison. Thus, the total time for looping through all the embeddings is O(n×d) . 4. Comparison and Thresholding: After computing the Euclidean distance for each embedding, the algorithm keeps track of the minimum distance in each subset and compares it with a global minimum. These comparisons and updates are O(n) operations, which don’t significantly affect the overall complexity. 5. Final Threshold Check: At the end of the process, the algorithm compares the minimum distance to a threshold, which is also an O(1) operation. Total Runtime Complexity : The CLIP extraction process is O(E) , which is a constant heavy operation performed once. The distance calculation and looping through embeddings result in O(n×d) . Final checks and comparisons are O(n) . Thus, the overall time complexity of the algorithm is: O(E+n×d) Given that O(E)≫O(n×d), the overall complexity is dominated by the CLIP extraction process. As a result, the algorithm can scale efficiently by increasing number of points of interest (POIs), without significantly affecting runtime complexity. 3.4 LLM-assisted Content Creation and Delivery During the process of creating the content to be delivered to the visitors, multiple LLMs were experimented with to identify the most concise and clear option. The prompt to the LLMs included the title, an image of the POI, and the following prompt: “Create a short description for the artifact in the attached image. Gemini API (Gemini 1.0 Pro https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemini-pro-vision?hl=it&pli=1) was selected for this purpose, as its generated text proved to be the most effective. The text, as proposed in AAA, was subsequently verified, edited, and carefully examined by the authors to ensure its soundness. Once the descriptions were generated, they were added to the dataset of the relevant POIs. When users directed their smartphone cameras at a specific artifact, the system identified the POI and displayed the corresponding description on the screen. In cases where a showcase contained multiple items, the system first presented a general explanation about the items in the showcase and then expanded on several specific items included in the presentation. The system supports the creation and updating of explanations for POIs in two distinct ways: users can either enrich a new POI in the database by attaching a title and an image or update the description of an existing POI. In both cases, the provided title, image, and associated prompt trigger Gemini, and the resulting output is automatically saved to the dataset. 4 SYSTEM Our system includes two main components, an android application that we named Wandering Application (WA) and a back-end service that we named Matching Service (MS). The two components interact using an API that was built in the MS, and used by the WA, the whole system served two types of users, a visitor and a museum staff member. 4.1 The Wandering Application An Android application was developed to enhance the experience of museum visitors by delivering location aware information about POIs. The app is designed to continuously track the user's movement within the museum by utilizing the device's camera. Users are required to hold their device in a way that allows the camera to capture images of their path, enabling the app to gather data about their activities (hold it in their hand or hang it on their chest). The primary aim is to determine the visitor's location, specifically identifying which exhibit the visitor is currently viewing. In this context, "location" refers to the exhibit in front of the visitor rather than any arbitrary position within the museum. To achieve this goal, the application incorporates two key functionalities: 1) detecting the visitor's standing position, and 2) querying a service with images captured in that position. The app is programmed to detect when the user has stopped moving, as it is assumed that visitors will pause mainly in front of exhibits of interest. When the app detects that the user has stopped, it captures an image and sends it to a back-end service. The service then determines whether the image matches a known exhibit location. This process allows the app to accurately identify the visitor's position within the museum. Additionally, the application supports an enrichment process intended for museum staff or other designated users. These users can upload content related to of specific POIs, this may be textual description or any kind of multimedia. The app facilitates the submission of this data to a back-end service responsible for maintaining and updating the dataset of POIs and their associated content. This dual functionality ensures that the application not only enhances the visitor experience by providing location aware content but also continuously improves the museum's data repository through museum staff members contributions. Figure 1 is a sequence diagram that illustrates the overall interaction between the visitor and the system’s components. 4.1.1 Standing detection To detect when a user is standing still in front of a POI, we assume that visitors typically stop moving when they are observing an exhibit of interest. By monitoring the movement of users, the application can discern when a visitor has paused, thereby indicating a point of interest or engagement with the exhibit. The detection mechanism involves comparing consecutive images captured from the stream of images taken continuously by the camera of the mobile device. Specifically, we compute the difference between the average pixel values of two successive images using the following equation: where E(I) represents the average pixel value of an image I . This difference helps identify whether the scene in the images remains largely unchanged. If the user moves only slightly, the difference does not vary significantly, as most of the image remains continuous. To determine whether a user is standing still, the application calculates the difference in averages between consecutive images and checks if the difference remains below a predetermined threshold. This threshold accounts for small movements, ensuring robustness to minor positional adjustments. If the difference stays below the threshold for a continuous duration—configured to be 3 seconds—the application concludes that the user is stationary. This threshold was chosen to balance accuracy and speed, allowing the system to efficiently process image data in real time. We aimed to avoid using phone sensors like gyroscopes because these sensors vary across devices, with newer models having better sensors than older ones. Relying on sensors would limit the system's usability, as our users may use any type of device. Regarding the continuous use of the camera, it does affect the battery; however, other functions of the application are lightweight, so the camera is the primary factor impacting battery life. Nonetheless, this did not present a significant issue, nor did device conditions such as temperature. Once the stationary state is detected, an image is captured and transmitted to the backend service. (See Figure2 a & b for screenshots during the location identification process). 4.1.2 Querying the Matching Service Once an image is captured, the application sends a request to a matching service (MS). The MS responds with either a positive or negative responses. If the response is positive, the application displays a pop-up screen with detailed information about the POI), including metadata about the identified location such as the exhibit's ID, name, and description (see Figure2 c ), If the response is negative, the application continues to capture the image stream and informs the user that the position could not be identified (see Figure2 b) . 4.2 Matching Service A back-end service was built using python (that was chosen for this project due to its robust library ecosystem and compatibility with deep learning models, particularly the CLIP model and contains libraries like pandas, and NumPy facilitate efficient data manipulation, numerical computations, and model development. Moreover, Python's seamless integration with frameworks such as TensorFlow allows for the effective implementation of advanced models like CLIP). The Matching Service maintains a dataset, where minimal sets of images of all the POIs are stored. These minimal sets are created automatically from large sets of images, by applying the ARIDF algorithm (BBB). For storing the images, we used NoSql database, Mongodb, so our POIs are stored as documents, each document represents a POI. 4.2.1 Matching service system design (Figure 4) Find API: The Find API is designed to handle requests for the image recognition process. It initiates this process by receiving image-based queries, triggering the necessary operations, and returning the relevant responses. Orcheastror: The Orchestrator is responsible for coordinating interactions between various services and functionalities in all the processes. It serves as an intermediary, receiving requests from the API and systematically forwarding them to the appropriate downstream services. For example, upon receiving a request, the Orchestrator sends the relevant data to the Features Extractor service, which processes the input and returns a result. The Orchestrator then directs this result to the next designated service in the workflow. Features Extractor: The Features Extractor service is designed to process individual images or sets of images, returning their corresponding embeddings using CLIP. Finding Service: The Finding Service is designed to accept input embeddings and identify the most similar embeddings from a database. This process involves querying the entire database via the Data-Base Service to retrieve all relevant documents. Once the documents are obtained, the Finding Service iterates over them, calculating the Euclidean distance between the input embeddings and each embedding from the database. The embedding with the minimum Euclidean distance is identified, and its corresponding data is evaluated against a configurable similarity threshold. If the similarity meets or exceeds this threshold, the service considers the result sufficiently similar and returns it to the Orchestrator. Enrichment API: The Enrichment API is designed to manage enrichment requests. These requests typically contain a video of a POI accompanied by text that describes the POI. Upon receiving a request, the Enrichment API initiates the enrichment process by triggering the Orchestrator with the provided request data. ARIDF (BBB): see section 3.2. Storing Service: The Storing Service is tasked with the comprehensive assembly and storage of documents. Each document encapsulates all pertinent data about a POI, including the minimum set of embeddings that represent it. This service handles the creation of these detailed documents, ensuring that all relevant information is accurately compiled. In collaboration with the Data-Base Service, the Storing Service subsequently stores these documents in a database. The end result is a collection of documents, each uniquely representing a POI. Data-Base Service: The Data-Base Service is designed to facilitate Create, Read, Update, and Delete (CRUD) actions within the database, supporting the needs of various other services. This service establishes and manages a connection to the database while incorporating a caching layer that maintains a local copy of the dataset. This local copy includes all data existing in the database, significantly enhancing the performance of retrieval (get) actions. The Data-Base Service is also responsible for ensuring that the cached copy and the actual database remain consistently aligned, thereby guaranteeing data integrity and synchronization. This dual-layered approach not only optimizes access times but also ensures the reliability and accuracy of data across all interacting services. MongoDb: MongoDB serves as the database platform that stores comprehensive information about Points of Interest (POIs). Each POI is represented by a MongoDB document, structured to include several critical fields. Each document features a unique "id," which is automatically generated by MongoDB, and a "name" for the POI, extracted from the input description provided by the application. The "description" field contains a narrative that describes the POI, intended for presentation to the end user. In addition, the document contains a "minimum set of embeddings," representing a condensed version of the larger set of embeddings, and a "set of rest embeddings," which includes all other embeddings not part of the minimum set. Each embedding is a vector with a dimension of (1, 768), encapsulating the feature representation of the POI. Moreover, the document records other essential metadata about the input video, such as the number of frames extracted from the video, the title of the video, its length, and the creation date of the document. This comprehensive dataset is managed by our system, particularly the Data-Base Service, to ensure the efficient and effective utilization of POI information. The structure and organization of the MongoDB documents facilitate streamlined access and maintenance, supporting the overall functionality and performance of the POI dataset. 4.2.2 System Processes (Figure 3) Enriching a POI process: The Matching Service (MS) provides an entry point for enriching requests from the application. These requests typically include a video file representing a POI and accompanying metadata that describes it. Upon receiving an enriching request, the Enrichment API forwards the request data to the Orchestrator. The Orchestrator initiates the process by splitting the video into a set of images or frames, which are then sent to the Features Extractor. The Features Extractor processes each image or frame iteratively, applying the CLIP model to generate embeddings—a vector of features with a size of 768. After processing all frames, the Features Extractor returns a new set of embeddings to the Orchestrator. Next, the Orchestrator forwards the embeddings set to the ARIDF model (see POI Data Set Construction ). ARIDF processes these embeddings to identify a minimal set of representative embeddings and separates the remaining embeddings into another set. Both sets of embeddings, along with all the input video data and metadata, are then sent to the Storing Service by the Orchestrator. The Storing Service assembles this information into a data object that is forwarded to the Data-Base Service, which is responsible for storing the data in the database and updating the cache accordingly. As a result, the new POI is added to both the database and the cache, making it readily available for querying through our searching flow. Searching process: The Matching Service (MS) provides a find entry point, which initiates the searching process. When a request is accepted by the Find API, typically containing an image, the request data is forwarded to the Orchestrator. The Orchestrator begins by requesting the embeddings of the input image from the Features Extractor, which uses the CLIP model to generate a feature vector of length 768. This single embedding is then sent back to the Orchestrator. Subsequently, the Orchestrator forwards the embedding to the Finding Service that requests all POIs from the Data-Base Service, which returns the cached POIs for performance efficiency. The Finding Service iterates over each POI, calculating the Euclidean distance between the input embedding and the embeddings in the minimum set for each POI. The smallest distance is stored and compared across all POIs. At the end of this process, the smallest distance is compared to a pre-determined threshold, established through research to minimize false positives while maintaining a satisfactory rate of true positives. If the minimum distance is below the threshold, the image is deemed sufficiently similar to a specific POI in the database. The Finding Service then returns the corresponding description of the most similar POI to the Orchestrator. If the minimum distance is equal to or greater than the threshold, a message indicating that no sufficiently similar POI was found is returned. The Orchestrator then forwards the response to the Find API, which in turn provides an appropriate response to the client. This entire searching flow is optimized to execute within approximately 2-3 seconds, leveraging techniques such as caching, the ARIDF model, storing embeddings instead of images and the use of Euclidean distance calculations. These optimizations ensure that the system performs efficiently while maintaining high accuracy in identifying similar POIs. 5 EXPERIMENTATION All methods were carried out in accordance with relevant guidelines and regulations 5.1 Introduction to the Experiment The primary purpose of this experiment was to evaluate the effectiveness and usability of the proposed application solution in addressing our research question: “ How can image-based positioning be integrated efficiently into a mobile, location aware museum visitors guide? ”. The key objectives of the experiment were: Accuracy Evaluation: We aimed to assess the accuracy of the application's outputs and responses, particularly in delivering relevant and correct location-aware information to users. This involved verifying that the application provided accurate and consistent results across different scenarios and user interactions. Quality and Performance: The experiment sought to evaluate the quality and performance of the application under real-world conditions. We examined whether the application operated smoothly, with minimal delays or technical issues, and maintained high responsiveness and reliability. This was particularly important given the high computational demands of the problem the application addresses. Since we had already developed a system, we wanted to know what users thought of it beyond location identification, including usability, comfort and their overall enjoyment and engagement with the application, Usability Evaluation: To assess the usability of the application, we aimed to determine how intuitive and user-friendly the interface is for participants. This involved measuring how easily users could navigate the application, understand the content, and complete tasks without requiring extensive instruction or prior experience. Comfort Assessment: We aimed to evaluate the comfort level of users while interacting with the application. This included assessing the physical ease of use, the clarity and accessibility of the interface, and the overall user experience in terms of cognitive load and satisfaction. However, this goal was secondary, as it was not the primary focus of our study. Since users interacted with the application, we included some questions related, to gather additional insights. User Enjoyment and Engagement: Since overall enjoyment and engagement with the application is a key indicator of its success in enhancing user interaction and satisfaction, we tried.to assess this aspect as well By focusing on these objectives, the experiment sought to provide a comprehensive evaluation of the application's capabilities and identify areas for improvement. The findings from this experiment are intended to inform the development of future iterations of the application, ensuring that it meets the needs and expectations of a diverse user base. 5.2 Experimental Setup The experiment was conducted at the Hecht Museum (https://mushecht.haifa.ac.il/), located at the University of Haifa. The museum's unique environment is rich in archaeological displays and art, offering a diverse array of exhibits that provide a stimulating backdrop for research. For the purpose of this experiment, we selected the "Ancient Crafts and Industries" exhibition area, which focuses on seven ancient crafts: metalworking, woodworking, stone vessels, glassmaking, mosaic art, the art of writing, and the physician's craft. This area includes 29 displays, each showcasing artifacts and information related to these ancient practices. We decided to consider each display as a POI for the experiment. These POIs were integrated into our application and dataset, allowing us to gather data on participant interactions and experiences within this specific context. The choice of this area was motivated by its rich historical and educational content, that enabled us to assess various aspects of the application's usability, including localization, content engagement, and overall user satisfaction. This setting offered a unique opportunity to test the application in an environment that mimics real-world usage, where visitors engage with cultural and educational content. For each POI, a short video was taken, from "Ancient Crafts and Industries" area, most of the videos were short (average duration about 22.6 seconds and standard deviation about 15.3, see Table 2), the MS will convert each video to set of frames where the number of frames extracted is directly fits the duration of the video, where each second configured to yield about 2 frames. After extracting the frames, we implied the ARIDF algorithm (BBB) for each POI and save to database only the representing frames, see Figures 5,6 as examples of figures that we stored in database. 5.3 Methodology 5.3.1 Experimental Design and Participants The experiment aimed to evaluate the usability, comfort, quality, and accuracy of our application in a real-world setting. The experiment was approved by the IRB of the faculty of social sciences of the University of Haifa (approval number 094/24). We recruited 30 participants without imposing specific demographic restrictions, requiring only that they be adults. Although the study was open to all eligible individuals, we particularly targeted older adults, as they are more likely to face challenges with new technologies. Our primary focus was on museum visitors, who generally have an interest in archaeological artifacts. Participants were approached randomly and invited to join the study. Those who agreed to participate were asked to complete an "Application Form for Participation in Research and Informed Consent," which provided detailed information about the research and the experiment. Each participant was provided with a dedicated device with the application pre-installed to ensure a seamless experience. They were instructed to explore at least 15 Points of Interest (POIs) within the "Ancient Crafts and Industries" zone of the Hecht Museum. Although demographic data was not consistently collected, we managed to gather information for about half of the participants. Out of the 16 participants for whom we collected demographic data, there were 9 males and 7 females, with an average age of 44.5 years. The application was designed to provide detailed information about the exhibits at each POI, and participants were asked to verify the accuracy of this information. The study adhered to ethical standards and was approved by the Faculty of Social Sciences IRB (IRB approval number 094/24). 5.3.2 Procedure Upon entering the museum, visitors were approached and given a brief overview of the research. If interested, they signed an informed consent form and then they were provided with more detailed information about the study and the procedures involved. Participants who agreed to participate were asked to sign a consent form, which included comprehensive details about their role and what was expected of them. Participants were then provided with an Android device pre-loaded with the application. They were instructed to navigate the designated area and interact with at least 15 POIs from the 29 available in the area. For each POI, participants were required to stand in front of the exhibit and point the device’s camera towards it. The application was expected to present information related to the POI. If the information provided was accurate, the interaction was logged as successful. In cases where the information was incorrect, it was logged as a false positive. If the application failed to recognize the POI, participants could attempt multiple times from different angles or distances. All interactions were automatically logged and closely monitored and documented by a researcher accompanying the participant. Upon completing their exploration, participants were asked to fill a System Usability Score (SUS) (Bangor et al. 2009) questionnaire and participate in a semi-structured interview. 5.3.3 Data Collection Methods Data collection was conducted through multiple channels to ensure a comprehensive analysis of the application's performance. The primary source of data was the log from our remote service, which recorded time stamped record of every interaction, including the images captured, the results of the queries, POI IDs, similarity scores, timestamps, and response times. This data was crucial for assessing the application's accuracy and performance. Additionally, participants completed two documents: a SUS questionnaire and a semi-structured interview. Participants also suggested improvements and features they found beneficial or lacking. This comprehensive data collection strategy allowed us to evaluate the application's effectiveness, identify areas for improvement, and gather user-centric insights to refine the solution. 5.3.4 Data Analysis Log analysis was used to analyze issues related to the identification of the POIs – errors in identification and time it took to identify a position. For the SUS questionnaire, we utilized an online calculator available at SUS Calculator (https://blucado.com/sus-calculator/), which facilitated the computation of usability scores, allowing us to quantitatively assess the usability of our solution. Regarding the semi-structured interviews, most of the questions required yes/no responses. We reviewed all responses, tallying the number of affirmative and negative answers. For the open-ended questions, we performed a qualitative analysis to identify the most frequently mentioned topics. This process involved categorizing and coding the responses to determine common themes and insights. By employing these methods, we ensured a systematic and thorough analysis of both quantitative and qualitative data, providing a comprehensive understanding of the usability and user experience associated with our solution. 5.3.5 Data Availability The datasets used and/or quantitatively analyzed during the current study are available from the corresponding author on reasonable request. The questionnaires are in Hebrew and on paper, we will try to see how to make them available as well. 5.4 Results 5.4.1 Quantitative Findings ARIDF Performance (Table 2): In most cases, a short video of no more than 100 frames (less than 1 minute) was sufficient to represent a POI. This indicates that these videos provided enough data for ARIDF to successfully create a representative set that accurately reflects the original video. As shown in Table 2, the minimum number of frames selected to represent some POIs was as low as 6, from an initial set of more than 20 frames, demonstrating that many POIs were adequately represented with just 6 frames, which is quite small. On average, ARIDF helped reduce the number of frames in the input videos by about three-quarters (from 43.678 frames to 10.071 frames). This reduction minimized the storage required for unnecessary data, thereby enhancing the performance of the MS. However, there are still some POIs that required longer videos for ARIDF to adequately represent the POI, resulting in a larger set of frames being stored in the database. In some cases, this number can exceed 40 frames per POI. This is particularly true for POIs that are large or have multiple viewing angles, such as "Mosaic Art," which needed coverage from all angles (see Figure 7). Another example is "De Materia Medica by the Greek," which also has multiple angles, requiring a video that accounts for its three corners and includes overhead coverage. Table 2: An overview of the image-based representation of the POIs including the number of frames used the ARIDF, the number of frames remaining after applying ARIDF. Additionally, the table includes various statistics related to these POIs. POI # POI Name Frames before ARIDF Frames after ARIDF Duration (sec) % reduction # of visits # of errors # of Unrecognized POIs 1 De Materia Medica by the Greek 157 55 78 65% 25 0 1 2 Human Illnesses in Ancient Times 21 7 10 67% 15 0 0 3 Metal working 27 6 13 78% 15 1 0 4 Lost Wax 27 6 13 78% 15 1 0 5 Bronze Vessels 26 6 13 77% 15 0 1 6 Selection of Metal Objects 28 6 14 79% 20 0 0 7 Artifacts made of Iron 28 6 14 79% 20 0 1 8 Physician 90 25 45 72% 20 0 1 9 Glassmaking-Part1 34 6 17 82% 17 0 0 10 Glassmaking-Part2 45 6 22 87% 18 1 1 11 Producing the Raw Glassp-Part1 52 6 25 88% 15 1 0 12 Producing the Raw Glassp-Part2 38 6 18 84% 28 0 0 13 Ossuary 27 11 13 59% 12 0 1 14 Burial coffin 43 13 22 70% 15 0 0 15 Selection of Wooden Objects 56 6 27 89% 15 0 0 16 Lead coffin 42 10 20 76% 15 2 1 17 Carpenter’s Tools 71 26 34 63% 19 0 0 18 Frieze fragment 27 9 13 67% 17 0 0 19 Burial coffin (Sarcophagus) 36 8 17 78% 15 0 0 20 Hebrew Promissory Note 25 6 12 76% 15 0 0 21 Alphabetic Script 68 7 32 90% 17 1 0 22 Jewish Tombstone 30 7 14 77% 18 0 0 23 Hieroglyphic Script 36 6 17 83% 12 0 0 24 Jewish ossuaries 32 7 15 78% 19 0 0 25 Stone Vessels Everyday Life 47 6 23 87% 21 0 1 26 Tables 31 6 15 81% 16 0 0 27 Stone Vessels (Late 2 nd Temple Period) 48 6 23 88% 18 0 0 28 Stone Jar 31 7 15 77% 22 0 0 29 Mosaic Art 123 41 61 67% 21 0 0 43.678 10.071 22.58 Avg 21 6 10 Min 157 55 78 Max 27.386 10.194 15.281 STD 3. Logs Analysis: As previously mentioned, we implemented a comprehensive logging mechanism to monitor every action within the system. Additionally, we stored all snapshots captured by users to query the system. This setup enabled us to thoroughly track each request, including its status, answer and progression, providing valuable insights into system performance and user interactions. Participant Perspective (Table 3): Table 3: summaries the experiment in numbers from participants perspective, we can find for each participant some information about his experience, like number of POIs he visited, number of max tries to get information about POI, how many POIs wasn’t detected at all, how much was detected wrongly and other important statistics Participant # # of Visited POIs # of Successful Searching Tries # of Unrecognized POIs # of Wrongly recognized POIs Avg Min Max STD number of Tries for successfully recognized POI 1 25 27 1 0 1.08 1 2 0.34 2 15 19 0 0 1.27 1 3 0.59 3 15 20 0 1 1.33 1 3 0.65 4 15 17 0 1 1.13 1 2 0.43 5 15 16 1 0 1.07 1 2 0.36 6 20 25 0 0 1.25 1 2 0.44 7 20 22 1 0 1.10 1 2 0.37 8 20 22 1 0 1.10 1 3 0.55 9 17 22 0 0 1.29 1 2 0.47 10 18 20 1 1 1.11 1 3 0.58 11 15 15 0 1 0.93 1 1 0.00 12 28 36 0 0 1.29 1 2 0.46 13 12 13 1 0 1.08 1 3 0.60 14 15 16 0 0 1.07 1 2 0.26 15 15 17 0 0 1.13 1 2 0.35 16 15 15 1 2 1.00 1 2 0.38 17 19 25 0 0 1.32 1 2 0.48 18 17 20 0 0 1.18 1 3 0.53 19 15 15 0 0 1.00 1 1 0.00 20 15 20 0 0 1.33 1 3 0.62 21 17 21 0 1 1.24 1 3 0.60 22 18 19 0 0 1.06 1 2 0.24 23 12 13 0 0 1.08 1 2 0.29 24 19 23 0 0 1.21 1 2 0.42 25 21 22 1 0 1.05 1 2 0.22 26 16 17 0 0 1.06 1 2 0.25 27 18 21 0 0 1.17 1 2 0.38 28 22 26 0 0 1.18 1 2 0.39 29 21 26 0 0 1.24 1 3 0.54 30 16 16 0 0 1.00 1 1 0.00 Sum 526 606 8 7 Avg 17.53 20.17 0.27 0.23 Min 12.00 13.00 0.00 0.00 Max 28.00 36.00 1.00 2.00 STD 3.54 4.96 0.45 0.50 According to Table 3, there is variation in how participants perceived the application's interest level. For instance, 8 out of 30 participants visited at least 20 POIs, indicating a higher level of engagement. Conversely, some participants found the application less engaging, despite being asked to visit at least 15 POIs; 2 participants visited only 12 POIs. However, the overall average number of POIs visited by all participants was about 17, which exceeds the minimum requirement. The application demonstrated high reliability, with users needing an average of only 2 attempts to receive a positive response. In fact, some participants consistently received a positive response on their first attempt throughout the entire experiment. There were also differences in how easily participants could use the application. For example, participant number 19 managed to visit 15 POIs and received a positive response on the first try for all of them, while participant number 3 had less success. Overall, there was little variation in the average number of attempts needed across all participants (17.53) and the average number of POIs visited (20.17). Additionally, our analytics reveal a low incidence of errors: only 8 from 30 participants (0.27 on average), encountered a situation where the POI was not recognized at all, and 7 from 30 participants (0.23 on average) experienced a situation where the POI was incorrectly recognized. This suggests that the application was generally accurate and user-friendly. POI Perspective (Table 2): The Points of Interest (POIs) were not uniformly managed, meaning some POIs required more effort to be adequately represented in our database, leading to the need for longer videos compared to others. For example, the POI "De Materia Medica by the Greek" had a high average number of attempts needed for successful recognition, indicating that participants had to try more often than with other POIs to achieve a correct recognition. This suggests that this POI was particularly challenging. To address this issue, we uploaded an additional video of this specific POI to our database, which increased the likelihood of accurate detection. Another issue worth mentioning occurred with the POI "Glassmaking-Part 2." As indicated in the table, this POI was incorrectly recognized twice, which is relatively high compared to other POIs. In both cases, the POI was mistakenly identified as "Producing the Raw Glass-Part 1." This error is understandable, as these two POIs appear quite similar (see Figure 8), leading to confusion and incorrect recognition. Although we attempted to resolve this by uploading more videos for each of these POIs, the solution was not particularly effective. Thus, determining the best approach to handle such cases remains an open question. 4. SUS score: The average SUS score for all 30 participants was 87.08, which is classified as an excellent rating according to (Bangor et al. 2009) (a score within the range of approximately 85 to 100 is typically deemed "Excellent," signifying a high level of perceived usability for the system). This result indicates that users found the system highly intuitive and easy to use, requiring minimal effort to learn and navigate. The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. 5.4.2 Qualitative Findings The qualitative findings from the semi-structured interviews provide valuable insights into the user experience and effectiveness of the system. Below is a summary of the responses to each interview question: 5. Errors in Location Identification: Among the 30 participants, there were only 7 instances of incorrect location identification out of more than 500 requests. The findings here were discussed in the quantitative findings section. 6. Total Number of Location Queries: The participants made a total of approximately 559 location queries, averaging about 17.5 requests per user. Which is above the average 15, this suggests that users were actively engaging with the system. 7. Perceived Response Time: Regarding the system's response time, 1 user reported that the response time was too long, 10 users indicated that it was "a little" too long, while the remaining 19 users felt that the response time was satisfactory. This feedback suggests that while most users were satisfied with the response time, there is room for improvement to enhance the user experience. 8. Instances of Failed Location Identification: Across all 30 participants, there were only 8 instances where the system failed to identify a location out of more than 500 requests. The findings here were discussed in the quantitative findings section. 9. Clarity on Location Search Initiation: When asked if it was clear when the application started searching for their location, 29 participants answered "yes," while 2 participants responded "no." This indicates that the majority of users found the system's cues for initiating location searches to be clear and understandable. 10. Clarity on Location Identification: All 30 participants responded "yes" when asked if it was clear when their location was identified. This unanimous positive feedback highlights the effectiveness of the system in communicating successful location identification to users. 11. General Feedback on System Use: The open-ended question on general feedback yielded predominantly positive responses, such as "Easy to use," "Great app," "Encouraging to visit the museum", "Working great", "Recommended app" and "Understandable flow." However, two users mentioned that they needed a little help or guidance when first using the app, suggesting a potential area for improving the onboarding experience. 12. Suggestions for Improvement: Participants also provided constructive feedback for enhancing the system. The most frequently mentioned suggestions included support for different languages (8 mentions) and audio description (8 mentions), faster response times (4 mentions), improvements to the app's design (3 mentions), clearer instructions (2 mentions), enriching descriptions with images and links (1 mention) and making the system more interactive by answering user questions (1 mention). These suggestions offer valuable insights into potential areas for future development. 6 Discussion The experiment conducted at the Hecht Museum provides a comprehensive evaluation of the proposed application's usability, comfort, quality, and accuracy in delivering a seamless user experience within a real-world setting. The results, both quantitative and qualitative, offer significant insights into the application's performance and areas for future enhancement. 6.1 Accuracy Evaluation One of the primary objectives of this experiment was to evaluate the application's accuracy in identifying and providing information about various Points of Interest (POIs). As outlined in the quantitative findings section, the results are highly promising, with only 7 instances of incorrect location identification out of more than 500 queries and just 8 instances of failed location identification. These low error rates demonstrate the application's reliability and accuracy in delivering the correct information, which is crucial for maintaining user trust and satisfaction. The quantitative findings also indicate that most issues occurred with specific POIs, suggesting that certain POIs are more challenging due to factors such as lighting effects, size, position, structure, and more. For example, "De Materia Medica by the Greek" presents unique challenges as it is of moderate height, allowing for multiple angles of photography, such as from above or the side. Creating a comprehensive video that captures all these angles proved difficult. Initially, we uploaded a video that was not sufficiently comprehensive, leading to several recognition failures. We resolved this by later enriching the database with better representative videos. As a result, as seen in Table 2 , this POI was ultimately represented by a larger video consisting of 157 frames. This solution proved effective, as no participants encountered issues with this POI afterward. Another challenge arose when two POIs were very similar, such as "Glassmaking-Part 2" and "Producing the Raw Glass-Part 1." The visual similarity between these two POIs led to identification errors for the former. This issue underscores the complexity of accurately distinguishing between similar exhibits. Our experiment was conducted in a single area of the museum, raising further questions about the application's performance if deployed across the entire museum or in larger museums. The strategy of enriching the database with longer videos did not significantly improve the results in these cases, leaving this area an open question for future research. 6.2 Quality and Performance While the response time was generally satisfactory for most users, it was identified as an area with potential for improvement. Nineteen participants reported being satisfied with the response time; however, 11 participants felt that it was either too long or slightly too long. Improving the application's responsiveness could further enhance the user experience, minimizing moments of frustration or disengagement. The application demonstrated good quality and performance, as evidenced by its low error rate and the low average number of attempts required to obtain a positive result (see Table 3 for the average number of visited POIs and the average number of attempts needed to successfully identify a POI). Additionally, the relatively good response time contributed to a positive user experience. While there is still room for improvement, the application generally meets the desired quality and performance standards. 6.3 Usability Evaluation The System Usability Scale (SUS) score of 87.08, categorized as "Excellent," is a strong indicator of the application's high usability. This score not only reflects the ease with which participants navigated the application but also suggests that the application requires minimal effort to learn and use, which is crucial for user adoption and satisfaction. The SUS score places the application in the top tier of systems, making it highly competitive in the realm of digital tools designed for museum environments or similar contexts. A key aspect of this experiment was the intentional targeting of older-aged visitors, with an average participant age of around 44.5. This demographic is often less comfortable with using new technologies, making their feedback particularly valuable in assessing the application's usability. The positive responses from this group underscore the success of the application in delivering a user-friendly experience that is accessible even to those who might typically struggle with technology. Comments such as "Easy to use," "Great app," and "Recommended app" highlight the application's effectiveness in overcoming common barriers faced by older users. The high level of engagement, as evidenced by the average of 17.5 location queries per participant, indicates that users found the application both intuitive and engaging, further validating its usability. 6.4 Comfort Assessment Comfort, both physical and cognitive, is another critical aspect of user experience. The experiment's results indicate that the application successfully provides a comfortable user experience, with users able to easily interact with the system and understand its functionality. The fact that 29 out of 30 participants found the location search initiation cues clear, and all participants understood when their location was identified, underscores the clarity and accessibility of the application’s interface. 6.5 User Enjoyment and Engagement The experiment also highlighted the application's success in engaging users, as demonstrated by the active participation and the generally positive feedback. The application not only facilitated an informative and enjoyable experience but also encouraged participants to explore the museum in greater depth. Comments like "Encouraging to visit the museum" suggest that the application has the potential to significantly enhance visitor engagement and satisfaction, which is a key goal of the solution. The constructive feedback provided by participants offers valuable insights into areas for further development. The most common suggestions, such as the need for faster response times, more interactive features, and support for additional languages, provide a clear roadmap for future iterations of the application. By addressing these areas, the application can evolve to better meet the needs of a broader audience and provide an even more enriched user experience. 6.6 Limitations While the experiment yielded valuable insights, it is important to acknowledge its limitations. The study was conducted within a specific area of the Hecht Museum, focusing on the "Ancient Crafts and Industries" section with a limited number of POIs. This restricted scope may not fully represent the application's performance in other museum settings or with a broader range of exhibits. Additionally, the application relies on wireless or device connectivity to upload requests and download responses, a feature that may not be available in every indoor environment. This dependence on connectivity could limit the app's effectiveness in settings with poor or no network coverage. Future research should explore the application’s effectiveness in larger and more diverse museums, where it may encounter different challenges in terms of layout, exhibit types, and visitor interactions, as well as varying levels of connectivity. 6.7 Contribution to the Field of Indoor Localization in Cultural Heritage Sites The field of indoor localization within cultural heritage sites presents unique challenges that differ significantly from those encountered in other environments. Unlike conventional indoor localization systems, which are typically designed for locations such as shopping malls or office buildings, cultural heritage sites like museums and historical landmarks require localization solutions that are not only accurate but also sensitive to the context and significance of the environment. These sites often feature complex layouts, dense exhibits, and a rich array of POIs, all of which must be navigated in a manner that enhances, rather than detracts from, the visitor's experience. One of the primary difficulties in developing indoor localization systems for cultural heritage sites is the need to balance technological innovation with the preservation of the site's integrity. Any solution must be unobtrusive, ensuring that the technology does not overshadow the cultural and educational value of the exhibits. Additionally, the system must be robust enough to handle the unique architectural features of heritage sites, such as thick walls, varied room sizes, and often limited or inconsistent lighting conditions. These factors can significantly impact the performance of traditional localization technologies, making it difficult to achieve the level of accuracy and reliability required. Our proposed application addresses these challenges by integrating image-based location detection service into a user- friendly interface that is tailored to the needs of cultural heritage sites. The application’s ability to accurately identify and provide information about various POIs, even in the complex environment of the Hecht Museum, demonstrates its potential as a valuable tool for indoor localization in similar contexts. By utilizing the camera on a mobile device, the application minimizes the need for additional infrastructure, such as beacons or Wi-Fi triangulation, which can be intrusive or challenging to deploy in historical settings. 7 Conclusion and Future Directions The experiment conducted at the Hecht Museum has provided substantial evidence of the potential of image-based positioning. The results demonstrate that an image-based positioning system can be effectively designed to create a robust and user-friendly application, as indicated by the high SUS score and positive feedback from participants. The experiment also identified several areas for future development to enhance the application's usability and effectiveness. The development team should focus on improving the application's response times, expanding language support, and adding more interactive features to further boost user satisfaction. By addressing the feedback gathered from this experiment, the application can continue to evolve and maintain a competitive edge in the digital tools market for educational and cultural institutions. Another promising direction for future work is the integration of audio descriptions, which could either replace or complement text descriptions. This feature would make the application more accessible to users with visual impairments or those who prefer auditory information, thereby enhancing the overall user experience and broadening the application’s appeal across different demographic groups. Finally, expanding the application’s scope to function effectively in larger and more diverse museum environments is essential. Future research should investigate how the application performs in different settings, with various exhibit types and layouts, to ensure its versatility and adaptability within the cultural heritage sector. During the preparation of this work the author(s) used ChatGPT in order to enhance the text and improve clarity. After using this tool, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication. Declarations Author Contribution Bashar Egbariya developed the system, carried out the experimentation, analyzed the results and drafted the initial version of the manuscriptRotem Dror guided and supervised the use of LLM for content generation for the mobile guideTsvi Kuflik was the initiator of the study, guided the project, took active part in the evaluation and iterative development of the system, guided the analysis of the results and revised, reviewed and re formatted the manuscript.Ilan Shimshoni let and guided the machine vision aspects, took an active part in analyzing the results and reviewing the manuscript. References Abdulateef, A. T., & Makki, S. A. (2023, December). A survey of indoor positioning system based-smartphone. In AIP Conference Proceedings (Vol. 2977, No. 1). AIP Publishing. Ashraf, I., Hur, S., & Park, Y. (2019 a). Application of deep convolutional neural networks and smartphone sensors for indoor localization. Applied Sciences, 9(11), 2337. Ashraf, I., Hur, S., Park, S., & Park, Y. (2019 b). DeepLocate: Smartphone based indoor localization with a deep neural network ensemble classifier. Sensors, 20(1), 133. Bangor, A., Kortum, P., & Miller, J. (2009). Determining what individual SUS scores mean: Adding an adjective rating scale. Journal of usability studies, 4(3), 114-123. Barbieri, L., Brambilla, M., Trabattoni, A., Mervic, S., & Nicoli, M. (2021). UWB localization in a smart factory: Augmentation methods and experimental assessment. IEEE Transactions on Instrumentation and Measurement, 70, 1-18. Barbour, N., & Schmidt, G. (2001). Inertial sensor technology trends. IEEE Sensors journal, 1(4), 332-339. Basiri, A., Lohan, E. S., Moore, T., Winstanley, A., Peltola, P., Hill, C., ... & e Silva, P. F. (2017). Indoor location based services challenges, requirements and usability of current solutions. Computer Science Review, 24, 1-12. Benini, L., Farella, E., & Guiducci, C. (2006). Wireless sensor networks: Enabling technology for ambient intelligence. Microelectronics journal, 37(12), 1639-1649. Brusch, I. (2022). Identification of travel styles by learning from consumer-generated images in online travel communities. Information & Management, 59(6), 103682. Butun, I., Österberg, P., & Gidlund, M. (2019, June). Preserving location privacy in cyber-physical systems. In 2019 IEEE Conference on Communications and Network Security (CNS) (pp. 1-6). IEEE. Cavallari, T., Golodetz, S., Lord, N. A., Valentin, J., Prisacariu, V. A., Di Stefano, L., & Torr, P. H. (2019). Real-time RGB-D camera pose estimation in novel scenes using a relocalisation cascade. IEEE transactions on pattern analysis and machine intelligence, 42(10), 2465-2477. Chen, R., & Chen, L. (2021). Smartphone-based indoor positioning technologies. Urban informatics, 467-490. Chintalapudi, K., Padmanabha Iyer, A., & Padmanabhan, V. N. (2010, September). Indoor localization without the pain. In Proceedings of the sixteenth annual international conference on Mobile computing and networking (pp. 173-184). Conte, G., & Doherty, P. (2008, March). An integrated UAV navigation system based on aerial image matching. In 2008 IEEE Aerospace Conference (pp. 1-10). IEEE. Correa, A., Barcelo, M., Morell, A., & Vicario, J. L. (2017). A review of pedestrian indoor positioning systems for mass market applications. Sensors, 17(8), 1927. Dabove, P., Di Pietra, V., Piras, M., Jabbar, A. A., & Kazim, S. A. (2018, April). Indoor positioning using Ultra-wide band (UWB) technologies: Positioning accuracies and sensors' performances. In 2018 IEEE/ION Position, Location and Navigation Symposium (PLANS) (pp. 175-184). IEEE. Davidson, P., & Piché, R. (2016). A survey of selected indoor positioning methods for smartphones. IEEE Communications surveys & tutorials, 19(2), 1347-1370. Dong, E., Xu, J., Wu, C., Liu, Y., & Yang, Z. (2019, April). Pair-navi: Peer-to-peer indoor navigation with mobile visual slam. In IEEE INFOCOM 2019-IEEE conference on computer communications (pp. 1189-1197). IEEE. AAA removed for anonymization El-Sheimy, N., & Li, Y. (2021). Indoor navigation: State of the art and future trends. Satellite Navigation, 2(1), 7. Elmenreich, W. (2002). An introduction to sensor fusion. Vienna University of Technology, Austria, 502, 1-28. Feldmann, S., Kyamakya, K., Zapater, A., & Lue, Z. (2003, June). An Indoor Bluetooth-Based Positioning System: Concept, Implementation and Experimental Evaluation. In International conference on wireless networks (Vol. 272). Ghasemi, Y., Jeong, H., Choi, S. H., Park, K. B., & Lee, J. Y. (2022). Deep learning-based object detection in augmented reality: A systematic review. Computers in Industry, 139, 103661. Ghouaiel, N., Garbaya, S., Cieutat, J. M., & Jessel, J. P. (2017). Mobile augmented reality in museums: towards enhancing visitor's learning experience. International Journal of Virtual Reality, 17(1), 21-31. Gu, Y., Chen, M., Ren, F., & Li, J. (2016, April). HED: Handling environmental dynamics in indoor WiFi fingerprint localization. In 2016 IEEE wireless communications and networking conference (pp. 1-6). IEEE. Gu, Y., Lo, A., & Niemegeers, I. (2009). A survey of indoor positioning systems for wireless personal networks. IEEE Communications surveys & tutorials, 11(1), 13-32. Guo, G., Chen, R., Ye, F., Peng, X., Liu, Z., & Pan, Y. (2019). Indoor smartphone localization: A hybrid WiFi RTT-RSS ranging approach. Ieee Access, 7, 176767-176781. Gupta, P., Sharma, V., Gairolla, J., Thakur, U., Pandey, N., Khurana, D., & Ramavat, A. S. (2024). Mobile Based Indoor Hospital Navigation System for Tertiary Care Setup: A Scoping Review. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). He, S., & Chan, S. H. G. (2015). Wi-Fi fingerprint-based indoor positioning: Recent advances and comparisons. IEEE Communications Surveys & Tutorials, 18(1), 466-490. Hsu, H. H., Chang, J. K., Peng, W. J., Shih, T. K., Pai, T. W., & Man, K. L. (2018). Indoor localization and navigation using smartphone sensory data. Annals of Operations Research, 265, 187-204. Huang, H., & Gartner, G. (2010). A survey of mobile indoor navigation systems (pp. 305-319). Springer Berlin Heidelberg. Jackermeier, R., & Ludwig, B. (2018). Exploring the limits of PDR-based indoor localisation systems under realistic conditions. Journal of Location Based Services, 12(3-4), 231-272. Jahne, B. (Ed.). (2000). Computer vision and applications: a guide for students and practitioners. Elsevier. Jamshidi, S., Ensafi, M., & Pati, D. (2020). Wayfinding in interior environments: An integrative review. Frontiers in Psychology, 11, 549628. Jiang, Y., Zheng, X., & Feng, C. (2023). Toward Multi-area Contactless Museum Visitor Counting with Commodity WiFi. ACM Journal on Computing and Cultural Heritage, 16(1), 1-26. Jo, H. J., & Kim, S. (2018). Indoor smartphone localization based on LOS and NLOS identification. Sensors, 18(11), 3987. Kárník, J., & Streit, J. (2016). Summary of available indoor location techniques. IFAC-PapersOnLine, 49(25), 311-317. Kim Geok, T., Zar Aung, K., Sandar Aung, M., Thu Soe, M., Abdaziz, A., Pao Liew, C., ... & Yong, W. H. (2020). Review of indoor positioning: Radio wave technology. Applied Sciences, 11(1), 279. Kim, J., Lee, S., & Kim, H. (2018). A survey on computer vision-based indoor localization methods. Sensors, 18(10), 3234. Klette, R. (2014). Concise computer vision (Vol. 233, pp. 2-1). London: Springer. Kolivand, H., El Rhalibi, A., Tajdini, M., Abdulazeez, S., & Praiwattana, P. (2018). Cultural heritage in marker-less augmented reality: A survey. In Advanced methods and new materials for cultural heritage preservation. IntechOpen. Kuo, Y. S., Pannuto, P., Hsiao, K. J., & Dutta, P. (2014, September). Luxapose: Indoor positioning with mobile phones and visible light. In Proceedings of the 20th annual international conference on Mobile computing and networking (pp. 447-458). Li, Q., Zhu, J., Liu, T., Garibaldi, J., Li, Q., & Qiu, G. (2017, November). Visual landmark sequence-based indoor localization. In Proceedings of the 1st Workshop on Artificial Intelligence and Deep Learning for Geographic Knowledge Discovery (pp. 14-23). Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2021). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems, 33(12), 6999-7019. Liu, M., Cheng, L., Qian, K., Wang, J., Wang, J., & Liu, Y. (2020). Indoor acoustic localization: A survey. Human-centric Computing and Information Sciences, 10, 1-24. Liu, T., Zhang, X., Li, Q., & Fang, Z. (2017). A visual-based approach for indoor radio map construction using smartphones. Sensors, 17(8), 1790. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60, 91-110. Lymberopoulos, D., & Liu, J. (2017). The microsoft indoor localization competition: Experiences and lessons learned. IEEE Signal Processing Magazine, 34(5), 125-140. Martinez del Horno, M., García-Varea, I., & Orozco Barbosa, L. (2019). Calibration of Wi-Fi-based indoor tracking systems for Android-based smartphones. Remote Sensing, 11(9), 1072. Meliones, A., & Sampson, D. (2018). Blind MuseumTourer: A system for self-guided tours in museums and blind indoor navigation. Technologies, 6(1), 4. Misra, P. (2006). Global positioning system: Signals. Measurements, and Performance/Ganga-Jamuna Press. BBB removed for anonymization Morar, A., Moldoveanu, A., Mocanu, I., Moldoveanu, F., Radoi, I. E., Asavei, V., ... & Butean, A. (2020). A comprehensive survey of indoor localization methods based on computer vision. Sensors, 20(9), 2641. Moreno, A., & Angulo, I. (2012 a). A Reliable ICT Solution for Organ Transport Traceability and Incidences Reporting Based on Sensor Networks and Wireless Technologies. In Distributed Computing and Artificial Intelligence: 9th International Conference (pp. 395-402). Springer Berlin Heidelberg. Moreno, A., Angulo, I., Perallos, A., Landaluce, H., Zuazola, I. J. G., Azpilicueta, L., ... & Villadangos, J. (2012 b). IVAN: Intelligent van for the distribution of pharmaceutical drugs. Sensors, 12(5), 6587-6609. Morley, S. K., Sullivan, J. P., Carver, M. R., Kippen, R. M., Friedel, R. H. W., Reeves, G. D., & Henderson, M. G. (2017). Energetic particle data from the global positioning system constellation. Space Weather, 15(2), 283-289. Morris, T. (2004). Computer vision and image processing. Palgrave Macmillan Ltd. Mur-Artal, R., Montiel, J. M. M., & Tardos, J. D. (2015). ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE transactions on robotics, 31(5), 1147-1163. Naser, R. S., Lam, M. C., Qamar, F., & Zaidan, B. B. (2023). Smartphone-based indoor localization systems: A systematic literature review. Electronics, 12(8), 1814. O'Shea, K. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 . Piras, M., Lingua, A., Dabove, P., & Aicardi, I. (2014, May). Indoor navigation using Smartphone technology: A future challenge or an actual possibility? In 2014 IEEE/ION Position, Location and Navigation Symposium-PLANS 2014 (pp. 1343-1352). IEEE. Podevijn, N., Plets, D., Trogh, J., Karaagac, A., Haxhibcqiri, J., Hoebeke, J., ... & Joseph, W. (2018, September). Performance comparison of RSS algorithms for indoor localization in large open environments. In 2018 International Conference on Indoor Positioning and Indoor Navigation (IPIN) (pp. 1-6). IEEE. Poulose, A., & Han, D. S. (2019a). Hybrid indoor localization using IMU sensors and smartphone camera. Sensors, 19(23), 5084. Poulose, A., Eyobu, O. S., & Han, D. S. (2019b). An indoor position-estimation algorithm using smartphone IMU sensor data. Ieee Access, 7, 11165-11177. Pundir, A. K., Jagannath, J. D., & Ganapathy, L. (2019, January). Improving supply chain visibility using IoT-internet of things. In 2019 ieee 9th annual computing and communication workshop and conference (ccwc) (pp. 0156-0162). IEEE. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748-8763). PMLR. Ravi, N., Shankar, P., Frankel, A., Elgammal, A., & Iftode, L. (2005, August). Indoor localization using camera phones. In Seventh IEEE Workshop on Mobile Computing Systems & Applications (WMCSA'06 Supplement) (pp. 1-7). IEEE. Renaudin, V., Yalak, O., Tomé, P., & Merminod, B. (2007). Indoor navigation of emergency agents. European Journal of Navigation, 5(3), 36-45. Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011, November). ORB: An efficient alternative to SIFT or SURF. In 2011 International conference on computer vision (pp. 2564-2571). Ieee. Sawaby, A. M., Noureldin, H. M., Mohamed, M. S., Omar, M. O., Shaaban, N. S., Ahmed, N. N., ... & Mostafa, H. (2019, May). A smart indoor navigation system over BLE. In 2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST) (pp. 1-4). IEEE. Shenoy, A., & Thillaiarasu, N. (2022, March). A survey on different computer vision based human activity recognition for surveillance applications. In 2022 6th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 1372-1376). IEEE. Stock, O., Zancanaro, M., Busetta, P., Callaway, C., Krüger, A., Kruppa, M., ... & Rocchi, C. (2007). Adaptive, intelligent presentation of information for the museum visitor in PEACH. User Modeling and User-Adapted Interaction, 17, 257-304. Stockman, G., & Shapiro, L. G. (2001). Computer vision. Prentice Hall PTR. Syahidi, A. A., Kiyokawa, K., & Okura, F. (2023, October). Computer Vision in Smart City Application: A Mapping Review. In 2023 6th International Conference on Applied Computational Intelligence in Information Systems (ACIIS) (pp. 1-6). IEEE. Tan, S. Y., Lee, K. J., & Lam, M. C. (2020). A Shopping Mall Indoor Navigation Application using Wi-Fi Positioning System. International Journal, 9(4). Trichopoulos, G., Konstantakis, M., Caridakis, G., Katifori, A., & Koukouli, M. (2023). Crafting a Museum Guide Using ChatGPT4. Big Data and Cognitive Computing, 7(3), 148. Varalatchoumy, M., Divakaran, S., & Ram, R. A. (2023, May). Foodflare: An Indoor Navigation System. In International Conference on Applications of Machine Intelligence and Data Analytics (ICAMIDA 2022) (pp. 722-734). Atlantis Press. Villaespesa, E., & Crider, S. (2021). Computer vision tagging the metropolitan museum of art's collection: A comparison of three systems. Journal on Computing and Cultural Heritage (JOCCH), 14(3), 1-17. Wang, B., Liu, K., & Zhao, J. (2016, August). Inner attention based recurrent neural networks for answer selection. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1288-1297). Wang, S. S. (2018). A BLE-based pedestrian navigation system for car searching in indoor parking garages. Sensors, 18(5), 1442. Wang, X., Gao, L., Mao, S., & Pandey, S. (2016). CSI-based fingerprinting for indoor localization: A deep learning approach. IEEE transactions on vehicular technology, 66(1), 763-776. CCC removed for anonymization Wu, J. (2017). Introduction to convolutional neural networks. National Key Lab for Novel Software Technology. Nanjing University. China, 5(23), 495. Xie, L., Lee, F., Liu, L., Kotani, K., & Chen, Q. (2020). Scene recognition: A comprehensive survey. Pattern Recognition, 102, 107205. Xingli, G., Yaning, L., & Ruihui, Z. (2018, March). Indoor positioning technology based on deep neural networks. In 2018 Ubiquitous Positioning, Indoor Navigation and Location-Based Services (UPINLBS) (pp. 1-6). IEEE. Xiong, Z., Sottile, F., Spirito, M. A., & Garello, R. (2011, February). Hybrid indoor positioning approaches based on WSN and RFID. In 2011 4th IFIP International Conference on New Technologies, Mobility and Security (pp. 1-5). IEEE. Yang, S., Ma, L., Jia, S., & Qin, D. (2020). An improved vision-based indoor positioning method. IEEE Access, 8, 26941-26949. Yang, Z., Wu, C., & Liu, Y. (2012, August). Locating in fingerprint space: Wireless indoor localization with little human intervention. In Proceedings of the 18th annual international conference on Mobile computing and networking (pp. 269-280). Yao, Y., Pan, L., Fen, W., Xu, X., Liang, X., & Xu, X. (2020). A robust step detection and stride length estimation for pedestrian dead reckoning using a smartphone. IEEE Sensors Journal, 20(17), 9685-9697. Ye, H., Chen, Y., & Liu, M. (2019, May). Tightly coupled 3d lidar inertial odometry and mapping. In 2019 International Conference on Robotics and Automation (ICRA) (pp. 3144-3150). IEEE. Yin, Y., Yu, F., Xu, Y., Yu, L., & Mu, J. (2017). Network location-aware service recommendation with random walk in cyber-physical systems. Sensors, 17(9), 2059. Youssef, M. A., Agrawala, A., & Shankar, A. U. (2003, March). WLAN location determination via clustering and probability distributions. In Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003.(PerCom 2003). (pp. 143-150). IEEE. Yuan, Y., Melching, C., Yuan, Y., & Hogrefe, D. (2018). Multi-device fusion for enhanced contextual awareness of localization in indoor environments. IEEE Access, 6, 7422-7431. Zhang, L., Huang, L., Yi, Q., Wang, X., Zhang, D., & Zhang, G. (2022, September). Positioning method of pedestrian dead reckoning based on human activity recognition assistance. In 2022 IEEE 12th International Conference on Indoor Positioning and Indoor Navigation (IPIN) (pp. 1-8).IEEE. Zhang, J., & Shah, M. (2019). Visual indoor localization: A survey. IEEE Signal Processing Magazine, 36(5), 128-140. Zhou, F., & De la Torre, F. (2015). Factorized graph matching. IEEE transactions on pattern analysis and machine intelligence, 38(9), 1774-1789. Zou, Z., Chen, Q., Uysal, I., & Zheng, L. (2014). Radio frequency identification enabled wireless sensing for intelligent food logistics. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 372(2017), 20130313. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6142584","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":435689433,"identity":"419329de-ee20-4889-ba3a-705c9aed1586","order_by":0,"name":"Bashar Egbariya","email":"","orcid":"","institution":"University of Haifa","correspondingAuthor":false,"prefix":"","firstName":"Bashar","middleName":"","lastName":"Egbariya","suffix":""},{"id":435689434,"identity":"2d66cc60-0fb3-491c-8c56-8fdd96d272e3","order_by":1,"name":"Rotem Dror","email":"","orcid":"","institution":"University of Haifa","correspondingAuthor":false,"prefix":"","firstName":"Rotem","middleName":"","lastName":"Dror","suffix":""},{"id":435689435,"identity":"6cc40701-4063-4b1a-9b93-40d112e9b600","order_by":2,"name":"Tsvi Kuflik","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA70lEQVRIie3RMQrCMBSA4eeSLoWsEcVe4RVn8SophbpU6NjBISBkErvqLfQGDUInD+BYFycXF+lQxKgdXJo6OuSnU+hH3iMANtufpgTwEQHQH44+Rzxp/91tyLgh44agmfQ0CcSbQEPAQKb06KvNIpqtaU7KKkEPnEMJpekWFqPaFfFcMu74K0RfuBGaB2OEq5KkmgBhLtZ6yrhjF1po8khnRA/WrxGngl671g9ztZcxJ8DJwEUMBOu65aTJdh35kgVyMEQMJbtgbiJOppa31T30vOxQ9K81TjIans9V2k6+0ou/er1O/hOw2Ww2W2tPknJKKOAZAqcAAAAASUVORK5CYII=","orcid":"","institution":"University of Haifa","correspondingAuthor":true,"prefix":"","firstName":"Tsvi","middleName":"","lastName":"Kuflik","suffix":""},{"id":435689437,"identity":"f5144d9a-a7fb-426d-82f8-678d55a68721","order_by":3,"name":"Ilan Shimshoni","email":"","orcid":"","institution":"University of Haifa","correspondingAuthor":false,"prefix":"","firstName":"Ilan","middleName":"","lastName":"Shimshoni","suffix":""}],"badges":[],"createdAt":"2025-03-03 04:38:20","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6142584/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6142584/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":79835625,"identity":"d372aaee-e0b0-4c80-b616-dc051feebd11","added_by":"auto","created_at":"2025-04-03 11:19:13","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":348572,"visible":true,"origin":"","legend":"\u003cp\u003eSequence diagram representing the interaction between the application components for detecting the standing position and querying the backend service for searching position flow.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/cd4a88e5f4a80a4521a5e7be.png"},{"id":79834762,"identity":"5d4e3d2e-da7c-4676-aaa3-41d5eed54272","added_by":"auto","created_at":"2025-04-03 11:11:13","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":821528,"visible":true,"origin":"","legend":"\u003cp\u003eApplication Screenshots: (a) A screenshot displaying a museum exhibit, where the exhibit is recognized as a position based on its storage in the datastore. The application is actively attempting to determine the exhibit's position, sending a request and awaiting a response. (b) A screenshot showing a random position that is not stored in the datastore. The instructions on the screen inform the user that \"Location can't be detected\" and advise them to \"Please try a different angle or get closer to the object.\" (c) A screenshot showing a successful response for the scenario in (a), where the position has been accurately detected. The relevant data is retrieved from the database and presented to the user.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/29a3bffe2d4128842afcea3c.png"},{"id":79834760,"identity":"51820e14-b089-4c4e-9ced-3bfb4d34db4d","added_by":"auto","created_at":"2025-04-03 11:11:13","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":199702,"visible":true,"origin":"","legend":"\u003cp\u003eApplication diagram: depicts the overall architecture of the wandering application and its interaction with the backend service.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/b838a164522cd86f4da02457.png"},{"id":79834764,"identity":"3c9ed64d-a2f0-4ee0-9019-a6364ca9ded2","added_by":"auto","created_at":"2025-04-03 11:11:13","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":429445,"visible":true,"origin":"","legend":"\u003cp\u003eDiagram of Server-Side Service Architecture: Illustrates the interactions between service components and their interactions with the client-side, including detailed flow processes.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/5514cc45be6c814343103a94.png"},{"id":79835629,"identity":"53363fb9-3939-43c1-932c-9724ebff072d","added_by":"auto","created_at":"2025-04-03 11:19:14","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":989781,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates the frames selected by the ARIDF that are stored in the database to represent the video describing the \"Bronze Vessels\" POI. These frames are not necessarily in the same order as they appear in the original video. In this particular instance, the frames are sorted as in the original sequence, but this is not the case for all videos. According to ARIDF, frame (1) represents 5 other frames, (2) represents 1, (3) represents 1, (4) represents 15, (5) represents 1 and frame (6) represents 4, these covers the whole 27 frames extracted from “Bronze Vessels” video.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/52fb7ac999dab1ecc452de46.png"},{"id":79835627,"identity":"cb2ce9b2-f791-4a3e-ad99-2710245cc975","added_by":"auto","created_at":"2025-04-03 11:19:13","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":958045,"visible":true,"origin":"","legend":"\u003cp\u003eAnother example that illustrates the frames selected by the ARIDF that are stored in the database to represent the video describing the \"Lost Wax\" POI.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/437f6db777aacf82c337fbc3.png"},{"id":79836592,"identity":"357e36bc-a3cf-48c3-ba0d-b40452298bd9","added_by":"auto","created_at":"2025-04-03 11:27:13","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":446827,"visible":true,"origin":"","legend":"\u003cp\u003e(1) Depicts the \" De Materia Medica by the Greek\" POI, it has many angles that need to be taken into consideration, (2) Depicts the \"Mosaic Art\" POI, located in the center of the \"Ancient Crafts and Industries\" exhibition area. This central positioning allows for a greater variety of viewing angles around the POI.\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/8522791943e2af271847fba8.png"},{"id":79834771,"identity":"05bda29e-7228-4028-9a6a-2c10e8b21e1d","added_by":"auto","created_at":"2025-04-03 11:11:14","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":1028703,"visible":true,"origin":"","legend":"\u003cp\u003e(1) Glassmaking-Part2, (2) Producing the Raw Glassp-Part1.\u003c/p\u003e","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/b674f397db10c42209e7c8bb.png"},{"id":81255061,"identity":"db929fad-8ae3-4d59-96db-98c41c5834d7","added_by":"auto","created_at":"2025-04-24 04:16:43","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":8124970,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6142584/v1/04530131-ebf1-453f-a692-768734d3342a.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Using Image-based positioning for seamless localization in cultural heritage setting","fulltext":[{"header":"1 INTRODUCTION","content":"\u003cp\u003eIndoor localization refers to determining the precise location of objects or people within an indoor environment. Unlike outdoor localization, which primarily relies on GPS technology that provides global coverage with an error rate of up to 3.5 meters under optimal conditions (Misra et al. 2006, Morley et al. 2017), GPS does not work inside. Indoor positioning technologies cannot use GPS due to signal reflection and diffraction caused by indoor obstacles, necessitating the use of alternative methods (X. Wang et al. 2016, Chintalapudi et al. 2010). The distinction between indoor and outdoor localization lies in the environmental factors affecting signal propagation and the technologies employed. While GPS works well in open spaces, its accuracy diminishes significantly indoors due to signal blockage and interference from walls and other structures (Jo HJ et al.2018, Tan et al.2020, del Horno et al.2019, Yao et al.2020). Indoor localization, therefore, often requires alternative technologies (K.Chintalapudi et al.2010), each offering different levels of accuracy and requiring varying infrastructure investments. As a result, indoor localization is a field that needs to be investigated, and it is important to do so in order to develop effective solutions that can overcome these challenges.\u003c/p\u003e\n\u003cp\u003eResearchers have developed various technologies to achieve precise indoor localization and tracking. These include Radio-frequency Identification (RFID) (Poulose et al.2019a, Ashraf et al.2019a), Wi-Fi (Y. Gu et al.2016), Bluetooth (Guo et al.2019), acoustic methods (Liu et al.2017, Gupta et al.2024), and inertial sensors (Aeshah et al.2023). RFID requires the presence of active or passive tags within the environment or on the user, along with a scanner to read these tags (Poulose et al.2019a). Wi-Fi and Bluetooth methods necessitate additional infrastructure to be effective. Acoustic localization offers centimeter-level accuracy but is susceptible to interference from reflections and obstructions like walls (Liu et al.2020).\u003c/p\u003e\n\u003cp\u003ePedestrian Dead Reckoning (PDR) is a notable approach for estimating a pedestrian\u0026apos;s location using data from inertial measurement units (IMUs) (Zhang et al. 2022). Research on PDR has investigated aspects such as step detection, step length estimation, heading determination, and position estimation based on this information (Poulose et al. 2019a, Klette et al. 2014). PDR is widely employed in smart devices for localization purposes (Guo et al. 2019). However, PDR deteriorates over time, causing the error to increase progressively.\u003c/p\u003e\n\u003cp\u003eAnother method received signal strength (RSS) or received signal strength indicator (RSSI), involves measuring the strength of a radio signal, which theoretically diminishes as the distance between the transmitter and receiver grows (Naser et al.2023).\u003c/p\u003e\n\u003cp\u003eEach localization method faces unique challenges, including issues with accuracy, cost, coverage, complexity, and applicability. With the growth in smartphone computing power and distribution, there has been an increasing trend toward utilizing smartphone sensors for position detection (Naser et al.2023, Stockman et al.2001).\u003c/p\u003e\n\u003cp\u003eThe significance of accurate indoor localization cannot be overstated in pervasive computing environments, as it is crucial in various applications across different domains as emergency security, crowd monitoring, intelligent warehousing, precision marketing, mobile health, augmented reality, and other significant fields (Hsu et al.2018, Ashraf et al.2019b). In healthcare, for example, it is used to track the location of patients and equipment within hospitals (Shenoy et al.2022). In retail, it aids in understanding customer behavior by monitoring their movement within stores (Lin et al.2020). The advent of smart buildings and internet of things (IoT) has further increased the demand for precise indoor localization, facilitating automation and enhancing user experiences. However, indoor localization faces numerous challenges, including signal multipath effects and signal attenuation (X.Wang et al.2016, Chintalapudi et al.2010), as well as the requirement for extensive infrastructure, which can be costly and demand regular maintenance (Z. Yang et al.2012, He et al.2015, N. Ravi et al.2005).\u003c/p\u003e\n\u003cp\u003eIndoor positioning systems (IPS) are highly valuable in cultural heritage settings like museums and historical sites, where they address unique challenges. The preservation of artifacts and structures often limits the installation of intrusive infrastructure, making non-invasive localization solutions essential. The complexity and scale of museums, along with diverse visitor profiles, require accurate and user-friendly navigation aids. IPS enhances visitor experiences by providing context-aware information, guiding visitors through exhibits, and offering detailed descriptions of artifacts. Additionally, IPS plays a crucial role in crowd management and security, ensuring the safety and smooth operation of these venues while adhering to aesthetic preservation and conservation standards (Y.\u0026nbsp;Yin et al.2017).\u003c/p\u003e\n\u003cp\u003eImage-based indoor positioning offers a promising solution for the unique challenges posed by cultural heritage environments. This approach leverages the ubiquity of smartphones equipped with cameras to determine the user\u0026apos;s location by matching captured images with a pre-existing database of images tagged with location data. Unlike other methods, image-based positioning does not require the installation of hardware like beacons or sensors, preserving the integrity of cultural sites (Conte et al.2008, Kuo et al.2014, Li et al.2017, Dong et al.2019).\u003c/p\u003e\n\u003cp\u003eThe primary advantage of image-based positioning lies in its accuracy and non-intrusiveness. By analyzing visual features in the captured images, such systems can pinpoint locations with high precision. This method also enables a rich, interactive visitor experience, as images can be linked to multimedia content, providing immersive storytelling opportunities. The approach\u0026apos;s scalability and low cost further enhance its appeal, as it primarily requires the maintenance of a digital image database rather than physical infrastructure (Conte et al.2008, Kuo et al.2014, Li et al.2017, Dong et al.2019).\u003c/p\u003e\n\u003cp\u003eIt is worth noting that images are used for identifying users\u0026apos; positions outdoors as well, as demonstrated for instance by (Brusch 2022), but our focus is on indoors positioning hence we do not cover outdoors image-based positioning.\u003c/p\u003e\n\u003cp\u003eIn our study, we explored the viability of applying computer vision techniques combined with a smartphone to better derive the location of a person. Our research question was \u0026ldquo;\u003cstrong\u003eHow can image-based positioning be integrated efficiently into a mobile, location aware museum visitors guide?\u003c/strong\u003e\u0026rdquo; We developed an Android-based mobile application that leverages image-based positioning to guide visitors within the (removed for annonymization). A significant challenge we faced was determining how to compare images representing different locations and accurately assess their similarity. To solve this, we used the CLIP (Radford et al.2021) model for image representation, which allowed us to effectively match images and pinpoint the visitor\u0026apos;s location. The application captures photos of exhibits and compares them to a pre-curated database to determine the visitor\u0026rsquo;s position within the museum. Integrated with a backend service, the system provides visitors with real-time information about nearby points of interest (POIs), enhancing their educational experience with timely and relevant insights. Since experimentation in a realistic setting requires also high-quality content, we followed (AAA) and used a large language model for content creation that was corrected manually, for creating commentaries about the POIs.\u003c/p\u003e\n\u003cp\u003eWe conducted a user study to evaluate the system\u0026apos;s usability and accuracy, and to gather feedback on its effectiveness and user interface design. The results indicate that image-based indoor positioning is a viable and efficient solution for enhancing visitor experiences in cultural heritage settings. Additionally, the system\u0026apos;s performance and accuracy were continuously monitored through comprehensive log data, providing valuable insights. Analysis of these logs revealed consistent high performance and accuracy, further demonstrating the system\u0026rsquo;s reliability and ease of use. By addressing the challenges and opportunities of indoor positioning in cultural heritage, this work contributes to the broader field of location-based services and specifically to their application in cultural heritage.\u003c/p\u003e"},{"header":"2 BACKGROUND AND RELATED WORK","content":"\u003cp\u003eThis section reviews related work on applications in indoor localization, with a particular focus on their use in museums.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.1 Indoor localization\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIndoor localization is a domain concerned with facilitating precise and efficient wayfinding within enclosed structures, encompassing facilities such as commercial buildings, shopping malls, healthcare facilities, and airports. Unlike outdoor localization, which relies on satellite-based systems like GPS, indoor localization systems focus on addressing the unique challenges posed by the absence of reliable GPS signals within indoor environments (Jo HJ et al.2018, Tan Esther et al.2020, del Horno et al.2019, Yao et al.2020).\u003c/p\u003e\n\u003cp\u003eCentral components of indoor localization encompass:\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003eLocation Determination: Indoor localization systems employ a variety of technologies, including Wi-Fi signals, Bluetooth beacons, Radio-Frequency Identification (RFID), computer vision, and smartphone sensor data, to pinpoint a user\u0026apos;s exact position within a building (Huang et al.2009, El-Sheimy et al.2021).\u003c/li\u003e\n \u003cli\u003eCartography: The creation of detailed indoor maps is a fundamental aspect of indoor localization. These maps encompass floor plans, room layouts, key points of interest, and optimal routing information (Huang et al.2009).\u003c/li\u003e\n \u003cli\u003eRouting and Guidance: Once a user\u0026apos;s location is ascertained, the system provides them with step-by-step directions, comprising turn-by-turn instructions, visual cues, and audible guidance, facilitating their journey to a specified destination (Huang et al.2009).\u003c/li\u003e\n \u003cli\u003ePoints of Interest (POIs): Indoor localization systems typically highlight POIs, including restrooms, exits, retail establishments, offices, or specific locations within a building, assisting users in locating desired destinations (El-Sheimy et al.2021).\u003c/li\u003e\n \u003cli\u003eUser Interface: Users engage with the localization system through interfaces such as mobile applications, kiosks, or wearable devices. These interfaces furnish users with pertinent information, including maps, directions, and related details (El-Sheimy et al.2021).\u003c/li\u003e\n \u003cli\u003eAccessibility: Indoor localization systems are invaluable for individuals with disabilities, particularly those with visual impairments, empowering them to navigate autonomously and securely within indoor spaces (El-Sheimy et al.2021).\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eIndoor localization systems fulfill a diverse set of objectives, including enhancing user experiences, bolstering safety and security measures, optimizing logistical operations within expansive facilities and offering location-based services to visitors and patrons. Their application extends across manifold settings, ranging from commercial complexes and transportation hubs to healthcare institutions, cultural establishments, and intelligent building infrastructures (Barbieri et al.2021, Butun et al.2019, Pundir et al.2019, Zou et al.2014, Moreno et al.2012a, Moreno et al.2012b, Yin et al.2017, Basiri et al.2017, Kim Geok et al.2020, Xingli et al2018).\u003c/p\u003e\n\u003cp\u003eOngoing innovations in indoor localization technology, underscored by advancements in fields such as computer vision, machine learning, and sensor technology, have significantly enhanced the accuracy and utility of these systems, fostering their continual development and integration into various real-world scenarios (Basiri et al.2017, Kim Geok et al.2020, lu et al2018, Gu et al2009, Dabove et al.2018, Lymberopoulos et al.2017, Yuan et al.2018, Podevijn et al.2018, Xiong et al.2011, Correa et al.2017, Jackermeier et al.2018, K\u0026aacute;rn\u0026iacute;k et al.2016, Davidson et al.2017, Benini et al.2006).\u003c/p\u003e\n\u003cp\u003eEarly technical approaches for indoor localization systems relied on techniques like Wi-Fi triangulation (Youssef et al.2005), Bluetooth signal strength (Feldmann et al.2003), and Radio Frequency Identification (RFID) tags (Renaudin et al.2007) to estimate the user\u0026apos;s position.\u003c/p\u003e\n\u003cp\u003eMore robust approaches, like inertial sensors\u0026mdash;such as accelerometers and gyroscopes\u0026mdash;have become integral components in indoor localization systems. These sensors measure changes in velocity and orientation to estimate the user\u0026rsquo;s position. Inertial localization systems were initially used in military localization (Barbour et al. 2001). However, a significant drawback of this approach is that it loses accuracy over time due to the accumulation of small errors in the sensor data. Other approaches involve Beacons such as Bluetooth Low Energy (BLE) Technology: Beacons are small wireless devices that transmit signals, for indoor positioning. They can be placed strategically throughout an indoor environment to provide location information to user devices. They became popular due to their low power consumption and ease of deployment. Examples of uses include a localization system for car searching in indoor parking (Wang et al.2018). Estimote beacons (https://estimote.com/), to detect the user\u0026rsquo;s presence (Sawaby et al.2019). Finally, the Sensor Fusion approach, which is a modern indoor localization approach integrate and analyzes data from various sources to provide more reliable positioning information (Elmenreich et al.2002). Examples for such implementation include the use of LiDAR and inertial measurement units (IMU), for reliably estimating the pose with high precision (Ye et al.2019) thus providing more effective tracking.\u003c/p\u003e\n\u003cp\u003eSmartphones have emerged as the predominant tool for indoor localization due to their versatility and multifunctionality. Various indoor localization systems harness the capabilities of smartphones, utilizing features such as Wi-Fi connectivity, built-in sensors, and cameras (M. Piras et al.2014).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.2 Indoor localization in museums\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIndoor localization technologies have gained prominence in recent years, particularly within the context of museums and cultural institutions. Traditional localization methods, such as paper maps and static signage, often prove inadequate in large and intricate museum spaces. Visitors may experience difficulties in finding their way, leading to suboptimal experiences (Jamshidi et al.2020).\u003c/p\u003e\n\u003cp\u003eA large variety of technologies were experimented in museums, including IR beacons (Stock et al.2007), RF Zigbee beacons (XXX), WiFi (Jiang et al.2023), Landmark based navigation (CCC).\u003c/p\u003e\n\u003cp\u003eMuseums have increasingly turned to computer vision-based indoor localization solutions to enhance visitor experiences. This approach offers several advantages, including cost-effectiveness, real-time tracking, and the ability to recognize visual cues and landmarks within the museum environment (Conte et al.2008, Kuo et al.2014, Li et al.2017, Dong et al.2019).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eSome museums have integrated augmented reality (AR) into their indoor localization systems (Kolivand et al.2019). AR overlays digital information onto the visitor\u0026apos;s view through a smartphone or smart glasses. This provides dynamic, real-time guidance, allowing visitors to interact with exhibits more effectively. For instance, visitors can point their devices at artworks or artifacts, receiving detailed information about them. Ghouaie et al.2017, introduces an innovative handheld Augmented Reality (AR) system termed the \u0026quot;Mobile Augmented Reality Touring System\u0026quot; (M.A.R.T.S). A key feature of this system is its proposal to replicate the role of a human guide using a virtual human counterpart within the M.A.R.T.S framework. The overarching objective of these interaction schemes is to facilitate the real-time linkage of digital information with exhibits, enhancing the visitor experience and understanding.\u003c/p\u003e\n\u003cp\u003eIndoor localization solutions in museums not only guide visitors but also offer interactive and immersive experiences. Users can access multimedia content related to exhibits, including audio guides, videos, and additional contextual information (Stock et al.2007), (XXX). These features transform the localization experience into an educational and engaging journey through the museum\u0026apos;s collections. Example: In the cited research paper (Villaespesa et al.2021), computer vision technology was employed to establish a subject tagging system within a web-based platform. This entailed the utilization of computer vision algorithms capable of expeditiously generating subject tags from digital images of objects within a curated collection. The methodology involved the acquisition of an extensive dataset comprising images captured within a museum setting. Subsequently, these images underwent processing through computer vision algorithms, resulting in the extraction of descriptive tags, which were subsequently cataloged in a database. This system thereby facilitated user-accessible search functionalities on the website based on the assigned tags, eventually engage users with museum\u0026rsquo;s collections.\u003c/p\u003e\n\u003cp\u003eComputer vision-based indoor localization systems can be tailored to enhance accessibility and inclusivity within museums. They can provide specialized guidance for visitors with disabilities, such as audio descriptions for visually impaired individuals. Additionally, multilingual support ensures that diverse audiences can navigate and engage with museum exhibits comfortably. Meliones et al.2018, introduces an interactive autonomous localization system designed for indoor use, specifically targeting individuals and groups with visual impairments, referred to as the \u0026quot;Blind Museum Tourer.\u0026quot; The core functionality of the Blind Museum Tourer system hinges upon the incorporation of a robust indoor localization module, serving as a guide for individuals who are blind or visually impaired, facilitating self-guided tours within museum premises. In real-time, the system possesses the capability to pinpoint the user\u0026apos;s location within the indoor environment and subsequently provide guidance towards the next exhibit according to the predefined tour route. Upon reaching each exhibit, the system delivers auditory presentations to the user for an informative and engaging museum experience.\u003c/p\u003e\n\u003cp\u003eIn conclusion, indoor localization technologies, particularly those based on computer vision and augmented reality, are redefining the way visitors explore and interact with museums. These innovative solutions not only streamline localization but also contribute to richer, more immersive, and inclusive museum experiences. As technology continues to advance, museums can look forward to further enhancing visitor engagement and education through indoor localization systems.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.3 Computer vision in indoor localization\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eComputer vision\u003c/strong\u003e is a field focused on techniques for capturing, processing, and analyzing images, allowing systems to interpret and understand visual information. Computer vision algorithms typically rely on both low-level and high-level visual features to extract meaningful data from images. Low-level features include color, texture, edges, and corners, while high-level features involve semantic information and object relationships. Techniques like image filtering, feature detection, and feature extraction play a key role in extracting these features (Klette et al.2014, Stockman et al.2001, Morris et al.2004, J\u0026auml;hne et al.2000).\u003c/p\u003e\n\u003cp\u003eRecent advancements in hardware, software, and machine learning have significantly boosted the capabilities of computer vision. Notably, \u003cstrong\u003edeep learning\u003c/strong\u003e methods have revolutionized the field, with \u003cstrong\u003econvolutional neural networks (CNNs)\u003c/strong\u003e (O\u0026apos;Shea et al.2015) and \u003cstrong\u003erecurrent neural networks (RNNs)\u003c/strong\u003e (Wang et al.2016) becoming central to modern computer vision tasks. CNNs, in particular, excel in autonomously learning layered representations from large image datasets, enabling accurate image recognition and classification tasks (Li et al.2021).\u003c/p\u003e\n\u003cp\u003eOne crucial application of computer vision is \u003cstrong\u003escene recognition\u003c/strong\u003e, where algorithms identify and categorize environments or contexts within images. Scene recognition relies on analyzing visual cues such as objects, textures, and spatial arrangements to classify images into various scene types (Lin Xie et al.2020). This capability is vital in fields such as \u003cstrong\u003esmart city infrastructure\u003c/strong\u003e, where it helps recognize urban scenes, monitor traffic, and optimize city planning (Syahidi et al.2023). It is also crucial in \u003cstrong\u003eaugmented reality (AR)\u003c/strong\u003e, where devices must recognize scenes and objects in real time to interact with the user\u0026apos;s surroundings effectively (Ghasemi et al.2022).\u003c/p\u003e\n\u003cp\u003eIn recent years, there has been a growing interest in applying computer vision to \u003cstrong\u003eindoor localization\u003c/strong\u003e due to its ability to provide accurate and robust localization in environments where GPS signals are unavailable or unreliable. Traditional positioning signals such as GPS fail indoors, and even wireless signals like Wi-Fi and Bluetooth can suffer from interference caused by building materials or obstructions (Morar et al.2020). By contrast, computer vision provides a robust alternative by leveraging images or video streams captured by cameras to estimate a user\u0026rsquo;s location accurately (Kim et al.2018).\u003c/p\u003e\n\u003cp\u003eAccording to (Chen et al.2021), the most significant advantage of cellular positioning technology is to achieve seamless indoor positioning.\u003c/p\u003e\n\u003cp\u003eDespite its potential, computer vision for indoor localization faces certain challenges, including \u003cstrong\u003eaccuracy\u003c/strong\u003e (affected by lighting conditions, camera quality, and moving objects) and \u003cstrong\u003ecomputational complexity\u003c/strong\u003e (which can limit real-time application) (Zhang et al.2019). Nevertheless, computer vision is becoming a promising tool for indoor localization, and ongoing improvements in algorithms and hardware continue to address these challenges. Yang et al. (2020) proposes an improved vision-based positioning method that uses a pixel threshold-based eight-point method to enhance the quality of feature points, thereby eliminating mismatching caused by pixel drift. The method also improves the epipolar constraint and introduces a new cost function for better accuracy in fundamental matrix calculation, achieving superior results compared to traditional methods.\u003c/p\u003e\n\u003cp\u003eAn innovative approach in this field is \u003cstrong\u003eVisual SLAM (Simultaneous Localization and Mapping)\u003c/strong\u003e, which uses computer vision to simultaneously map an environment and estimate the user\u0026apos;s position within it. By analyzing visual features extracted from camera images, \u003cstrong\u003eVisual SLAM\u003c/strong\u003e algorithms track user movement and generate a real-time map of the surrounding environment (Raul et al.2015). This technology has been employed in various indoor localization systems to provide precise localization. For instance, a study by Poulose et al. (2019) demonstrated that combining Visual SLAM with hybrid sensors on a smartphone camera reduced the localization error from 0.1398 meters to 0.0690 meters, but the major drawback is that this approach is very computationally costly.\u003c/p\u003e\n\u003cp\u003eIn addition to Visual SLAM, \u003cstrong\u003eaugmented reality (AR)\u003c/strong\u003e is often integrated into computer vision-based indoor localization. AR systems overlay digital information onto the camera view, offering real-time localization assistance (Cavallari et al.2019). Such systems typically analyze images captured by the user\u0026apos;s camera to recognize specific objects or landmarks. By utilizing \u003cstrong\u003edeep learning models\u003c/strong\u003e for object recognition, these systems enhance the real-time interactivity between the user and their environment (Zhou et al.2015). Varalatchoumy et al. (2023) proposed an AR-based indoor localization solution that integrates smartphone sensors and cameras. Experimental results from this study showed that localization errors ranged from 0.1 to 0.25 meters for short distances and up to 1.2 meters over longer distances of 200 meters.\u003c/p\u003e\n\u003cp\u003eAnother widely-used technique in indoor localization is \u003cstrong\u003evisual landmark recognition\u003c/strong\u003e, where computer vision algorithms identify distinctive landmarks within an environment. These systems estimate the user\u0026apos;s position based on their relative proximity and orientation to these landmarks. By integrating landmark recognition with \u003cstrong\u003esensor fusion techniques\u003c/strong\u003e, such systems can enhance accuracy and provide more reliable localization (CCC).\u003c/p\u003e\n\u003cp\u003eTable 1 provides a summary of advantages and disadvantages of the different methods.\u003c/p\u003e\n\u003cp\u003eIn our specific application, we employ a camera-based solution for museum localization. Users wear a smartphone around their neck while exploring the museum, and the device continuously captures images of their surroundings. These images are analyzed to detect when users pause in front of exhibits, signaling engagement with specific points of interest (POIs). The system\u0026rsquo;s reliance on camera-based assessments for location identification negates the need for external tools like Wi-Fi or additional sensors. This vision-based positioning system is an efficient solution for indoor localization, providing visitors with real-time, accurate guidance without manual intervention.\u003c/p\u003e\n\u003cp\u003eAs research in computer vision continues to progress, the application of this technology in indoor localization is expected to become even more accurate, robust, and efficient. These advancements will further enhance the user experience, particularly in cultural heritage environments, such as museums, where accurate and non-intrusive localization is crucial.\u003c/p\u003e\n\u003cp\u003eTable 1: An overview of each method\u0026apos;s strengths and weaknesses, offering insights into their applications in various indoor environments, including museums.\u003c/p\u003e\n\u003ctable border=\"0\" cellpadding=\"0\" title=\"Table 1: This table provides a concise overview of each method's strengths and weaknesses, offering insights into their applications in various indoor environments, including museums.\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eMethod\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eAdvantages\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eDisadvantages\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eWi-Fi Triangulation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Widely available in most indoor environments.\u003c/p\u003e\n \u003cp\u003e- Cost-effective as it uses existing Wi-Fi networks.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Susceptible to interference from walls and objects.\u003c/p\u003e\n \u003cp\u003e- Limited accuracy in complex or crowded environments.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eBluetooth Beacons (BLE)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Low power consumption.\u003c/p\u003e\n \u003cp\u003e- Easy to deploy.\u003c/p\u003e\n \u003cp\u003e- Provides good accuracy within short ranges.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Requires maintenance (e.g., battery replacement).\u003c/p\u003e\n \u003cp\u003e- Signal may weaken in large spaces or through obstacles.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eRFID\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- High precision for short-range localization.\u003c/p\u003e\n \u003cp\u003e- No reliance on batteries for tags.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Limited range.\u003c/p\u003e\n \u003cp\u003e- Expensive to deploy over large areas.\u003c/p\u003e\n \u003cp\u003e- Requires specialized readers.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eInertial Sensors\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Can work without external infrastructure.\u003c/p\u003e\n \u003cp\u003e- Suitable for real-time tracking.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Accumulates errors over time (drift).\u003c/p\u003e\n \u003cp\u003e- Limited standalone accuracy.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eComputer Vision\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- High accuracy in recognizing locations and landmarks.\u003c/p\u003e\n \u003cp\u003e- Cost-effective using existing cameras.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Dependent on lighting conditions.\u003c/p\u003e\n \u003cp\u003e- Computationally intensive.\u003c/p\u003e\n \u003cp\u003e- Accuracy can be impacted by moving objects.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eVisual SLAM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Provides simultaneous localization and mapping.\u003c/p\u003e\n \u003cp\u003e- Works in real time for dynamic environments.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- High computational cost.\u003c/p\u003e\n \u003cp\u003e- Requires advanced hardware for real-time processing.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eAugmented Reality (AR)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Enhances user engagement.\u003c/p\u003e\n \u003cp\u003e- Real-time overlay of information onto the environment.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Requires good camera quality.\u003c/p\u003e\n \u003cp\u003e- Limited accuracy for large or cluttered spaces.\u003c/p\u003e\n \u003cp\u003e- High battery usage on devices.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eLandmark Recognition\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- High reliability using unique landmarks.\u003c/p\u003e\n \u003cp\u003e- Improves accuracy with sensor fusion.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Requires pre-mapped landmarks.\u003c/p\u003e\n \u003cp\u003e- Less effective in environments lacking distinctive features.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eSensor Fusion\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Combines data from multiple sources for improved accuracy.\u003c/p\u003e\n \u003cp\u003e- Works in diverse conditions.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- High computational complexity.\u003c/p\u003e\n \u003cp\u003e- May require multiple sensors, increasing system cost.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eSmartphones (General)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Widely available and versatile.\u003c/p\u003e\n \u003cp\u003e- Integrates multiple features (Wi-Fi, sensors, cameras).\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e- Dependent on smartphone hardware capabilities.\u003c/p\u003e\n \u003cp\u003e- Battery drain can be significant during continuous use.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cstrong\u003e2.4 Using Large Language Models in Cultural Heritage\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSince large language models (LLMs) appeared, they quickly found their way into a large and diverse domain of application, as they speed up the process of content creation. Like many other domains, LLMs were adopted also in cultural heritage. Trichopoulos et al. (2023) presented MAGICAL: Museum AI Guide for Augmenting Cultural Heritage with Intelligent Language Model \u0026ndash; a system that demonstrated the capability of CHTGPT4 (cite GPT4) to be used as a tour guide that responds to visitors\u0026apos; questions and provides answers about objects (using also speech to text and text to speech technologies). There are not many additional studies that explored the potential of LLMs to be used as a smart and personalized museum guide, still, it is beyond the scope of this paper to review them. However, one issue needs to be noted and it is the validity of the content created by the LLMs. (AAA) demonstrated the potential of automatic content generation for descriptions of artifacts in a museum, where an image, a title, and sometimes a Wikipedia article were uses to guide the creation of a textual description of the object of interest, but not as a replacement of a tour guide, but as an assistant to the content curator of the museum. The authors suggested that the created content will be verified manually and only then used by a visitors\u0026apos; guide system, thus becoming a \u0026quot;curator\u0026apos;s helper\u0026quot; We adopted this approach in our study, as the quality of the content is important as part of the overall experience.\u003c/p\u003e"},{"header":"3\tLOCALIZATION ALGORITHM","content":"\u003cp\u003eThe problem we are addressing is determining the location of a museum visitor. As visitors explore the museum, they may come across POIs they want to learn more about. Our goal is to provide a solution that enables them to easily access detailed information about the POIs they encounter. The proposed solution is an application that helps visitors identify and learn about any POI they are standing in front of.\u003c/p\u003e\n\u003cp\u003eTo build this solution, we need to overcome several key challenges. First, the application must develop an efficient way to represent the input images so that the matching process is both accurate and optimized. Proper image representation is crucial to ensure that the system can quickly and reliably match the input image to the correct POI. Second, it is essential to construct a well-organized data set for each POI, which will allow for smoother identification and comparison processes. The system must maintain a robust database to support effective POI recognition.\u003c/p\u003e\n\u003cp\u003eFinally, the solution needs to provide a fast and reliable method to recognize whether a POI exists in the database or not. If the POI does exist, the solution should \u003cstrong\u003ecorrectly\u003c/strong\u003e recognize it, ensuring accurate identification. This includes handling cases where a POI is not yet present in the data set and making that decision swiftly to maintain a smooth user experience.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3.1 Image Representation and Matching\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMany research efforts have focused on using feature extraction techniques to solve the challenge of image representation and matching. For example, (Yang et al. 2020) employed the SIFT algorithm for feature extraction and matching. SIFT (G. Lowe et al.2004) works by detecting key points in an image and describing them using distinctive feature vectors, which are then used to match images based on similarity.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAt the beginning of our research, we explored using SIFT and other tools like OpenCV\u0026rsquo;s ORB (Rublee et al.2011) model. However, these models (models based detecting key points in the image) proved to be too slow and relatively inaccurate for our purposes. The complex matching process they rely on significantly impacted performance, making them unsuitable for online applications where real-time processing is critical. These solutions employed computationally expensive approaches that were not practical from a processing time perspective.\u003c/p\u003e\n\u003cp\u003eWe then turned to deep learning-based representations, which are relatively new compared to traditional methods like SIFT, SURF, and ORB. We evaluated several candidates, including ResNet (He et al.2016) and CLIP (Radford et al.2021), to assess their feasibility. One advantage of these models is that they allow us to compute distances between feature vectors, which is much more computationally efficient than the matching process used by SIFT.\u003c/p\u003e\n\u003cp\u003eAmong the models we tested, CLIP yielded the best performance and accuracy. As a result, we decided to further investigate and ultimately adopt CLIP as the feature extraction model behind our solution. While CLIP was initially designed to represent images and text together, we chose to utilize its dense layer output, which produces a feature vector representation of the image.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDuring our investigation of the CLIP model, we discovered several versions available. After testing and research, we selected the CLIP-ViT-Large-Patch14 (https://huggingface.co/openai/clip-vit-large-patch14) model, which best suited our needs. We also experimented with the CLIP-ViT-Base-Patch32 version (https://huggingface.co/openai/clip-vit-base-patch32), but it did not perform as well as CLIP-ViT-Large-Patch14 in terms of accuracy and performance.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3.2 POI Data Set Construction\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIn order to build a database, we captured videos for each POI, ensuring that the video frames cover every possible snapshot a visitor might take. For each video, we extracted all the frames and then employed CLIP on the entire set of frames, resulting in a set of embeddings representing those frames. After this, the ARIDF (BBB) algorithm is applied to these embeddings.\u003c/p\u003e\n\u003cp\u003eThe ARIDF (Automatic Representative Image Dataset Finder for Image Based Localization) algorithm processes a set of embeddings to identify a minimal subset that best represents the entire set. The process begins by initializing a distance matrix to store the pairwise Euclidean distances between the embeddings. To optimize performance, only half of the matrix is computed since the distance between any two embeddings is symmetric.\u003c/p\u003e\n\u003cp\u003eOnce the distance matrix is computed, it is converted into a binary matrix using a threshold. If a distance is greater than 0.38 (a value determined through a grid search on this parameter), the corresponding cell is set to 0, indicating dissimilarity; otherwise, it is set to 1, indicating similarity.\u003c/p\u003e\n\u003cp\u003eThe goal of the algorithm is to identify the most representative embeddings. It does this by iteratively selecting the column with the most 1s, which indicates the embedding closest to the majority of other embeddings. The corresponding embedding index is added to a subset that will represent the entire set. The matrix is then updated to mark all similar embeddings as covered, reducing redundancy in subsequent iterations.\u003c/p\u003e\n\u003cp\u003eThis process repeats until no 1s remain in the matrix, meaning all similar embeddings have been accounted for. The final output is a subset of embeddings that effectively represents the diversity within the entire set. This subset is returned as the result, providing a reduced but representative view of the data. The ARIDF algorithm is efficient in identifying significant relationships within the data and minimizing redundancy, making it ideal for reducing dimensionality while preserving diversity.\u003c/p\u003e\n\u003cp\u003eThe ARIDF reduction yielded different subset sizes for different videos, as it is influenced by several factors, such as the diversity between frames and the video\u0026apos;s length. Nevertheless, the model still has the ability to eliminate more than 70% of the total frames for each video, effectively reducing redundancy while preserving important visual content.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3.3 POI Recognition\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe algorithm aims to find a matching Point of Interest (POI) in a database based on an input image by using image embeddings. The process starts when an image is received as input. The first step is to generate an embedding (a vector of features) for this input image using the CLIP model. This embedding is a numerical representation of the image\u0026apos;s features.\u003c/p\u003e\n\u003cp\u003eNext, the algorithm initializes two variables: \u0026ldquo;minDistance\u0026rdquo; to keep track of the smallest Euclidean distance found and \u0026ldquo;matchedPOI\u0026rdquo; to store the data associated with the embedding that is closest to the input image. Initially, \u0026ldquo;minDistance\u0026rdquo; is set to infinity, indicating that no distance has been calculated yet.\u003c/p\u003e\n\u003cp\u003eThe algorithm then loops over all subsets of embeddings stored in the database. For each subset, it calculates the Euclidean distance between the input embedding and each embedding within the subset. The Euclidean distance measures the similarity between two vectors; a smaller distance indicates higher similarity.\u003c/p\u003e\n\u003cp\u003eWithin each subset, the algorithm maintains a local minimum distance (\u0026ldquo;subsetMinDistance\u0026rdquo;). As it iterates through each embedding in the subset, it updates \u0026ldquo;subsetMinDistance\u0026rdquo; whenever a smaller distance is found. Once all embeddings in a subset have been processed, the algorithm compares \u0026ldquo;subsetMinDistance\u0026rdquo; with the overall \u0026ldquo;minDistance\u0026rdquo;. If \u0026ldquo;subsetMinDistance\u0026rdquo; is smaller, \u0026ldquo;minDistance\u0026rdquo; is updated, and the corresponding data for that subset is stored in \u0026ldquo;matchedPOI\u0026rdquo;.\u003c/p\u003e\n\u003cp\u003eAfter processing all subsets of embeddings, the algorithm checks whether the smallest distance found (\u0026ldquo;minDistance\u0026rdquo;) is below a predefined threshold. This threshold is determined based on the desired balance between false positives and true positives. If \u0026ldquo;minDistance\u0026rdquo; is below the threshold, it indicates that a sufficiently similar POI was found, and the corresponding data for that POI is returned. If \u0026ldquo;minDistance\u0026rdquo; is greater than the threshold, the algorithm returns a general response indicating that no matching location was found.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eRuntime complexity,\u0026nbsp;\u003c/strong\u003eto analyze the runtime complexity of the described algorithm, let\u0026apos;s break down the various components:\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eKey variables:\u003c/strong\u003e\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\u003cstrong\u003en\u003c/strong\u003e: Number of image embeddings in the database.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003ed\u003c/strong\u003e: Dimensionality of each embedding (in this case,\u0026nbsp;\u003cstrong\u003e768\u003c/strong\u003e).\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003e1. CLIP Extraction Process:\u003c/p\u003e\n\u003cp\u003eThe extraction process that generates an embedding from the input image is stated to be computationally expensive. However, this process only happens once, so we can denote its time complexity as\u0026nbsp;\u003cstrong\u003eO(E)\u003c/strong\u003e, where E is the cost of computing the embedding for the input image using the CLIP model.\u003c/p\u003e\n\u003cp\u003e2. Euclidean Distance Calculation:\u003c/p\u003e\n\u003cp\u003eThe algorithm calculates the Euclidean distance between the input embedding and each embedding in the database. Since each embedding is a vector of length d=768, calculating the Euclidean distance between two embeddings takes\u0026nbsp;\u003cstrong\u003eO(d)\u003c/strong\u003e time. So for each embedding, the distance computation is\u0026nbsp;\u003cstrong\u003eO(768)=O(d)\u003c/strong\u003e (constant time).\u003c/p\u003e\n\u003cp\u003e3. Looping Over Embeddings: The algorithm loops over all embeddings in each subset. The loop operates at:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\u003cstrong\u003eLoop over all embeddings\u003c/strong\u003e:\u0026nbsp;\u003cstrong\u003eO(n)\u003c/strong\u003e.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eEuclidean distance\u003c/strong\u003e:\u0026nbsp;\u003cstrong\u003eO(d)\u003c/strong\u003e for each comparison.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThus, the total time for looping through all the embeddings is\u0026nbsp;\u003cstrong\u003eO(n\u0026times;d)\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e4. Comparison and Thresholding:\u003c/p\u003e\n\u003cp\u003eAfter computing the Euclidean distance for each embedding, the algorithm keeps track of the minimum distance in each subset and compares it with a global minimum. These comparisons and updates are\u0026nbsp;\u003cstrong\u003eO(n)\u003c/strong\u003e operations, which don\u0026rsquo;t significantly affect the overall complexity.\u003c/p\u003e\n\u003cp\u003e5. Final Threshold Check:\u003c/p\u003e\n\u003cp\u003eAt the end of the process, the algorithm compares the minimum distance to a threshold, which is also an\u0026nbsp;\u003cstrong\u003eO(1)\u003c/strong\u003e operation.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTotal Runtime Complexity\u003c/strong\u003e\u003cstrong\u003e:\u003c/strong\u003e\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003eThe CLIP extraction process is\u0026nbsp;\u003cstrong\u003eO(E)\u003c/strong\u003e, which is a constant heavy operation performed once.\u003c/li\u003e\n \u003cli\u003eThe distance calculation and looping through embeddings result in\u0026nbsp;\u003cstrong\u003eO(n\u0026times;d)\u003c/strong\u003e.\u003c/li\u003e\n \u003cli\u003eFinal checks and comparisons are\u0026nbsp;\u003cstrong\u003eO(n)\u003c/strong\u003e.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThus, the overall time complexity of the algorithm is:\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eO(E+n\u0026times;d)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eGiven that O(E)≫O(n\u0026times;d), the overall complexity is dominated by the CLIP extraction process. As a result, the algorithm can scale efficiently by increasing number of points of interest (POIs), without significantly affecting runtime complexity.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3.4 LLM-assisted Content Creation and Delivery\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDuring the process of creating the content to be delivered to the visitors, multiple LLMs were experimented with to identify the most concise and clear option. The prompt to the LLMs included the title, an image of the POI, and the following prompt: \u0026ldquo;Create a short description for the artifact in the attached image. Gemini API (Gemini 1.0 Pro https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemini-pro-vision?hl=it\u0026amp;amp;pli=1) was selected for this purpose, as its generated text proved to be the most effective. The text, as proposed in AAA, was subsequently verified, edited, and carefully examined by the authors to ensure its soundness.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eOnce the descriptions were generated, they were added to the dataset of the relevant POIs. When users directed their smartphone cameras at a specific artifact, the system identified the POI and displayed the corresponding description on the screen. In cases where a showcase contained multiple items, the system first presented a general explanation about the items in the showcase and then expanded on several specific items included in the presentation.\u003c/p\u003e\n\u003cp\u003eThe system supports the creation and updating of explanations for POIs in two distinct ways: users can either enrich a new POI in the database by attaching a title and an image or update the description of an existing POI. In both cases, the provided title, image, and associated prompt trigger Gemini, and the resulting output is automatically saved to the dataset.\u003c/p\u003e"},{"header":"4\tSYSTEM","content":"\u003cp\u003eOur system includes two main components, an android application that we named Wandering Application (WA) and a back-end service that we named Matching Service (MS).\u003c/p\u003e\n\u003cp\u003eThe two components interact using an API that was built in the MS, and used by the WA, the whole system served two types of users, a visitor and a museum staff member.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.1 The Wandering Application\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAn Android application was developed to enhance the experience of museum visitors by delivering location aware information about POIs. The app is designed to continuously track the user\u0026apos;s movement within the museum by utilizing the device\u0026apos;s camera. Users are required to hold their device in a way that allows the camera to capture images of their path, enabling the app to gather data about their activities (hold it in their hand or hang it on their chest). The primary aim is to determine the visitor\u0026apos;s location, specifically identifying which exhibit the visitor is currently viewing. In this context, \u0026quot;location\u0026quot; refers to the exhibit in front of the visitor rather than any arbitrary position within the museum.\u003c/p\u003e\n\u003cp\u003eTo achieve this goal, the application incorporates two key functionalities: 1) detecting the visitor\u0026apos;s standing position, and 2) querying a service with images captured in that position. The app is programmed to detect when the user has stopped moving, as it is assumed that visitors will pause mainly in front of exhibits of interest. When the app detects that the user has stopped, it captures an image and sends it to a back-end service. The service then determines whether the image matches a known exhibit location. This process allows the app to accurately identify the visitor\u0026apos;s position within the museum.\u003c/p\u003e\n\u003cp\u003eAdditionally, the application supports an enrichment process intended for museum staff or other designated users. These users can upload content related to of specific POIs, this may be textual description or any kind of multimedia. The app facilitates the submission of this data to a back-end service responsible for maintaining and updating the dataset of POIs and their associated content.\u003c/p\u003e\n\u003cp\u003eThis dual functionality ensures that the application not only enhances the visitor experience by providing location aware content but also continuously improves the museum\u0026apos;s data repository through museum staff members contributions. Figure 1 is a sequence diagram that illustrates the overall interaction between the visitor and the system\u0026rsquo;s components.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.1.1 Standing detection\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo detect when a user is standing still in front of a POI, we assume that visitors typically stop moving when they are observing an exhibit of interest. By monitoring the movement of users, the application can discern when a visitor has paused, thereby indicating a point of interest or engagement with the exhibit.\u003c/p\u003e\n\u003cp\u003eThe detection mechanism involves comparing consecutive images captured from the stream of images taken continuously by the camera of the mobile device. Specifically, we compute the difference between the average pixel values of two successive images using the following equation:\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cimg src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAAAsCAYAAADfCoXBAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAoxSURBVHhe7ZxJjw5dFMer3y8ghpWICBYEkZiDSEhoYWXVhIWVaSERCYKNGKKFLW0hsUNI2JgTElPMISQssBCxMsQn6Lf+R51HvdV3OLeGfuq+zi+pvk9V3RruOef+71BV3TOYkiiKokTAP1mqKIrSelSwFEWJBhUsRVGiQQVLUZRoUMFSFCUaVLAURYkGFSxFUaJBBUtRlGhQwVIUJRpUsBRFiQYVLEVRouGvE6xPnz4l+/btSyZPnpxtiYPTp08nK1euTM6fP59tiYMHDx4kPT09ybFjx7ItzVC3X2O1d2yExkctggXH4qJY1q5dm20tB2587ty5nfP5AvD169d0zVGjRnWOsQUZts+ZMyf5+fNncuvWrWxrHGzevDnZuHFjsn//frL39+/fsz1KE379W+zdRAOO+rht2zYSo9rBf2uog61bt+K/Pgz29/dnW/5w//592tfX15dtMZMG3eDIkSMpP0B+1zFXr16l8+La3759G3z16hWtIy1y7tw52ofUBvbnF1NZqtLb2/ufa2A9BJQTdsKC322HfW+zJe+XLKbySvxaxeZttXexTLZl79692RFDgc1Q31B/Pn78mG39jckvXC99wE7wN87tuj7wxUeR2gQLN2YrFG4G+wYGBrItQ+E8rsDLA6PAIJMmTcq22IEzJMZDPtwDFl/esnAFwDUQdGUqgbQ8bUASkCw6WIqwn2GzIlI7VLV5G+2dL5PJtlyf0KibkAh93i+287hAxwF2gyDa6JpgsfHKAuHB8dJgYmNKgohbI8m5kS/kPsrA91MmCBiUG+eQtnrdQhKQnAd2MYHtpuND/FrV5m20N5fJdE9sUxNSAeZzmBoLKTwKstmdrzHsgoWL2gLOB7cGUsPA4CxwvgBEXuRzqTzDQ8oqDpKAYMF1qohiSLm6iSQg0fP25SkSWv6qNm+jvblMoUiFnv1StWeJ+oR7NV0vVLCCJ915kpsnuDFBjkk7MHv2bEqZfD5MXBbhJwS7d++m9efPn3fy4ymNCZwnFaskDSBaX716NeXHpLuJ27dvU7pkyRJKXTx+/JjSFStWUNoEsB8mh1MnJqNHj862hjNx4kSyw//hKRZsAqZPn06phBC/1mHzttkbk+VcphBw3M2bN5NUeL22uHv3LqWLFi2itCw7d+6ke2WfVSFIsOCspUuXklhgSQUvOXDgQHLkyBHaXywY8sMwYNmyZZTmWbx4MZ0jVXBaT3tLtI4FT2lM3LhxI0mVmn7DWZz/x48ftK3I5cuXKZ06dSqlLupykIs6RRHlRyBIn8bgCSw3CCFLI097cvCTvSlTplAKcE1bIwRC/FqXzUPt3SRPnz6lFB2GPGjQbY09CBF69sv8+fMpLcu8efMovXLlCqVVEAsWlBmPKhFEEA20OGDVqlWUgnzAMWlXkNKFCxdSauLFixeUmo438e7dO0qLzjLx4cMHSmfOnEmpC/TwgPQ+ysC9iRkzZlBahVmzZlH65csXSn3s2rWrI/AhCxqWpkBcofFDnHBM4RWCHTt2dALdRIhf67J5qL2b5N69e5Tmy48OAnpP06ZNy7YMRSr03INDr7LKSADAr/DvhQsXsi3lEQvW8ePHqQCHDh0yFgAF44DLw2LkMhCMnA9YH48ePaJUEqw8dPTBFcdWjrrgVstVGUP5/Plz9is+3r9/Tylii3t0Y8aMocbD1CtnpH4Fddu8DfZ+9uwZpVu2bOnYbd26dbTN1cBIhZ57cHVNj7DtUc+qIBIsXOTUqVMkKsUXQ7n1so2lIUbYZ1NpPj7EMC9fvqR0wYIFlNZB3Q4yMVyiGBMPHz6kdGBgoNOj4yF/yJyWjW7ZnOdnQxfJG9/ogULQUR/zPWFMrdjqISMVeu7BSYaOIXz9+jX7VQ6RYLkqM88PYG6rCI/1XUM3Pp672xK4xZT0sKQ05aA8wyGKLto4h8W+zA9juHGrY2jeLZvz/GzogmG7jydPnlBa7DGOGDFCNE0ioYmRQB2IBIu7wCZR4TGxadzMQzeXCHAPyzXHlQctJoYPvb292RY3aFkl+ByEyu6aBJZgE0WUiT+PgEDgOpgvlH4OMn78+OyXmzbOYfG8YfEauK6rRyT1q68hunbtGtk9RJSl9m6Kt2/fUlocMsO/J0+ezNbK4+qVYp6s+JYAbDhciOewTMDJGPIBU1DfuXOHUtf8FQuF5GkP4Baz+AqFDf5GioXRhM9BcMrRo0dJKKtgE0WIE4bcZ8+epYq6Z88eWr906VKWwwwPjceNG0dpbLBI2BofNBKm12GAxK/AZnNcG+fevn27eJjUFntzvTI18mjk0ODZBFgi9LZeKc6JeTKcH8N22A11YsOGDVkOP2PHjs1+laO0YMEwmIDHmNkWcL4JPpyDhUL6JOLNmzeUSp/4rFmzhlJ+smiCHVT8ABRCht4lnopCRFxwi2ODRRGYeg54/YNFn4cF3Hu1wfMYTfaAmoR74KbJdcQGGgnbxLvEry6bX79+nXojaCSktMXerk7CmTNnSERs9ygReu6VTpgwgdIiBw8epPoKm27atImuZxNIhuuYKfZDEAkWC9LFixcpkBAI69evT/r7+8mJ2AbQIuYnDfndKOznrmQeHov7JgrzhPbIli9fTik7oQju7cSJE53feWBciIdUTEHxHAyesjLFPBDEw4cPZ2t/9nOlNMGVsWjTmEA8gWJPAWVD7wcVwTZV4PMrcNkc9g6pPG2xNw+/TJ0EvH+Fl7Bd0yU+oYeQoa6CX79+UcpABDECyNcH5PGJOGwHX/b19WVbKpDegAi8pp/eWOfzhNR5tB2/sS3tJQ15vZ6/98NxeL2/+Go+8mN/yCcZyI8lhNSBdIzp0wDexwvWTfC92uDjTfCx+cUF/kNFKuLOTydgT5wHnza0GdunFyhj3h62xYXLr1Kb8/357NgGe+PaXAddi6s+od4iD+qtiZBz8adsrg+oAeuAKZ8tPmy4I6JhOOCkQcCFs4mKDZ+TJHAFMMHnd/0rHCm4R59Y1VGe4SI0IEOoww58f64YjMneElxCL4X/E4NPrADiGXlN1wuNj64KFm4UBZHCouH6NzU2XCovwSVYaH1tDglBIlbYhzy+fG2hScECVf3qE6zY7C2hqgCHiJXPP9EIFv/biRCjIWhwDAxeBhgNhjYNT33YBAvnQYsFJ1aBRc91HjgXQ29cL5bK07RggSp+dQlWjPaW4hMSG6h7sLWk3rKwufKGxkdXBAvOh/igMNJAYANXDXwYHIGNQJSCe+RutCmwqwLxLooVfuOaDHqVWA8NsG4TGpBlKeNXgGNM9xervUNA2UKFHvW2KECYCinWC57zxrldtF6wOEBQSElPCYZEfgRimaFgVXBt01KncKFspmvkBUupF26AikvTwto2QoSeOw2mJV8fULchanXWEaYHf9ILKoqitJ5Kb7oriqIMJypYiqJEgwqWoijRoIKlKEo0qGApihINKliKokSDCpaiKNGggqUoSjSoYCmKEglJ8i+Q8D2S5tWPFAAAAABJRU5ErkJggg==\" height=\"44\" width=\"300\"\u003e\u003c/p\u003e\n\u003cp\u003ewhere \u003cem\u003eE(I)\u0026nbsp;\u003c/em\u003erepresents the average pixel value of an image \u003cem\u003eI\u003c/em\u003e. This difference helps identify whether the scene in the images remains largely unchanged. If the user moves only slightly, the difference does not vary significantly, as most of the image remains continuous.\u003c/p\u003e\n\u003cp\u003eTo determine whether a user is standing still, the application calculates the difference in averages between consecutive images and checks if the difference remains below a predetermined threshold. This threshold accounts for small movements, ensuring robustness to minor positional adjustments. If the difference stays below the threshold for a continuous duration\u0026mdash;configured to be 3 seconds\u0026mdash;the application concludes that the user is stationary. This threshold was chosen to balance accuracy and speed, allowing the system to efficiently process image data in real time.\u003c/p\u003e\n\u003cp\u003eWe aimed to avoid using phone sensors like gyroscopes because these sensors vary across devices, with newer models having better sensors than older ones. Relying on sensors would limit the system\u0026apos;s usability, as our users may use any type of device. Regarding the continuous use of the camera, it does affect the battery; however, other functions of the application are lightweight, so the camera is the primary factor impacting battery life. Nonetheless, this did not present a significant issue, nor did device conditions such as temperature.\u003c/p\u003e\n\u003cp\u003eOnce the stationary state is detected, an image is captured and transmitted to the backend service. (See \u003cstrong\u003eFigure2 a \u0026amp; b\u003c/strong\u003e for screenshots during the location identification process).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.1.2 Querying the Matching Service\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eOnce an image is captured, the application sends a request to a matching service (MS). The MS responds with either a positive or negative responses. If the response is positive, the application displays a pop-up screen with detailed information about the POI), including metadata about the identified location such as the exhibit\u0026apos;s ID, name, and description (see \u003cstrong\u003eFigure2 c\u003c/strong\u003e), If the response is negative, the application continues to capture the image stream and informs the user that the position could not be identified (see \u003cstrong\u003eFigure2 b)\u003c/strong\u003e.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.2 Matching Service\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA back-end service was built using python (that was chosen for this project due to its robust library ecosystem and compatibility with deep learning models, particularly the CLIP model and contains libraries like pandas, and NumPy facilitate efficient data manipulation, numerical computations, and model development. Moreover, Python\u0026apos;s seamless integration with frameworks such as TensorFlow allows for the effective implementation of advanced models like CLIP).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe Matching Service maintains a dataset, where minimal sets of images of all the POIs are stored. These minimal sets are created automatically from large sets of images, by applying the ARIDF algorithm (BBB). For storing the images, we used NoSql database, Mongodb, so our POIs are stored as documents, each document represents a POI.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.2.1 Matching service system design (Figure 4)\u003c/strong\u003e\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003eFind API: The Find API is designed to handle requests for the image recognition process. It initiates this process by receiving image-based queries, triggering the necessary operations, and returning the relevant responses.\u003c/li\u003e\n \u003cli\u003eOrcheastror: The Orchestrator is responsible for coordinating interactions between various services and functionalities in all the processes. It serves as an intermediary, receiving requests from the API and systematically forwarding them to the appropriate downstream services. For example, upon receiving a request, the Orchestrator sends the relevant data to the Features Extractor service, which processes the input and returns a result. The Orchestrator then directs this result to the next designated service in the workflow.\u003c/li\u003e\n \u003cli\u003eFeatures Extractor: The Features Extractor service is designed to process individual images or sets of images, returning their corresponding embeddings using CLIP.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eFinding Service: The Finding Service is designed to accept input embeddings and identify the most similar embeddings from a database. This process involves querying the entire database via the Data-Base Service to retrieve all relevant documents. Once the documents are obtained, the Finding Service iterates over them, calculating the Euclidean distance between the input embeddings and each embedding from the database. The embedding with the minimum Euclidean distance is identified, and its corresponding data is evaluated against a configurable similarity threshold. If the similarity meets or exceeds this threshold, the service considers the result sufficiently similar and returns it to the Orchestrator.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eEnrichment API: The Enrichment API is designed to manage enrichment requests. These requests typically contain a video of a POI accompanied by text that describes the POI. Upon receiving a request, the Enrichment API initiates the enrichment process by triggering the Orchestrator with the provided request data.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eARIDF (BBB): see section 3.2.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eStoring Service: The Storing Service is tasked with the comprehensive assembly and storage of documents. Each document encapsulates all pertinent data about a POI, including the minimum set of embeddings that represent it. This service handles the creation of these detailed documents, ensuring that all relevant information is accurately compiled. In collaboration with the Data-Base Service, the Storing Service subsequently stores these documents in a database. The end result is a collection of documents, each uniquely representing a POI.\u003c/li\u003e\n \u003cli\u003eData-Base Service: The Data-Base Service is designed to facilitate Create, Read, Update, and Delete (CRUD) actions within the database, supporting the needs of various other services. This service establishes and manages a connection to the database while incorporating a caching layer that maintains a local copy of the dataset. This local copy includes all data existing in the database, significantly enhancing the performance of retrieval (get) actions. The Data-Base Service is also responsible for ensuring that the cached copy and the actual database remain consistently aligned, thereby guaranteeing data integrity and synchronization. This dual-layered approach not only optimizes access times but also ensures the reliability and accuracy of data across all interacting services.\u003c/li\u003e\n \u003cli\u003eMongoDb: MongoDB serves as the database platform that stores comprehensive information about Points of Interest (POIs). Each POI is represented by a MongoDB document, structured to include several critical fields. Each document features a unique \u0026quot;id,\u0026quot; which is automatically generated by MongoDB, and a \u0026quot;name\u0026quot; for the POI, extracted from the input description provided by the application. The \u0026quot;description\u0026quot; field contains a narrative that describes the POI, intended for presentation to the end user.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eIn addition, the document contains a \u0026quot;minimum set of embeddings,\u0026quot; representing a condensed version of the larger set of embeddings, and a \u0026quot;set of rest embeddings,\u0026quot; which includes all other embeddings not part of the minimum set. Each embedding is a vector with a dimension of (1, 768), encapsulating the feature representation of the POI.\u003c/p\u003e\n\u003cp\u003eMoreover, the document records other essential metadata about the input video, such as the number of frames extracted from the video, the title of the video, its length, and the creation date of the document. This comprehensive dataset is managed by our system, particularly the Data-Base Service, to ensure the efficient and effective utilization of POI information. The structure and organization of the MongoDB documents facilitate streamlined access and maintenance, supporting the overall functionality and performance of the POI dataset.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.2.2 System Processes (Figure 3)\u003c/strong\u003e\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003eEnriching a POI process: The Matching Service (MS) provides an entry point for enriching requests from the application. These requests typically include a video file representing a POI and accompanying metadata that describes it. Upon receiving an enriching request, the Enrichment API forwards the request data to the Orchestrator. The Orchestrator initiates the process by splitting the video into a set of images or frames, which are then sent to the Features Extractor. The Features Extractor processes each image or frame iteratively, applying the CLIP model to generate embeddings\u0026mdash;a vector of features with a size of 768. After processing all frames, the Features Extractor returns a new set of embeddings to the Orchestrator. Next, the Orchestrator forwards the embeddings set to the ARIDF model (see \u003cstrong\u003ePOI Data Set Construction\u003c/strong\u003e). ARIDF processes these embeddings to identify a minimal set of representative embeddings and separates the remaining embeddings into another set. Both sets of embeddings, along with all the input video data and metadata, are then sent to the Storing Service by the Orchestrator. The Storing Service assembles this information into a data object that is forwarded to the Data-Base Service, which is responsible for storing the data in the database and updating the cache accordingly. As a result, the new POI is added to both the database and the cache, making it readily available for querying through our searching flow.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eSearching process: The Matching Service (MS) provides a find entry point, which initiates the searching process. When a request is accepted by the Find API, typically containing an image, the request data is forwarded to the Orchestrator. The Orchestrator begins by requesting the embeddings of the input image from the Features Extractor, which uses the CLIP model to generate a feature vector of length 768. This single embedding is then sent back to the Orchestrator. Subsequently, the Orchestrator forwards the embedding to the Finding Service that requests all POIs from the Data-Base Service, which returns the cached POIs for performance efficiency. The Finding Service iterates over each POI, calculating the Euclidean distance between the input embedding and the embeddings in the minimum set for each POI. The smallest distance is stored and compared across all POIs. At the end of this process, the smallest distance is compared to a pre-determined threshold, established through research to minimize false positives while maintaining a satisfactory rate of true positives. If the minimum distance is below the threshold, the image is deemed sufficiently similar to a specific POI in the database. The Finding Service then returns the corresponding description of the most similar POI to the Orchestrator. If the minimum distance is equal to or greater than the threshold, a message indicating that no sufficiently similar POI was found is returned. The Orchestrator then forwards the response to the Find API, which in turn provides an appropriate response to the client. This entire searching flow is optimized to execute within approximately 2-3 seconds, leveraging techniques such as caching, the ARIDF model, storing embeddings instead of images and the use of Euclidean distance calculations. These optimizations ensure that the system performs efficiently while maintaining high accuracy in identifying similar POIs.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"5\tEXPERIMENTATION","content":"\u003cp\u003eAll methods were carried out in accordance with relevant guidelines and regulations\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.1 Introduction to the Experiment\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe primary purpose of this experiment was to evaluate the effectiveness and usability of the proposed application solution in addressing our research question: \u0026ldquo;\u003cstrong\u003eHow can image-based positioning be integrated efficiently into a mobile, location aware museum visitors guide?\u003c/strong\u003e\u0026rdquo;.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe key objectives of the experiment were:\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003eAccuracy Evaluation: We aimed to assess the accuracy of the application\u0026apos;s outputs and responses, particularly in delivering relevant and correct location-aware information to users. This involved verifying that the application provided accurate and consistent results across different scenarios and user interactions.\u003c/li\u003e\n \u003cli\u003eQuality and Performance: The experiment sought to evaluate the quality and performance of the application under real-world conditions. We examined whether the application operated smoothly, with minimal delays or technical issues, and maintained high responsiveness and reliability. This was particularly important given the high computational demands of the problem the application addresses.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eSince we had already developed a system, we wanted to know what users thought of it beyond location identification, including usability, comfort and their overall enjoyment and engagement with the application,\u003c/p\u003e\n\u003col start=\"3\"\u003e\n \u003cli\u003eUsability Evaluation: To assess the usability of the application, we aimed to determine how intuitive and user-friendly the interface is for participants. This involved measuring how easily users could navigate the application, understand the content, and complete tasks without requiring extensive instruction or prior experience.\u003c/li\u003e\n \u003cli\u003eComfort Assessment: We aimed to evaluate the comfort level of users while interacting with the application. This included assessing the physical ease of use, the clarity and accessibility of the interface, and the overall user experience in terms of cognitive load and satisfaction. However, this goal was secondary, as it was not the primary focus of our study. Since users interacted with the application, we included some questions related, to gather additional insights.\u003c/li\u003e\n \u003cli\u003eUser Enjoyment and Engagement: Since overall enjoyment and engagement with the application is a key indicator of its success in enhancing user interaction and satisfaction, we tried.to assess this aspect as well\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eBy focusing on these objectives, the experiment sought to provide a comprehensive evaluation of the application\u0026apos;s capabilities and identify areas for improvement. The findings from this experiment are intended to inform the development of future iterations of the application, ensuring that it meets the needs and expectations of a diverse user base.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.2 Experimental Setup\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe experiment was conducted at the Hecht Museum (https://mushecht.haifa.ac.il/), located at the University of Haifa. The museum\u0026apos;s unique environment is rich in archaeological displays and art, offering a diverse array of exhibits that provide a stimulating backdrop for research.\u003c/p\u003e\n\u003cp\u003eFor the purpose of this experiment, we selected the \u0026quot;Ancient Crafts and Industries\u0026quot; exhibition area, which focuses on seven ancient crafts: metalworking, woodworking, stone vessels, glassmaking, mosaic art, the art of writing, and the physician\u0026apos;s craft. This area includes 29 displays, each showcasing artifacts and information related to these ancient practices.\u003c/p\u003e\n\u003cp\u003eWe decided to consider each display as a POI for the experiment. These POIs were integrated into our application and dataset, allowing us to gather data on participant interactions and experiences within this specific context. The choice of this area was motivated by its rich historical and educational content, that enabled us to assess various aspects of the application\u0026apos;s usability, including localization, content engagement, and overall user satisfaction.\u003c/p\u003e\n\u003cp\u003eThis setting offered a unique opportunity to test the application in an environment that mimics real-world usage, where visitors engage with cultural and educational content.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFor each POI, a short video was taken, from \u0026quot;Ancient Crafts and Industries\u0026quot; area, most of the videos were short (average duration about 22.6 seconds and standard deviation about 15.3, see Table 2), the MS will convert each video to set of frames where the number of frames extracted is directly fits the duration of the video, where each second configured to yield about 2 frames.\u003c/p\u003e\n\u003cp\u003eAfter extracting the frames, we implied the ARIDF algorithm (BBB) for each POI and save to database only the representing frames, see Figures 5,6 as examples of figures that we stored in database.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.3 Methodology\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.3.1 Experimental Design and Participants\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe experiment aimed to evaluate the usability, comfort, quality, and accuracy of our application in a real-world setting. The experiment was approved by the IRB of the faculty of social sciences of the University of Haifa (approval number 094/24). We recruited 30 participants without imposing specific demographic restrictions, requiring only that they be adults. Although the study was open to all eligible individuals, we particularly targeted older adults, as they are more likely to face challenges with new technologies. Our primary focus was on museum visitors, who generally have an interest in archaeological artifacts.\u003c/p\u003e\n\u003cp\u003eParticipants were approached randomly and invited to join the study. Those who agreed to participate were asked to complete an \u0026quot;Application Form for Participation in Research and Informed Consent,\u0026quot; which provided detailed information about the research and the experiment. Each participant was provided with a dedicated device with the application pre-installed to ensure a seamless experience. They were instructed to explore at least 15 Points of Interest (POIs) within the \u0026quot;Ancient Crafts and Industries\u0026quot; zone of the Hecht Museum.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAlthough demographic data was not consistently collected, we managed to gather information for about half of the participants. Out of the 16 participants for whom we collected demographic data, there were 9 males and 7 females, with an average age of 44.5 years. The application was designed to provide detailed information about the exhibits at each POI, and participants were asked to verify the accuracy of this information. The study adhered to ethical standards and was approved by the Faculty of Social Sciences IRB (IRB approval number 094/24).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.3.2 Procedure\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eUpon entering the museum, visitors were approached and given a brief overview of the research. If interested, they signed an informed consent form and then they were provided with more detailed information about the study and the procedures involved. Participants who agreed to participate were asked to sign a consent form, which included comprehensive details about their role and what was expected of them.\u003c/p\u003e\n\u003cp\u003eParticipants were then provided with an Android device pre-loaded with the application. They were instructed to navigate the designated area and interact with at least 15 POIs from the 29 available in the area. For each POI, participants were required to stand in front of the exhibit and point the device\u0026rsquo;s camera towards it. The application was expected to present information related to the POI. If the information provided was accurate, the interaction was logged as successful. In cases where the information was incorrect, it was logged as a false positive. If the application failed to recognize the POI, participants could attempt multiple times from different angles or distances. All interactions were automatically logged and closely monitored and documented by a researcher accompanying the participant. Upon completing their exploration, participants were asked to fill a System Usability Score (SUS) (Bangor et al. 2009) questionnaire and participate in a semi-structured interview. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.3.3 Data Collection Methods\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eData collection was conducted through multiple channels to ensure a comprehensive analysis of the application\u0026apos;s performance. The primary source of data was the log from our remote service, which recorded time stamped record of every interaction, including the images captured, the results of the queries, POI IDs, similarity scores, timestamps, and response times. This data was crucial for assessing the application\u0026apos;s accuracy and performance.\u003c/p\u003e\n\u003cp\u003eAdditionally, participants completed two documents: a SUS questionnaire and a semi-structured interview. Participants also suggested improvements and features they found beneficial or lacking. This comprehensive data collection strategy allowed us to evaluate the application\u0026apos;s effectiveness, identify areas for improvement, and gather user-centric insights to refine the solution.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.3.4 Data Analysis\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eLog analysis was used to analyze issues related to the identification of the POIs \u0026ndash; errors in identification and time it took to identify a position.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFor the SUS questionnaire, we utilized an online calculator available at SUS Calculator (https://blucado.com/sus-calculator/), which facilitated the computation of usability scores, allowing us to quantitatively assess the usability of our solution.\u003c/p\u003e\n\u003cp\u003eRegarding the semi-structured interviews, most of the questions required yes/no responses. We reviewed all responses, tallying the number of affirmative and negative answers. For the open-ended questions, we performed a qualitative analysis to identify the most frequently mentioned topics. This process involved categorizing and coding the responses to determine common themes and insights.\u003c/p\u003e\n\u003cp\u003eBy employing these methods, we ensured a systematic and thorough analysis of both quantitative and qualitative data, providing a comprehensive understanding of the usability and user experience associated with our solution.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.3.5 Data Availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets used and/or quantitatively analyzed during the current study are available from the corresponding author on reasonable request. The questionnaires are in Hebrew and on paper, we will try to see how to make them available as well.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.4 Results\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.4.1 Quantitative Findings\u003c/strong\u003e\u003c/p\u003e\n\u003col class=\"decimal_type\"\u003e\n \u003cli\u003eARIDF Performance (Table 2): In most cases, a short video of no more than 100 frames (less than 1 minute) was sufficient to represent a POI. This indicates that these videos provided enough data for ARIDF to successfully create a representative set that accurately reflects the original video. As shown in Table 2, the minimum number of frames selected to represent some POIs was as low as 6, from an initial set of more than 20 frames, demonstrating that many POIs were adequately represented with just 6 frames, which is quite small. On average, ARIDF helped reduce the number of frames in the input videos by about three-quarters (from 43.678 frames to 10.071 frames). This reduction minimized the storage required for unnecessary data, thereby enhancing the performance of the MS. However, there are still some POIs that required longer videos for ARIDF to adequately represent the POI, resulting in a larger set of frames being stored in the database. In some cases, this number can exceed 40 frames per POI. This is particularly true for POIs that are large or have multiple viewing angles, such as \u0026quot;Mosaic Art,\u0026quot; which needed coverage from all angles (see Figure 7). Another example is \u0026quot;De Materia Medica by the Greek,\u0026quot; which also has multiple angles, requiring a video that accounts for its three corners and includes overhead coverage.\u003c/li\u003e\n \u003cli\u003eTable 2: An overview of the image-based representation of the POIs including the number of frames used the ARIDF, the number of frames remaining after applying ARIDF. Additionally, the table includes various statistics related to these POIs.\u003c/li\u003e\n\u003c/ol\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"605\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003ePOI #\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003ePOI Name\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003eFrames before\u003c/p\u003e\n \u003cp\u003eARIDF\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003eFrames after\u003c/p\u003e\n \u003cp\u003eARIDF\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003eDuration\u003c/p\u003e\n \u003cp\u003e(sec)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e\u0026nbsp;% reduction\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e# of visits\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e# of errors\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e# of Unrecognized POIs\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eDe Materia Medica by the Greek\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e157\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e55\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e78\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e65%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eHuman Illnesses in Ancient Times\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e67%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eMetal working\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e78%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eLost Wax\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e78%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eBronze Vessels\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e77%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eSelection of Metal Objects\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e79%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eArtifacts made of Iron\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e79%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003ePhysician\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e45\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e72%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eGlassmaking-Part1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e34\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e82%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eGlassmaking-Part2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e45\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e87%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eProducing the Raw Glassp-Part1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e52\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e88%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eProducing the Raw Glassp-Part2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e38\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e84%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eOssuary\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e59%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eBurial coffin\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e43\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e70%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eSelection of Wooden Objects\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e56\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e89%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eLead coffin\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e42\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e76%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eCarpenter\u0026rsquo;s Tools\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e71\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e34\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e63%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eFrieze fragment\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e67%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eBurial coffin (Sarcophagus)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e36\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e78%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eHebrew Promissory Note\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e76%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eAlphabetic Script\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e68\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e32\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e90%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eJewish Tombstone\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e30\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e77%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eHieroglyphic Script\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e36\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e83%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eJewish ossuaries\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e32\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e78%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eStone Vessels Everyday Life\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e47\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e87%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eTables\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e81%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eStone Vessels (Late 2\u003csup\u003end\u003c/sup\u003e Temple Period)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e48\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e88%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eStone Jar\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e77%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003eMosaic Art\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e123\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e41\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e61\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 57px;\"\u003e\n \u003cp\u003e67%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 38px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 66px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e43.678\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e10.071\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e22.58\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003eAvg\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003eMin\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e157\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e55\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e78\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003eMax\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 189px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 47px;\"\u003e\n \u003cp\u003e27.386\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e10.194\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003e15.281\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 57px;\"\u003e\n \u003cp\u003eSTD\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e3. Logs Analysis: As previously mentioned, we implemented a comprehensive logging mechanism to monitor every action within the system. Additionally, we stored all snapshots captured by users to query the system. This setup enabled us to thoroughly track each request, including its status, answer and progression, providing valuable insights into system performance and user interactions.\u003c/p\u003e\n\u003cul\u003e\n \u003cli style=\"font-weight: bold;\"\u003e\u003cstrong\u003eParticipant Perspective (Table 3):\u0026nbsp;\u003c/strong\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eTable 3: summaries the experiment in numbers from participants perspective, we can find for each participant some information about his experience, like number of POIs he visited, number of max tries to get information about POI, how many POIs wasn\u0026rsquo;t detected at all, how much was detected wrongly and other important statistics\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"100%\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003eParticipant #\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e# of\u0026nbsp;\u003c/p\u003e\n \u003cp\u003eVisited POIs\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 12px;\"\u003e\n \u003cp\u003e# of\u0026nbsp;\u003c/p\u003e\n \u003cp\u003eSuccessful\u0026nbsp;\u003c/p\u003e\n \u003cp\u003eSearching Tries\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 14px;\"\u003e\n \u003cp\u003e# of Unrecognized POIs\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e# of\u0026nbsp;\u003c/p\u003e\n \u003cp\u003eWrongly recognized POIs\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003eAvg\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003eMin\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003eMax\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 6px;\"\u003e\n \u003cp\u003eSTD\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"4\" valign=\"top\" style=\"width: 38px;\"\u003e\n \u003cp\u003enumber of Tries for successfully recognized POI\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.08\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.34\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.59\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.33\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.65\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.43\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.07\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.36\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.44\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.37\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.55\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.47\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.58\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e36\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.46\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.08\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.60\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.07\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.26\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.35\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.38\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.32\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.48\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.53\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.33\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.62\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.60\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.06\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.24\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.08\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.29\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.42\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.22\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.06\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.25\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.38\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.39\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.54\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003e30\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 6px;\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003eSum\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e526\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e606\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003eAvg\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e17.53\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e20.17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0.27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0.23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003eMin\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e12.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e13.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003eMax\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e28.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e36.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e1.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e2.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 11px;\"\u003e\n \u003cp\u003eSTD\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 10px;\"\u003e\n \u003cp\u003e3.54\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 12px;\"\u003e\n \u003cp\u003e4.96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 14px;\"\u003e\n \u003cp\u003e0.45\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 11px;\"\u003e\n \u003cp\u003e0.50\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 38px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eAccording to Table 3, there is variation in how participants perceived the application\u0026apos;s interest level. For instance, 8 out of 30 participants visited at least 20 POIs, indicating a higher level of engagement. Conversely, some participants found the application less engaging, despite being asked to visit at least 15 POIs; 2 participants visited only 12 POIs. However, the overall average number of POIs visited by all participants was about 17, which exceeds the minimum requirement.\u003c/p\u003e\n\u003cp\u003eThe application demonstrated high reliability, with users needing an average of only 2 attempts to receive a positive response. In fact, some participants consistently received a positive response on their first attempt throughout the entire experiment. There were also differences in how easily participants could use the application. For example, participant number 19 managed to visit 15 POIs and received a positive response on the first try for all of them, while participant number 3 had less success. Overall, there was little variation in the average number of attempts needed across all participants (17.53) and the average number of POIs visited (20.17).\u003c/p\u003e\n\u003cp\u003eAdditionally, our analytics reveal a low incidence of errors: only 8 from 30 participants (0.27 on average), encountered a situation where the POI was not recognized at all, and 7 from 30 participants (0.23 on average) experienced a situation where the POI was incorrectly recognized. This suggests that the application was generally accurate and user-friendly.\u0026nbsp;\u003c/p\u003e\n\u003cul\u003e\n \u003cli style=\"font-weight: bold;\"\u003e\u003cstrong\u003ePOI Perspective (Table 2):\u003c/strong\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThe Points of Interest (POIs) were not uniformly managed, meaning some POIs required more effort to be adequately represented in our database, leading to the need for longer videos compared to others. For example, the POI \u0026quot;De Materia Medica by the Greek\u0026quot; had a high average number of attempts needed for successful recognition, indicating that participants had to try more often than with other POIs to achieve a correct recognition. This suggests that this POI was particularly challenging. To address this issue, we uploaded an additional video of this specific POI to our database, which increased the likelihood of accurate detection.\u003c/p\u003e\n\u003cp\u003eAnother issue worth mentioning occurred with the POI \u0026quot;Glassmaking-Part 2.\u0026quot; As indicated in the table, this POI was incorrectly recognized twice, which is relatively high compared to other POIs. In both cases, the POI was mistakenly identified as \u0026quot;Producing the Raw Glass-Part 1.\u0026quot; This error is understandable, as these two POIs appear quite similar (see Figure 8), leading to confusion and incorrect recognition. Although we attempted to resolve this by uploading more videos for each of these POIs, the solution was not particularly effective. Thus, determining the best approach to handle such cases remains an open question.\u003c/p\u003e\n\u003cp\u003e4. SUS score: The average SUS score for all 30 participants was 87.08, which is classified as an excellent rating according to (Bangor et al. 2009) (a score within the range of approximately 85 to 100 is typically deemed \u0026quot;Excellent,\u0026quot; signifying a high level of perceived usability for the system). This result indicates that users found the system highly intuitive and easy to use, requiring minimal effort to learn and navigate.\u003c/p\u003e\n\u003cp\u003eThe datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5.4.2 Qualitative Findings\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe qualitative findings from the semi-structured interviews provide valuable insights into the user experience and effectiveness of the system. Below is a summary of the responses to each interview question:\u003c/p\u003e\n\u003cp\u003e5. Errors in Location Identification:\u003c/p\u003e\n\u003cp\u003eAmong the 30 participants, there were only 7 instances of incorrect location identification out of more than 500 requests. The findings here were discussed in the quantitative findings section.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e6. Total Number of Location Queries:\u003c/p\u003e\n\u003cp\u003eThe participants made a total of approximately 559 location queries, averaging about 17.5 requests per user.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWhich is above the average 15, this suggests that users were actively engaging with the system.\u003c/p\u003e\n\u003cp\u003e7. Perceived Response Time:\u003c/p\u003e\n\u003cp\u003eRegarding the system\u0026apos;s response time, 1 user reported that the response time was too long, 10 users indicated that it was \u0026quot;a little\u0026quot; too long, while the remaining 19 users felt that the response time was satisfactory. This feedback suggests that while most users were satisfied with the response time, there is room for improvement to enhance the user experience.\u003c/p\u003e\n\u003cp\u003e8. Instances of Failed Location Identification:\u003c/p\u003e\n\u003cp\u003eAcross all 30 participants, there were only 8 instances where the system failed to identify a location out of more than 500 requests. The findings here were discussed in the quantitative findings section.\u003c/p\u003e\n\u003cp\u003e9. Clarity on Location Search Initiation:\u003c/p\u003e\n\u003cp\u003eWhen asked if it was clear when the application started searching for their location, 29 participants answered \u0026quot;yes,\u0026quot; while 2 participants responded \u0026quot;no.\u0026quot; This indicates that the majority of users found the system\u0026apos;s cues for initiating location searches to be clear and understandable.\u003c/p\u003e\n\u003cp\u003e10. Clarity on Location Identification:\u003c/p\u003e\n\u003cp\u003eAll 30 participants responded \u0026quot;yes\u0026quot; when asked if it was clear when their location was identified. This unanimous positive feedback highlights the effectiveness of the system in communicating successful location identification to users.\u003c/p\u003e\n\u003cp\u003e11. General Feedback on System Use:\u003c/p\u003e\n\u003cp\u003eThe open-ended question on general feedback yielded predominantly positive responses, such as \u0026quot;Easy to use,\u0026quot; \u0026quot;Great app,\u0026quot; \u0026quot;Encouraging to visit the museum\u0026quot;, \u0026quot;Working great\u0026quot;, \u0026quot;Recommended app\u0026quot; and \u0026quot;Understandable flow.\u0026quot; However, two users mentioned that they needed a little help or guidance when first using the app, suggesting a potential area for improving the onboarding experience.\u003c/p\u003e\n\u003cp\u003e12. Suggestions for Improvement:\u003c/p\u003e\n\u003cp\u003eParticipants also provided constructive feedback for enhancing the system. The most frequently mentioned suggestions included support for different languages (8 mentions) and audio description (8 mentions), faster response times (4 mentions), improvements to the app\u0026apos;s design (3 mentions), clearer instructions (2 mentions), enriching descriptions with images and links (1 mention) and making the system more interactive by answering user questions (1 mention). These suggestions offer valuable insights into potential areas for future development.\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;\u003c/p\u003e"},{"header":"6 Discussion","content":"\u003cp\u003eThe experiment conducted at the Hecht Museum provides a comprehensive evaluation of the proposed application's usability, comfort, quality, and accuracy in delivering a seamless user experience within a real-world setting. The results, both quantitative and qualitative, offer significant insights into the application's performance and areas for future enhancement.\u003c/p\u003e \u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003e6.1 Accuracy Evaluation\u003c/h2\u003e \u003cp\u003eOne of the primary objectives of this experiment was to evaluate the application's accuracy in identifying and providing information about various Points of Interest (POIs). As outlined in the quantitative findings section, the results are highly promising, with only 7 instances of incorrect location identification out of more than 500 queries and just 8 instances of failed location identification. These low error rates demonstrate the application's reliability and accuracy in delivering the correct information, which is crucial for maintaining user trust and satisfaction.\u003c/p\u003e \u003cp\u003eThe quantitative findings also indicate that most issues occurred with specific POIs, suggesting that certain POIs are more challenging due to factors such as lighting effects, size, position, structure, and more. For example, \"De Materia Medica by the Greek\" presents unique challenges as it is of moderate height, allowing for multiple angles of photography, such as from above or the side. Creating a comprehensive video that captures all these angles proved difficult. Initially, we uploaded a video that was not sufficiently comprehensive, leading to several recognition failures. We resolved this by later enriching the database with better representative videos. As a result, as seen in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, this POI was ultimately represented by a larger video consisting of 157 frames. This solution proved effective, as no participants encountered issues with this POI afterward.\u003c/p\u003e \u003cp\u003eAnother challenge arose when two POIs were very similar, such as \"Glassmaking-Part 2\" and \"Producing the Raw Glass-Part 1.\" The visual similarity between these two POIs led to identification errors for the former. This issue underscores the complexity of accurately distinguishing between similar exhibits. Our experiment was conducted in a single area of the museum, raising further questions about the application's performance if deployed across the entire museum or in larger museums. The strategy of enriching the database with longer videos did not significantly improve the results in these cases, leaving this area an open question for future research.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e6.2 Quality and Performance\u003c/h2\u003e \u003cp\u003eWhile the response time was generally satisfactory for most users, it was identified as an area with potential for improvement. Nineteen participants reported being satisfied with the response time; however, 11 participants felt that it was either too long or slightly too long. Improving the application's responsiveness could further enhance the user experience, minimizing moments of frustration or disengagement.\u003c/p\u003e \u003cp\u003eThe application demonstrated good quality and performance, as evidenced by its low error rate and the low average number of attempts required to obtain a positive result (see Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e for the average number of visited POIs and the average number of attempts needed to successfully identify a POI). Additionally, the relatively good response time contributed to a positive user experience. While there is still room for improvement, the application generally meets the desired quality and performance standards.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec23\" class=\"Section2\"\u003e \u003ch2\u003e6.3 Usability Evaluation\u003c/h2\u003e \u003cp\u003eThe System Usability Scale (SUS) score of 87.08, categorized as \"Excellent,\" is a strong indicator of the application's high usability. This score not only reflects the ease with which participants navigated the application but also suggests that the application requires minimal effort to learn and use, which is crucial for user adoption and satisfaction. The SUS score places the application in the top tier of systems, making it highly competitive in the realm of digital tools designed for museum environments or similar contexts.\u003c/p\u003e \u003cp\u003eA key aspect of this experiment was the intentional targeting of older-aged visitors, with an average participant age of around 44.5. This demographic is often less comfortable with using new technologies, making their feedback particularly valuable in assessing the application's usability. The positive responses from this group underscore the success of the application in delivering a user-friendly experience that is accessible even to those who might typically struggle with technology. Comments such as \"Easy to use,\" \"Great app,\" and \"Recommended app\" highlight the application's effectiveness in overcoming common barriers faced by older users. The high level of engagement, as evidenced by the average of 17.5 location queries per participant, indicates that users found the application both intuitive and engaging, further validating its usability.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec24\" class=\"Section2\"\u003e \u003ch2\u003e6.4 Comfort Assessment\u003c/h2\u003e \u003cp\u003eComfort, both physical and cognitive, is another critical aspect of user experience. The experiment's results indicate that the application successfully provides a comfortable user experience, with users able to easily interact with the system and understand its functionality. The fact that 29 out of 30 participants found the location search initiation cues clear, and all participants understood when their location was identified, underscores the clarity and accessibility of the application\u0026rsquo;s interface.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec25\" class=\"Section2\"\u003e \u003ch2\u003e6.5 User Enjoyment and Engagement\u003c/h2\u003e \u003cp\u003eThe experiment also highlighted the application's success in engaging users, as demonstrated by the active participation and the generally positive feedback. The application not only facilitated an informative and enjoyable experience but also encouraged participants to explore the museum in greater depth. Comments like \"Encouraging to visit the museum\" suggest that the application has the potential to significantly enhance visitor engagement and satisfaction, which is a key goal of the solution.\u003c/p\u003e \u003cp\u003eThe constructive feedback provided by participants offers valuable insights into areas for further development. The most common suggestions, such as the need for faster response times, more interactive features, and support for additional languages, provide a clear roadmap for future iterations of the application. By addressing these areas, the application can evolve to better meet the needs of a broader audience and provide an even more enriched user experience.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec26\" class=\"Section2\"\u003e \u003ch2\u003e6.6 Limitations\u003c/h2\u003e \u003cp\u003eWhile the experiment yielded valuable insights, it is important to acknowledge its limitations. The study was conducted within a specific area of the Hecht Museum, focusing on the \"Ancient Crafts and Industries\" section with a limited number of POIs. This restricted scope may not fully represent the application's performance in other museum settings or with a broader range of exhibits. Additionally, the application relies on wireless or device connectivity to upload requests and download responses, a feature that may not be available in every indoor environment. This dependence on connectivity could limit the app's effectiveness in settings with poor or no network coverage.\u003c/p\u003e \u003cp\u003eFuture research should explore the application\u0026rsquo;s effectiveness in larger and more diverse museums, where it may encounter different challenges in terms of layout, exhibit types, and visitor interactions, as well as varying levels of connectivity.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec27\" class=\"Section2\"\u003e \u003ch2\u003e6.7 Contribution to the Field of Indoor Localization in Cultural Heritage Sites\u003c/h2\u003e \u003cp\u003eThe field of indoor localization within cultural heritage sites presents unique challenges that differ significantly from those encountered in other environments. Unlike conventional indoor localization systems, which are typically designed for locations such as shopping malls or office buildings, cultural heritage sites like museums and historical landmarks require localization solutions that are not only accurate but also sensitive to the context and significance of the environment. These sites often feature complex layouts, dense exhibits, and a rich array of POIs, all of which must be navigated in a manner that enhances, rather than detracts from, the visitor's experience.\u003c/p\u003e \u003cp\u003eOne of the primary difficulties in developing indoor localization systems for cultural heritage sites is the need to balance technological innovation with the preservation of the site's integrity. Any solution must be unobtrusive, ensuring that the technology does not overshadow the cultural and educational value of the exhibits. Additionally, the system must be robust enough to handle the unique architectural features of heritage sites, such as thick walls, varied room sizes, and often limited or inconsistent lighting conditions. These factors can significantly impact the performance of traditional localization technologies, making it difficult to achieve the level of accuracy and reliability required.\u003c/p\u003e \u003cp\u003eOur proposed application addresses these challenges by integrating image-based location detection service into a user- friendly interface that is tailored to the needs of cultural heritage sites. The application\u0026rsquo;s ability to accurately identify and provide information about various POIs, even in the complex environment of the Hecht Museum, demonstrates its potential as a valuable tool for indoor localization in similar contexts. By utilizing the camera on a mobile device, the application minimizes the need for additional infrastructure, such as beacons or Wi-Fi triangulation, which can be intrusive or challenging to deploy in historical settings.\u003c/p\u003e \u003c/div\u003e"},{"header":"7 Conclusion and Future Directions","content":"\u003cp\u003eThe experiment conducted at the Hecht Museum has provided substantial evidence of the potential of image-based positioning. The results demonstrate that an image-based positioning system can be effectively designed to create a robust and user-friendly application, as indicated by the high SUS score and positive feedback from participants.\u003c/p\u003e \u003cp\u003eThe experiment also identified several areas for future development to enhance the application's usability and effectiveness. The development team should focus on improving the application's response times, expanding language support, and adding more interactive features to further boost user satisfaction. By addressing the feedback gathered from this experiment, the application can continue to evolve and maintain a competitive edge in the digital tools market for educational and cultural institutions.\u003c/p\u003e \u003cp\u003eAnother promising direction for future work is the integration of audio descriptions, which could either replace or complement text descriptions. This feature would make the application more accessible to users with visual impairments or those who prefer auditory information, thereby enhancing the overall user experience and broadening the application\u0026rsquo;s appeal across different demographic groups.\u003c/p\u003e \u003cp\u003eFinally, expanding the application\u0026rsquo;s scope to function effectively in larger and more diverse museum environments is essential. Future research should investigate how the application performs in different settings, with various exhibit types and layouts, to ensure its versatility and adaptability within the cultural heritage sector.\u003c/p\u003e \u003cp\u003e\u003cb\u003eDuring the preparation of this work the author(s) used ChatGPT in order to enhance the text and improve clarity. After using this tool, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.\u003c/b\u003e\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eBashar Egbariya developed the system, carried out the experimentation, analyzed the results and drafted the initial version of the manuscriptRotem Dror guided and supervised the use of LLM for content generation for the mobile guideTsvi Kuflik was the initiator of the study, guided the project, took active part in the evaluation and iterative development of the system, guided the analysis of the results and revised, reviewed and re formatted the manuscript.Ilan Shimshoni let and guided the machine vision aspects, took an active part in analyzing the results and reviewing the manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eAbdulateef, A. T., \u0026amp; Makki, S. A. (2023, December). A survey of indoor positioning system based-smartphone. In\u0026nbsp;AIP Conference Proceedings\u0026nbsp;(Vol. 2977, No. 1). AIP Publishing.\u003c/li\u003e\n \u003cli\u003eAshraf, I., Hur, S., \u0026amp; Park, Y. (2019 a). Application of deep convolutional neural networks and smartphone sensors for indoor localization.\u0026nbsp;Applied Sciences,\u0026nbsp;9(11), 2337.\u003c/li\u003e\n \u003cli\u003eAshraf, I., Hur, S., Park, S., \u0026amp; Park, Y. (2019 b). DeepLocate: Smartphone based indoor localization with a deep neural network ensemble classifier.\u0026nbsp;Sensors,\u0026nbsp;20(1), 133.\u003c/li\u003e\n \u003cli\u003eBangor, A., Kortum, P., \u0026amp; Miller, J. (2009). Determining what individual SUS scores mean: Adding an adjective rating scale.\u0026nbsp;Journal of usability studies,\u0026nbsp;4(3), 114-123.\u003c/li\u003e\n \u003cli\u003eBarbieri, L., Brambilla, M., Trabattoni, A., Mervic, S., \u0026amp; Nicoli, M. (2021). UWB localization in a smart factory: Augmentation methods and experimental assessment.\u0026nbsp;IEEE Transactions on Instrumentation and Measurement,\u0026nbsp;70, 1-18.\u003c/li\u003e\n \u003cli\u003eBarbour, N., \u0026amp; Schmidt, G. (2001). Inertial sensor technology trends.\u0026nbsp;IEEE Sensors journal,\u0026nbsp;1(4), 332-339.\u003c/li\u003e\n \u003cli\u003eBasiri, A., Lohan, E. S., Moore, T., Winstanley, A., Peltola, P., Hill, C., ... \u0026amp; e Silva, P. F. (2017). Indoor location based services challenges, requirements and usability of current solutions.\u0026nbsp;Computer Science Review,\u0026nbsp;24, 1-12.\u003c/li\u003e\n \u003cli\u003eBenini, L., Farella, E., \u0026amp; Guiducci, C. (2006). Wireless sensor networks: Enabling technology for ambient intelligence.\u0026nbsp;Microelectronics journal,\u0026nbsp;37(12), 1639-1649.\u003c/li\u003e\n \u003cli\u003eBrusch, I. (2022). Identification of travel styles by learning from consumer-generated images in online travel communities. Information \u0026amp; Management, 59(6), 103682.\u003c/li\u003e\n \u003cli\u003eButun, I., \u0026Ouml;sterberg, P., \u0026amp; Gidlund, M. (2019, June). Preserving location privacy in cyber-physical systems. In\u0026nbsp;2019 IEEE Conference on Communications and Network Security (CNS)\u0026nbsp;(pp. 1-6). IEEE.\u003c/li\u003e\n \u003cli\u003eCavallari, T., Golodetz, S., Lord, N. A., Valentin, J., Prisacariu, V. A., Di Stefano, L., \u0026amp; Torr, P. H. (2019). Real-time RGB-D camera pose estimation in novel scenes using a relocalisation cascade.\u0026nbsp;IEEE transactions on pattern analysis and machine intelligence,\u0026nbsp;42(10), 2465-2477.\u003c/li\u003e\n \u003cli\u003eChen, R., \u0026amp; Chen, L. (2021). Smartphone-based indoor positioning technologies.\u0026nbsp;Urban informatics, 467-490.\u003c/li\u003e\n \u003cli\u003eChintalapudi, K., Padmanabha Iyer, A., \u0026amp; Padmanabhan, V. N. (2010, September). Indoor localization without the pain. In\u0026nbsp;Proceedings of the sixteenth annual international conference on Mobile computing and networking\u0026nbsp;(pp. 173-184).\u003c/li\u003e\n \u003cli\u003eConte, G., \u0026amp; Doherty, P. (2008, March). An integrated UAV navigation system based on aerial image matching. In\u0026nbsp;2008 IEEE Aerospace Conference\u0026nbsp;(pp. 1-10). IEEE.\u003c/li\u003e\n \u003cli\u003eCorrea, A., Barcelo, M., Morell, A., \u0026amp; Vicario, J. L. (2017). A review of pedestrian indoor positioning systems for mass market applications.\u0026nbsp;Sensors,\u0026nbsp;17(8), 1927.\u003c/li\u003e\n \u003cli\u003eDabove, P., Di Pietra, V., Piras, M., Jabbar, A. A., \u0026amp; Kazim, S. A. (2018, April). Indoor positioning using Ultra-wide band (UWB) technologies: Positioning accuracies and sensors\u0026apos; performances. In\u0026nbsp;2018 IEEE/ION Position, Location and Navigation Symposium (PLANS)\u0026nbsp;(pp. 175-184). IEEE.\u003c/li\u003e\n \u003cli\u003eDavidson, P., \u0026amp; Pich\u0026eacute;, R. (2016). A survey of selected indoor positioning methods for smartphones.\u0026nbsp;IEEE Communications surveys \u0026amp; tutorials,\u0026nbsp;19(2), 1347-1370.\u003c/li\u003e\n \u003cli\u003eDong, E., Xu, J., Wu, C., Liu, Y., \u0026amp; Yang, Z. (2019, April). Pair-navi: Peer-to-peer indoor navigation with mobile visual slam. In\u0026nbsp;IEEE INFOCOM 2019-IEEE conference on computer communications\u0026nbsp;(pp. 1189-1197). IEEE.\u003c/li\u003e\n \u003cli\u003e\u0026nbsp;AAA removed for anonymization\u003c/li\u003e\n \u003cli\u003eEl-Sheimy, N., \u0026amp; Li, Y. (2021). Indoor navigation: State of the art and future trends.\u0026nbsp;Satellite Navigation,\u0026nbsp;2(1), 7.\u003c/li\u003e\n \u003cli\u003eElmenreich, W. (2002). An introduction to sensor fusion.\u0026nbsp;Vienna University of Technology, Austria,\u0026nbsp;502, 1-28.\u003c/li\u003e\n \u003cli\u003eFeldmann, S., Kyamakya, K., Zapater, A., \u0026amp; Lue, Z. (2003, June). An Indoor Bluetooth-Based Positioning System: Concept, Implementation and Experimental Evaluation. In\u0026nbsp;International conference on wireless networks\u0026nbsp;(Vol. 272).\u003c/li\u003e\n \u003cli\u003eGhasemi, Y., Jeong, H., Choi, S. H., Park, K. B., \u0026amp; Lee, J. Y. (2022). Deep learning-based object detection in augmented reality: A systematic review.\u0026nbsp;Computers in Industry,\u0026nbsp;139, 103661.\u003c/li\u003e\n \u003cli\u003eGhouaiel, N., Garbaya, S., Cieutat, J. M., \u0026amp; Jessel, J. P. (2017). Mobile augmented reality in museums: towards enhancing visitor\u0026apos;s learning experience.\u0026nbsp;International Journal of Virtual Reality,\u0026nbsp;17(1), 21-31.\u003c/li\u003e\n \u003cli\u003eGu, Y., Chen, M., Ren, F., \u0026amp; Li, J. (2016, April). HED: Handling environmental dynamics in indoor WiFi fingerprint localization. In\u0026nbsp;2016 IEEE wireless communications and networking conference\u0026nbsp;(pp. 1-6). IEEE.\u003c/li\u003e\n \u003cli\u003eGu, Y., Lo, A., \u0026amp; Niemegeers, I. (2009). A survey of indoor positioning systems for wireless personal networks.\u0026nbsp;IEEE Communications surveys \u0026amp; tutorials,\u0026nbsp;11(1), 13-32.\u003c/li\u003e\n \u003cli\u003eGuo, G., Chen, R., Ye, F., Peng, X., Liu, Z., \u0026amp; Pan, Y. (2019). Indoor smartphone localization: A hybrid WiFi RTT-RSS ranging approach.\u0026nbsp;Ieee Access,\u0026nbsp;7, 176767-176781.\u003c/li\u003e\n \u003cli\u003eGupta, P., Sharma, V., Gairolla, J., Thakur, U., Pandey, N., Khurana, D., \u0026amp; Ramavat, A. S. (2024). Mobile Based Indoor Hospital Navigation System for Tertiary Care Setup: A Scoping Review.\u003c/li\u003e\n \u003cli\u003eHe, K., Zhang, X., Ren, S., \u0026amp; Sun, J. (2016). Deep residual learning for image recognition. In\u0026nbsp;Proceedings of the IEEE conference on computer vision and pattern recognition\u0026nbsp;(pp. 770-778).\u003c/li\u003e\n \u003cli\u003eHe, S., \u0026amp; Chan, S. H. G. (2015). Wi-Fi fingerprint-based indoor positioning: Recent advances and comparisons.\u0026nbsp;IEEE Communications Surveys \u0026amp; Tutorials,\u0026nbsp;18(1), 466-490.\u003c/li\u003e\n \u003cli\u003eHsu, H. H., Chang, J. K., Peng, W. J., Shih, T. K., Pai, T. W., \u0026amp; Man, K. L. (2018). Indoor localization and navigation using smartphone sensory data.\u0026nbsp;Annals of Operations Research,\u0026nbsp;265, 187-204.\u003c/li\u003e\n \u003cli\u003eHuang, H., \u0026amp; Gartner, G. (2010).\u0026nbsp;A survey of mobile indoor navigation systems\u0026nbsp;(pp. 305-319). Springer Berlin Heidelberg.\u003c/li\u003e\n \u003cli\u003eJackermeier, R., \u0026amp; Ludwig, B. (2018). Exploring the limits of PDR-based indoor localisation systems under realistic conditions.\u0026nbsp;Journal of Location Based Services,\u0026nbsp;12(3-4), 231-272.\u003c/li\u003e\n \u003cli\u003eJahne, B. (Ed.). (2000).\u0026nbsp;Computer vision and applications: a guide for students and practitioners. Elsevier.\u003c/li\u003e\n \u003cli\u003eJamshidi, S., Ensafi, M., \u0026amp; Pati, D. (2020). Wayfinding in interior environments: An integrative review.\u0026nbsp;Frontiers in Psychology,\u0026nbsp;11, 549628.\u003c/li\u003e\n \u003cli\u003eJiang, Y., Zheng, X., \u0026amp; Feng, C. (2023). Toward Multi-area Contactless Museum Visitor Counting with Commodity WiFi.\u0026nbsp;ACM Journal on Computing and Cultural Heritage,\u0026nbsp;16(1), 1-26.\u003c/li\u003e\n \u003cli\u003eJo, H. J., \u0026amp; Kim, S. (2018). Indoor smartphone localization based on LOS and NLOS identification.\u0026nbsp;Sensors,\u0026nbsp;18(11), 3987.\u003c/li\u003e\n \u003cli\u003eK\u0026aacute;rn\u0026iacute;k, J., \u0026amp; Streit, J. (2016). Summary of available indoor location techniques.\u0026nbsp;IFAC-PapersOnLine,\u0026nbsp;49(25), 311-317.\u003c/li\u003e\n \u003cli\u003eKim Geok, T., Zar Aung, K., Sandar Aung, M., Thu Soe, M., Abdaziz, A., Pao Liew, C., ... \u0026amp; Yong, W. H. (2020). Review of indoor positioning: Radio wave technology.\u0026nbsp;Applied Sciences,\u0026nbsp;11(1), 279.\u003c/li\u003e\n \u003cli\u003eKim, J., Lee, S., \u0026amp; Kim, H. (2018). A survey on computer vision-based indoor localization methods. Sensors, 18(10), 3234.\u003c/li\u003e\n \u003cli\u003eKlette, R. (2014).\u0026nbsp;Concise computer vision\u0026nbsp;(Vol. 233, pp. 2-1). London: Springer.\u003c/li\u003e\n \u003cli\u003eKolivand, H., El Rhalibi, A., Tajdini, M., Abdulazeez, S., \u0026amp; Praiwattana, P. (2018). Cultural heritage in marker-less augmented reality: A survey. In\u0026nbsp;Advanced methods and new materials for cultural heritage preservation. IntechOpen.\u003c/li\u003e\n \u003cli\u003eKuo, Y. S., Pannuto, P., Hsiao, K. J., \u0026amp; Dutta, P. (2014, September). Luxapose: Indoor positioning with mobile phones and visible light. In\u0026nbsp;Proceedings of the 20th annual international conference on Mobile computing and networking\u0026nbsp;(pp. 447-458).\u003c/li\u003e\n \u003cli\u003eLi, Q., Zhu, J., Liu, T., Garibaldi, J., Li, Q., \u0026amp; Qiu, G. (2017, November). Visual landmark sequence-based indoor localization. In\u0026nbsp;Proceedings of the 1st Workshop on Artificial Intelligence and Deep Learning for Geographic Knowledge Discovery\u0026nbsp;(pp. 14-23).\u003c/li\u003e\n \u003cli\u003eLi, Z., Liu, F., Yang, W., Peng, S., \u0026amp; Zhou, J. (2021). A survey of convolutional neural networks: analysis, applications, and prospects.\u0026nbsp;IEEE transactions on neural networks and learning systems,\u0026nbsp;33(12), 6999-7019.\u003c/li\u003e\n \u003cli\u003eLiu, M., Cheng, L., Qian, K., Wang, J., Wang, J., \u0026amp; Liu, Y. (2020). Indoor acoustic localization: A survey.\u0026nbsp;Human-centric Computing and Information Sciences,\u0026nbsp;10, 1-24.\u003c/li\u003e\n \u003cli\u003eLiu, T., Zhang, X., Li, Q., \u0026amp; Fang, Z. (2017). A visual-based approach for indoor radio map construction using smartphones.\u0026nbsp;Sensors,\u0026nbsp;17(8), 1790.\u003c/li\u003e\n \u003cli\u003eLowe, D. G. (2004). Distinctive image features from scale-invariant keypoints.\u0026nbsp;International journal of computer vision,\u0026nbsp;60, 91-110.\u003c/li\u003e\n \u003cli\u003eLymberopoulos, D., \u0026amp; Liu, J. (2017). The microsoft indoor localization competition: Experiences and lessons learned.\u0026nbsp;IEEE Signal Processing Magazine,\u0026nbsp;34(5), 125-140.\u003c/li\u003e\n \u003cli\u003eMartinez del Horno, M., Garc\u0026iacute;a-Varea, I., \u0026amp; Orozco Barbosa, L. (2019). Calibration of Wi-Fi-based indoor tracking systems for Android-based smartphones.\u0026nbsp;Remote Sensing,\u0026nbsp;11(9), 1072.\u003c/li\u003e\n \u003cli\u003eMeliones, A., \u0026amp; Sampson, D. (2018). Blind MuseumTourer: A system for self-guided tours in museums and blind indoor navigation.\u0026nbsp;Technologies,\u0026nbsp;6(1), 4.\u003c/li\u003e\n \u003cli\u003eMisra, P. (2006). Global positioning system: Signals.\u0026nbsp;Measurements, and Performance/Ganga-Jamuna Press.\u003c/li\u003e\n \u003cli\u003e\u0026nbsp;BBB removed for anonymization\u003c/li\u003e\n \u003cli\u003eMorar, A., Moldoveanu, A., Mocanu, I., Moldoveanu, F., Radoi, I. E., Asavei, V., ... \u0026amp; Butean, A. (2020). A comprehensive survey of indoor localization methods based on computer vision.\u0026nbsp;Sensors,\u0026nbsp;20(9), 2641.\u003c/li\u003e\n \u003cli\u003eMoreno, A., \u0026amp; Angulo, I. (2012 a). A Reliable ICT Solution for Organ Transport Traceability and Incidences Reporting Based on Sensor Networks and Wireless Technologies. In\u0026nbsp;Distributed Computing and Artificial Intelligence: 9th International Conference\u0026nbsp;(pp. 395-402). Springer Berlin Heidelberg.\u003c/li\u003e\n \u003cli\u003eMoreno, A., Angulo, I., Perallos, A., Landaluce, H., Zuazola, I. J. G., Azpilicueta, L., ... \u0026amp; Villadangos, J. (2012 b). IVAN: Intelligent van for the distribution of pharmaceutical drugs.\u0026nbsp;Sensors,\u0026nbsp;12(5), 6587-6609.\u003c/li\u003e\n \u003cli\u003eMorley, S. K., Sullivan, J. P., Carver, M. R., Kippen, R. M., Friedel, R. H. W., Reeves, G. D., \u0026amp; Henderson, M. G. (2017). Energetic particle data from the global positioning system constellation.\u0026nbsp;Space Weather,\u0026nbsp;15(2), 283-289.\u003c/li\u003e\n \u003cli\u003eMorris, T. (2004).\u0026nbsp;Computer vision and image processing. Palgrave Macmillan Ltd.\u003c/li\u003e\n \u003cli\u003eMur-Artal, R., Montiel, J. M. M., \u0026amp; Tardos, J. D. (2015). ORB-SLAM: a versatile and accurate monocular SLAM system.\u0026nbsp;IEEE transactions on robotics,\u0026nbsp;31(5), 1147-1163.\u003c/li\u003e\n \u003cli\u003eNaser, R. S., Lam, M. C., Qamar, F., \u0026amp; Zaidan, B. B. (2023). Smartphone-based indoor localization systems: A systematic literature review.\u0026nbsp;Electronics,\u0026nbsp;12(8), 1814.\u003c/li\u003e\n \u003cli\u003eO\u0026apos;Shea, K. (2015). An introduction to convolutional neural networks. \u003cem\u003earXiv preprint arXiv:1511.08458\u003c/em\u003e.\u003c/li\u003e\n \u003cli\u003ePiras, M., Lingua, A., Dabove, P., \u0026amp; Aicardi, I. (2014, May). Indoor navigation using Smartphone technology: A future challenge or an actual possibility? In\u0026nbsp;2014 IEEE/ION Position, Location and Navigation Symposium-PLANS 2014\u0026nbsp;(pp. 1343-1352). IEEE.\u003c/li\u003e\n \u003cli\u003ePodevijn, N., Plets, D., Trogh, J., Karaagac, A., Haxhibcqiri, J., Hoebeke, J., ... \u0026amp; Joseph, W. (2018, September). Performance comparison of RSS algorithms for indoor localization in large open environments. In\u0026nbsp;2018 International Conference on Indoor Positioning and Indoor Navigation (IPIN)\u0026nbsp;(pp. 1-6). IEEE.\u003c/li\u003e\n \u003cli\u003ePoulose, A., \u0026amp; Han, D. S. (2019a). Hybrid indoor localization using IMU sensors and smartphone camera.\u0026nbsp;Sensors,\u0026nbsp;19(23), 5084.\u003c/li\u003e\n \u003cli\u003ePoulose, A., Eyobu, O. S., \u0026amp; Han, D. S. (2019b). An indoor position-estimation algorithm using smartphone IMU sensor data.\u0026nbsp;Ieee Access,\u0026nbsp;7, 11165-11177.\u003c/li\u003e\n \u003cli\u003ePundir, A. K., Jagannath, J. D., \u0026amp; Ganapathy, L. (2019, January). Improving supply chain visibility using IoT-internet of things. In\u0026nbsp;2019 ieee 9th annual computing and communication workshop and conference (ccwc)\u0026nbsp;(pp. 0156-0162). IEEE.\u003c/li\u003e\n \u003cli\u003eRadford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... \u0026amp; Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In\u0026nbsp;International conference on machine learning\u0026nbsp;(pp. 8748-8763). PMLR.\u003c/li\u003e\n \u003cli\u003eRavi, N., Shankar, P., Frankel, A., Elgammal, A., \u0026amp; Iftode, L. (2005, August). Indoor localization using camera phones. In\u0026nbsp;Seventh IEEE Workshop on Mobile Computing Systems \u0026amp; Applications (WMCSA\u0026apos;06 Supplement)\u0026nbsp;(pp. 1-7). IEEE.\u003c/li\u003e\n \u003cli\u003eRenaudin, V., Yalak, O., Tom\u0026eacute;, P., \u0026amp; Merminod, B. (2007). Indoor navigation of emergency agents.\u0026nbsp;European Journal of Navigation,\u0026nbsp;5(3), 36-45.\u003c/li\u003e\n \u003cli\u003eRublee, E., Rabaud, V., Konolige, K., \u0026amp; Bradski, G. (2011, November). ORB: An efficient alternative to SIFT or SURF. In\u0026nbsp;2011 International conference on computer vision\u0026nbsp;(pp. 2564-2571). Ieee.\u003c/li\u003e\n \u003cli\u003eSawaby, A. M., Noureldin, H. M., Mohamed, M. S., Omar, M. O., Shaaban, N. S., Ahmed, N. N., ... \u0026amp; Mostafa, H. (2019, May). A smart indoor navigation system over BLE. In\u0026nbsp;2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST)\u0026nbsp;(pp. 1-4). IEEE.\u003c/li\u003e\n \u003cli\u003eShenoy, A., \u0026amp; Thillaiarasu, N. (2022, March). A survey on different computer vision based human activity recognition for surveillance applications. In\u0026nbsp;2022 6th International Conference on Computing Methodologies and Communication (ICCMC)\u0026nbsp;(pp. 1372-1376). IEEE.\u003c/li\u003e\n \u003cli\u003eStock, O., Zancanaro, M., Busetta, P., Callaway, C., Kr\u0026uuml;ger, A., Kruppa, M., ... \u0026amp; Rocchi, C. (2007). Adaptive, intelligent presentation of information for the museum visitor in PEACH. User Modeling and User-Adapted Interaction, 17, 257-304.\u003c/li\u003e\n \u003cli\u003eStockman, G., \u0026amp; Shapiro, L. G. (2001).\u0026nbsp;Computer vision. Prentice Hall PTR.\u003c/li\u003e\n \u003cli\u003eSyahidi, A. A., Kiyokawa, K., \u0026amp; Okura, F. (2023, October). Computer Vision in Smart City Application: A Mapping Review. In\u0026nbsp;2023 6th International Conference on Applied Computational Intelligence in Information Systems (ACIIS)\u0026nbsp;(pp. 1-6). IEEE.\u003c/li\u003e\n \u003cli\u003eTan, S. Y., Lee, K. J., \u0026amp; Lam, M. C. (2020). A Shopping Mall Indoor Navigation Application using Wi-Fi Positioning System.\u0026nbsp;International Journal,\u0026nbsp;9(4).\u003c/li\u003e\n \u003cli\u003eTrichopoulos, G., Konstantakis, M., Caridakis, G., Katifori, A., \u0026amp; Koukouli, M. (2023). Crafting a Museum Guide Using ChatGPT4. Big Data and Cognitive Computing, 7(3), 148.\u003c/li\u003e\n \u003cli\u003eVaralatchoumy, M., Divakaran, S., \u0026amp; Ram, R. A. (2023, May). Foodflare: An Indoor Navigation System. In\u0026nbsp;International Conference on Applications of Machine Intelligence and Data Analytics (ICAMIDA 2022)\u0026nbsp;(pp. 722-734). Atlantis Press.\u003c/li\u003e\n \u003cli\u003eVillaespesa, E., \u0026amp; Crider, S. (2021). Computer vision tagging the metropolitan museum of art\u0026apos;s collection: A comparison of three systems.\u0026nbsp;Journal on Computing and Cultural Heritage (JOCCH),\u0026nbsp;14(3), 1-17.\u003c/li\u003e\n \u003cli\u003eWang, B., Liu, K., \u0026amp; Zhao, J. (2016, August). Inner attention based recurrent neural networks for answer selection. In\u0026nbsp;Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)\u0026nbsp;(pp. 1288-1297).\u003c/li\u003e\n \u003cli\u003eWang, S. S. (2018). A BLE-based pedestrian navigation system for car searching in indoor parking garages.\u0026nbsp;Sensors,\u0026nbsp;18(5), 1442.\u003c/li\u003e\n \u003cli\u003eWang, X., Gao, L., Mao, S., \u0026amp; Pandey, S. (2016). CSI-based fingerprinting for indoor localization: A deep learning approach.\u0026nbsp;IEEE transactions on vehicular technology,\u0026nbsp;66(1), 763-776.\u003c/li\u003e\n \u003cli\u003e\u0026nbsp;CCC removed for anonymization\u003c/li\u003e\n \u003cli\u003eWu, J. (2017). Introduction to convolutional neural networks.\u0026nbsp;National Key Lab for Novel Software Technology. Nanjing University. China,\u0026nbsp;5(23), 495.\u003c/li\u003e\n \u003cli\u003eXie, L., Lee, F., Liu, L., Kotani, K., \u0026amp; Chen, Q. (2020). Scene recognition: A comprehensive survey.\u0026nbsp;Pattern Recognition,\u0026nbsp;102, 107205.\u003c/li\u003e\n \u003cli\u003eXingli, G., Yaning, L., \u0026amp; Ruihui, Z. (2018, March). Indoor positioning technology based on deep neural networks. In\u0026nbsp;2018 Ubiquitous Positioning, Indoor Navigation and Location-Based Services (UPINLBS)\u0026nbsp;(pp. 1-6). IEEE.\u003c/li\u003e\n \u003cli\u003eXiong, Z., Sottile, F., Spirito, M. A., \u0026amp; Garello, R. (2011, February). Hybrid indoor positioning approaches based on WSN and RFID. In\u0026nbsp;2011 4th IFIP International Conference on New Technologies, Mobility and Security\u0026nbsp;(pp. 1-5). IEEE.\u003c/li\u003e\n \u003cli\u003eYang, S., Ma, L., Jia, S., \u0026amp; Qin, D. (2020). An improved vision-based indoor positioning method.\u0026nbsp;IEEE Access,\u0026nbsp;8, 26941-26949.\u003c/li\u003e\n \u003cli\u003eYang, Z., Wu, C., \u0026amp; Liu, Y. (2012, August). Locating in fingerprint space: Wireless indoor localization with little human intervention. In\u0026nbsp;Proceedings of the 18th annual international conference on Mobile computing and networking\u0026nbsp;(pp. 269-280).\u003c/li\u003e\n \u003cli\u003eYao, Y., Pan, L., Fen, W., Xu, X., Liang, X., \u0026amp; Xu, X. (2020). A robust step detection and stride length estimation for pedestrian dead reckoning using a smartphone.\u0026nbsp;IEEE Sensors Journal,\u0026nbsp;20(17), 9685-9697.\u003c/li\u003e\n \u003cli\u003eYe, H., Chen, Y., \u0026amp; Liu, M. (2019, May). Tightly coupled 3d lidar inertial odometry and mapping. In\u0026nbsp;2019 International Conference on Robotics and Automation (ICRA)\u0026nbsp;(pp. 3144-3150). IEEE.\u003c/li\u003e\n \u003cli\u003eYin, Y., Yu, F., Xu, Y., Yu, L., \u0026amp; Mu, J. (2017). Network location-aware service recommendation with random walk in cyber-physical systems.\u0026nbsp;Sensors,\u0026nbsp;17(9), 2059.\u003c/li\u003e\n \u003cli\u003eYoussef, M. A., Agrawala, A., \u0026amp; Shankar, A. U. (2003, March). WLAN location determination via clustering and probability distributions. In\u0026nbsp;Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003.(PerCom 2003).\u0026nbsp;(pp. 143-150). IEEE.\u003c/li\u003e\n \u003cli\u003eYuan, Y., Melching, C., Yuan, Y., \u0026amp; Hogrefe, D. (2018). Multi-device fusion for enhanced contextual awareness of localization in indoor environments.\u0026nbsp;IEEE Access,\u0026nbsp;6, 7422-7431.\u003c/li\u003e\n \u003cli\u003eZhang, L., Huang, L., Yi, Q., Wang, X., Zhang, D., \u0026amp;amp; Zhang, G. (2022, September). Positioning method of pedestrian dead reckoning based on human activity recognition assistance. In\u0026nbsp;2022 IEEE 12th International Conference on Indoor Positioning and Indoor Navigation (IPIN)\u0026nbsp;(pp. 1-8).IEEE.\u003c/li\u003e\n \u003cli\u003eZhang, J., \u0026amp; Shah, M. (2019). Visual indoor localization: A survey. IEEE Signal Processing Magazine, 36(5), 128-140.\u003c/li\u003e\n \u003cli\u003eZhou, F., \u0026amp; De la Torre, F. (2015). Factorized graph matching.\u0026nbsp;IEEE transactions on pattern analysis and machine intelligence,\u0026nbsp;38(9), 1774-1789.\u003c/li\u003e\n \u003cli\u003eZou, Z., Chen, Q., Uysal, I., \u0026amp; Zheng, L. (2014). Radio frequency identification enabled wireless sensing for intelligent food logistics. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 372(2017), 20130313.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Image-based positioning, Mobile visitors' guide, Indoor positioning, Cultural Heritage","lastPublishedDoi":"10.21203/rs.3.rs-6142584/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6142584/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis study presents the development and evaluation of an image-based positioning system for a mobile museum visitors\u0026rsquo; guide system comprising two components: An Android application and a backend service. The application identifies visitors\u0026rsquo; location by capturing and sending images to the server, which then determines their location within the museum. The server maintains a comprehensive dataset of the museum's points of interest (POIs), and content about them. The content was created automatically, using large language model (LLM) and corrected by museum staff, who can also upload videos and descriptive information for each POI via the application. The image-based indoor positioning solution uses a deep learning-based model for representing an image as a vector of features. This approach enables the system to simply calculate distances between vectors and ultimately determine the similarity between them, allowing for accurate POI identification. A user study aimed at evaluating users' perception of the systems' accuracy and ease of use was conducted at the Hecht Museum, where participants used the developed application and subsequently completed a System Usability Scale (SUS) questionnaire, along with other open-ended questions. The high scores and the highly positive feedback obtained indicate an overall excellent usability experience, especially with respect to the accuracy and speed of POI identification. The feedback also provided insights into areas where our solution can be enhanced and further developed.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e","manuscriptTitle":"Using Image-based positioning for seamless localization in cultural heritage setting","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-03 11:11:08","doi":"10.21203/rs.3.rs-6142584/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"d9c20867-c0a0-4563-830c-531094477d27","owner":[],"postedDate":"April 3rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":46394114,"name":"Physical sciences/Mathematics and computing/Information technology"},{"id":46394115,"name":"Physical sciences/Mathematics and computing/Computer science"}],"tags":[],"updatedAt":"2025-04-24T04:08:30+00:00","versionOfRecord":[],"versionCreatedAt":"2025-04-03 11:11:08","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6142584","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6142584","identity":"rs-6142584","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.